Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid capital letters in various sentence and noun phrase constructors #3693

Open
balacij opened this issue Feb 16, 2024 · 0 comments
Open
Labels
newcomers Good first issue to work on!

Comments

@balacij
Copy link
Collaborator

balacij commented Feb 16, 2024

I'm sure there is an example that breaks my simple-minded approach, but my first instinct is for everything (even Crawford) to be lowercase in the data. The generator knows that Crawford is a proper noun, so it adds the capital at generation time. The exception would be acronyms that use funky capitalization, like LaTeX.

Originally posted by @smiths in #3692 (comment)

#3534 is the original issue, where @samm82 found some problematic sentence and/or noun phrase entries that carried capital letters. We really want Drasil to do the capitalization for us, purely on its own. We need to do that by enriching the data as much as we can, trying to minimize the raw capital letters we enter. To do this, we can add a simple error case in all of our sentence and noun phrase constructors so that we strictly forbid capital letter entry.

You can find a few examples of these problematic string entries that would need to be fixed using grep or rg commands, similar to how I did in my comment on #3692.

@balacij balacij added the newcomers Good first issue to work on! label Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
newcomers Good first issue to work on!
Projects
None yet
Development

No branches or pull requests

1 participant