Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spell checking tool #7209

Open
leventov opened this issue Mar 7, 2019 · 3 comments
Open

Spell checking tool #7209

leventov opened this issue Mar 7, 2019 · 3 comments

Comments

@leventov
Copy link
Member

leventov commented Mar 7, 2019

Introduce a tool to catch spelling and common typing problems, such as "the the", "decomission", etc.

In the meantime, we can use Checkstyle's Regexp pattern, but that's pretty inefficient(?) and may slow down the build.

Inspired by #7154.

@gianm
Copy link
Contributor

gianm commented Mar 8, 2019

I'm not sure if a generic spell check is worth it -- there might be way too many false positives and it would just get annoying -- but something that looks only for specific misspellings may be worth it. (Although, of limited usefulness, since it wouldn't be checking very much.)

I'd suggest doing something that can be applied through Maven (rather than IntelliJ inspections) since it's more accessible. Everyone working on Druid has Maven, not everyone has IntelliJ, and it's a better experience when people can verify their own stuff before submitting it up as a PR. The IntelliJ inspection errors won't be obvious until TC runs, which introduces delay in the PR cycle.

@gianm
Copy link
Contributor

gianm commented Mar 8, 2019

Btw, did you just mean docs? I was figuring this suggestion was about adding a spell checker specifically for docs, but just realized the top comment didn't mention that.

@leventov
Copy link
Member Author

leventov commented Mar 8, 2019

I'm not sure if a generic spell check is worth it -- there might be way too many false positives and it would just get annoying -- but something that looks only for specific misspellings may be worth it. (Although, of limited usefulness, since it wouldn't be checking very much.)

I'd suggest doing something that can be applied through Maven (rather than IntelliJ inspections) since it's more accessible. Everyone working on Druid has Maven, not everyone has IntelliJ, and it's a better experience when people can verify their own stuff before submitting it up as a PR. The IntelliJ inspection errors won't be obvious until TC runs, which introduces delay in the PR cycle.

Yes, that's exactly what I meant - a Maven plugin which is feeded with a list of stop words (BTW we can include profanity there, too). My only concern with Checkstyle's Regexp rules is that they are regular expressions and having a lot of them to check in the whole repo may slow down the build. The stop word plugin should do simple exact match scan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants