Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsoref spell checker: remove resource from validation #5693

Closed
romani opened this issue Apr 5, 2018 · 13 comments
Closed

jsoref spell checker: remove resource from validation #5693

romani opened this issue Apr 5, 2018 · 13 comments

Comments

@romani
Copy link
Member

romani commented Apr 5, 2018

Folders to skip by spell checker:
src/test/resources/*
src/it/resources/*
such folders contains ugly code by design, all mistakes in them are done on purpose for testing.

cleanup https://github.com/checkstyle/contribution/blob/master/jsoref-spellchecker/whitelist.words from odd names.

@romani
Copy link
Member Author

romani commented Apr 6, 2018

@jsoref , can you help us with this issue ?

@jsoref
Copy link
Contributor

jsoref commented Apr 8, 2018

Sure, offhand the place to stick this is near the find argument, iirc it currently excludes "images", you could use the same style for "resources" assuming you're willing to exclude all folders named "resources".

I'll look in more detail tomorrow.

@romani
Copy link
Member Author

romani commented Apr 8, 2018

yes, you might already have tool/command that can easily remove all not found words from whilelist.
there are many resources, some of them are good to be validated by spellchecker, we will skip only src/test/resources/*, src/it/resources/*.

jsoref added a commit to jsoref/checkstyle that referenced this issue Apr 8, 2018
@romani romani added this to the 8.10 milestone Apr 11, 2018
@romani
Copy link
Member Author

romani commented Apr 11, 2018

@jsoref , thanks a lot for filter .
can you send us PR to cleanup whiltelist ?

I want to activate your tool on PR validation. Right now it is disabled(always return 0 as exit code)

@jsoref
Copy link
Contributor

jsoref commented Apr 12, 2018

@romani sure: checkstyle/contribution#296 -- it's pretty small :-)

@jsoref
Copy link
Contributor

jsoref commented Apr 12, 2018

If you're asking about for this PR, that was already done in #5693 (reference)

@romani
Copy link
Member Author

romani commented Apr 12, 2018

The biggest problem of whitelist is that it too much words from our inputs/resources, example https://github.com/checkstyle/contribution/blob/master/jsoref-spellchecker/whitelist.words#L3585

✔ ~/java/github/checkstyle/checkstyle [master|✔] 
$ grep -Rl "ZZZZZZ" *
src/test/resources/com/puppycrawl/tools/checkstyle/checks/javadoc/abstractjavadoc/InputAbstractJavadocPositionWithSinglelineComments.java

as we get rid of them, it should become small it might become reasonable to host it main repo.
ZZZZZZ and similar , can not be placed in whitelist.

@romani
Copy link
Member Author

romani commented Apr 15, 2018

@jsoref , whitelist file still contains odd words, please help to cleanup.
I did several changes to whitelist to resolve diffs at CIs.

@jsoref
Copy link
Contributor

jsoref commented Apr 15, 2018

Hmm...
We could drop:

  • single letter repetitions (e.g. ZZZZZ)
  • unicode things: u[a-f]+
  • hex things: [a-f]+ (I'm not terribly comfortable with this one, it could easily result in letting a misspelling through)

I think this set should be a pretty good win...

@romani
Copy link
Member Author

romani commented Apr 16, 2018

@jsoref , one more nuance:

~/java/github/checkstyle/contribution/jsoref-spellchecker [master L|✔] $ grep -C 2 "fffff" whitelist.words 
xfffb
xffff
xffffffff
xffffffffffffffff

~/java/github/checkstyle/checkstyle [master L|✔] $ grep -C 2 -r "xffffffffffffffff" *
.../MagicNumberCheckTest.java-            "103:36: " + getCheckMessage(MSG_KEY, "-2.5"),
.../MagicNumberCheckTest.java-            "109:34: " + getCheckMessage(MSG_KEY, "0xffffffff"),
.../MagicNumberCheckTest.java:            "110:36: " + getCheckMessage(MSG_KEY, "0xffffffffffffffffL"),

and works are validated in URLs (should be skipped)

~/java/github/checkstyle/contribution/jsoref-spellchecker [master L|✔] $ grep -C 2 "CHDCHAHD" whitelist.words 
CHDCEAHH
CHDCFJDG
CHDCHAHD
CHDCHBAE
CHDDCDHH

~/java/github/checkstyle/checkstyle [master L|✔] $ grep -r "CHDCHAHD" *
src/main/java/com/puppycrawl/tools/checkstyle/api/JavadocTokenTypes.java:     * <a href="https://docs.oracle.com/javase/8/docs/technotes/tools/unix/javadoc.html#CHDCHAHD">

How can we skip hex stuff from validation ?

@romani
Copy link
Member Author

romani commented Apr 16, 2018

one more prove that resources are not excluded

$ grep -rl "soooooooooooooooooooooooooooooooooooolongfooooooooooooooooooooooooooooooooooooooooooo" *
src/it/resources/com/google/checkstyle/test/chapter3filestructure/rule32packagestate/InputLineLength.java
src/it/resources/com/google/checkstyle/test/chapter3filestructure/toolongpackagetotestcoveragegooglesjavastylerule/InputLineLength.java

@jsoref
Copy link
Contributor

jsoref commented Apr 17, 2018

Ok, the problem is that the files don't start with a leading / and thus the exclude isn't working...

@romani
Copy link
Member Author

romani commented Apr 17, 2018

issue is resolved.

@romani romani closed this as completed Apr 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants