LUCENE-9727: build side support for running Hunspell tests. #2313

dweiss · 2021-02-07T11:03:30Z

No description provided.

…hecks.

dweiss · 2021-02-07T11:06:26Z

@donnerpeter I've done some minor edits so that those checks run via Gradle too. Can you take a peek? I made one change that may affect you - those dictionaries are now scanned recursively without the depth limit - this helps me to run against both repos at once.

donnerpeter · 2021-02-07T13:08:33Z

lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java

- * documentation}
+ * Loads all dictionaries from the directory specified in {@code hunspell.dictionaries} system
+ * property and prints their memory usage. All *.aff files are traversed directly inside the given
+ * directory or in its immediate subdirectories. Each *.aff file must have a same-named sibling


Not only immediate now?

Yep, thanks.

donnerpeter · 2021-02-07T13:10:35Z

lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java

+    int threads = Runtime.getRuntime().availableProcessors();
+    ExecutorService executor = Executors.newFixedThreadPool(threads);
+    try {
+      Deque<Path> failures = new ConcurrentLinkedDeque<>();


synchronizedList might look less surprising to readers :)

Doesn't matter it a test, but sure.

donnerpeter · 2021-02-07T13:11:41Z

lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java

-        + "suffixes="
-        + RamUsageTester.humanSizeOf(dic.suffixes)
-        + ")";
+        + ("words=" + RamUsageTester.humanSizeOf(dic.words) + ", ")


A nice trick to save on LOCs in the face of this ruthless spotless formatter :)

Yes, such hints and groupings make logical sense and also help with formatting.

donnerpeter

@donnerpeter I've done some minor edits so that those checks run via Gradle too. Can you take a peek? I made one change that may affect you - those dictionaries are now scanned recursively without the depth limit - this helps me to run against both repos at once.

Thank you very much, LGTM overall! Do you check out both repos into some common parent directory? Looks like a nice idea.

donnerpeter · 2021-02-07T13:12:19Z

lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java

-        System.err.println("While checking " + aff + ":");
-        e.printStackTrace();
+    int threads = Runtime.getRuntime().availableProcessors();
+    ExecutorService executor = Executors.newFixedThreadPool(threads);


Thanks for parallelization! I've considered it, but never got to do it.

donnerpeter · 2021-02-07T13:14:23Z

lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestPerformance.java

@@ -40,8 +42,15 @@
 * en.txt}) in a directory specified in {@code -Dhunspell.corpora=...}
 */
 @TestCaseOrdering(TestCaseOrdering.AlphabeticOrder.class)
-@Ignore("enable manually")


I actually took advantage of this. Running all tests in a package skipped the ignored ones, and so the correctness was checked relatively quickly. Now that'd also include performance/allDict tests, and can become slower :( But I can work around that locally.

Not if you create a launch config that just has one of those properties (just the hunspell.dictionaries) - then perf tests would be skipped. If you launch everything from the same launcher then indeed this will be a problem. I don't care much but it'd be my preference to control it somehow without modifying the code itself. You could use a custom property, test group annotation... whatever works. I think two properties already separate these tests well enough though.

Sure, that's the way to go

dweiss · 2021-02-07T18:36:34Z

Do you check out both repos into some common parent directory?

Yes, that's what I did. I just wanted a superset of all the dictionaries... Perhaps eventually make it a gradle task to fetch them automatically and then run integration tests on them. Just goofing around with the code though - I'm not trying to step in your way!

… directive offsets.

donnerpeter · 2021-02-07T19:43:31Z

Do you check out both repos into some common parent directory?

Yes, that's what I did. I just wanted a superset of all the dictionaries... Perhaps eventually make it a gradle task to fetch them automatically and then run integration tests on them. Just goofing around with the code though - I'm not trying to step in your way!

No worries, and thanks :)

)

dweiss added 2 commits February 7, 2021 12:02

LUCENE-9727: add build-side support for running Hunspell validation c…

728caac

…hecks.

Fixup.

3362e56

dweiss added 3 commits February 7, 2021 12:48

Use parallelism to speed up parsing dictionaries.

a4a21ad

Tidy.

1254050

Merge remote-tracking branch 'origin/master' into LUCENE-9727

dfc3ca6

donnerpeter reviewed Feb 7, 2021

View reviewed changes

donnerpeter approved these changes Feb 7, 2021

View reviewed changes

Correct comments, offending API use.

d66fa22

LUCENE-9740: add an ignored test to scan *.aff files for SET and FLAG…

107498f

… directive offsets.

dweiss merged commit 903782d into apache:master Feb 8, 2021

epugh pushed a commit to epugh/lucene-solr-1 that referenced this pull request Feb 8, 2021

LUCENE-9727: build side support for running Hunspell tests. (apache#2313

be453ed

)

epugh pushed a commit to epugh/lucene-solr-1 that referenced this pull request Feb 8, 2021

LUCENE-9727: build side support for running Hunspell tests. (apache#2313

ba9bd6e

)

asfimport mentioned this pull request Dec 8, 2021

Add build-side support for running full validation checks against hunspell repos [LUCENE-9727] apache/lucene#10766

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LUCENE-9727: build side support for running Hunspell tests. #2313

LUCENE-9727: build side support for running Hunspell tests. #2313

dweiss commented Feb 7, 2021

dweiss commented Feb 7, 2021

donnerpeter Feb 7, 2021

dweiss Feb 7, 2021

donnerpeter Feb 7, 2021

dweiss Feb 7, 2021

donnerpeter Feb 7, 2021

dweiss Feb 7, 2021

donnerpeter left a comment

donnerpeter Feb 7, 2021

donnerpeter Feb 7, 2021

dweiss Feb 7, 2021

donnerpeter Feb 7, 2021

dweiss commented Feb 7, 2021

donnerpeter commented Feb 7, 2021 •

edited

Loading

LUCENE-9727: build side support for running Hunspell tests. #2313

LUCENE-9727: build side support for running Hunspell tests. #2313

Conversation

dweiss commented Feb 7, 2021

dweiss commented Feb 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

donnerpeter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dweiss commented Feb 7, 2021

donnerpeter commented Feb 7, 2021 • edited Loading

donnerpeter commented Feb 7, 2021 •

edited

Loading