Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to lucene-7.0.0-snapshot-a0aef2f #24775

Merged
merged 1 commit into from May 19, 2017

Conversation

@nknize
Copy link
Member

commented May 18, 2017

This PR upgrades master to a current Lucene snapshot with commit id a0aef2f.

@nknize nknize requested a review from jpountz May 18, 2017
@nknize

This comment has been minimized.

Copy link
Member Author

commented May 18, 2017

There is currently a failing test that I could use another set of eyes on. Might be something simple, but its in the Ukrainian analyzer which I have little experience with. @cbuescher Could you have a look?

REPRODUCE WITH: gradle :plugins:analysis-ukrainian:test -Dtests.seed=4FA45DF62C3EBDFD -Dtests.class=org.elasticsearch.index.analysis.SimpleUkrainianAnalyzerTests -Dtests.method="testBasicUsage" -Dtests.security.manager=true -Dtests.locale=es-NI -Dtests.timezone=Europe/Paris
ERROR   1.60s J0 | SimpleUkrainianAnalyzerTests.testBasicUsage <<< FAILURES!
   > Throwable #1: java.lang.NullPointerException
   >    at __randomizedtesting.SeedInfo.seed([4FA45DF62C3EBDFD:14B7E413A036E21D]:0)
   >    at morfologik.stemming.Dictionary.read(Dictionary.java:80)
   >    at org.apache.lucene.analysis.uk.UkrainianMorfologikAnalyzer.getDictionary(UkrainianMorfologikAnalyzer.java:154)
   >    at org.apache.lucene.analysis.uk.UkrainianMorfologikAnalyzer.createComponents(UkrainianMorfologikAnalyzer.java:148)
   >    at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:198)
   >    at org.elasticsearch.index.analysis.SimpleUkrainianAnalyzerTests.testAnalyzer(SimpleUkrainianAnalyzerTests.java:46)
   >    at org.elasticsearch.index.analysis.SimpleUkrainianAnalyzerTests.testBasicUsage(SimpleUkrainianAnalyzerTests.java:37)
   >    at java.lang.Thread.run(Thread.java:745)
@cbuescher

This comment has been minimized.

Copy link
Member

commented May 18, 2017

@nknize It appears the Ukrainian dictionary resource cannot be loaded by the Analyzer. When debugging this I see that in UkrainianMorfologikAnalyzer#getDictionary the resource that should be loaded by UkrainianMorfologikAnalyzer.class.getClassLoader().getResource("ua/net/nlp/ukrainian.dict") is null. Since this is internal to Lucene I don't know exactly what might have changed there but will see what happens instead on master where this seems to work.

@cbuescher

This comment has been minimized.

Copy link
Member

commented May 18, 2017

There appears to be a packaging problem with the lucene-analyzers-morfologik-7.0.0-snapshot jar. The one that is still used on master (snapshot-89f6d17) contains the ukrainian.dict, the one used in this PR (snapshot-a0aef2f) appears to be missing this file. From my gradle cache:

~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-morfologik/7.0.0-snapshot-89f6d17/1fe11b45d9f6a68ef1e9994bebd81d26632efc5$ jar -tf lucene-analyzers-morfologik-7.0.0-snapshot-89f6d17.jar
META-INF/
META-INF/MANIFEST.MF
META-INF/services/
org/
org/apache/
org/apache/lucene/
org/apache/lucene/analysis/
org/apache/lucene/analysis/morfologik/
org/apache/lucene/analysis/uk/
META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory
org/apache/lucene/analysis/morfologik/MorfologikAnalyzer.class
org/apache/lucene/analysis/morfologik/MorfologikFilter.class
org/apache/lucene/analysis/morfologik/MorfologikFilterFactory.class
org/apache/lucene/analysis/morfologik/MorphosyntacticTagsAttribute.class
org/apache/lucene/analysis/morfologik/MorphosyntacticTagsAttributeImpl.class
org/apache/lucene/analysis/morfologik/package-info.class
org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer$DefaultSetHolder.class
org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.class
org/apache/lucene/analysis/uk/mapping_uk.txt
org/apache/lucene/analysis/uk/package-info.class
org/apache/lucene/analysis/uk/stopwords.txt
org/apache/lucene/analysis/uk/ukrainian.dict
org/apache/lucene/analysis/uk/ukrainian.info
META-INF/LICENSE.txt
META-INF/NOTICE.txt

~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-morfologik/7.0.0-snapshot-a0aef2f/26d01ae0d15243b30874b2cb609be5d041890459$ jar -tf lucene-analyzers-morfologik-7.0.0-snapshot-a0aef2f.jar
META-INF/
META-INF/MANIFEST.MF
META-INF/services/
org/
org/apache/
org/apache/lucene/
org/apache/lucene/analysis/
org/apache/lucene/analysis/morfologik/
org/apache/lucene/analysis/uk/
META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory
org/apache/lucene/analysis/morfologik/MorfologikAnalyzer.class
org/apache/lucene/analysis/morfologik/MorfologikFilter.class
org/apache/lucene/analysis/morfologik/MorfologikFilterFactory.class
org/apache/lucene/analysis/morfologik/MorphosyntacticTagsAttribute.class
org/apache/lucene/analysis/morfologik/MorphosyntacticTagsAttributeImpl.class
org/apache/lucene/analysis/morfologik/package-info.class
org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer$DefaultSetHolder.class
org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.class
org/apache/lucene/analysis/uk/mapping_uk.txt
org/apache/lucene/analysis/uk/package-info.class
org/apache/lucene/analysis/uk/stopwords.txt
META-INF/LICENSE.txt
META-INF/NOTICE.txt
@nknize

This comment has been minimized.

Copy link
Member Author

commented May 18, 2017

Ah, I missed that issue! Thanks so much @jimczi

@cbuescher I can fix this as part of the PR if its easier. No difference to me.

@cbuescher

This comment has been minimized.

Copy link
Member

commented May 18, 2017

Great, I think it should be part of this PR. I just tried updating the plugin dependencies to:

-  compile "org.carrot2:morfologik-stemming:2.1.0"
-  compile "org.carrot2:morfologik-fsa:2.1.0"
+  compile "org.carrot2:morfologik-stemming:2.1.1"
+  compile "org.carrot2:morfologik-fsa:2.1.1"
+  compile "ua.net.nlp:morfologik-ukrainian-search:3.7.5"

And this seems to work. I got the version numbers from https://github.com/apache/lucene-solr/blob/master/lucene/ivy-versions.properties but I'm not entirely sure if these are the correct ones for this snapshot version (its lucene master, I'm not sure about how your branches are organized)

Furthermore I'm wondering how we are usually supposed to track transitive dependency version updates like this (e.g. going from morfologik-fsa:2.1.0 to morfologik-fsa:2.1.1 in this case) for lucene analyzers etc... Any thoughts on this are appreciated.

@nknize nknize force-pushed the nknize:upgrade/lucene7-a0aef2f branch May 18, 2017
@jimczi

This comment has been minimized.

Copy link
Member

commented May 18, 2017

Furthermore I'm wondering how we are usually supposed to track transitive dependency version updates like this (e.g. going from morfologik-fsa:2.1.0 to morfologik-fsa:2.1.1 in this case) for lucene analyzers etc... Any thoughts on this are appreciated.

I think it should be part of the upgrade process, I'll add a line in the docs about checking the external dependencies. For instance the upgrade of icu4j is missing in master:
https://issues.apache.org/jira/browse/LUCENE-7035
No need to add this change in this PR, we can fix the discrepancy in a follow up.

@jimczi
jimczi approved these changes May 18, 2017
Copy link
Member

left a comment

LGTM
@nknize you'll have to re-run updateSHAS to include the new jars.

Copy link
Contributor

left a comment

Thanks @nknize !

@nknize nknize force-pushed the nknize:upgrade/lucene7-a0aef2f branch May 18, 2017
This commit upgrades master to a current lucene snapshot with commit id a0aef2f.
@nknize nknize force-pushed the nknize:upgrade/lucene7-a0aef2f branch to deb7caf May 19, 2017
@nknize nknize merged commit deb7caf into elastic:master May 19, 2017
1 of 2 checks passed
1 of 2 checks passed
elasticsearch-ci Build triggered. sha1 is merged.
Details
CLA Commit author is a member of Elasticsearch
Details
@clintongormley clintongormley added v6.0.0-beta1 and removed v6.0.0 labels Jul 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.