-
Notifications
You must be signed in to change notification settings - Fork 256
Allow Users to specify Elasticsearch Major Version #9696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
….8.x - 7.x.x by default
…or compatibility with elasticsearch 7
chrisvittal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. Need to add an ENV_VAR call.
hail/Makefile
Outdated
| JAR ?= jar | ||
| endif | ||
|
|
||
| ELASTIC_MAJOR_VERSION ?= 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should move this up into the hail/spark/scala version block, and add a call to ENV_VAR like on line 20.
hail/build.gradle
Outdated
| return "6.8.13" | ||
| } | ||
| else if (elasticMajorVersion == "7") { | ||
| return "7.1.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use 7.1.1 here? It looks like the latest version is 7.10.0.
https://www.elastic.co/guide/en/elasticsearch/hadoop/current/install.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran into some version conflict with another dependency when I tried 7.10, just need to sort that out. Changed it to 7.1.1 just to make sure this selection method works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like by running 7.1.1 I can write data to 6.8.x, so I think I just don't trust that matrix at all anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that matrix is "... platform and software configurations that are eligible for support under our subscription offerings".
Yeah, I was also previously using Hail's ES-Hadoop 6.2.4 to write to ES 5.5, which is not officially supported according to that matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, found a statement in documentation:
Elasticsearch for Apache Hadoop maintains backwards compatibility with the most recent minor version of Elasticsearch’s previous major release (5.X supports back to 2.4.X, 6.X supports back to 5.6.X, etc…).
-- https://www.elastic.co/guide/en/elasticsearch/hadoop/current/install.html#upgrading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, nice! Gnomad uses 6.8 right? I wonder if the right thing to do is just to change the default to use ES-hadoop 7 then. Anyone using 5 is probably pretty out of date already? Idk how much of a lift updating elasticsearch is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with defaulting to 7. gnomAD is on 6.8, so it's not an issue (and should be upgraded anyway). Everything less than 7 is now end of life (https://www.elastic.co/support/eol).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I landed on 7.8.1. Seems like that's the last one before I start running into py4j issues. I'll figure those out at a later date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good plan to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need to add a dependency to the $(SHADOW_JAR) target to ensure it's rebuilt when the elasticsearch version changes.
Users want to be able to use elasticsearch 7. This adds an argument to make so that we pull in a different version of
elasticsearch-spark-20depending on what they specify forELASTIC_MAJOR_VERSION.