Fix missing service files in ES-Hadoop jars #1265
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When creating the jar files for ES-Hadoop, each integration copies the contents of the MR jar into itself since the MR jar contains all the core code. Once each jar is built, they all contribute their contents to the top level elasticsearch-hadoop jar (ignoring duplicate code files). A problem occurs during these jar transitions: The contents of META-INF/services are not copied along. This previously would manifest as not being able to create a Spark SQL dataframe using the short name
"es"
when using theelasticsearch-hadoop-x.x.x.jar
. Creating the dataframe using the short name would work fine when using theelasticsearch-spark-yy_zz-x.x.x.jar
because it contains the appropriate service file, which is never copied up to the root jar.Now that we have Kerberos integrated, there are several items in different projects services directories that all need to be copied around in order for different Kerberos features in Hadoop and Spark to function normally.
We did not encounter these problems because we make use of a separate hadoop testing jar, which is created directly from the sources of the projects instead of from the jar files, and which includes all the test and integration test sources.
This PR ensures that the contents of the
mr
project'sMETA-INF/services
directory are copied into the hive, pig, spark, and storm jars, and that the contents of all of integrationsMETA-INF/services
directories are copied into the root elasticsearch-hadoop jar.