Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to save from spark(on mesos) to Elastic search #579

Closed
priyanknarvekar opened this issue Oct 27, 2015 · 8 comments
Closed

Unable to save from spark(on mesos) to Elastic search #579

priyanknarvekar opened this issue Oct 27, 2015 · 8 comments

Comments

@priyanknarvekar
Copy link

Got an error saying 2 version of ES Hadoop lib were found on class path ,
I noticed there are 2 entries, but they still point to the same version.

15/10/26 22:18:06 ERROR Version: Multiple ES-Hadoop versions detected in the classpath; please use only one
jar:file:/tmp/mesos/slaves/3014350f-cd05-44af-9c9c-3974bdeed86e-S1/frameworks/3014350f-cd05-44af-9c9c-3974bdeed86e-0016/executors/3014350f-cd05-44af-9c9c-3974bdeed86e-S1/runs/3697bc14-f0bd-48b9-8599-9683867eec63/elasticsearch-spark_2.10-2.2.0-m1.jar
jar:file:/tmp/mesos/slaves/3014350f-cd05-44af-9c9c-3974bdeed86e-S1/frameworks/3014350f-cd05-44af-9c9c-3974bdeed86e-0016/executors/3014350f-cd05-44af-9c9c-3974bdeed86e-S1/runs/3697bc14-f0bd-48b9-8599-9683867eec63/./elasticsearch-spark_2.10-2.2.0-m1.jar

@costin
Copy link
Member

costin commented Oct 27, 2015

There should be only one version of the jar, just as the error indicates. We can't reliably tell where a class is loaded from and whether the contain jar (it can just be a folder) is the same version of not as the other one.
Any reason why you have two and not just one?

@priyanknarvekar
Copy link
Author

if you notice its actually a single version (elasticsearch-spark_2.10-2.2.0-m1.jar)
I am running spark on mesos, and setting the jar via addjars in spark. and I add them only once.

@costin
Copy link
Member

costin commented Oct 27, 2015

I see there's only one jar but there are two different entries. What's your classpath?

@priyanknarvekar
Copy link
Author

the spark-mesos executor is running with with following command (checked this from ps -ef | grep java)

/usr/java/8/x86_64/jdk/jre/bin/java -cp /opt/apache/spark/latest/conf/:/opt/apache/spark/latest/lib/spark-assembly-1.5.1-hadoop2.6.0.jar:/opt/apache/spark/latest/lib/datanucleus-api-jdo-3.2.6.jar:/opt/apache/spark/latest/lib/datanucleus-core-3.2.10.jar:/opt/apache/spark/latest/lib/datanucleus-rdbms-3.2.9.jar -Xms1024m -Xmx1024m org.apache.spark.executor.MesosExecutorBackend

the jar seems to be are added during runtime to the class path by spark executor (from log file) :
however there is only one entry for addition of jar so not sure why is it detected twice by es .

15/10/27 14:36:44 INFO Executor: Fetching file:/opt/libs/java/elasticsearch-spark_2.10-2.2.0-m1.jar with timestamp 1445970853892
15/10/27 14:36:44 INFO Utils: Copying /opt/libs/java/elasticsearch-spark_2.10-2.2.0-m1.jar to /tmp/spark-2b1f3e3f-77e9-428d-b098-a180bde62f47/-20273273801445970853892_cache
15/10/27 14:36:44 INFO Utils: Copying /tmp/spark-2b1f3e3f-77e9-428d-b098-a180bde62f47/-20273273801445970853892_cache to /tmp/mesos/slaves/3014350f-cd05-44af-9c9c-3974bdeed86e-S4/frameworks/3f470bbd-c050-42bc-bd36-e38041ce16de-0002/executors/3014350f-cd05-44af-9c9c-3974bdeed86e-S4/runs/fb3e5850-33dc-4229-a187-be6662d17589/./elasticsearch-spark_2.10-2.2.0-m1.jar
15/10/27 14:36:44 INFO Executor: Adding file:/tmp/mesos/slaves/3014350f-cd05-44af-9c9c-3974bdeed86e-S4/frameworks/3f470bbd-c050-42bc-bd36-e38041ce16de-0002/executors/3014350f-cd05-44af-9c9c-3974bdeed86e-S4/runs/fb3e5850-33dc-4229-a187-be6662d17589/./elasticsearch-spark_2.10-2.2.0-m1.jar to class loader

@costin
Copy link
Member

costin commented Oct 27, 2015

however there is only one entry for addition of jar so not sure why is it detected twice by es .

Because there two different entries in the classpath. The fact that they point to the same jar makes this even weirder.
Hence why I asked for the runtime classpath of your task - the logs above don't give the full picture.

Either way, I've improved the check to perform normalization of the URLs which in this case should alleviate the problem. Can you please try the dev builds and report back?

costin added a commit that referenced this issue Oct 28, 2015
costin added a commit that referenced this issue Oct 28, 2015
relates #579

(cherry picked from commit 12bfda7)
@costin
Copy link
Member

costin commented Oct 28, 2015

Pushed in 2.x and master.

@costin costin closed this as completed Oct 28, 2015
@priyanknarvekar
Copy link
Author

I tried with the build snapshot , it didnt work, still seemed to have failed with similar error.

15/10/28 19:07:05 ERROR Version: Multiple ES-Hadoop versions detected in the classpath; please use only one
jar:file:/tmp/mesos/slaves/3014350f-cd05-44af-9c9c-3974bdeed86e-S5/frameworks/3f470bbd-c050-42bc-bd36-e38041ce16de-0004/executors/3014350f-cd05-44af-9c9c-3974bdeed86e-S5/runs/c15377a2-537f-4540-a409-b3dbebf67802/./elasticsearch-spark_2.10-2.2.0.BUILD-SNAPSHOT.jar
jar:file:/tmp/mesos/slaves/3014350f-cd05-44af-9c9c-3974bdeed86e-S5/frameworks/3f470bbd-c050-42bc-bd36-e38041ce16de-0004/executors/3014350f-cd05-44af-9c9c-3974bdeed86e-S5/runs/c15377a2-537f-4540-a409-b3dbebf67802/elasticsearch-spark_2.10-2.2.0.BUILD-SNAPSHOT.jar

@costin
Copy link
Member

costin commented Oct 29, 2015

Likely you haven't used the latest snapshot (since the job was stuck). I've manually pushed another one if you want to try it out.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants