Skip to content

ClassNotFoundException: EsHadoopNoNodesLeftException #585

@freakinruben

Description

@freakinruben

I've created a spark job that reads from elasticsearch and then writes back to it. Locally it runs fine, but when I submit it to a cluster I'm not able to get it working. I've tried different things, as described below.

Spark-cluster is running spark-1.5.1-bin-without-hadoop on 3 nodes as a test setup.

I included the following packages in my uberjar:

  • org.elasticsearch/elasticsearch-spark_2.10: 2.2.0-m1 (also tried with 2.1.1)
  • org.elasticsearch/elasticsearch-hadoop-mr: 2.2.0-m1

The job then fails with

java.lang.ClassNotFoundException: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException

The ClassNotFoundException surprised me, it also prevents me from seeing the
actual reason for the exception, so I tried to solve the ClassNotFoundException.

I downloaded the packages on all nodes, added them to the spark classpath, rebooted
spark and re-submitted the job.

It then tells me

15/10/30 10:57:42 WARN scheduler.TaskSetManager:
Lost task 0.0 in stage 5.0 (TID 11, 10.20.0.97):
java.lang.Error: Multiple ES-Hadoop versions detected in the classpath; please use only one
....

Makes sense, there are multiple ES-Hadoop packages.. So I build an uberjar without
the packages described in step 1.

Exception in thread "main" java.lang.ExceptionInInitializerError
....
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.spark.rdd.api.java.JavaEsSpark

This also happens when I don't put the packages on the spark-classpath but provide
them when submitting the job:

  • submitted with --packages org.elasticsearch:elasticsearch-hadoop-mr:2.2.0-m1,org.elasticsearch:elasticsearch-spark_2.10:2.2.0-m1
  • submitted with --jars /home/hadoop/extra/elasticsearch-spark_2.10-2.2.0-m1.jar,/home/hadoop/extra/elasticsearch-hadoop-mr-2.2.0-m1.jar

So now I'm a bit confused of what's the right way to go about this. Do you have
any tips on how to run the job?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions