New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.NoSuchMethodError: org.apache.spark.sql.DataFrameReader.load #153

Closed
ghost opened this Issue Aug 8, 2016 · 2 comments

Comments

Projects
None yet
1 participant
@ghost

ghost commented Aug 8, 2016

When executing line 658 reading Avro file from S3

val df1  = sqlContext.read.avro(avroFile) 

with build.sbt

 scalaVersion := "2.11.7"

 libraryDependencies ++= Seq(
    "org.apache.spark" % "spark-core_2.11" % "2.0.0",
     "org.apache.spark" % "spark-sql_2.11" % "2.0.0",
     "com.databricks" %% "spark-avro" % "3.0.0",
     "com.amazonaws"     % "aws-java-sdk-s3" % "1.11.3"
 )

Getting the following error:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.DataFrameReader.load(Ljava/lang/String;)Lorg/apache/spark/sql/DataFrame;
    at com.databricks.spark.avro.package$AvroDataFrameReader$$anonfun$avro$2.apply(package.scala:45)
    at com.databricks.spark.avro.package$AvroDataFrameReader$$anonfun$avro$2.apply(package.scala:45)
    at ExportData$.data_transformation_for_index(ExportData.scala:658)
@JoshRosen

This comment has been minimized.

Show comment
Hide comment
@JoshRosen

JoshRosen Aug 8, 2016

Contributor

This looks like a duplicate of #150, but, helpfully, your issue includes the actual version numbers, so I'm going to mark that other issue as a duplicate of this one and will continue investigation here.

I'm a bit puzzled that the NoSuchMessageError is referencing a method which returns org/apache/spark/sql/DataFrame given that DataFrame became an alias for Dataset[Row] in Spark 2.0 and this library is compiled against Spark 2.0 and is presumably running against that version.

When I look at package$AvroDataFrameReader$$anonfun$avro$2.class using javap I see it calling the Dataset-returning method:

public final org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> apply(java.lang.String);
    descriptor: (Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=2
         0: aload_0
         1: getfield      #23                 // Field eta$0$2$1:Lorg/apache/spark/sql/DataFrameReader;
         4: aload_1
         5: invokevirtual #28                 // Method org/apache/spark/sql/DataFrameReader.load:(Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
         8: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       9     0  this   Lcom/databricks/spark/avro/package$AvroDataFrameReader$$anonfun$avro$2;
            0       9     1  path   Ljava/lang/String;
      LineNumberTable:
        line 34: 0
    Signature: #55                          // (Ljava/lang/String;)Lorg/apache/spark/sql/Dataset<Lorg/apache/spark/sql/Row;>;

Given all of this, are you sure that you're using version 3.0.0 of spark-avro and don't somehow have an older conflicting version on your classpath?

Contributor

JoshRosen commented Aug 8, 2016

This looks like a duplicate of #150, but, helpfully, your issue includes the actual version numbers, so I'm going to mark that other issue as a duplicate of this one and will continue investigation here.

I'm a bit puzzled that the NoSuchMessageError is referencing a method which returns org/apache/spark/sql/DataFrame given that DataFrame became an alias for Dataset[Row] in Spark 2.0 and this library is compiled against Spark 2.0 and is presumably running against that version.

When I look at package$AvroDataFrameReader$$anonfun$avro$2.class using javap I see it calling the Dataset-returning method:

public final org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> apply(java.lang.String);
    descriptor: (Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=2
         0: aload_0
         1: getfield      #23                 // Field eta$0$2$1:Lorg/apache/spark/sql/DataFrameReader;
         4: aload_1
         5: invokevirtual #28                 // Method org/apache/spark/sql/DataFrameReader.load:(Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
         8: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       9     0  this   Lcom/databricks/spark/avro/package$AvroDataFrameReader$$anonfun$avro$2;
            0       9     1  path   Ljava/lang/String;
      LineNumberTable:
        line 34: 0
    Signature: #55                          // (Ljava/lang/String;)Lorg/apache/spark/sql/Dataset<Lorg/apache/spark/sql/Row;>;

Given all of this, are you sure that you're using version 3.0.0 of spark-avro and don't somehow have an older conflicting version on your classpath?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Aug 10, 2016

I found the issue, passing wrong version of databricks:spark-avro to spark-submit.

WRONG:

spark-submit  --packages com.databricks:spark-avro_2.10:2.0.1

RIGHT

spark-submit  --packages com.databricks:spark-avro_2.11:3.0.0 

ghost commented Aug 10, 2016

I found the issue, passing wrong version of databricks:spark-avro to spark-submit.

WRONG:

spark-submit  --packages com.databricks:spark-avro_2.10:2.0.1

RIGHT

spark-submit  --packages com.databricks:spark-avro_2.11:3.0.0 

@ghost ghost closed this Aug 10, 2016

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment