Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fail to Write RDD into HDFS with Parquet Format #344
I was wondering if anybody could help me fix this issue:
I tried to write a function to store a RDD into HDFS with Parquet format. The way worked for previous spark version was to mimic the "adamSave" function in this file adam-core/src/main/scala/org/bdgenomics/adam/rdd/ADAMRDDFunctions.scala.
However, it does not work when I upgrade spark from 0.9 to 1.0.1. The error message is "could not instanciate class parquet.avro.AvroWriteSupport set in job conf at parquet.write.support.class"
Then, I checked the newer version of ADAM and found that you modified the function and deleted the setting step of SupportWriter.
Nevertheless, I tried to modify my code to make it consistent with your newer version, it still does not work. The error is:
It may be not that appropriate to ask this question at this place, but ADAM is the only example that I can find to use avro-parquet and run in spark 1.0.1. Could you help me fix it? I will really appreciate it.
Parquet updated their docs to indicate that the ParquetAvroOutputFormat should be used instead of the ParquetOutputFormat + AvroWriteSupport. If you've just removed the AvroWriteSupport, you'll also need to change the ParquetOutputFormat over to the ParquetAvroOutputFormat.
PengWeiPRC firstname.lastname@example.org wrote:
Thanks for your reply. Did you mean AvroParquetOutputFormat? I modified my code but the error is ClassNotFound.
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:1 failed 4 times, most recent failure: Exception failure in TID 6 on host Gene1.CS.UCLA.EDU: java.lang.ClassNotFoundException: parquet.avro.AvroParquetOutputFormat
I am pretty sure that I imported "parquet.avro.AvroParquetOutputFormat", and in my pom.xml file there is the dependencies related to parquet-avro and I also find this class in the jar. Any suggestions? Thanks very much.