Unable to run pig job #68

mohitanchlia · 2013-08-14T01:02:22Z

I downloaded latest build from the repo and ran a simple job like this:

grunt> A = LOAD 'twitter/tweet/_search?q=kim' USING org.elasticsearch.hadoop.pig.ESStorage();
grunt> DUMP A;

And got error

Pig Stack Trace

ERROR 2117: Unexpected error when launching map reduce job.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A
at org.apache.pig.PigServer.openIterator(PigServer.java:838)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:538)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias A
at org.apache.pig.PigServer.storeEx(PigServer.java:937)
at org.apache.pig.PigServer.store(PigServer.java:900)
at org.apache.pig.PigServer.openIterator(PigServer.java:813)
... 12 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117: Unexpected error when launching map reduce job.
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:352)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1266)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1251)
at org.apache.pig.PigServer.storeEx(PigServer.java:933)
... 14 more
Caused by: java.lang.RuntimeException: Could not resolve error that occured when launching map reduce job: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.elasticsearch.hadoop.mr.ESInputFormat.getSplits(ESInputFormat.java:335)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
at java.lang.Thread.run(Thread.java:662)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)

    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:676)
    at java.lang.Thread.dispatchUncaughtException(Thread.java:1874)

Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.elasticsearch.hadoop.mr.ESInputFormat.getSplits(ESInputFormat.java:335)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
at java.lang.Thread.run(Thread.java:662)

at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)

The text was updated successfully, but these errors were encountered:

costin · 2013-08-14T11:51:04Z

Looks like you're running Yarn/Hadoop-2.0. We don't provide any official artifacts for it yet but you can download the sources, compile them against the yarn jars and you should be good to go.

mohitanchlia · 2013-08-14T16:41:50Z

Thanks! Is it just changing the pom files? I think I'll have to change the interfaces as well?

costin · 2013-08-14T18:03:35Z

Just changing the artifact is enough for compiling (note there's no pom since we're using gradle but it should be straight forward how to do the change - we have a cdh branch for this as well).
There's no need to change anything to interfaces - the breaking change is in Hadoop not in the clients and a simple recompilation fixes everything.

mohitanchlia · 2013-08-15T00:22:36Z

I just started going through gradle manual. But it would be helpful if you could tell me which file I need to change to add the dependency? Is it build.gradle?

Thanks

costin · 2013-08-15T08:11:46Z

Try using the yarn profile - from the command line

gradlew -P distro=hadoopYarn

relates to #68

relates #68

costin · 2013-08-15T11:16:07Z

I've changed the build system so a yarn based binary is published as well - add yarn and you should be good to go.
Note that the pom still refers to the 1.x Hadoop artifacts (and that's because there's only one pom shared across the entire project).

I've already pushed a build so try it out and let me know how it goes.

Thanks,

mohitanchlia · 2013-08-15T16:45:46Z

Thanks a lot for quick response. I'll get the most recent build snapshot and try it out.

mohitanchlia · 2013-08-15T17:10:09Z

I took this jar elasticsearch-hadoop-1.3.0.BUILD-20130815.110726-110.jar and registered it with pig and tried running the job. I still get

Caused by: java.lang.RuntimeException: Could not resolve error that occured when launching map reduce job: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.elasticsearch.hadoop.mr.ESInputFormat.getSplits(ESInputFormat.java:335)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
at java.lang.Thread.run(Thread.java:662)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)

    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:676)
    at java.lang.Thread.dispatchUncaughtException(Thread.java:1874)

Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.elasticsearch.hadoop.mr.ESInputFormat.getSplits(ESInputFormat.java:335)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
"pig_1376586430066.log" 71L, 5234C

costin · 2013-08-15T17:19:37Z

That's because you are not using the 'yarn' jar but the normal one. You need this jar: https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-hadoop/1.3.0.BUILD-SNAPSHOT/elasticsearch-hadoop-1.3.0.BUILD-20130815.110726-110-yarn.jar - notice the -yarn.

See the readme page on how to do that with Maven.

mohitanchlia · 2013-08-15T17:57:37Z

Thanks! my mistake. I tried with the new jar and it worked. There was ES error but I think that is ES related. I have some questions on how READS are done in parallel, but I'll take a look at the split code. thanks

costin · 2013-08-15T18:21:28Z

It's great to hear that it's working.
As for the reads - you can check the docs or also check out the mailing list. In short, we do the splitting based on shards - the upcoming reference docs goes into details about that.
You can read it yourself under the doc/ folder until we publish it on the website.

Foolius · 2014-04-03T12:38:20Z

I think I have a similar problem, although I'm not getting such a big stacktrace.
I only get this error message: 2014-04-03 14:36:01,762 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve EsStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

with both the yarn and non-yarn version.

costin · 2014-04-03T12:40:46Z

@Foolius please open a new issue - this one is 8 months old and the code base has changed significantly.
Also please include info about your environment and most importantly your script otherwise it's just a guessing game.

costin added a commit that referenced this issue Aug 15, 2013

prepare to publish a yarn-based jar

3c7cfd8

relates to #68

costin added a commit that referenced this issue Aug 15, 2013

include YARN (if available) to default artifacts

c72b228

relates #68

mohitanchlia closed this as completed Aug 15, 2013

costin added the bug label Mar 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run pig job #68

Unable to run pig job #68

mohitanchlia commented Aug 14, 2013

costin commented Aug 14, 2013

mohitanchlia commented Aug 14, 2013

costin commented Aug 14, 2013

mohitanchlia commented Aug 15, 2013

costin commented Aug 15, 2013

costin commented Aug 15, 2013

mohitanchlia commented Aug 15, 2013

mohitanchlia commented Aug 15, 2013

costin commented Aug 15, 2013

mohitanchlia commented Aug 15, 2013

costin commented Aug 15, 2013

Foolius commented Apr 3, 2014

costin commented Apr 3, 2014

Unable to run pig job #68

Unable to run pig job #68

Comments

mohitanchlia commented Aug 14, 2013

Pig Stack Trace

at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)

costin commented Aug 14, 2013

mohitanchlia commented Aug 14, 2013

costin commented Aug 14, 2013

mohitanchlia commented Aug 15, 2013

costin commented Aug 15, 2013

costin commented Aug 15, 2013

mohitanchlia commented Aug 15, 2013

mohitanchlia commented Aug 15, 2013

costin commented Aug 15, 2013

mohitanchlia commented Aug 15, 2013

costin commented Aug 15, 2013

Foolius commented Apr 3, 2014

costin commented Apr 3, 2014