Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hadoop's uber mode causes job failure #280

Closed
AndreasHoermandinger opened this issue Sep 25, 2014 · 11 comments
Closed

Hadoop's uber mode causes job failure #280

AndreasHoermandinger opened this issue Sep 25, 2014 · 11 comments

Comments

@AndreasHoermandinger
Copy link

When having mapreduce.job.ubertask.enable set to true applications, that are executed in uber mode, fail.
There seems to be a problem with the task id:

2014-09-25 17:01:27,976 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.IllegalArgumentException: TaskId string : attempt_1411539203871_0014_m_000000_0 is not properly formed
        at org.apache.hadoop.mapreduce.TaskID.forName(TaskID.java:233)
        at org.apache.hadoop.mapred.TaskID.forName(TaskID.java:195)
        at org.elasticsearch.hadoop.mr.HeartBeat.<init>(HeartBeat.java:51)
        at org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.init(EsInputFormat.java:226)
        at org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.init(EsInputFormat.java:367)
        at org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.initialize(EsInputFormat.java:191)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:525)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
        at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
        at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
        at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Here are the full logs of the crashes in mapreduce and pig
https://gist.github.com/AndreasHoermandinger/a89ec7df4334fa2c98e0
As data I used the shakespeare.json file provided in the 10 minute kibana walkthrough: http://www.elasticsearch.org/guide/en/kibana/current/snippets/shakespeare.json

@costin
Copy link
Member

costin commented Sep 25, 2014

@AndreasHoermandinger Thanks - what version of es-hadoop and Pig are you using? Hadoop is 2.5.1, correct?

@AndreasHoermandinger
Copy link
Author

Yes, correct
Pig is 0.13.0
es-hadoop is 2.0.1
PS: forgot to mention, only tested with pig and mapreduce

@costin
Copy link
Member

costin commented Sep 25, 2014

@AndreasHoermandinger Hi, I've pushed a dev build with a potential fix to maven (check out the latest 2.0.2.BUILD-SNAPSHOT). Can you please try it out and report back?

@AndreasHoermandinger
Copy link
Author

@costin Yes, it works. Thank you!

costin added a commit that referenced this issue Sep 25, 2014
Hadoop 2.5.x introduced a bug where the task attempt is used for the
task id. To cope around this, the code searches first for the task
attempt and only then falls back to the task id.

relates #280

(cherry picked from commit a2084e2)
costin added a commit that referenced this issue Sep 25, 2014
Hadoop 2.5.x introduced a bug where the task attempt is used for the
task id. To cope around this, the code searches first for the task
attempt and only then falls back to the task id.

relates #280
@AndreasHoermandinger
Copy link
Author

@costin I tried writing too using the fixed build, and I get the same error again

2014-09-26 15:46:53,738 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.IllegalArgumentException: TaskId string : attempt_1411539203871_0035_m_000000_0 is not properly formed

@costin
Copy link
Member

costin commented Sep 26, 2014

@AndreasHoermandinger Can you post the entire stacktrace please? You mentioned the fix worked - can you indicate whether it was with reading or writing?

@AndreasHoermandinger
Copy link
Author

@costin Sorry, forgot to mention: it worked with reading ( LOAD .... USING EsStorage ); from elasticsearch, but crashed with writing ( STORE .... USING EsStorage ) now:

https://gist.github.com/AndreasHoermandinger/5a55f356a1c8480fb6e6

@costin
Copy link
Member

costin commented Sep 26, 2014

@AndreasHoermandinger Looks like there a code path that wasn't addressed by the previous fix. I've remedied this and pushed another 2.0.2 build - can you please try it out and report back?
Thanks!

@AndreasHoermandinger
Copy link
Author

@costin Now writing works too, thank you for the fast fixes

@costin
Copy link
Member

costin commented Sep 26, 2014

cheers!

costin added a commit that referenced this issue Sep 26, 2014
costin added a commit that referenced this issue Sep 26, 2014
Relates #280

(cherry picked from commit 145ad76)
@costin
Copy link
Member

costin commented Sep 26, 2014

Marking as close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants