" -file option is deprecated, please use generic option -files instead." #79

Open
UMDTERPS opened this Issue Oct 31, 2013 · 1 comment

2 participants

@UMDTERPS

Hello! I am trying to run a job for our data team and we are getting errors using dumbo. We are using the latest version of Dumbo and Cloudera.

Command used to run the job:

"ls[benjamin@arya dedup]$ dumbo start jaccard.py -input products -output products-output13 -hadoop /usr/ -hadooplib /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/"

Stacktrace:

13/10/30 13:05:32 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
13/10/30 13:05:32 WARN streaming.StreamJob: -jobconf option is deprecated, please use -D instead.
packageJobJar: [/home/benjamin/mapreduce/jobs/dedup/typedbytes.pyc, /home/benjamin/mapreduce/jobs/dedup/jaccard.py, /home/benjamin/mapreduce/jobs/dedup/dumbo/backends/common.pyc] [] /tmp/streamjob5478521893861821465.jar tmpDir=null
13/10/30 13:05:33 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/10/30 13:05:34 INFO mapred.FileInputFormat: Total input paths to process : 1
13/10/30 13:05:35 INFO mapred.JobClient: Running job: job_201310231818_0015
13/10/30 13:05:36 INFO mapred.JobClient: map 0% reduce 0%
13/10/30 13:05:47 INFO mapred.JobClient: Task Id : attempt_201310231818_0015_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.AutoInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1649)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:620)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.AutoInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1617)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1641)

Any help would be greatly appreciated!

@a4tunado

Seams like hadoop-streaming.jar is missing on your nodes.
Check if your environtment points to the correct HADDOP_
paths or try to add hadoop-streaming*.jar with -libjar option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment