Pydoop submit script fails #366

orwa-te · 2020-02-08T16:41:51Z

I have tried to run the code for wordCount example linked here https://crs4.github.io/pydoop/tutorial/pydoop_script.html using the pydoop script script.py hdfs_input hdfs_output and it worked fine for me and I could see the results from HDFS. However when I try to run the full-featured version of the program using "Pydoop submit" linked here https://crs4.github.io/pydoop/tutorial/mapred_api.html#api-tutorial using pydoop submit --upload-file-to-cache wc.py wc input output it takes too much time while running without getting any response or result, also the map-reduce job looks like it got stuck and always get something like this in the terminal:

2020-02-08 18:21:05,580 INFO mapreduce.Job: Job job_1581178676163_0001 running in uber mode : false
2020-02-08 18:21:05,583 INFO mapreduce.Job: map 0% reduce 0%
2020-02-08 18:31:34,480 INFO mapreduce.Job: Task Id : attempt_1581178676163_0001_m_000000_0, Status : FAILED
AttemptID:attempt_1581178676163_0001_m_000000_0 Timed out after 600 secs
^C[hdadmin@datanode3 pydoop]$

Map-Reduce job fails when using "Pydoop submit"!!
What could cause the problem and how to solve it?

The text was updated successfully, but these errors were encountered:

simleo · 2020-02-10T09:50:10Z

To see what went wrong you have to check the individual task logs. You can access them via the Hadoop web UI.

orwa-te · 2020-02-10T21:52:41Z

After tyring multiple times, the console gives me these messages:

2020-02-10 23:22:03,628 INFO mapreduce.Job: map 0% reduce 0%
2020-02-10 23:32:34,268 INFO mapreduce.Job: Task Id : attempt_1581369620079_0001_m_000000_0, Status : FAILED
AttemptID:attempt_1581369620079_0001_m_000000_0 Timed out after 600 secs
[2020-02-10 23:32:33.784]Sent signal OUTPUT_THREAD_DUMP (SIGQUIT) to pid 24623 as user hdadmin for container container_1581369620079_0001_01_000002, result=success
[2020-02-10 23:32:33.792]Container killed by the ApplicationMaster.
[2020-02-10 23:32:33.811]Container killed on request. Exit code is 143
[2020-02-10 23:32:33.812]Container exited with a non-zero exit code 143.

I opened "sys logs" from web UI and could not find any error or even warning messages, but "stderr" data is like this:

Feb 10, 2020 11:22:01 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
Feb 10, 2020 11:22:01 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Feb 10, 2020 11:22:01 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
Feb 10, 2020 11:22:01 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
Feb 10, 2020 11:22:01 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Feb 10, 2020 11:22:01 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Feb 10, 2020 11:22:02 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"

I searched for the message "Container exited with a non-zero exit code 143" and found that it may be related to the garbage collector or other memory allocation issues. If this is the case, how the default script Pydoop works with no problems!

simleo · 2020-02-11T10:43:48Z

I see. Try tweaking the memory settings and good luck :)

orwa-te · 2020-02-12T23:59:13Z

I am running my Hadoop in my single machine on VM installed with 10 GB RAM and 2 processing cores, Centos 7
What is wrong with the following configuration settings? Here are the properties with their values where memory in MB:

yarn-site.xml

yarn.scheduler.minimum-allocation-mb -> 512
yarn.scheduler.minimum-allocation-vcores -> 1
yarn.scheduler.maximum-allocation-vcores -> 2
yarn.nodemanager.resource.memory-mb -> 8192
yarn.nodemanager.resource.cpu-vcores -> 2

mapred-site.xml

mapreduce.map.memory.mb -> 3072
mapreduce.reduce.memory.mb -> 3072
mapreduce.map.java.opts -> Xmx2048m
mapreduce.reduce.java.opts ->  Xmx2048m
yarn.nodemanager.vmem-pmem-ratio -> 2.1

simleo · 2020-02-13T09:15:38Z

That depends on many factors, including the Hadoop version you're running. You can try asking on the Hadoop mailing lists. In the Docker images we use for testing, the configuration is rather minimal. If you want, you can check it out here.

orwa-te closed this as completed Feb 9, 2020

orwa-te reopened this Feb 9, 2020

simleo closed this as completed Feb 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pydoop submit script fails #366

Pydoop submit script fails #366

orwa-te commented Feb 8, 2020 •

edited by simleo

Loading

simleo commented Feb 10, 2020

orwa-te commented Feb 10, 2020 •

edited

Loading

simleo commented Feb 11, 2020

orwa-te commented Feb 12, 2020

simleo commented Feb 13, 2020 •

edited

Loading

Pydoop submit script fails #366

Pydoop submit script fails #366

Comments

orwa-te commented Feb 8, 2020 • edited by simleo Loading

simleo commented Feb 10, 2020

orwa-te commented Feb 10, 2020 • edited Loading

simleo commented Feb 11, 2020

orwa-te commented Feb 12, 2020

simleo commented Feb 13, 2020 • edited Loading

orwa-te commented Feb 8, 2020 •

edited by simleo

Loading

orwa-te commented Feb 10, 2020 •

edited

Loading

simleo commented Feb 13, 2020 •

edited

Loading