How long will it take to run "prepare.sh"? #97

TYoung1221 · 2015-05-27T07:44:13Z

I am running HiBench to verify the performance SPARK SQL. And the "prepare.sh" took more than 3 hours and hasn't finished yet. This is my console below:

yang@xxxxx:~/HiBench/bin$ ./run-all.sh
Prepare join ...
Exec script: /home/yang/HiBench/workloads/join/prepare/prepare.sh
Parsing conf: /home/yang/HiBench/conf/00-default-properties.conf
Parsing conf: /home/yang/HiBench/conf/10-data-scale-profile.conf
Parsing conf: /home/yang/HiBench/conf/99-user_defined_properties.conf
Parsing conf: /home/yang/HiBench/workloads/join/conf/00-join-default.conf
Parsing conf: /home/yang/HiBench/workloads/join/conf/10-join-userdefine.conf
Probing spark verison, may last long at first time...
start HadoopPrepareJoin bench
hdfs rm -r: /home/yang/hadoop/bin/hadoop --config /home/yang/hadoop/etc/hadoop fs -rm -r -skipTrash hdfs://andromeda:9000/HiBench/Join/Input
15/05/27 16:32:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
rm: `hdfs://xxxxx:9000/HiBench/Join/Input': No such file or directory
Pages:120000, USERVISITS:1000000
Submit MapReduce Job: /home/yang/hadoop/bin/hadoop --config /home/yang/hadoop/etc/hadoop jar /home/yang/HiBench/src/autogen/target/autogen-4.0-SNAPSHOT-jar-with-dependencies.jar HiBench.DataGen -t hive -b hdfs://xxxxx:9000/HiBench/Join -n Input -m 12 -r 6 -p 120000 -v 1000000 -o sequence
15/05/27 16:32:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/05/27 16:32:49 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/05/27 16:32:50 INFO mapreduce.Job: Running job: job_1432692021703_0006

And in my hadoop web page, the log shows as below:
2015-05-27 16:42:12,268 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2015-05-27 16:42:12,269 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).

So, is there anything wrong with my hdfs configuration?

lvsoft · 2015-05-27T08:30:19Z

According to your log, you are running HiBench with default data scale profile, which should be finished in several minutes for each workloads even for only 1 node.

Can you paste your report/join/prepare/bench.log for details?

TYoung1221 · 2015-05-27T08:35:24Z

Thank you for your reply!

This is report/join/prepare/bench.log you mentioned.

15/05/27 16:32:45 INFO HiBench.HiveData: Generating hive data files...
15/05/27 16:32:45 INFO HiBench.HiveData: Initializing hive date generator...
15/05/27 16:32:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
curIndex: 349, total: 350
15/05/27 16:32:48 INFO HiBench.Dummy: Creating dummy file hdfs://andromeda:9000/HiBench/Join/temp/dummy with 12 slots...
15/05/27 16:32:48 INFO HiBench.HiveData: Creating table rankings...
15/05/27 16:32:48 INFO Configuration.deprecation: mapred.reduce.slowstart.completed.maps is deprecated. Instead, use mapreduce.job.reduce.slowstart.completedmaps
15/05/27 16:32:48 INFO HiBench.HiveData: Running Job: Create rankings
15/05/27 16:32:48 INFO HiBench.HiveData: Pages file hdfs://andromeda:9000/HiBench/Join/temp/dummy as input
15/05/27 16:32:48 INFO HiBench.HiveData: Rankings file hdfs://andromeda:9000/HiBench/Join/Input/rankings as output
15/05/27 16:32:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/05/27 16:32:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/05/27 16:32:49 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/05/27 16:32:49 INFO mapred.FileInputFormat: Total input paths to process : 1
15/05/27 16:32:50 INFO mapreduce.JobSubmitter: number of splits:12
15/05/27 16:32:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1432692021703_0006
15/05/27 16:32:50 INFO impl.YarnClientImpl: Submitted application application_1432692021703_0006
15/05/27 16:32:50 INFO mapreduce.Job: The url to track the job: http://andromeda.jp:8088/proxy/application_1432692021703_0006/
15/05/27 16:32:50 INFO mapreduce.Job: Running job: job_1432692021703_0006
~

TYoung1221 · 2015-05-27T08:42:27Z

And this is the information from the job0006 URL:

///Application Overview
User: yang
Name: Create rankings
Application Type: MAPREDUCE
Application Tags:
State: ACCEPTED
FinalStatus: UNDEFINED
Started: 27-May-2015 16:32:50
Elapsed: 1hrs, 8mins, 11sec
Tracking URL: UNASSIGNED
Diagnostics:

///Application Metrics
Total Resource Preempted: <memory:0, vCores:0>
Total Number of Non-AM Containers Preempted: 0
Total Number of AM Containers Preempted: 0
Resource Preempted from Current Attempt: <memory:0, vCores:0>
Number of Non-AM Containers Preempted from Current Attempt: 0
Aggregate Resource Allocation: 0 MB-seconds, 0 vcore-seconds

lvsoft · 2015-05-27T08:59:10Z

It seems like your yarn cluster has no available resource to run the job. The job state is ACCEPTED for hours to wait for RUN. Could you please check your Memory Total of Cluster metrics in YARN web UI?

TYoung1221 · 2015-05-27T09:08:51Z

Thank you for your reply. Yes the Memory Total of Cluster is zero. And what should I do?

TYoung1221 · 2015-05-27T09:14:29Z

And my yang-site.xml is like this:

    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
    <property>
            <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

lvsoft · 2015-05-27T11:05:38Z

Have you started at least 1 yarn nodemanager? You need to execute sbin/start-yarn.sh in your hadoop bin to start both resourcemanager and nodemanager in all your workers. It seems you only have resourcemanager started.

lvsoft · 2015-06-11T04:55:49Z

Seem's like a trivial issue. I'll close it.

lvsoft closed this as completed Jun 11, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How long will it take to run "prepare.sh"? #97

How long will it take to run "prepare.sh"? #97

TYoung1221 commented May 27, 2015

lvsoft commented May 27, 2015

TYoung1221 commented May 27, 2015

TYoung1221 commented May 27, 2015

lvsoft commented May 27, 2015

TYoung1221 commented May 27, 2015

TYoung1221 commented May 27, 2015

lvsoft commented May 27, 2015

lvsoft commented Jun 11, 2015

How long will it take to run "prepare.sh"? #97

How long will it take to run "prepare.sh"? #97

Comments

TYoung1221 commented May 27, 2015

lvsoft commented May 27, 2015

TYoung1221 commented May 27, 2015

TYoung1221 commented May 27, 2015

lvsoft commented May 27, 2015

TYoung1221 commented May 27, 2015

TYoung1221 commented May 27, 2015

lvsoft commented May 27, 2015

lvsoft commented Jun 11, 2015