Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dfsioe YARN error while generating Input Data #80

Closed
manishagajbe opened this issue Mar 27, 2015 · 10 comments
Closed

dfsioe YARN error while generating Input Data #80

manishagajbe opened this issue Mar 27, 2015 · 10 comments

Comments

@manishagajbe
Copy link

I am getting following error message while generating Input data for DFSIOE benchmark.

HiBench : 2.2 , yarn branch
JAVA : jdk1.7.0_45
Hadoop : 2.3.0
myHadoop : 2.1.0

15/03/26 16:50:40 INFO dfsioe.TestDFSIOEnh: maximum concurrent maps = 2
15/03/26 16:50:40 INFO dfsioe.TestDFSIOEnh: creating control file: 200 mega bytes, 256 files
java.io.IOException: Mkdirs failed to create /benchmarks/TestDFSIO-Enh/io_control
at org.apache.hadoop.fs.dfsioe.TestDFSIOEnh.createControlFile(TestDFSIOEnh.java:648)
at org.apache.hadoop.fs.dfsioe.TestDFSIOEnh.run(TestDFSIOEnh.java:598)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.dfsioe.TestDFSIOEnh.main(TestDFSIOEnh.java:624)

@adrian-wang
Copy link
Contributor

Did you have your HDFS working, and have right privileges?

@manishagajbe
Copy link
Author

I don't have a standard HDFS. I am using Lustre. I am able to run the TestDFSIO benchmark that comes along with Hadoop distribution.

@adrian-wang
Copy link
Contributor

What do you mean by myHadoop and Hadoop?

@manishagajbe
Copy link
Author

myHadoop : myHadoop provides a framework for launching Hadoop clusters within traditional high-performance compute clusters and supercomputers. It allows users to provision and deploy Hadoop clusters within the batch scheduling environment of such systems with minimal expertise required.

Hadoop : the opensource Apache Hadoop

I am able to run the simple wordcount, pi and TestDFSIO too. Are there any parameters that I can pass using "-D" to DFSIOE to force it using the base directory? For some reason, the "-Dtest.build.data= .... ", doesn't seem to work when I run the example. I use this option with "TestDFSIO", and it works just fine. I even tried using explicit path in the run script that is in HiBench distribution.

I have similar problems with "Nutchindexing" and "Bayesian" where the default file to read from is "/usr/share/dict/linux.word" and I don;t have that file but I do have "/usr/share/dict/american". Is there a way to change the files to read from?

@adrian-wang
Copy link
Contributor

TestDFSIOEnh may not support that argument.

for the dictionary file, you could make a soft link as a workaround.

@hasonhai
Copy link

Got the same issue. I'm using HDP 2.1 and Ambari 2.0.0.
DFSIOEnh fail to run this command to create the data input for test:

${HADOOP_EXECUTABLE} jar ${DATATOOLS} org.apache.hadoop.fs.dfsioe.TestDFSIOEnh \
    -Dmapreduce.map.java.opts="-Dtest.build.data=${INPUT_HDFS} $MAP_JAVA_OPTS" \
    -Dmapreduce.reduce.java.opts="-Dtest.build.data=${INPUT_HDFS} $RED_JAVA_OPTS" \
    -write -skipAnalyze -nrFiles ${RD_NUM_OF_FILES} -fileSize ${RD_FILE_SIZE} -bufferSize 4096 

The job was received by Yarn but then job failed and containers were killed.

15/04/22 08:18:54 INFO mapreduce.Job: Task Id : attempt_1429536760347_0016_m_000004_1, Status : FAILED
Error: java.io.IOException: Mkdirs failed to create /HiBench/benchmarks/TestDFSIO-Enh/io_data (exists=false, cwd=file:/hadoop/yarn/local/usercache/centos/appcache/application_1429536760347_0016/container_e02_1429536760347_0016_01_000017)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:851)
    at org.apache.hadoop.fs.dfsioe.TestDFSIOEnh$WriteMapperEnh.doIO(TestDFSIOEnh.java:209)
    at org.apache.hadoop.fs.dfsioe.IOMapperBase.map(IOMapperBase.java:123)
    at org.apache.hadoop.fs.dfsioe.IOMapperBase.map(IOMapperBase.java:41)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

@manishagajbe
Copy link
Author

I have created a soft link for linux.words file, however I do see the error
"ERROR: number of words should be greater than 0"
The linux.words file is not empty.

@viplav
Copy link
Contributor

viplav commented Jul 4, 2015

We have been seeing the same issue as @hasonhai with Hortonworks HDP 2. I found the issue after debugging and tested a fix. I can send a Pull request, if you are interested.

@manishagajbe
Copy link
Author

Sure. Thanks.

On Sat, Jul 4, 2015 at 12:42 AM, viplav notifications@github.com wrote:

We have been seeing the same issue as @hasonhai
https://github.com/hasonhai with Hortonworks HDP 2. I found the issue
after debugging and tested a fix. I can send a Pull request, if you are
interested.


Reply to this email directly or view it on GitHub
#80 (comment)
.

@viplav
Copy link
Contributor

viplav commented Jul 7, 2015

I have sent a pull request ( #110 ) to fix the issue with using wrong FS resulting in errors like "Mkdirs failed to create" etc. Please take a look and merge so that people who run into this issue (e.g. using Hortonworks HDP 2) will benefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants