Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix libhdfs problem in CDH #10

Closed
mli opened this issue May 13, 2015 · 11 comments
Closed

fix libhdfs problem in CDH #10

mli opened this issue May 13, 2015 · 11 comments

Comments

@mli
Copy link
Member

mli commented May 13, 2015

CDH is slightly difference to the native hadoop. We can support it or write a document to let users change the things accordingly.

  • it may do not have hdfs.h
  • it may only proive a hdfs.a

check the version, on my case:

~/hadoop version
Hadoop 2.3.0-cdh5.1.0 

the solution:

  1. to get hdfs.h

according to my version, i download

http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.3.0-cdh5.1.0-src.tar.gz

extract, and copy

hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h

to the fold include/.

  1. change make/dmlc.mk to use the .a version
DMLC_LDFLAGS+= $(HADOOP_HDFS_HOME)/lib/native/libhdfs.a -L$(LIBJVM) -ljvm -Wl,-rpath=$(LIBJVM)

It should compile now. Then we need to set the environment properly (I guess it is also necessary for the the native hadoop). In my case, I set .bashrc

export HADOOP_HOME=/usr/lib/hadoop
export HADOOP_CONF=/etc/hadoop/conf
export CLASSPATH=${HADOOP_CONF}:$(find ${HADOOP_HOME} -name *.jar | tr '\n' ':')

And there is warning in 64bit centos

 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...

but it seems OK to ignore it according to this

@gzc
Copy link

gzc commented Jan 14, 2016

I use .so and jars which provided by hadoop2.6.2 instead of libhdfs.a. Everything seems went well, it can successfully compile and access CDH HDFS :)

@wenmin-wu
Copy link

@mli I had make the changes according to your suggestions, but it still come out with the error as following:
/usr/bin/ld: /usr/lib/hadoop/lib/native/libhdfs.a(hdfs.c.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/usr/lib/hadoop/lib/native/libhdfs.a: could not read symbols: Bad value

@cicadas
Copy link

cicadas commented Mar 19, 2016

@wenmin-wu I also encountered this error. have you fixed it?

@peperxuhui
Copy link

I had questions the same as @wenmin-wu and @cicadas , Can anyone fixed it?

@cicadas
Copy link

cicadas commented Mar 24, 2016

@peperxuhui
I think there are tow solutions.

  1. I use centos 6, and I installed a package hadoop-libhdfs-devel
sudo yum install hadoop-libhdfs-devel

this package contains libhdfs.so, and you can compile xgboost using libhdfs.so. However, this is a dynamic library, you have to install hadoop-libhdfs on every node in your hadoop grid, otherwise, you may encounter running time error.
2. Another solution is compile hadoop by your self. Downloaded cdh hadoop sourcecode, and re-compiled it with -fPIC argument.

@peperxuhui
Copy link

I tried what @cicadas said, but unfortunately, after i added cdh.repo to yum.repos.d, hadoop-libhdfs-devel was so big that i can't download, it was always DOWNLOAD ERROR in China , so I tried the second way, I didn't know how to add -fPIC in CMakeLists.txt so I complied the hadoop-2.6.4-src without add -fPIC and finally I got libhdfs.so in lib/native and I add DMLC_LDFLAGS+=$(HADOOP_HDFS_HOME)/lib/native/libhdfs.so -L$(LIBJVM) -ljvm -Wl,-rpath=$(LIBJVM) in xgboost config.mk instead of libhdfs.a . Finally ,After I compiled xgboost success, I export LD_LIBRARY_PATH=....../xgboost_hdfs_lib:$LD_LIBRARY_PATH which ....../xgboost_hdfs_lib only hold libhdfs.so and libhdfs.so.0.0.0, After that xgboost can work on yarn and also can read file from CDH HDFS, It worked well!

@cicadas
Copy link

cicadas commented Mar 28, 2016

@peperxuhui
first download hadoop sourcecode: wget 'http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.4.8-src.tar.gz'
download protobuf 2.5 (cdh 5.4.8 can onle use protobuf 2.5.0 to compile): wget 'http://protobuf.googlecode.com/files/protobuf-2.5.0.tar.bz2'

unzip cdh, and run mvn command in cdh path:
mvn package -Pdist,native -DskipTests -Dtar.
all compiled *.so files are under hadoop-hdfs-project/hadoop-hdfs/target/native.
You need to link these *.so files into libhdfs.a, here are the steps:
- cd hadoop-hdfs-project/hadoop-hdfs/target/native
- change CMakeCache.txt, add "-fPIC" to all CMAKE_…_FLAGS argunments.
- make clean && make
- Then you can find libhdfs.a

@superbobry
Copy link
Contributor

FYI here is a script to build libhdfs from CDH5 on both OSX and Linux.

@tqchen tqchen closed this as completed Nov 29, 2017
@antant-shenzhen-2013
Copy link

antant-shenzhen-2013 commented Jan 25, 2018

@superbobry I follow your script to compile .so files.
BUT when dmlc-submit

/home/luwl/github/xgboost/dmlc-core/tracker/dmlc-submit
--cluster=yarn --num-workers=4 --worker-cores=4 --worker-memory=16G --queue=root.ad
-f /usr/lib/hadoop/lib/native/libhdfs.so.0.0.0
/home/luwl/github/xgboost/xgboost /home/luwl/github/xgboost/demo/distributed-training/mushroom.aws.conf.small
nthread=4
data=hdfs://nameservice1/user/luwl/dsp/data/xgboost/small-train
eval[test]=hdfs://nameservice1/user/luwl/dsp/data/xgboost/small-test
model_dir=hdfs://nameservice1/user/luwl/dsp/model/ctr-xgboost-model-small

many containers show the following errors:

Container: container_e06_1510835833211_1431884_01_000009 on nfjd-hadoop02-node355.jpushoa.com_8041

LogType:stderr
Log Upload Time:Fri Jan 26 14:41:19 +0800 2018
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:Fri Jan 26 14:41:19 +0800 2018
LogLength:32
Log Contents:
Usage: launcher.py your command

any suggestion?

@wenmin-wu
Copy link

@antant-shenzhen-2013 I have posted blog on CSDN about how to deploy xgboost on yarn, you can follow it step by step. The blog link is http://blog.csdn.net/u010306433/article/details/51403894

@antant-shenzhen-2013
Copy link

antant-shenzhen-2013 commented Jan 29, 2018

@wenmin-wu thx. I would try it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants