This is a small library (and some test programs) whose purpose help use Hadoop's HDFS C API with C++ programs.
Most classes follow a RAII idiom and, when errors are detected, exceptions are thrown.
The code is hosted on http://github.com/tmacam/libhdfscpp and is distributed under the Apache 2.0 license.
zlib (tested with ubuntu version 1:1.2.3.3.dfsg-7ubuntu1)
For Debian and Ubuntu installing the package zlib1g-dev should do.
Hadoop HDFS (version == 0.18.3)
AFAIK, we are only compatible with this particular version of Hadoop --- at least with this version libhdfs C API.
libhdfs
Although this is part of the Hadoop distribution and despite the fact that pre-built binaries for this lib are shipped with Hadoop distribution, it is better to rebuild this library from source.
Here is how:
libhdfs source code is inside src/c++/libhdfs/, in Hadoop's distribution.
Inside the extras directory in libhdfscpp distribution there are two patches. Copy Makefile.diff to libhdfs source directory.
Go to libhdfs directory and apply Makefile's patch:
cd $HADOOP_HOME/src/c++/libhdfs cp Makefile Makefile.orig patch Makefile < Makefile.diffEdit Makefile you just copied. Modify the parts between # BEGIN tmacam and # END tmacam to make sense for your installation settings. In my last install, I was compiling it for a i386 installation, so, this is what I got in that section of the file:
# BEGIN tmacam # for i386 use OS_ARCH=i386 and N_BITS=32 OS_ARCH=i386 N_BITS=32 # for amd64 us OS_ARCH=amd64 and N_BITS=64 #OS_ARCH=amd64 #N_BITS=64 PLATFORM=linux JAVA_HOME=/usr/lib/jvm/java-6-sun LIBHDFS_DEST_DIR=/usr/local/libhdfs-0.18.3/ LIBHDFS_BUILD_DIR=$(LIBHDFS_DEST_DIR)/lib LIBHDFS_INCLUDES_DIR=$(LIBHDFS_DEST_DIR)/lib # # execute 'make help' to see build instructions # # END tmacamTo build it, first run make help. It will produce an output similar to this:
LD_LIBRARY_PATH='/usr/lib/jvm/java-6-sun/jre/lib/i386/server/' make LD_LIBRARY_PATH='/usr/lib/jvm/java-6-sun/jre/lib/i386/server/' make installAll you have to do is run these commands.
Notice: this Makefile expects a JDK from Sun.
Edit libhdfscpp's Makefile, applying changes similar to those made while building libhdfs. In sum, you have to configure OS_ARCH, PLATFORM, JAVA_HOME, N_BITS and LIBHDFS_HOME. Those settings are right at the top of the Makefile.
There are lots of library dependencies, libraries not in the default search path, classes not in Java's default CLASSPATH --- it is a royal PITA to run programs compiled with libhdfs and, for that matter, with libhdfscpp. Thus, we recommend running programs that use libhdfscpp as follow:
LD_LIBRARY_PATH=$(cat LDLIBPATH.txt) CLASSPATH=$(cat CLASSPATH.txt) ./hdfsread /user/hadoop/friends-txt-DELME2/part-00001
Notice that both LDLIBPATH.txt and CLASSPATH.txt are produced by libhdfscpp's Makefile.
We produce a simple but functional .pc use with pkg-config.
- http://hadoop.apache.org/common/docs/r0.20.0/libhdfs.html
- http://wiki.apache.org/hadoop/HDFS-APIs
- file:///home/speed/tmacam/hadoop-64bits/hadoop-0.18.3/libhdfs/docs/api/hdfs_8h.html
- http://community.abiquo.com/display/ADP/Using+libhdfs
- http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/fs/FileSystem.html
# vim:syn=rst et ai sts=4 ts=4 sw=4 tw=72: