Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++/Python] Add troubleshooting section for setting up HDFS JNI interface #17331

Open
asfimport opened this issue Aug 2, 2017 · 5 comments
Open

Comments

@asfimport
Copy link

The hadoop library directory contains a libhdfs.a and a libhadoop.so but no libhdfs.so.

Environment: linux trusty-cdh5
Reporter: Martin Durant / @martindurant

Note: This issue was originally created as ARROW-1313. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Can you provide more detail about your environment (i.e. so that this can be reproduced)? The location of libhdfs.so can vary a lot by Hadoop distribution.

@asfimport
Copy link
Author

Martin Durant / @martindurant:
Docker file: https://github.com/dask/hdfs3/blob/master/continuous_integration/Dockerfile

This uses an official .deb of CDH5, installed into /usr/lib/hadoop. There is no libhdfs.so anywhere in that directory..

Using java-7-openjdk-amd64.

@asfimport
Copy link
Author

Wes McKinney / @wesm:
It appears in this particular Hadoop distribution that libhdfs is packaged as a separate Linux package:

apt-get install libhdfs0-dev

and then

# find /usr -name \*.so -print | grep hdfs
/usr/lib/libhdfs.so

@asfimport
Copy link
Author

Martin Durant / @martindurant:
That would install the whole of hadoop as system packages, so there would be two separate ones with the CHD install from before.
libhdfs.so is only 200kB, can it not be distributed?

@asfimport
Copy link
Author

Wes McKinney / @wesm:
My understanding is that the safest thing to do in production is use the libhdfs.so that is shipped with a particular Hadoop distribution (since there may be internal details that are particular to that version of Hadoop); while the public C API is the same between versions, in theory there could be internal details in the JNI implementation that break the Java "ABI". The Hadoop community would be able to give better advice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant