Unable to load libhdfs #18621

asfimport · 2021-04-15T12:42:23Z

I am using pyarrow 3.0.0 with python 3.7 and hadoop 2.10.1 on windows 10 64bit. Facing this following error.

I am using pyspark 3.1.1. I am not able to save dataframe to hdfs. When I used pyspark 3.0.0 I was able to save dataframe hdfs.

please help:

import pyarrow as pa
fs = pa.hdfs.connect(host='localhost', port=9001)
main:1: DeprecationWarning: pyarrow.hdfs.connect is deprecated as of 2.0.0, please use pyarrow.fs.HadoopFileSystem instead.
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\1570513\Anaconda3\envs\on-premise-latest\lib\site-packages\pyarrow\hdfs.py", line 219, in connect
extra_conf=extra_conf
File "C:\Users\1570513\Anaconda3\envs\on-premise-latest\lib\site-packages\pyarrow\hdfs.py", line 229, in _connect
extra_conf=extra_conf)
File "C:\Users\1570513\Anaconda3\envs\on-premise-latest\lib\site-packages\pyarrow\hdfs.py", line 45, in init
self._connect(host, port, user, kerb_ticket, extra_conf)
File "pyarrow\io-hdfs.pxi", line 75, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 99, in pyarrow.lib.check_status
OSError: Unable to load libhdfs: The specified module could not be found.

Reporter: Sukesh Pabolu

Original Issue Attachments:

image-2021-04-15-20-04-50-069.png

_{Note: This issue was originally created as ARROW-12399. Please see the migration documentation for further details.}

asfimport · 2021-04-15T14:17:53Z

Joris Van den Bossche / @jorisvandenbossche:
Could you try with pyarrow.fs.HadoopFileSystem(host='localhost', port=9001) instead? (the hdfs.connect() method is deprecated in favor of pyarrow.fs.HadoopFileSystem, which is also backed by a somewhat different implementation)

asfimport · 2021-04-15T14:22:52Z

Sukesh Pabolu:

from pyarrow import fs
fs.HadoopFileSystem(host='localhost', port=9001)
Traceback (most recent call last):
File "", line 1, in
File "pyarrow_hdfs.pyx", line 83, in pyarrow._hdfs.HadoopFileSystem.init
File "pyarrow\error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow\error.pxi", line 99, in pyarrow.lib.check_status
OSError: Unable to load libhdfs: The specified module could not be found.

asfimport · 2021-04-15T14:28:46Z

Joris Van den Bossche / @jorisvandenbossche:
Further question, did you check the docs at https://arrow.apache.org/docs/python/filesystems.html#hadoop-file-system-hdfs ? It specified some environment variables that need to be set. Are those set?

asfimport · 2021-04-15T14:34:53Z

Sukesh Pabolu:

asfimport · 2021-04-16T04:42:25Z

Sukesh Pabolu:
I have assigned all environment variables

asfimport · 2021-04-19T04:43:50Z

Sukesh Pabolu:
I am waiting for further reply

asfimport · 2021-04-19T07:59:12Z

Joris Van den Bossche / @jorisvandenbossche:
The assignment of "ARROW_LIBHDFS_DIR" might not be fully correct (it should be "C:
hadoop
lib
native" with double back-slashes), but it might also not be needed to set this variable if this is not pointing to something else as $HADOOP_HOME/lib/native

(I don't have HDFS myself, so I can't further help than asking those initial questions)

asfimport · 2021-06-22T12:40:17Z

Antoine Pitrou / @pitrou:
[~sukeshpabolu] Have you checked the above suggestion (use double backslashes to circumvent escaping issues)?

asfimport added this to the 3.0.0 milestone Jan 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to load libhdfs #18621

Unable to load libhdfs #18621

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 16, 2021

asfimport commented Apr 19, 2021

asfimport commented Apr 19, 2021

asfimport commented Jun 22, 2021

Unable to load libhdfs #18621

Unable to load libhdfs #18621

Comments

asfimport commented Apr 15, 2021

Original Issue Attachments:

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 15, 2021

asfimport commented Apr 16, 2021

asfimport commented Apr 19, 2021

asfimport commented Apr 19, 2021

asfimport commented Jun 22, 2021