Skip to content

[Python] Allow config hadoop_bin in pyarrow hdfs.py  #19822

@asfimport

Description

@asfimport

Currently, the hadoop_bin is either from HADOOP_HOME or the hadoop command. 

https://github.com/apache/arrow/blob/master/python/pyarrow/hdfs.py#L130

However, in some of environment setup, hadoop_bin could be some other location. Can we do something like 

 

if 'HADOOP_BIN' in os.environ:
    hadoop_bin = os.environ['HADOOP_BIN']
elif 'HADOOP_HOME' in os.environ:
    hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME'])
else:
    hadoop_bin = 'hadoop'

 

 

Reporter: Wenbo Zhao

PRs and other links:

Note: This issue was originally created as ARROW-3503. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions