Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-2919: [C++/Python] Improve HdfsFile error messages, fix Python unit test suite #3209

Closed
wants to merge 4 commits into from

Conversation

wesm
Copy link
Member

@wesm wesm commented Dec 18, 2018

This also resolves ARROW-3957 and ARROW-4053.

Summary:

  • Properly initialize NativeFile when opening from HDFS. This was broken when the "closed" property was added and some other refactoring, and wasn't caught because these tests aren't being run regularly
  • Slightly improves the handling of filesystem URIs -- there were some tests that failed without these changes because the docker-compose HDFS containers don't allow writes from $USER
  • Improve error message when calling "info" on a file that does not exist
  • Improve error message when calling ls on a directory that does not exist
  • Suggest checking whether you are connecting to the right HDFS port when getting errno 255

Change-Id: If58d642c2d627bcf9d5902064d03d13f8b5bd4fe
…ue to non-existent file or bad port

Change-Id: Ic490a94a3837a609066c29a815efd1902a2b7bf3
@wesm wesm requested review from kszucs and pitrou December 18, 2018 03:44
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks most good to me. Some minor comments.

auto path = this->ScratchPath("path-does-not-exist");

HdfsPathInfo info;
Status s = this->client_->GetPathInfo(path, &info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert that the Status is a particular error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


self.fs, _ = _get_filesystem_and_path(filesystem, a_path)
self.paths = path_or_paths
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This temporary assignment looks a bit superfluous. You could just access path_or_paths below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Change-Id: Ib7b69f3287876b95db5b2b03d3e9c3daf5f9c26d
Change-Id: I35dec2a1a19cdaf1688281b6798680e79bf95501
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@wesm wesm closed this in 758bd55 Dec 18, 2018
@wesm wesm deleted the ARROW-2919 branch December 18, 2018 18:25
cav71 pushed a commit to cav71/arrow that referenced this pull request Dec 22, 2018
…unit test suite

This also resolves ARROW-3957 and ARROW-4053.

Summary:

* Properly initialize NativeFile when opening from HDFS. This was broken when the "closed" property was added and some other refactoring, and wasn't caught because these tests aren't being run regularly
* Slightly improves the handling of filesystem URIs -- there were some tests that failed without these changes because the docker-compose HDFS containers don't allow writes from $USER
* Improve error message when calling "info" on a file that does not exist
* Improve error message when calling `ls` on a directory that does not exist
* Suggest checking whether you are connecting to the right HDFS port when getting errno 255

Author: Wes McKinney <wesm+git@apache.org>

Closes apache#3209 from wesm/ARROW-2919 and squashes the following commits:

b11e5b6 <Wes McKinney> Restore arrow_dependencies to Gandiva dependencies
20e8784 <Wes McKinney> Code review comments
4ba93bb <Wes McKinney> More helpful error messages when GetPathInfo or ListDirectory fails due to non-existent file or bad port
3c67ea6 <Wes McKinney> Basic fixes to get Python unit tests passing again
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants