Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-7311: [Python] Return filesystem and path from URI #6197

Closed
wants to merge 3 commits into from

Conversation

kszucs
Copy link
Member

@kszucs kszucs commented Jan 14, 2020

This should supersede #5977

@kszucs kszucs requested a review from pitrou January 14, 2020 21:36
@github-actions
Copy link

@fhoering
Copy link
Contributor

Ok. Thanks for the PR. Sorry for the delay on the other PR. I will close it.

What about the ergonomics of this new API in Python ? Do you think we can just wrap it to fit the old API ? Or replace the methods by shorter versions like ls, read, write ?

@kszucs
Copy link
Member Author

kszucs commented Jan 15, 2020

@fhoering the unix like filesystem methods might not fit every backend, like S3 or HDFS. I think we'll need to deprecate the old filesystem API in favor of the new one.

cc @pitrou

@fhoering
Copy link
Contributor

fhoering commented Jan 15, 2020

OK. Obviously no need to do exactly unix like but at least use a concise format. When thinking of filesystem everybody somehow thinks of dos/unix filesystem command and S3/HDFS file system shells also wrap to this. So imo the most intuitive is just stick to that.

I also would like to upstream additional helper methods from https://github.com/dask/hdfs3. Already wrapped some of them in my project (criteo/cluster-pack#23, https://github.com/criteo/cluster-pack/blob/master/cluster_pack/filesystem.py). My goal is to make all this disappear and use only pyarrow's filesystem.

So I will create a jira ticket and document what I think could be improved. I can also work on that if it is agreed and PR will be accepted.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

python/pyarrow/tests/test_fs.py Show resolved Hide resolved
@kszucs
Copy link
Member Author

kszucs commented Jan 15, 2020

@fhoering sounds good to me, please create a jira ticket.

@jorisvandenbossche
Copy link
Member

Thanks @kszucs !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants