New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-7311: [Python] Return filesystem and path from URI #6197
Conversation
Ok. Thanks for the PR. Sorry for the delay on the other PR. I will close it. What about the ergonomics of this new API in Python ? Do you think we can just wrap it to fit the old API ? Or replace the methods by shorter versions like ls, read, write ? |
OK. Obviously no need to do exactly unix like but at least use a concise format. When thinking of filesystem everybody somehow thinks of dos/unix filesystem command and S3/HDFS file system shells also wrap to this. So imo the most intuitive is just stick to that. I also would like to upstream additional helper methods from https://github.com/dask/hdfs3. Already wrapped some of them in my project (criteo/cluster-pack#23, https://github.com/criteo/cluster-pack/blob/master/cluster_pack/filesystem.py). My goal is to make all this disappear and use only pyarrow's filesystem. So I will create a jira ticket and document what I think could be improved. I can also work on that if it is agreed and PR will be accepted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@fhoering sounds good to me, please create a jira ticket. |
Thanks @kszucs ! |
This should supersede #5977