-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to map to pyarrow.hdfs.connect? #1113
Comments
Could you please explain how you'd like your alternative implementation to be handled by fsspec? I am understanding, that you would like the "bfs" protocol to be registered with fsspec, and have it create the arrow fs, wrap it, and return an fsspec instance. Is this right? |
@martindurant Yes. That's exact what I want. Do you have any suggestion for those custom protocols? Should we do it downstream or upstream? |
Could you please try: fsspec.implementations.arrow.HadoopFileSystem("bfs://service-endpoint") to see if this does this right thing? If so, the following change --- a/fsspec/implementations/arrow.py
+++ b/fsspec/implementations/arrow.py
@@ -260,6 +260,8 @@ class HadoopFileSystem(ArrowFSWrapper):
out = {}
if ops.get("host", None):
out["host"] = ops["host"]
+ if ops.get("protocol"):
+ out["host"] = f"{ops.get('protocol')}://{out['host']}"
if ops.get("username", None):
out["user"] = ops["username"] should allow for fsspec.register_implementation("bfs", fsspec.implementations.arrow.HadoopFileSystem)
with fsspec.open("bfs://service-endpoint/tmp/license.txt") as f:
use_somehow(f) |
@Jeffwan , did my proposed solution work for you? |
Let's say we have a file system called
bfs://
which is an equivalent implementation of HDFS. We can support pyarrow operation like following way.Could I know whether fsspec can support our protocol?
The text was updated successfully, but these errors were encountered: