New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Handle URI paths for filesystems? #34386
Comments
Ah, so Maybe it would be simpler to add Otherwise I worry that there may be many entry points into filesystem that accept a path (and there are many filesystem implementations). For example, this change would touch the S3 filesystem and the GCS filesystem and so on. I think there is some advantage in keeping the filesystem implementations as simple as possible. |
That would also work. In my use case, I would like to get all the paths back to pass on to another system (which does also use Arrow filesystems), but there is one step that would at least like to parse and validate that all the URIs are consistent for a filesystem. |
### Rationale for this change We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem. ### What changes are included in this PR? Adds a new arrow::fs::PathFromUriOrPath method ### Are these changes tested? Yes, via unit tests ### Are there any user-facing changes? There is a new API method but no changes to any existing APIs * Closes: #34386 Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
### Rationale for this change We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem. ### What changes are included in this PR? Adds a new arrow::fs::PathFromUriOrPath method ### Are these changes tested? Yes, via unit tests ### Are there any user-facing changes? There is a new API method but no changes to any existing APIs * Closes: apache#34386 Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
### Rationale for this change We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem. ### What changes are included in this PR? Adds a new arrow::fs::PathFromUriOrPath method ### Are these changes tested? Yes, via unit tests ### Are there any user-facing changes? There is a new API method but no changes to any existing APIs * Closes: apache#34386 Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
### Rationale for this change We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem. ### What changes are included in this PR? Adds a new arrow::fs::PathFromUriOrPath method ### Are these changes tested? Yes, via unit tests ### Are there any user-facing changes? There is a new API method but no changes to any existing APIs * Closes: apache#34386 Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
Describe the enhancement requested
Given a URI like
s3://bucket/foo/bar.parquet
, right now, you can turn that URI into a FileSystem instance + a non-URI path. That is useful, but then the FileSystem APIs don't work with URIs. For instance, it's explicitly rejected here:arrow/cpp/src/arrow/filesystem/s3fs.cc
Lines 416 to 419 in 0844924
So if you have a set of URIs and want to work with them with filesystems, you have to parse the URIs yourself. Given we already do the URI munging internally, it might be nice for FileSystem instances to expose a method to parse URIs into a path as well (raising an error if the URI is incompatible).
Component(s)
C++
The text was updated successfully, but these errors were encountered: