New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ruby] Add support for loading table by Arrow Dataset #18794
Comments
Kouhei Sutou / @kou: |
Kanstantsin Ilchanka / @simpl1g: Also, doing this causes segfault on version 5.0.0 Arrow::S3FileSystem.new |
Kouhei Sutou / @kou:
|
Kouhei Sutou / @kou: |
Kanstantsin Ilchanka / @simpl1g: ruby-2.7.4/gems/gobject-introspection-3.4.9/lib/gobject-introspection/loader.rb:616:in `invoke': [file-system-dataset-factory][set-file-system-uri]: NotImplemented: Got S3 URI but Arrow compiled without S3 support (Arrow::Error::NotImplemented) |
Kouhei Sutou / @kou: |
Kanstantsin Ilchanka / @simpl1g: |
Kouhei Sutou / @kou: The following patch will work:
|
Kanstantsin Ilchanka / @simpl1g: Questions:
|
Kouhei Sutou / @kou:
Could you send the changes you tried to Homebrew? |
Kouhei Sutou / @kou:
You can use general user/password syntax for URI: |
Kouhei Sutou / @kou:
Convenient APIs ( |
Kanstantsin Ilchanka / @simpl1g:
|
Kanstantsin Ilchanka / @simpl1g:
I'm not sure that this is valid for S3, they require different authentication I expect something like this arrow/python/pyarrow/_s3fs.pyx Line 57 in 0dfe592
s3_fs = Arrow::S3FileSystem.new(access_key: 'key', secret_key: 'key', region: 'region')
table = Arrow::Table.load(URI("s3://backet/path"), filesystem: s3_fs) |
Kouhei Sutou / @kou: S3FileSystem uses user/password information in URI if they exist: arrow/cpp/src/arrow/filesystem/s3fs.cc Lines 330 to 335 in 0dfe592
|
Kanstantsin Ilchanka / @simpl1g: However secret key can contain / for example and it will be an error, I have such case in one of production buckets
URI("s3://acces_key:secret/key@bucket/file.csv")
# rfc3986_parser.rb:67:in `split': bad URI(is not URI?)
|
Kouhei Sutou / @kou: |
Kouhei Sutou / @kou:
If you use "s3://..." on local, initialization process take a long time because of timeout. See also: aws/aws-sdk-cpp#1410 |
Kanstantsin Ilchanka / @simpl1g: |
Kanstantsin Ilchanka / @simpl1g: |
Reporter: Kouhei Sutou / @kou
Assignee: Kouhei Sutou / @kou
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-13687. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: