-
Notifications
You must be signed in to change notification settings - Fork 3.8k
GH-40754: [Python] Expose tls_ca_file_path to S3FileSystem #45881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@github-actions crossbow submit -g python |
This comment was marked as outdated.
This comment was marked as outdated.
python/pyarrow/_s3fs.pyx
Outdated
@@ -406,6 +408,8 @@ cdef class S3FileSystem(FileSystem): | |||
retry_strategy.max_attempts) | |||
else: | |||
raise ValueError(f'Invalid retry_strategy {retry_strategy!r}') | |||
if tls_ca_file_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if tls_ca_file_path: | |
if tls_ca_file_path is not None: |
python/pyarrow/_s3fs.pyx
Outdated
tls_ca_file_path : str, default None | ||
Use custom CA certificate for TLS verification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tls_ca_file_path : str, default None | |
Use custom CA certificate for TLS verification | |
tls_ca_file_path : str, default None | |
If set, this should be the path of a file containing TLS certificates | |
in PEM format which will be used for TLS verification. |
Thanks for submitting this @bw513 . Is there a reason this PR is still in draft? I've posted two comments, feel free to update. |
Really sorry, I got prioritised onto other work and am now back on this. I made it a draft as there seemed to be some different approaches being suggested on #40754. I'll fix-up based on the suggestions above and remove the draft. Thanks for the comments! |
ed19cc0
to
1fc1a46
Compare
1fc1a46
to
f477507
Compare
If I understand correctly, the failure is due to R format which is probably an unrelated issue. Should I rebase off a stable release such as apache-arrow-20.0.0? |
Could you rebase on the current main? |
f477507
to
2adc66e
Compare
2adc66e
to
61bc36e
Compare
That looks like it works, thanks! I have also realised that I missed making the test a bit more specific as suggested, so I have fixed that now too. |
@github-actions crossbow submit -g python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I'll just wait for CI
Revision: 61bc36e Submitted crossbow builds: ursacomputing/crossbow @ actions-643d9fc53b |
CI failures are unrelated. |
After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 4636e8a. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Rationale for this change
Currently, when using the pyarrow.fs.S3FileSystem, it is not immediately obvious how provide an alternative TLS certificate authority when working with (for example) a self-hosted minio using a self-signed certificate. It is possible to do this via environment variables, as per #40754; however, the user may not be able to easily specify these depending on their python execution environment.
What changes are included in this PR?
Adds an additional parameter to S3FileSystem to specify the file path to a custom certificate.
Are these changes tested?
Yes. A unit-test for S3FileSystem pickling is included; and I have manually verified that I can access files from a self-hosted minio with TLS enabled.
Are there any user-facing changes?
An additional parameter
tls_ca_file_path
is available on the S3FileSystem constructor.