Skip to content

Conversation

@cj-zhukov
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

This PR is for consolidating all the external_dependency examples (dataframe_to_s3, query_aws_s3) into a single example binary. We are agreed on the pattern and we can apply it to the remaining examples

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@cj-zhukov
Copy link
Contributor Author

High-Level Overview

This PR consolidates all external_dependency examples (dataframe_to_s3, query_aws_s3) into a single example binary.
Previously, each example had its own file, but now they can be executed via subcommands using:

cargo run --example external_dependency -- [dataframe_to_s3|query_aws_s3]

@cj-zhukov
Copy link
Contributor Author

It looks like the nyc-tlc bucket is no longer publicly accessible.
Anonymous ListBucket and GetObject requests now return AccessDenied, which makes the example fail even with --no-sign-request:

 aws s3 ls s3://nyc-tlc/  --no-sign-request

I updated the example to use a test Parquet file uploaded to a user-controlled S3 bucket instead.

@2010YOUY01
Copy link
Contributor

It looks like the nyc-tlc bucket is no longer publicly accessible. Anonymous ListBucket and GetObject requests now return AccessDenied, which makes the example fail even with --no-sign-request:

 aws s3 ls s3://nyc-tlc/  --no-sign-request

I updated the example to use a test Parquet file uploaded to a user-controlled S3 bucket instead.

If the CI is currently running this example without errors, the address should be valid — am I missing something? 🤔

@cj-zhukov
Copy link
Contributor Author

It looks like the nyc-tlc bucket is no longer publicly accessible. Anonymous ListBucket and GetObject requests now return AccessDenied, which makes the example fail even with --no-sign-request:

 aws s3 ls s3://nyc-tlc/  --no-sign-request

I updated the example to use a test Parquet file uploaded to a user-controlled S3 bucket instead.

If the CI is currently running this example without errors, the address should be valid — am I missing something? 🤔

I might be missing something here, but I wasn’t able to access the s3://nyc-tlc bucket from my machine - even

aws s3 ls s3://nyc-tlc/ --no-sign-request 

returns AccessDenied.

One possibility is that CI is running the example with temporary AWS credentials in the environment, so the request becomes signed, which would explain why it still works there.

To confirm whether the bucket is still publicly accessible, could you please try running the same aws s3 ls command on your side?

If it is public, I will revert my change; if not, we can keep the example using a user-controlled bucket to avoid confusion for others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants