Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow boto3 as a HTTP data (range) provider #11

Closed
candrsn opened this issue Oct 1, 2021 · 2 comments
Closed

allow boto3 as a HTTP data (range) provider #11

candrsn opened this issue Oct 1, 2021 · 2 comments

Comments

@candrsn
Copy link

candrsn commented Oct 1, 2021

If the httpx client to S3 compliant bucket as abstracted just a bit and its own class,
AWS boto3 could be swapped in when needed.

I am starting working on such an abstraction in the repo / branch
https://github.com/candrsn/sqlite-s3-query/tree/feature/boto3_client

It is not ready for a merge, but I want to start discussions before I a am ready for a PR.

The abstraction would move the current S3 API elements into a class that had minimal symmetry with the boto3 S3 client API
x = boto3.client('S3')

and also

session = boto3.Session()
s3 = session.client('s3')

That would simplifiy S3 API access methods

@michalc
Copy link
Owner

michalc commented Oct 2, 2021

I'm very tempted to not make it so boto3 cannot be swapped in via an abstraction layer. This is a fairly low level library (at least as far as a lot of Python libraries go I think?), with what I hope is a minimal set of dependencies, and I would like to keep it that way for flexibility and performance reasons. I know it's very typical for boto3 to be required for Python code to talk to AWS, but for S3 (or an S3-compatible service), I think it's fairly unnecessary.

(I've even been thinking even about how to remove the http client as a dependency... but realistically that won't happen, at least any time soon)

But... if you really want to use this library with boto3 right now, you probably can. You just need to present an httpx-like interface via get_http_client, and convert

  • HEAD requests -> head_object
  • GET requests, extracting the versionId in its query string parameters and range from headers -> get_object

@michalc
Copy link
Owner

michalc commented Oct 10, 2021

After thinking a bit more, I'm happy to not add any deliberate support for boto3, and just have the abstraction be at the http(s) level.

@michalc michalc closed this as completed Oct 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants