Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configurable S3 path style for auto backups #1563

Merged
merged 5 commits into from
Jan 5, 2024

Conversation

jtackaberry
Copy link
Contributor

@jtackaberry jtackaberry commented Jan 4, 2024

AWS has deprecated path-style URLs but it's required for some S3-like alternatives such as Minio.

This is a PR for #1560, which allows the user to force the S3 path style to work with Minio. Marked as draft as updates to tests are still needed.

I will pair it with a docs PR against rqlite/rqlite.io later tonight, to show examples of how to configure rqlite with Wasabi (which it turns out doesn't need forced path style) and Minio (which does).

AWS has deprecated the path-style URLs but it's required for some
S3-like alternatives such as Minio.
@jtackaberry jtackaberry marked this pull request as draft January 4, 2024 21:57
@jtackaberry
Copy link
Contributor Author

jtackaberry commented Jan 4, 2024

I'll submit a separate PR to add logging when backups occur.

Edit: changed my mind and slipped it into this one. It's just a one-liner. Am I being lazy? :)

@jtackaberry
Copy link
Contributor Author

This PR is ready for review.

Doc updates over at rqlite/rqlite.io#7

auto/backup/uploader.go Show resolved Hide resolved
secretKey string
bucket string
key string
forcePathStyle bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading this: https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config

whether S3ForcePathStyle is set or not, uploading will work, correct? I'm reading the AWS SDK docs, but it's not clear what the implications are of setting this variable. This is distinct from this PR, does it matter if this variable is set or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazon S3 works fine when it's set to true, yeah, but Amazon has deprecated this mode and is phasing it out, so for Amazon-native S3 I wouldn't want to enable it even if it does currently work.

https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/

@otoolep otoolep merged commit dda26e2 into rqlite:master Jan 5, 2024
18 checks passed
@otoolep
Copy link
Member

otoolep commented Jan 5, 2024

Looks good -- thanks.

@otoolep
Copy link
Member

otoolep commented Jan 5, 2024

I'm guessing we should do something similar for auto-restore from these systems?

@jtackaberry
Copy link
Contributor Author

I'm guessing we should do something similar for auto-restore from these systems?

Unfortunately I forgot to test it (I will tomorrow) but that should be covered by the PR. downloader.go was updated.

@otoolep
Copy link
Member

otoolep commented Jan 5, 2024

Ah yes, I see that now, missed it when reviewing the PR.

@otoolep
Copy link
Member

otoolep commented Jan 5, 2024

Is there any suggestions for testing? Could Minio be launched in CircleCI, the way I launch currently launch Consul and etcd for some end-to-end testing?

https://github.com/rqlite/rqlite/blob/master/.circleci/config.yml#L237

https://github.com/rqlite/rqlite/blob/master/system_test/e2e/auto_state.py

The idea would be we would launch Minio, and run an end-to-end upload and restore using a local Minio endpoint.

@jtackaberry
Copy link
Contributor Author

Unfortunately I forgot to test it (I will tomorrow) but that should be covered by the PR.

Restores tested, worked fine.

Could Minio be launched in CircleCI, the way I launch currently launch Consul and etcd for some end-to-end testing?

In principle, yes. I'm not sure about the container execution environment with CircleCI and how you access services like etcd (or minio in this case) from rqlite, i.e. how you know the IP it's on. I'm guessing all the containers share the same virtual interface so you just connect to localhost?

Using docker directly -- I'll let you translate to CircleCI-speak :) -- you can start a test instance of Minio like this:

$ docker run --name minio-test -p 9000:9000 -e MINIO_ROOT_USER=test -e MINIO_ROOT_PASSWORD=testpass quay.io/minio/minio server /data

In order to create a test bucket, you need to use Minio's mc tool. This binary is available in the minio container image, so if you can exec into it after the container is running that'd be one option, or alternatively mc is available as a standalone container image at quay.io/minio/mc. Assuming minio is available over localhost, you can create the bucket like so:

mc alias set test http://localhost:9000 test testpass
mc mb test/rqlite

This will create a bucket called rqlite.

rqlite can then point to it using the root user. (Obviously you wouldn't do this in production -- you'd create a separate policy that grants access to the bucket, create a user, and attach the policy to that user -- but this is good enough for testing.)

{
  "version": 1
  "type": "s3",
  "interval": "5s",
  "sub": {
    "access_key_id": "test",
    "secret_access_key": "testpass",
    "bucket": "rqlite",
    "endpoint": "http://localhost:9000",
    "path": "backups/citest.sqlite.gz",
    "region": "us-east-1",
    "force_path_style": true
  }
}

The endpoint needs to be prefixed with http:// here. Without an explicit scheme it assumes https. But now that I run through this recipe, I see the stringified form of S3Client is looking a little janky: s3://http://localhost:9000/rqlite/backups/citest.sqlite.gz.

I could submit another PR to strip the scheme from the endpoint before constructing the string (which would lose the fact of http or https in the string), or just drop the s3:// prefix altogether for the non-Amazon-native case so it's prefixed with http:// or https://. Do you have a preference? AFAICT the stringified form of S3Client is just used for display/logging purposes. s3:// isn't really a defined standard, so once you're outside of Amazon land all bets are off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants