Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fs/s3): Initial support for s3 filesystem backend #1900

Merged
merged 6 commits into from
Jul 25, 2023

Conversation

ahobson
Copy link
Contributor

@ahobson ahobson commented Jul 24, 2023

This follows the same pattern as fs/git to add support for fetching feature flag data from S3.

In an AWS environment, this would allow deploying a readonly container and not require communication to a git repository outside of the deployment environment.

Let me know if you have suggestions or thoughts on how to improve the implementation.

Thank you for flipt.

Testing

The unit tests take advantage of an existing s3 bucket, using minio to simulate it locally.

Just like for the git backend, the mage dagger:run test:unit does the work to start, provision and configure the test suite appropriately.

However, if you do want to run this test locally to experiment or investigate, you can always do the following:

# in one terminal session
docker run -it --rm --name minio \
  -e MINIO_ROOT_USER=user -e MINIO_ROOT_PASSWORD=password -p 9009:9009 \
  quay.io/minio/minio:latest server /data --address ":9009"
# in another session first use the minio provisioner binary
# this pushes the contents of the provided directory into a bucket in minio
AWS_ACCESS_KEY_ID=user AWS_SECRET_ACCESS_KEY=password \
  go run ./build/internal/cmd/minio/... -minio-url http://localhost:9009 -testdata-dir ./internal/storage/fs/s3/testdata
# then you can run the test for `fs.Source`
AWS_ACCESS_KEY_ID=user AWS_SECRET_ACCESS_KEY=password TEST_S3_ENDPOINT=http://localhost:9009 \
  go test -v ./internal/storage/fs/s3/...

@ahobson ahobson requested a review from a team as a code owner July 24, 2023 14:19
Copy link
Collaborator

@markphelps markphelps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing @ahobson !! Thank you so much for the contribution!! Very much appreciated, I think an s3 backend makes total sense. Also the code looks 👨🏻‍🍳 💋

One question about the UX from a config standpoint

Again thank you very much this is awesome!

internal/config/storage.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 24, 2023

Codecov Report

Merging #1900 (0961f2d) into main (c2374aa) will decrease coverage by 0.38%.
The diff coverage is 62.80%.

@@            Coverage Diff             @@
##             main    #1900      +/-   ##
==========================================
- Coverage   71.64%   71.26%   -0.38%     
==========================================
  Files          58       60       +2     
  Lines        5501     5743     +242     
==========================================
+ Hits         3941     4093     +152     
- Misses       1335     1419      +84     
- Partials      225      231       +6     
Files Changed Coverage Δ
internal/cmd/grpc.go 0.00% <0.00%> (ø)
internal/s3fs/s3fs.go 65.32% <65.32%> (ø)
internal/storage/fs/s3/source.go 78.12% <78.12%> (ø)
internal/config/storage.go 79.68% <91.30%> (+6.51%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@markphelps markphelps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great @ahobson, just a couple minor suggestions, one typo. exactly what I had in mind!! thank you!

build/testing/integration.go Outdated Show resolved Hide resolved
internal/cmd/grpc.go Show resolved Hide resolved
internal/config/storage.go Show resolved Hide resolved
Copy link
Contributor

@GeorgeMac GeorgeMac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so great. Thanks for taking the time to build this @ahobson 🙏 and the for attention to the tests. Was a nice Monday surpise.

I couldn't come up with any blocking feedback either, lets get this in 🙌 💪

I tried to ponder if there was something we could do with s3 and prefix listing for directory heirarchy, but the more I sat on it, the more I realized it was probably brining zero value for lots of pointless complexity. What you have here makes sense. Its enough to support multiple namespaces, via multiple files in a single bucket 👌

We're looking to make this FS backends an non-experimental in a coming release. We can definitely gets some docs written up around this. I wonder, going forward, if it would be valuable to have a new sub-command (e.g. flipt upload) for e.g. taking the flipt index file and finding the flag files in your project and just putting the minimum set in a target bucket. Food for thought / potential nice to have.

Comment on lines 72 to 84
// try to return fs compatible error if possible
var nsbe *types.NoSuchBucket
if errors.As(err, &nsbe) {
return nil, pathError
}
var nske *types.NoSuchKey
if errors.As(err, &nske) {
return nil, pathError
}
var nfe *types.NotFound
if errors.As(err, &nfe) {
return nil, pathError
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take it or leave it: We have a little utility to make this terser if you want

flipt/errors/errors.go

Lines 8 to 19 in 2416a01

// As is a utility for one-lining errors.As statements.
// e.g. cerr, match := errors.As[MyCustomError](err).
func As[E error](err error) (e E, _ bool) {
return e, errors.As(err, &e)
}
// AsMatch is the same as As but it returns just a boolean to represent
// whether or not the wrapped type matches the type parameter.
func AsMatch[E error](err error) (match bool) {
_, match = As[E](err)
return
}

Suggested change
// try to return fs compatible error if possible
var nsbe *types.NoSuchBucket
if errors.As(err, &nsbe) {
return nil, pathError
}
var nske *types.NoSuchKey
if errors.As(err, &nske) {
return nil, pathError
}
var nfe *types.NotFound
if errors.As(err, &nfe) {
return nil, pathError
}
// try to return fs compatible error if possible
if flipterrors.AsMatch[*types.NoSuchBucket](err) ||
flipterrors.AsMatch[*type.NoSuchKey](err) ||
flipterrors.AsMatch[*types.NotFound](err) {
return nil, pathError
}

Comment on lines +99 to +103
// Stat implements fs.StatFS. For the s3 filesystem, this gets the
// objects in the s3 bucket and stores them for later use. Stat can
// only be called on the currect directory as the s3 filesystem only
// supports walking a single bucket configured at creation time.
func (f *FS) Stat(name string) (fs.FileInfo, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my clarity then: The FS implementation needs to be primed with a call to Stat before it has an entries in it? (unless you happen to know the available keys beforehand of-course.

I see that fs.WalkDir starts with a call to Stat so that is why that works for the snapshot implementation we use.

I think this is good, I was just contemplating e.g. why we want to perform the operation on Stat and not e.g. on ReadDir instead. I think based on how we interact with fs.FS in the snapshot what you have here works great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to do a minimum viable s3fs and part of that was looking into how fs.WalkDir works and seeing that the call to Stat will happen before the call to ReadDir.

Re-reading the s3fs code, I suppose I could have ReadDir populate the filesystem entries if necessary, but because this internal flipt s3fs is only used by fs.WalkDIr I didn't think it was necessary or maybe even desirable, as I haven't tested ensuring all the operations work independently of each other.

Said another way, this isn't a general purpose s3 filesystem wrapper and I was trying to be explicit about that.

I'd vote leaving it as it is for now, and if we come up with another use case where this restriction is a problem, we can revisit it then.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep agreed 👍 and thanks for the clarification there.

I am going to sync up with the rest of the team later and figure out if we can fit this nicely into next weeks (1.24) release. I suspect we can 👍

@ahobson
Copy link
Contributor Author

ahobson commented Jul 25, 2023

I wonder, going forward, if it would be valuable to have a new sub-command (e.g. flipt upload) for e.g. taking the flipt index file and finding the flag files in your project and just putting the minimum set in a target bucket. Food for thought / potential nice to have.

Hmm. The aws cli provides aws s3 cp for single files and aws s3 sync for directories and I'd be hard pressed to believe I can do better than that. I guess if someone had a directory of files and only some of them were for feature flags maybe a specialized command would help, but aws s3 sync has include and exclude filters to do pretty much the same thing, I would think.

@GeorgeMac
Copy link
Contributor

I wonder, going forward, if it would be valuable to have a new sub-command (e.g. flipt upload) for e.g. taking the flipt index file and finding the flag files in your project and just putting the minimum set in a target bucket. Food for thought / potential nice to have.

Hmm. The aws cli provides aws s3 cp for single files and aws s3 sync for directories and I'd be hard pressed to believe I can do better than that. I guess if someone had a directory of files and only some of them were for feature flags maybe a specialized command would help, but aws s3 sync has include and exclude filters to do pretty much the same thing, I would think.

@ahobson I agree that aws cli for syncing would be the best course of action at first.
This is just a long term thought for a complete end to end experience for users of Flipt.
We have a GH action in the pipeline for things like validating Flipt configuration files and it works by invoking flipt validate directly. So this would slot nicely in there.

Addtionally, we already have the concept of the .flipt.yml file for indexing which files in a directory tree should be considered by Flipt. I would just imagine invoking this and using the full paths as keys (plus copy the .flipt.yml index file itself) and package that into a buckets contents:

func listStateFiles(logger *zap.Logger, source fs.FS) ([]string, error) {
// This is the default variable + value for the FliptIndex. It will preserve its value if
// a .flipt.yml can not be read for whatever reason.
idx := FliptIndex{
Version: "1.0",
Include: []string{
"**features.yml", "**features.yaml", "**.features.yml", "**.features.yaml",
},
}
// Read index file
inFile, err := source.Open(indexFile)
if err == nil {
if derr := yaml.NewDecoder(inFile).Decode(&idx); derr != nil {
return nil, fmt.Errorf("yaml: %w", derr)
}
}
if err != nil {
if !errors.Is(err, fs.ErrNotExist) {
return nil, err
} else {
logger.Debug("index file does not exist, defaulting...", zap.String("file", indexFile), zap.Error(err))
}
}
var includes []glob.Glob
for _, g := range idx.Include {
glob, err := glob.Compile(g)
if err != nil {
return nil, fmt.Errorf("compiling include glob: %w", err)
}
includes = append(includes, glob)
}
filenames := make([]string, 0)
if err := fs.WalkDir(source, ".", func(path string, d fs.DirEntry, err error) error {
if err != nil {
return err
}
if d.IsDir() {
return nil
}
for _, glob := range includes {
if glob.Match(path) {
filenames = append(filenames, path)
return nil
}
}
return nil
}); err != nil {
return nil, err
}
if len(idx.Exclude) > 0 {
var excludes []glob.Glob
for _, g := range idx.Exclude {
glob, err := glob.Compile(g)
if err != nil {
return nil, fmt.Errorf("compiling include glob: %w", err)
}
excludes = append(excludes, glob)
}
OUTER:
for i := range filenames {
for _, glob := range excludes {
if glob.Match(filenames[i]) {
filenames = append(filenames[:i], filenames[i+1:]...)
continue OUTER
}
}
}
}
return filenames, nil
}

But as I said, not for this pass. Just a potential idea to build upon to create something that gives a similar experience to the git and local backends.

Copy link
Collaborator

@markphelps markphelps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so cool. thanks again!!

@markphelps
Copy link
Collaborator

@all-contributors please add @ahobson for code

@allcontributors
Copy link
Contributor

@markphelps

I've put up a pull request to add @ahobson! 🎉

@markphelps markphelps enabled auto-merge (squash) July 25, 2023 18:13
@markphelps markphelps merged commit db8f92b into flipt-io:main Jul 25, 2023
18 of 20 checks passed
@ahobson ahobson deleted the adh-s3 branch July 25, 2023 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants