New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom S3 CSV billing export endpoint #1991
Conversation
Signed-off-by: Klaas Jan Dijksterhuis <klaasjand@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think it's a nice addition. I'd like just to see a non-conflicting way to configure it.
pkg/filemanager/filemanager.go
Outdated
@@ -44,13 +46,91 @@ func NewFileManager(path string) (FileManager, error) { | |||
return NewGCSStorageFile(path) | |||
case strings.Contains(path, "blob.core.windows.net"): | |||
return NewAzureBlobFile(path) | |||
case strings.HasPrefix(path, "http://"), strings.HasPrefix(path, "https://"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change s3
protocol will be assumed for any http(s)
link.
It will require a breaking change for a potential extension in the future like adding a new destination type.
I would suggest instead using one of the 3 next approaches:
- Introduce new protocol prefix string, like
customs3://endpoint.com/my-file.csv
orminio://endpoint.com/my-file.csv
- Add domain filter like, minio.com (probably the least flexible option)
- Extend existing
S3File
struct to support new destination. (I'm not familiar with minio, but it looks like you're using existing aws library and the same method calls.). Despite this being my preferred option, there might be potential challenges that I may not fully comprehend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards option 1 because I'd rather don't want to mess up the current AWS implementation too much or add extra complexity to it. I would like to hear your thoughts on the following. Introduce a new protocol prefix string sort of like you mentioned, e.g. customs3://fqdn:port/bucket-name/path/to/file.csv
and always enforce secure https so this translates to: https://fqdn:port/bucket-name/path/to/file.csv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the protocol prefix string/scheme to alts3
(alternative S3) so alts3://fqdn:port/bucket-name/path/to/file.csv
translates to https://fqdn:port/bucket-name/path/to/file.csv
.
pkg/filemanager/filemanager.go
Outdated
@@ -7,6 +7,7 @@ import ( | |||
"io" | |||
"net/url" | |||
"os" | |||
gp "path" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I would rather change a variable name instead of creating an alias for a standard package.
Makes code harder to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the alias and renamed variables/arguments.
pkg/filemanager/filemanager.go
Outdated
return nil, fmt.Errorf("invalid s3 path: %s", path) | ||
} | ||
|
||
const defaultRegion = "us-east-1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to hardcode the region?
I suspect this won't be correct for the majority of the people. If it's a required parameter I would rather fail with an error early and let the user fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to hardcode the region? I suspect this won't be correct for the majority of the people. If it's a required parameter I would rather fail with an error early and let the user fix it.
For the region I used this: minio/minio#15063 which is also described in the Minio documentation but both are based on using the AWS cli and not the SDK. I realize that's quite a bold assumption. After reading this: https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/endpoints/ it's probably better to make it customizable as there is already an environment variable present for that AWS_REGION
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reviewing the AWS SDK docs for endpoint configuration I changed the implementation which made it a little bit more straightforward. See also: https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/endpoints/
We are using the v1 endpoint because OpenCost currently uses github.com/aws/aws-sdk-go-v2/service/s3 v1.31.0
, from v1.38.0
and above we have to switch to the v2 endpoint, see: https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/endpoints/#migration also added a comment about it in the code.
pkg/filemanager/filemanager.go
Outdated
s3Client: s3.NewFromConfig(cfg), | ||
bucket: bucket, // bucket | ||
key: key, // path/to/file.csv | ||
}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just return S3File{}
here? I don't see the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just return
S3File{}
here? I don't see the difference.
You're right, my initial thought was to keep the original AWS S3 and the custom S3 code completely separated and have each their own Upload and Download implementation. But in retrospect it's probably better to reuse the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see arguments for having it separate as well.
I don't have a strong opinion on this. I think either way is fine.
… sdk based s3 client endpoint construction Signed-off-by: Klaas Jan Dijksterhuis <klaasjand@users.noreply.github.com>
…rt alias for path Signed-off-by: Klaas Jan Dijksterhuis <klaasjand@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@mattray Can you merge it? |
What does this PR change?
Does this PR relate to any other PRs?
How will this PR impact users?
Does this PR address any GitHub or Zendesk issues?
How was this PR tested?
Does this PR require changes to documentation?
Have you labeled this PR and its corresponding Issue as "next release" if it should be part of the next OpenCost release? If not, why not?