Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompress GZIP'd user data #1762

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Decompress GZIP'd user data #1762

wants to merge 1 commit into from

Conversation

cartermckinnon
Copy link
Member

@cartermckinnon cartermckinnon commented Apr 11, 2024

Issue #, if available:

Fixes #1734

Description of changes:

Adds support for GZIP-compressed user data.

The following scenarios are supported:

  1. User data that consists of a NodeConfig that is compressed with GZIP.
  2. User data that is a multi-part MIME document that is compressed with GZIP.
  3. User data that is a multi-part MIME document containing parts that are individually GZIP'd (and have the Content-Encoding: gzip header).
  4. A combination of 2 and 3.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Member

@ndbaker1 ndbaker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change LGTM. doesn't seem like we are going to adopt decompressing individual parts, so can we do a doc update before merging?

@cartermckinnon
Copy link
Member Author

doesn't seem like we are going to adopt decompressing individual parts

@ndbaker1 latest rev adds support for this, PTAL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote the unit tests for this config provider; they're more verbose but I think being explicit about the input is more clear and lets us more easily add test cases in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you do any integration testing here? Or is that covered by the existing integration tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the change, thanks for skipping over the junk to setup mime documents and testing the actual provider interface

Comment on lines +37 to +39
type userDataConfigProvider struct {
userDataProvider userDataProvider
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this wrapper?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for testing, it allows us to mock out the IMDS calls

return &userDataConfigProvider{}
return &userDataConfigProvider{
userDataProvider: &imdsUserDataProvider{},
}
}

func (ics *userDataConfigProvider) Provide() (*internalapi.NodeConfig, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is ics the name for userDataConfigProvider?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's an artifact from when it was something like imdsConfigProvider? 🤣
maybe we could just generally name the config provider references cp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh i like just calling it this but that's a Java-ism that gophers detest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha. Let's change it to something that makes sense :P

Comment on lines +108 to +114
if part.Header.Get(contentEncodingHeader) == "gzip" {
decompressedNodeConfigPart, err := decompressIfGZIP(nodeConfigPart)
if err != nil {
return nil, err
}
nodeConfigPart = decompressedNodeConfigPart
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to make a check that it's gzip and then gracefully handle it not being gzip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't just calling decompressIfGZIP in all cases work better since its fallible in constant time and safe either way?

To not hit this branch you'd have to set the nodeConfig mediaType but forget to add the gzip encoding header, which im sure would only happen on accident. maybe handling it explicitly is more "proper", but not sure if its necessary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIME semantics dictate that if the section is GZIP'd, this header needs to be there. If a user (or library) is creating MIME documents that use GZIP but omit this header, that should fail because it's not conformant to the spec.

Generally, folks are only using GZIP compression via a library which should implement this properly -- folks aren't hand-writing and hand-compressing their MIME docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you do any integration testing here? Or is that covered by the existing integration tests?

Comment on lines +108 to +114
if part.Header.Get(contentEncodingHeader) == "gzip" {
decompressedNodeConfigPart, err := decompressIfGZIP(nodeConfigPart)
if err != nil {
return nil, err
}
nodeConfigPart = decompressedNodeConfigPart
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't just calling decompressIfGZIP in all cases work better since its fallible in constant time and safe either way?

To not hit this branch you'd have to set the nodeConfig mediaType but forget to add the gzip encoding header, which im sure would only happen on accident. maybe handling it explicitly is more "proper", but not sure if its necessary

return &userDataConfigProvider{}
return &userDataConfigProvider{
userDataProvider: &imdsUserDataProvider{},
}
}

func (ics *userDataConfigProvider) Provide() (*internalapi.NodeConfig, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's an artifact from when it was something like imdsConfigProvider? 🤣
maybe we could just generally name the config provider references cp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the change, thanks for skipping over the junk to setup mime documents and testing the actual provider interface

Comment on lines +27 to +29
type userDataProvider interface {
GetUserData() ([]byte, error)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice way to make testing the provider interface easier 👍

@cartermckinnon
Copy link
Member Author

/ci
+workflow:os_distros al2023

Copy link
Contributor

@cartermckinnon roger that! I've dispatched a workflow. 👍

Copy link
Contributor

@cartermckinnon the workflow that you requested has completed. 🎉

AMI variantBuildTest
1.23 / al2023success ✅failure ❌
1.24 / al2023success ✅success ✅
1.25 / al2023success ✅success ✅
1.26 / al2023success ✅success ✅
1.27 / al2023success ✅success ✅
1.28 / al2023success ✅success ✅
1.29 / al2023success ✅success ✅
1.30 / al2023success ✅success ✅

@cartermckinnon
Copy link
Member Author

cartermckinnon commented May 16, 2024

One flake:

[Fail] [sig-api-machinery] CustomResourcePublishOpenAPI [Privileged:ClusterAdmin] [It] updates the published spec when one version gets renamed [Conformance]

Not related

@awslabs awslabs deleted a comment from github-actions bot May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nodeadm doesn't support userdata content-encoded with gzip
3 participants