Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Amazon's IMDSv2 #249

Merged
merged 4 commits into from Dec 3, 2020
Merged

Conversation

RichVanderwal
Copy link
Contributor

@RichVanderwal RichVanderwal commented Nov 25, 2020

This implements a newer method of authentication to AWS instance metadata endpoints.

Under IMDSv1, the agent simply called to an endpoint and received JSON with information about the instance size, availability zone, and other facts.

This newer IMDSv2 implements a session protocol. The agent calls two endpoints: one to get a session token that's valid for a specified time, and then the original endpoint with the session token to get the instance information.

As of this writing, AWS normally accepts both IMDSv1 and IMDSv2. A customer can tell AWS to only allow IMDSv2, so we're fine if we simply perform IMDSv2 calls exclusively.

Tests for these changes will need to be higher-level integration tests, since they rely on AWS services. Let me know if you think otherwise!

Links

Amazon's IMDSv2 Documentation
New Relic Java Agent Implementation

Details

One decision still in flight is what the timeout values should be for each call to the two endpoints. Since the original timeout was one second, I've split the difference and set the timeout for both calls to 500 milliseconds. These timeouts could be reduced, since the endpoints are reported to be very fast by the documentation above (< 10ms). Other New Relic language agents have other implementations, including a 100 millisecond timeout, and a back-off scheme using 15, 30, 60, 120, and 300 milliseconds.

@CLAassistant
Copy link

CLAassistant commented Nov 25, 2020

CLA assistant check
All committers have signed the CLA.

@RichVanderwal RichVanderwal changed the title Implement Implement Amazon's IMDSv2 Nov 25, 2020
@RichVanderwal RichVanderwal self-assigned this Nov 25, 2020
@RichVanderwal RichVanderwal added enhancement p1 Highest priority work items ready-for-review labels Nov 25, 2020
@RichVanderwal RichVanderwal linked an issue Nov 25, 2020 that may be closed by this pull request
@RichVanderwal RichVanderwal removed the request for review from krnowak November 25, 2020 00:38
Copy link

@krnowak krnowak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not ready for review, but just found it in the review queue in email, so just had a quick look.

Comment on lines 83 to 84
ret = nil
err = unexpectedAWSErr{e: fmt.Errorf("error contacting AWS IMDSv2 token endpoint, %v", err)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should return here, if an error happened, right?

Suggested change
ret = nil
err = unexpectedAWSErr{e: fmt.Errorf("error contacting AWS IMDSv2 token endpoint, %v", err)}
return nil, unexpectedAWSErr{e: fmt.Errorf("error contacting AWS IMDSv2 token endpoint, %v", err)}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, maybe you could create a request to awsEndpoint before calling getAWSToken and add the token to the request's headers only if err != nil. This would make IMDSv2 optional in case someone wants to stick with IMDSv1 (if that is even possible).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this! After thinking harder about it, we shouldn't throw an unExpectedAWSErr at all, we're expecting a timeout unless they're in AWS. Fixing this and the test.

A more thorough implementation would raise errors on other HTTP errors, but I'm deferring that work in the interest of time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the extra request to awsEndpoint, I'd rather avoid making three calls to AWS and increasing the overhead by another 0.5 second on startup. As of now, IMDSv2 endpoints are always available -- the only control the customer has is to turn IMDSv1 off to help their security posture.

awsTokenEndpointPath = "/latest/api/token"
awsEndpoint = "http://" + awsHostname + awsEndpointPath
awsTokenEndpoint = "http://" + awsHostname + awsTokenEndpointPath
awsTokenTTL = "60" // seconds this AWS utiliation session will last
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
awsTokenTTL = "60" // seconds this AWS utiliation session will last
awsTokenTTL = "60" // seconds this AWS utilization session will last

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@RichVanderwal RichVanderwal removed the request for review from krnowak December 3, 2020 17:44
@RichVanderwal RichVanderwal merged commit d57e5db into newrelic:develop Dec 3, 2020
@RichVanderwal RichVanderwal deleted the IMDSv2 branch December 3, 2020 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement p1 Highest priority work items
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWS IMDS V2 Support
4 participants