Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow non-static AWS credentials for Route 53, gated by "ambient credentials" flags #363

Merged
merged 12 commits into from Mar 26, 2018

Conversation

euank
Copy link
Contributor

@euank euank commented Feb 28, 2018

This includes all the changes in #243, but also implements the flags discussed in #308 and fixes the failing test.

The flags are implemented such that ambient credentials are only used if the access id and secret key are blank, which I personally think is the correct behavior.

This should not be merged yet because I the following still needs to be done:

  • Manually test each behaviour (each flag on/off and the expected failure/success of empty issuer, but valid ambient credentials)
    • default flags, regular issuer, regular cluster issuer, ambient cluster issuer work, ambient issuer errors
    • --issuer-ambient-credentials results in all four issuers working
    • --cluster-issuer-ambient-credentials=false results in the ambient cluster issuer no longer working
  • Evaluate adding e2e tests, hopefully add (edit: punted, this is difficult with the current e2e setup)
  • Add documentation about flag / clusterissuers

My current status on the above is that I'm having trouble getting the e2e tests to run locally; if I have trouble for longer I may file a separate issue about that.

Release note:

Issuers using the AWS Route53 solver may attempt to find credentials using the environment, EC2 IAM Role, and other sources available to the cert-manager controller. This behavior is on by default for cluster issuers and off by default for issuers. This behavior may be enabled or disabled for all issuers or cluster issuers using the `--issuer-ambient-credentials` and `--cluster-issuer-ambient-credentials` flags on the cert-manager controller.

@jetstack-bot jetstack-bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Feb 28, 2018
@jetstack-bot jetstack-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 28, 2018
@munnerz
Copy link
Member

munnerz commented Feb 28, 2018 via email

@jetstack-bot jetstack-bot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 28, 2018
Copy link
Member

@munnerz munnerz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for this PR, and thanks to @mattmoyer for the original work on the AWS provider too.

I share your concerns around ensuring we test this properly - this is in effect a security option, and so we need to ensure it does properly secure what it says. I think explicit 'ambient' variants of the dns provider constructor functions (like you've done here) and then unit tests to ensure they are not called with ambient credentials are disabled works well (at least, for now).

I think we also need to update our documentation to explain how ClusterIssuer and Issuer now differ - up until this point, they have behaved identically out of the box. This could definitely be a surprise to users and without proper documentation, might lead them to think "cert-manager just works some of the time /shrug".

secretAccessKey,
providerConfig.Route53.HostedZoneID,
providerConfig.Route53.Region,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm thinking in order to test this, we could probably just add a unit test that ensures the correct function is called (i.e. ambient or not). We will struggle otherwise to run these tests, as we currently don't mock the entire provider environment (and thus, we have no way to properly e2e test DNS providers at the moment either).

@euank
Copy link
Contributor Author

euank commented Mar 6, 2018

I pushed some additional fixups. I'm hitting some hangups testing this still, so it's still not ready for merge, but a bit closer.

@munnerz munnerz added this to the v0.3 milestone Mar 8, 2018
@euank
Copy link
Contributor Author

euank commented Mar 14, 2018

I added some infrastructure to the dns package to make it easier to unit test this sort of change in #391; unfortunately this and that PR will merge conflict.
Since that PR is significantly simpler, hopefully that one can get in, and then I can rebase this and add unit testing for this on top of the work in that one.

@jetstack-bot jetstack-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 15, 2018
@jetstack-bot jetstack-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 16, 2018
@euank
Copy link
Contributor Author

euank commented Mar 16, 2018

Rebased and added a unit test, I'm still not happy with the test coverage of this and still don't consider it quite ready for merge.

I do think review of what's here is reasonable though

Copy link
Member

@munnerz munnerz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me! Let me know when you're happy with this or need me to take another look 😄

// https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html#specifying-credentials
} else {
glog.V(5).Infof("not using ambient credentials")
config.WithCredentials(credentials.NewStaticCredentials(accessKeyID, secretAccessKey, ""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these credentials are invalid, will the default chain still be used? (i.e. does WithCredentials append to the chain, or replace?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It replaces the default credential chain. Even if they're an empty string, it will error out that they're invalid without trying anything else.

}
}
return m, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether this test file was meant to be removed? Can it be reworked for the new constructors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional.

My understanding was this isn't being run in CI, and even if it were, these tests are really just testing the aws sdk, not testing any of the code here as far as I could tell

args: []interface{}{"", "", "", "us-west-2", true},
},
},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a corresponding fixture where ambient is set to false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

mattmoyer and others added 7 commits March 23, 2018 14:30
This change maintains backwards compatibility, but makes the `accessKeyID` and `secretAccessKeySecretRef` fields of the `route53` DNS provider optional.
If not provided, AWS credentials will be loaded from `AWS_*` environment variables or the EC2 metadata service.
This should also work for things that impersonate the EC2 metadata service, such as [kube2iam](https://github.com/jtblin/kube2iam) and [kail](https://github.com/uswitch/kiam).

Signed-off-by: Matt Moyer <moyer@heptio.com>
Signed-off-by: Matt Moyer <moyer@heptio.com>
The zone id is never read from the environment; this test tests
functionality which doesn't exist in the actual software, so there's no
point in having it.
This implements ambient credential support for AWS, gated behind flags
for issuers and cluster issuers.

This adds the pair of flags discussed in
cert-manager#308.

It provides an implementation for those flag's effects for the route53
solver.
I'm convinced this test was never run and also did not provide any
significant value in this project.
@euank euank force-pushed the nonstatic-aws-creds branch 3 times, most recently from 48099de to e6884f1 Compare March 24, 2018 00:25
@euank
Copy link
Contributor Author

euank commented Mar 24, 2018

@munnerz I'm happy with this now. You should definitely do a pass on the docs I added since I'm sure there's a better way to explain some of that, but I've manually tested the code and think this is an okay to merge from my end.

I do wish there was a better way to deal with e2e testing this, but that seems like quite a rabbit hole to go down right now. I may do so later if my time allows.

Copy link
Member

@munnerz munnerz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor pending question around how the REGION environment variable is handled, other this looks good to me 😄 thanks for all the work!

@@ -35,6 +35,9 @@ spec:
key: api-key.txt
- name: route53
route53:
# Credentials may optionally be omitted to use ambient credentials if
# --issuer-ambient-credentials is enabled and
# ambient credentials are available to cert-manager.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering - would this be specified with something like:

- name: route53
  route53: {}

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I should read the docs 😄

- name: route53
  route53:
    region: us-east1-a


They will only be used if no credentials are supplied, even if the supplied credentials are incorrect.

By default, they may be used by cluster-issuers, but not regular issuers. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/cluster-issuers/ClusterIssuers

`--cluster-issuer-ambient-credentials=false` flags on the cert-manager may be
used to override this behavior.

Note that ambient credentials are disabled for regular 'Issuer's by default to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Issuer's? I think this should just be Issuers?


## Ambient Credentials

By default, a Cluster Issuer will be able to use 'ambient credentials' of the 'cert-manager' deployment for supported challenges. Currently, only the 'route53' challenge makes use of ambient credentials. To learn more about this behavior, see the [ambient credentials][ambient-creds] document.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, only the 'route53' ACME DNS challenge validation makes...

const testZoneID = "testzoneid"

func TestAmbientRegionFromEnv(t *testing.T) {
os.Setenv("AWS_REGION", "us-east-1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like if the 'region' field is not set in the Issuer resource, we use the environment to determine which to use (this var).

I guess this happens as part of credential loading in the default chain? & additionally, if AWS_REGION is set, but ambient credentials are disabled, will the region still be used? It's not such a big security concern, but it could cause strange behaviour for users and mask the fact they've not set the region explicitly in the YAML.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Region handling is distinct from credential chains. The two sources that will be consulted for region are that environment variable, and the 'shared config' (the ini file at ~/.aws/config). Notably, ec2 metadata isn't consulted.

The current implementation of the ambient option does not impact those settings. I didn't really think about that, so I left it as it was before.

The region thing is a bit silly here since route53 is a global service, so the region only really matters if you want to reduce api latency by hitting a nearer server, or if you're using a different aws partition (like aws china or govcloud).

I'll go ahead and make it respect the ambient config for now, but I think the above may be an argument for defaulting regardless it in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit to do the above, and updated those unit tests.

This notably results in the region being a required field if the
'ambient' option is not set for a given issuer.
@euank
Copy link
Contributor Author

euank commented Mar 25, 2018

/retest

unrelated flake

@munnerz
Copy link
Member

munnerz commented Mar 26, 2018

/lgtm
/approve

🎉 thanks for this @mattmoyer and @euank! Lots of work done here, but I think we've laid out a good framework and best practice for how we handle this in future, and can start rolling out similar changes for other providers too 😄.

Really appreciate it, it's not easy finding time to go back and forth on PRs like this!

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 26, 2018
@jetstack-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: munnerz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants