Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

Closed
nicktrav opened this issue Apr 29, 2021 · 13 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@nicktrav
Copy link
Contributor

1. Describe IN DETAIL the feature/behavior/change you would like to see.

We have a use-case where we'd like to add a node to an existing cluster that has a different AWS IAM Role to the Role that kops-controller is set up to look for (most likely a superset of permissions). The role doesn't have the same name as the role kops-controller expects and the node cannot be admitted.

To highlight the issue, it looks like as of 1.19+, kops-controller is used for authenticating a node when it attempts to join the cluster (implemented in #9653). Part of this process uses the AWS STS API to obtain the Role of the node (sent via aws_authenticator.go). The returned role is then checked against the role(s) that kops-controller was configured to accept (checked by aws_verifier.go).

The roles that kops-controller accepts are created here in template_functions.go, and are inferred from the instance profile.

The problem I'm facing is that in the case that the node joining the cluster has a Role that is not in the list of "allowed" roles (inferred from the IAM profile of the instance running kops-controller, or the instance profile name), the node cannot join the cluster (results in a 403 response returned here).

I'm wondering if anyone has thoughts on how we might configure a new node with a different IAM role, such that it will be accepted by kops-controller when attempting to join.

2. Feel free to provide a design supporting your feature request.

Not quite sure. Perhaps some way of opting out of the AWS IAM role validation? I'm missing some context / background on why / how the IAM role is used or required, versus just validating that the instance running the node exists (i.e. the relevance of doing this block, as opposed to just doing this block).

Prior to 1.19, any IAM role for the node could be used, provided it contained the requisite permissions to fetch config from the state store, etc.

Perhaps, I'm just misunderstanding the current flow, and this is already achievable, in which case, great!

@johngmyers
Copy link
Member

The check is to ensure that the request is indeed coming from an instance that is from the relevant ASG.

Information about configuring an instance group's instances to have additional or externally defined policies is in the documentation.

@nicktrav
Copy link
Contributor Author

The check is to ensure that the request is indeed coming from an instance that is from the relevant ASG.

Is there any risk to allowing a user to optionally disable this role check? For example, if I'd like to admit a node that is not part of the ASG and has a different role?

@johngmyers
Copy link
Member

The risk is the instance could be under control of an attacker, allowing that attacker to escalate to kubelet-level privileges with the API server and obtain the other node-level credentials that kops-controller mints.

@EladDolev
Copy link
Contributor

One use-case (I've been using for years) is outputting kops outputs as Terraform code, and overriding IAM roles for different instance groups
The thing is that kops built-in mechanism for this only works if all IAM roles/profiles are being controlled from outside of kops, and this is not very convenient

NOTE: Currently kOps only supports using existing instance profiles for every instance group in the cluster, not a mix of existing and managed instance profiles. This is due to the lifecycle overrides being used to prevent creation of the IAM-related resources.

@johngmyers
Copy link
Member

@EladDolev then I believe your request is to allow a mix of existing and managed instance profiles.

@EladDolev
Copy link
Contributor

indeed :-)

@johngmyers
Copy link
Member

@EladDolev that would seem to be a different ticket. It's not clear to me why that wouldn't work. Please run a test and file an issue to either change the code or the documentation, as appropriate.

@EladDolev
Copy link
Contributor

I'm just proposing an approach of how to implement the feature requested on this ticket
if kOps would support a mix of existing and managed instance profiles this would be very easy
the documentation is right, and currently it is not supported

I'm wondering if anyone has thoughts on how we might configure a new node with a different IAM role, such that it will be accepted by kops-controller when attempting to join.

currently with kops-controller this is only possible by creating the role + instance profile under the same name, and passing spec.iam.profile to the instance group.
the catch is that in order of making this work, need to create and pass IAM roles/profiles to all instance groups, masters included

TBH I don't know the code, but it sounds to me super simple to just remove this condition to either pass all IAM roles or none of them, and whenever spec.iam.profile is passed use it as if --lifecycle-overrides IAMRole=ExistsAndWarnIfChanges,IAMRolePolicy=ExistsAndWarnIfChanges,IAMInstanceProfileRole=ExistsAndWarnIfChanges was set

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 11, 2021
@olemarkus
Copy link
Member

If someone writes a PR for any of these features we can review. This will probably not be prioritised otherwise.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 11, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants