feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

nicktrav · 2021-04-29T22:29:14Z

1. Describe IN DETAIL the feature/behavior/change you would like to see.

We have a use-case where we'd like to add a node to an existing cluster that has a different AWS IAM Role to the Role that kops-controller is set up to look for (most likely a superset of permissions). The role doesn't have the same name as the role kops-controller expects and the node cannot be admitted.

To highlight the issue, it looks like as of 1.19+, kops-controller is used for authenticating a node when it attempts to join the cluster (implemented in #9653). Part of this process uses the AWS STS API to obtain the Role of the node (sent via aws_authenticator.go). The returned role is then checked against the role(s) that kops-controller was configured to accept (checked by aws_verifier.go).

The roles that kops-controller accepts are created here in template_functions.go, and are inferred from the instance profile.

The problem I'm facing is that in the case that the node joining the cluster has a Role that is not in the list of "allowed" roles (inferred from the IAM profile of the instance running kops-controller, or the instance profile name), the node cannot join the cluster (results in a 403 response returned here).

I'm wondering if anyone has thoughts on how we might configure a new node with a different IAM role, such that it will be accepted by kops-controller when attempting to join.

2. Feel free to provide a design supporting your feature request.

Not quite sure. Perhaps some way of opting out of the AWS IAM role validation? I'm missing some context / background on why / how the IAM role is used or required, versus just validating that the instance running the node exists (i.e. the relevance of doing this block, as opposed to just doing this block).

Prior to 1.19, any IAM role for the node could be used, provided it contained the requisite permissions to fetch config from the state store, etc.

Perhaps, I'm just misunderstanding the current flow, and this is already achievable, in which case, great!

The text was updated successfully, but these errors were encountered:

johngmyers · 2021-04-30T00:17:11Z

The check is to ensure that the request is indeed coming from an instance that is from the relevant ASG.

Information about configuring an instance group's instances to have additional or externally defined policies is in the documentation.

nicktrav · 2021-04-30T16:27:47Z

The check is to ensure that the request is indeed coming from an instance that is from the relevant ASG.

Is there any risk to allowing a user to optionally disable this role check? For example, if I'd like to admit a node that is not part of the ASG and has a different role?

johngmyers · 2021-04-30T17:01:51Z

The risk is the instance could be under control of an attacker, allowing that attacker to escalate to kubelet-level privileges with the API server and obtain the other node-level credentials that kops-controller mints.

EladDolev · 2021-06-06T17:13:23Z

One use-case (I've been using for years) is outputting kops outputs as Terraform code, and overriding IAM roles for different instance groups
The thing is that kops built-in mechanism for this only works if all IAM roles/profiles are being controlled from outside of kops, and this is not very convenient

NOTE: Currently kOps only supports using existing instance profiles for every instance group in the cluster, not a mix of existing and managed instance profiles. This is due to the lifecycle overrides being used to prevent creation of the IAM-related resources.

johngmyers · 2021-06-06T18:35:04Z

@EladDolev then I believe your request is to allow a mix of existing and managed instance profiles.

EladDolev · 2021-06-11T15:36:31Z

indeed :-)

johngmyers · 2021-06-11T17:58:46Z

@EladDolev that would seem to be a different ticket. It's not clear to me why that wouldn't work. Please run a test and file an issue to either change the code or the documentation, as appropriate.

EladDolev · 2021-06-13T15:46:14Z

I'm just proposing an approach of how to implement the feature requested on this ticket
if kOps would support a mix of existing and managed instance profiles this would be very easy
the documentation is right, and currently it is not supported

I'm wondering if anyone has thoughts on how we might configure a new node with a different IAM role, such that it will be accepted by kops-controller when attempting to join.

currently with kops-controller this is only possible by creating the role + instance profile under the same name, and passing spec.iam.profile to the instance group.
the catch is that in order of making this work, need to create and pass IAM roles/profiles to all instance groups, masters included

TBH I don't know the code, but it sounds to me super simple to just remove this condition to either pass all IAM roles or none of them, and whenever spec.iam.profile is passed use it as if --lifecycle-overrides IAMRole=ExistsAndWarnIfChanges,IAMRolePolicy=ExistsAndWarnIfChanges,IAMInstanceProfileRole=ExistsAndWarnIfChanges was set

k8s-triage-robot · 2021-09-11T15:51:49Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

olemarkus · 2021-09-11T19:56:46Z

If someone writes a PR for any of these features we can review. This will probably not be prioritised otherwise.

k8s-triage-robot · 2021-10-11T20:45:56Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2021-11-10T21:03:56Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2021-11-10T21:04:18Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 11, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 11, 2021

k8s-ci-robot closed this as completed Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

nicktrav commented Apr 29, 2021

johngmyers commented Apr 30, 2021

nicktrav commented Apr 30, 2021

johngmyers commented Apr 30, 2021

EladDolev commented Jun 6, 2021

johngmyers commented Jun 6, 2021

EladDolev commented Jun 11, 2021

johngmyers commented Jun 11, 2021

EladDolev commented Jun 13, 2021

k8s-triage-robot commented Sep 11, 2021

olemarkus commented Sep 11, 2021

k8s-triage-robot commented Oct 11, 2021

k8s-triage-robot commented Nov 10, 2021

k8s-ci-robot commented Nov 10, 2021

feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

Comments

nicktrav commented Apr 29, 2021

johngmyers commented Apr 30, 2021

nicktrav commented Apr 30, 2021

johngmyers commented Apr 30, 2021

EladDolev commented Jun 6, 2021

johngmyers commented Jun 6, 2021

EladDolev commented Jun 11, 2021

johngmyers commented Jun 11, 2021

EladDolev commented Jun 13, 2021

k8s-triage-robot commented Sep 11, 2021

olemarkus commented Sep 11, 2021

k8s-triage-robot commented Oct 11, 2021

k8s-triage-robot commented Nov 10, 2021

k8s-ci-robot commented Nov 10, 2021