Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only apply external policies when these are defined #9867

Merged
merged 1 commit into from
Sep 10, 2020
Merged

Only apply external policies when these are defined #9867

merged 1 commit into from
Sep 10, 2020

Conversation

kesor
Copy link
Contributor

@kesor kesor commented Sep 3, 2020

We experience fatal errors with this new feature (that we are not using) because some of the instance groups in our clusters include spec.iam.profile while others use the managed nodes role/profile. The ExternalPolicies feature is broken, and this makes it go away when not specifically asked for in the cluster configuration.

At the moment, without this patch, the kops update cluster operation results in -

F0903 17:45:44.372520   48506 task.go:59] found duplicate tasks with name "IAMRolePolicy/node-policyoverride": *awstasks.IAMRolePolicy {"ID":null,"Lifecycle":"Sync","Name":"node-policyoverride","Role":{"ID":null,"Lifecycle":"Sync","Name":"nodes.mycluster.domain.com","RolePolicyDocument":{"Name":"","Resource":{}},"ExportWithID":"nodes"},"PolicyDocument":null,"ExternalPolicies":null,"Managed":true} and *awstasks.IAMRolePolicy {"ID":null,"Lifecycle":"Sync","Name":"node-policyoverride","Role":{"ID":null,"Lifecycle":"Sync","Name":"secondgroup.mycluster.domain.com","RolePolicyDocument":{"Name":"","Resource":{}},"ExportWithID":"nodes"},"PolicyDocument":null,"ExternalPolicies":null,"Managed":true}

With this patch, everything works correctly without fatal errors.

@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Sep 3, 2020
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 3, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @kesor!

It looks like this is your first PR to kubernetes/kops 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kops has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @kesor. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 3, 2020
@mikesplain
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 3, 2020
@johngmyers
Copy link
Member

The usual process is to PR into master branch, then cherry-pick into a release branch. Please send a PR into master branch instead.

@hakman hakman changed the base branch from release-1.18 to master September 4, 2020 06:22
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 4, 2020
@hakman hakman changed the base branch from master to release-1.18 September 4, 2020 06:22
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 4, 2020
@hakman
Copy link
Member

hakman commented Sep 4, 2020

@johngmyers I don't think this PR is correct, or in any case has nothing to do with the original issue.
@kesor I am still trying to reproduce this and I am having a hard time as the initial issue doesn't have the IGs in the cluster manifest.

PS: sorry for breaking the tests, but should be easy to fix.

@kesor kesor changed the base branch from release-1.18 to release September 4, 2020 09:32
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. do-not-merge/contains-merge-commits and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 4, 2020
@kesor kesor changed the base branch from release to master September 4, 2020 09:34
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 4, 2020
@kesor
Copy link
Contributor Author

kesor commented Sep 4, 2020

The usual process is to PR into master branch, then cherry-pick into a release branch. Please send a PR into master branch instead.

rebased on master

@kesor
Copy link
Contributor Author

kesor commented Sep 4, 2020

@johngmyers I don't think this PR is correct, or in any case has nothing to do with the original issue.

Which "original issue"? The one I describe in the PR description?
I have been chasing for the reason why trying to upgrade our clusters to 1.18 fails, and eventually foundthat found duplicate tasks with name "IAMRolePolicy/node-policyoverride" is directly related to the externalPolicies feature which we are not using, while the clusters work just fine with 1.17 as-is right now.

@kesor I am still trying to reproduce this and I am having a hard time as the initial issue doesn't have the IGs in the cluster manifest.

Which initial issue?

@kesor
Copy link
Contributor Author

kesor commented Sep 4, 2020

/retest

@hakman hakman modified the milestones: v1.18, v1.19 Sep 4, 2020
@hakman
Copy link
Member

hakman commented Sep 4, 2020

@kesor I thought it's related to #9852.

I am trying to understand how to reproduce this to see if there is anything else that has to be considered.

@kesor
Copy link
Contributor Author

kesor commented Sep 4, 2020

@kesor I thought it's related to #9852.

I am trying to understand how to reproduce this to see if there is anything else that has to be considered.

This is unrelated. The bug in #9852 talks about the issue of shared iam.profile not working. This is kind of mentioned in the documentation for IAM in a "NOTE" quote.

This issue is talking about, if and when iam.profile is "correctly" used, then the new 1.18 feature externalPolicies is actually in conflict with it and causes a bug that can be avoided by "closing off" this externalPolicies new feature completely, and not scheduling its AddTask unless there were actually some externalPolicies present in the cluster spec.

@kesor
Copy link
Contributor Author

kesor commented Sep 4, 2020

/remove-label do-not-merge/contains-merge-commits

@k8s-ci-robot
Copy link
Contributor

@kesor: The label(s) /remove-label do-not-merge/contains-merge-commits cannot be applied. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash

In response to this:

/remove-label do-not-merge/contains-merge-commits

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubernetes kubernetes deleted a comment from k8s-ci-robot Sep 4, 2020
Copy link
Member

@hakman hakman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it would be helpful if you would open first an issue with an easy way to reproduce.

For this case, the proposed change indeed skips the AddTask for externalPolicies which seems reasonable at first sight. Except that the pattern used by kops for tasks is to reuse them for deleting and updating the resources and skipping the task will make it impossible to disable the external policies once they are applied.

IMHO the correct solution would be to check if the IAM profile is not shared (meaning is overridden in the IG spec) and add the task only for the kops managed profile.

pkg/model/iam.go Outdated Show resolved Hide resolved
@hakman
Copy link
Member

hakman commented Sep 10, 2020

One more thing, would you mind adding the same condition for the next code block? I think there is a similar issue there.

kops/pkg/model/iam.go

Lines 265 to 267 in 036ea69

// Generate additional policies if needed, and attach to existing role
{
additionalPolicy := ""

@kesor
Copy link
Contributor Author

kesor commented Sep 10, 2020

One more thing, would you mind adding the same condition for the next code block? I think there is a similar issue there.

kops/pkg/model/iam.go

Lines 265 to 267 in 036ea69

// Generate additional policies if needed, and attach to existing role
{
additionalPolicy := ""

@hakman We are using additionalPolicies in our clusters, and it works fine. So adding a condition here, will actually break things! So no ... addtionalPolicies and externalPolicies are apparently different enough that its a bad idea to do it.

@hakman
Copy link
Member

hakman commented Sep 10, 2020

From my point of view, additional policies should not be combined with existing IAM profiles.
kops should not modify the existing IAM profiles, because those could be used in multiple clusters, with different additional policies.

Could you maybe explain what would be the use case for it?

@kesor
Copy link
Contributor Author

kesor commented Sep 10, 2020

From my point of view, additional policies should not be combined with existing IAM profiles.
kops should not modify the existing IAM profiles, because those could be used in multiple clusters, with different additional policies.

Could you maybe explain what would be the use case for it?

Maybe it shouldn't ... but it does, and that means that closing it off behind a condition will break things.

The idea is that even though the IAM profile was created before the IG, it doesn't mean that the person managing the cluster should loose the ability to affect ALL IAM ROLES OF ALL IGs from one central location - which is the cluster additionalPolicy.node: location.

In any case, its another issue and doesn't really have to do with externalPolicies breaking 1.18.

@hakman
Copy link
Member

hakman commented Sep 10, 2020

I understand your point of view, but the external/shared policies should only be modified externally.
I agree that it is a similar, but different issue.
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 10, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman, kesor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 10, 2020
@k8s-ci-robot k8s-ci-robot merged commit 0a428be into kubernetes:master Sep 10, 2020
@kesor kesor deleted the v1.18.0-fix-external-policies branch September 10, 2020 17:54
k8s-ci-robot added a commit that referenced this pull request Sep 16, 2020
…9924-upstream-release-1.18

Automated cherry pick of #9867: only apply external policy tasks on non-shared iam #9924: Only add additional policies to kops managed IAMRoles
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants