Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Drop IAM roles for aws master/workers #697

Closed
wants to merge 2 commits into from

Conversation

rajatchopra
Copy link
Contributor

For the first step towards making sure all iam-role creds are granularized and each sufficient one stored within the scope of the component/operator that requires it.

This is a work in progress - the author still needs to understand what all will break.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 16, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rajatchopra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 16, 2018
@@ -57,7 +57,6 @@ func TFVars(cfg *types.InstallConfig, bootstrapIgn, masterIgn, workerIgn string)
if m.Platform.AWS != nil {
config.AWS.Master = aws.Master{
EC2Type: m.Platform.AWS.InstanceType,
IAMRoleName: m.Platform.AWS.IAMRoleName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this forces terraform to create one for you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this forces terraform to create one for you.

I haven't tracked that down, but we'll also want to remove/prune some of:

$ git grep iam_role origin/pr/697
origin/pr/697:data/data/aws/bootstrap/main.tf:  role = "${var.iam_role == "" ?
origin/pr/697:data/data/aws/bootstrap/main.tf:    join("|", aws_iam_role.bootstrap.*.name) :
origin/pr/697:data/data/aws/bootstrap/main.tf:    join("|", data.aws_iam_role.bootstrap.*.name)
origin/pr/697:data/data/aws/bootstrap/main.tf:data "aws_iam_role" "bootstrap" {
origin/pr/697:data/data/aws/bootstrap/main.tf:  count = "${var.iam_role == "" ? 0 : 1}"
origin/pr/697:data/data/aws/bootstrap/main.tf:  name  = "${var.iam_role}"
origin/pr/697:data/data/aws/bootstrap/main.tf:resource "aws_iam_role" "bootstrap" {
origin/pr/697:data/data/aws/bootstrap/main.tf:  count = "${var.iam_role == "" ? 1 : 0}"
origin/pr/697:data/data/aws/bootstrap/main.tf:resource "aws_iam_role_policy" "bootstrap" {
origin/pr/697:data/data/aws/bootstrap/main.tf:  count = "${var.iam_role == "" ? 1 : 0}"
origin/pr/697:data/data/aws/bootstrap/main.tf:  role  = "${aws_iam_role.bootstrap.id}"
origin/pr/697:data/data/aws/bootstrap/variables.tf:variable "iam_role" {
origin/pr/697:data/data/aws/iam/main.tf:  role = "${var.worker_iam_role == "" ?
origin/pr/697:data/data/aws/iam/main.tf:    join("|", aws_iam_role.worker_role.*.name) :
origin/pr/697:data/data/aws/iam/main.tf:    join("|", data.aws_iam_role.worker_role.*.name)
origin/pr/697:data/data/aws/iam/main.tf:data "aws_iam_role" "worker_role" {
origin/pr/697:data/data/aws/iam/main.tf:  count = "${var.worker_iam_role == "" ? 0 : 1}"
origin/pr/697:data/data/aws/iam/main.tf:  name  = "${var.worker_iam_role}"
origin/pr/697:data/data/aws/iam/main.tf:resource "aws_iam_role" "worker_role" {
origin/pr/697:data/data/aws/iam/main.tf:  count = "${var.worker_iam_role == "" ? 1 : 0}"
origin/pr/697:data/data/aws/iam/main.tf:resource "aws_iam_role_policy" "worker_policy" {
origin/pr/697:data/data/aws/iam/main.tf:  count = "${var.worker_iam_role == "" ? 1 : 0}"
origin/pr/697:data/data/aws/iam/main.tf:  role  = "${aws_iam_role.worker_role.id}"
origin/pr/697:data/data/aws/iam/variables.tf:variable "worker_iam_role" {
origin/pr/697:data/data/aws/main.tf:  iam_role                    = "${var.tectonic_aws_master_iam_role_name}"
origin/pr/697:data/data/aws/main.tf:  master_iam_role          = "${var.tectonic_aws_master_iam_role_name}"
origin/pr/697:data/data/aws/main.tf:  worker_iam_role = "${var.tectonic_aws_worker_iam_role_name}"
origin/pr/697:data/data/aws/master/main.tf:  role = "${var.master_iam_role == "" ?
origin/pr/697:data/data/aws/master/main.tf:    join("|", aws_iam_role.master_role.*.name) :
origin/pr/697:data/data/aws/master/main.tf:    join("|", data.aws_iam_role.master_role.*.name)
origin/pr/697:data/data/aws/master/main.tf:data "aws_iam_role" "master_role" {
origin/pr/697:data/data/aws/master/main.tf:  count = "${var.master_iam_role == "" ? 0 : 1}"
origin/pr/697:data/data/aws/master/main.tf:  name  = "${var.master_iam_role}"
origin/pr/697:data/data/aws/master/main.tf:resource "aws_iam_role" "master_role" {
origin/pr/697:data/data/aws/master/main.tf:  count = "${var.master_iam_role == "" ? 1 : 0}"
origin/pr/697:data/data/aws/master/main.tf:resource "aws_iam_role_policy" "master_policy" {
origin/pr/697:data/data/aws/master/main.tf:  count = "${var.master_iam_role == "" ? 1 : 0}"
origin/pr/697:data/data/aws/master/main.tf:  role  = "${aws_iam_role.master_role.id}"
origin/pr/697:data/data/aws/master/variables.tf:variable "master_iam_role" {
origin/pr/697:data/data/aws/variables-aws.tf:variable "tectonic_aws_master_iam_role_name" {
origin/pr/697:data/data/aws/variables-aws.tf:variable "tectonic_aws_worker_iam_role_name" {
origin/pr/697:pkg/tfvars/aws/aws.go:    IAMRoleName      string            `json:"tectonic_aws_master_iam_role_name,omitempty" yaml:"iamRoleName,omitempty"`
origin/pr/697:pkg/tfvars/aws/aws.go:    IAMRoleName      string            `json:"tectonic_aws_worker_iam_role_name,omitempty" yaml:"iamRoleName,omitempty"`

as well. I'm not sure about the bootstrap role. Maybe we should keep that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should prune:

$ git grep -i iamrole origin/pr/697 -- pkg
origin/pr/697:pkg/tfvars/aws/aws.go:    IAMRoleName      string            `json:"tectonic_aws_master_iam_role_name,omitempty" yaml:"iamRoleName,omitempty"`
origin/pr/697:pkg/tfvars/aws/aws.go:    IAMRoleName      string            `json:"tectonic_aws_worker_iam_role_name,omitempty" yaml:"iamRoleName,omitempty"`
origin/pr/697:pkg/types/aws/machinepool.go:     // IAMRoleName defines the IAM role associated
origin/pr/697:pkg/types/aws/machinepool.go:     IAMRoleName string `json:"iamRoleName"`
origin/pr/697:pkg/types/aws/machinepool.go:     if required.IAMRoleName != "" {
origin/pr/697:pkg/types/aws/machinepool.go:             a.IAMRoleName = required.IAMRoleName

and related.

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 19, 2018
@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 19, 2018
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 19, 2018
@rajatchopra rajatchopra force-pushed the iam_roles branch 4 times, most recently from c85bdb7 to d5ba752 Compare November 20, 2018 19:38
@abhinavdahiya
Copy link
Contributor

So it looks like

$ [core@ip-10-0-0-241 ~]$ journalctl -fu kubelet
-- Logs begin at Mon 2018-11-26 21:50:51 UTC. --
Nov 26 21:57:05 ip-10-0-0-241 hyperkube[4543]: Flag --client-ca-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:05 ip-10-0-0-241 hyperkube[4543]: Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:05 ip-10-0-0-241 hyperkube[4543]: I1126 21:57:05.079437    4543 server.go:418] Version: v1.11.0+d4cacc0
Nov 26 21:57:05 ip-10-0-0-241 hyperkube[4543]: I1126 21:57:05.079878    4543 aws.go:1032] Building AWS cloudprovider
Nov 26 21:57:05 ip-10-0-0-241 hyperkube[4543]: I1126 21:57:05.079917    4543 aws.go:994] Zone not specified in configuration file; querying AWS metadata service
Nov 26 21:57:05 ip-10-0-0-241 hyperkube[4543]: F1126 21:57:05.084483    4543 server.go:262] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0fc2564fadb34ac2f: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""
Nov 26 21:57:05 ip-10-0-0-241 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
Nov 26 21:57:05 ip-10-0-0-241 systemd[1]: Failed to start Kubernetes Kubelet.
Nov 26 21:57:05 ip-10-0-0-241 systemd[1]: Unit kubelet.service entered failed state.
Nov 26 21:57:05 ip-10-0-0-241 systemd[1]: kubelet.service failed.
Nov 26 21:57:15 ip-10-0-0-241 systemd[1]: kubelet.service holdoff time over, scheduling restart.
Nov 26 21:57:15 ip-10-0-0-241 systemd[1]: Stopped Kubernetes Kubelet.
Nov 26 21:57:15 ip-10-0-0-241 systemd[1]: Starting Kubernetes Kubelet...
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --rotate-certificates has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --allow-privileged has been deprecated, will be removed in a future version
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --minimum-container-ttl-duration has been deprecated, Use --eviction-hard or --eviction-soft instead. Will be removed in a future version.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --client-ca-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --rotate-certificates has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --allow-privileged has been deprecated, will be removed in a future version
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --minimum-container-ttl-duration has been deprecated, Use --eviction-hard or --eviction-soft instead. Will be removed in a future version.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --client-ca-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: I1126 21:57:15.301090    4558 server.go:418] Version: v1.11.0+d4cacc0
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: I1126 21:57:15.301374    4558 aws.go:1032] Building AWS cloudprovider
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: I1126 21:57:15.301405    4558 aws.go:994] Zone not specified in configuration file; querying AWS metadata service
Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: F1126 21:57:15.305323    4558 server.go:262] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0fc2564fadb34ac2f: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""

kubelet on master nodes is not running due to missing credentials for aws.

@wking
Copy link
Member

wking commented Nov 26, 2018

Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: F1126 21:57:15.305323    4558 server.go:262] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0fc2564fadb34ac2f: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""

Does anyone know off the top of their head why kubelet would care about listing AWS instances?

@rajatchopra
Copy link
Contributor Author

Nov 26 21:57:15 ip-10-0-0-241 hyperkube[4558]: F1126 21:57:15.305323    4558 server.go:262] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0fc2564fadb34ac2f: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""

Does anyone know off the top of their head why kubelet would care about listing AWS instances?

The InitCloudProvider wants to verify the instance ID so that it can mark itself ready. This code is supposed to be removed soon in favour of an external controller playing with the taints.. but I am not sure when it will land (or maybe it has landed already but not made it to our snapshot).

@wking
Copy link
Member

wking commented Nov 27, 2018

This code is supposed to be removed soon in favour of an external controller playing with the taints...

Looks like this feature is still beta in 1.13. So is there a way to get sub-creds in to the kubelet without empowering the entire node?

@openshift-ci-robot
Copy link
Contributor

@rajatchopra: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 3, 2018
@openshift-ci-robot
Copy link
Contributor

@rajatchopra: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws 95a9bad link /test e2e-aws
ci/prow/rhel-images 95a9bad link /test rhel-images

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@rajatchopra
Copy link
Contributor Author

Closing this PR because we are blocked on kubelet needing cloud-config+secrets to work without iam-instance roles.

@rajatchopra rajatchopra closed this Jan 4, 2019
wking added a commit to wking/openshift-installer that referenced this pull request Jan 30, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Jan 30, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Jan 30, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Jan 30, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Jan 30, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Jan 30, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Feb 1, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Feb 1, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
wking added a commit to wking/openshift-installer that referenced this pull request Feb 1, 2019
We're planning on dropping instance profiles in favor of the new
credentials operator [1], because we want AWS access to have
operator/pod/namespace granularity and not instance granularity.  Many
pods could be running on a given instance, and not all of them should
have a given permission.  While we're blocked from dropping these at
the moment due to kubelet cloud-config+secrets [2], we can drop the
user-facing knobs for this feature now.  Then pivoting the internal
approach once we get the kubelet sorted will be a non-breaking change.

[1]: https://github.com/openshift/cloud-credential-operator
[2]: openshift#697 (comment)
@rajatchopra rajatchopra deleted the iam_roles branch February 6, 2019 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants