Skip to content
This repository has been archived by the owner on May 7, 2021. It is now read-only.

aws: kola iam role can get into weird state if user doesn't have passrole perms #1047

Open
dustymabe opened this issue Aug 21, 2019 · 5 comments

Comments

@dustymabe
Copy link
Member

If I initially run kola without the iam passrole permissions then I end up with a instance profile that doesn't have a role associated with it.

In this case I run kola on a fresh account (no CreatedBy=mantle resources) and it fails because I didn't have passrole permissions:

[coreos-assembler]$ kola -p aws --aws-ami ami-0e884738127697eb9 --aws-region us-east-1 -b fcos run coreos.ignition.resource.s3
=== RUN   coreos.ignition.resource.s3
--- FAIL: coreos.ignition.resource.s3 (0.32s)
        harness.go:507: Cluster failed starting machines: error verifying IAM instance profile: adding role "kola" to instance profile "kola": AccessDenied: User: arn:aws:iam::013116697141:user/dusty-fcos is not authorized to perform: iam:PassRole on resource: role kola
        status code: 403, request id: 08080270-c449-11e9-a7a4-a589214ff8db
FAIL, output in _kola_temp/aws-2019-08-21-1922-321
harness: test suite failed

Subsequent runs of kola won't re-attempt to fix the error (i.e. a kola role exists so continue on):

[coreos-assembler]$ 
[coreos-assembler]$ kola -p aws --aws-ami ami-0e884738127697eb9 --aws-region us-east-1 -b fcos run coreos.ignition.resource.s3
=== RUN   coreos.ignition.resource.s3
--- FAIL: coreos.ignition.resource.s3 (351.12s)
        harness.go:507: Cluster failed starting machines: machine "i-0d3611d49fc18f878" failed to start: ssh journalctl failed: dial tcp 52.201.248.149:22: connect: connection refused
) on machine i-0d3611d49fc18f878 consolening (fs/kernfs/dir.c:1503 kernfs_remove_by_name_ns+0x83/0x90
FAIL, output in _kola_temp/aws-2019-08-21-1925-336
harness: test suite failed

This test eventually fails because the there is no role in the instance profile:

$ curl http://169.254.169.254/latest/meta-data/iam/info
{
  "Code" : "Success",
  "Message" : "Instance Profile does not contain a role.  Please see documentation at http://docs.amazonwebservices.com/IAM/latest/UserGuide/RolesTroubleshooting.html.",
  "LastUpdated" : "2019-08-21T18:43:09Z",
  "InstanceProfileArn" : "arn:aws:iam::00000000000:instance-profile/kola",
  "InstanceProfileId" : "AIPARGFOZ5J262XIR3ZOJ"
}

I'm guessing we should either do a check for passrole early and fail before we even try to create the kola role, or we should check the instance profile later to make sure it contains a role before continuing. We could do both :)

@arithx
Copy link
Contributor

arithx commented Aug 21, 2019

At a minimum I think we need to make it clean up the resources if it experiences an error during initial creation. Probably also worth looking into if we can do a one-time per run check that the instance profile contains the role as with how it's currently laid out I think we'd end up checking once per cluster if not once per machine.

@dustymabe dustymabe changed the title kola can get into weird state in aws account aws: kola iam role can get into weird state if user doesn't have passrole perms Aug 21, 2019
@dustymabe
Copy link
Member Author

dustymabe commented Aug 21, 2019

After I added passrole to my user and deleted the existing kola role I got a successful test:

$ kola -p aws --aws-ami ami-0e884738127697eb9 --aws-region us-east-1 -b fcos run coreos.ignition.resource.s3
=== RUN   coreos.ignition.resource.s3
--- PASS: coreos.ignition.resource.s3 (85.21s)
PASS, output in _kola_temp/aws-2019-08-21-1937-351

@cgwalters
Copy link
Member

This is "kola prep isn't idempotent" right?

@dustymabe
Copy link
Member Author

I think it tries to be but is missing a step. It checks for the role but I don't think it checks the instance profile to make sure it contains the role.

@arithx
Copy link
Contributor

arithx commented Aug 21, 2019

@cgwalters the code currently checks if the InstanceProfile exists, if it does it immediately exits without checking the underlying roles, if it doesn't it attempts to create said underlying roles. This ended up leading to this failure case where the creation of the underlying roles failed but the Instance Profile existed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants