Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieve credentials from instance profile #22

Closed

Conversation

saliceti
Copy link

@saliceti saliceti commented Jan 6, 2016

What

Adds option to retrieve AWS credentials from instance profile. This avoids hard coding AWS keys and greatly reduces the security risks.

Solves issue #20.

How to review

This can be tested on an AWS instance with the right profile

$ scripts/build
  • Write a test JSON, pointing to an existing file in a bucket the role has access to:
{
 "source": {
   "bucket": "[bucket name]",
   "versioned_file": "[file]",
   "region_name": "[region]"
 }
}
  • Test:
$ cat test-profile.json | ./built-check 
[{"version_id":"qLKI5ZfFKVmNX0SgbHgF_WggU9uA.dAC"}]
$ cat test-profile.json | ./built-in .
1.44 KB / 1.44 KB [======================================] 100.00 % 14.20 KB/s 0{"version":{},"metadata":[{"name":"filename","value":"concourse-state.json"},{"name":"url","value":"https://s3-eu-west-1.amazonaws.com/hectorcolin-state/concourse-state.json"}]}
$ cat test-profile.json | ./built-out ./concourse-state.json 
1.45 KB / 1.45 KB [=======================================] 100.00 % 3.92 KB/s 0{"version":{"version_id":"FC0rWcQp7_p0Y9zJBdrC6NrBeB748xuN"},"metadata":[{"name":"filename","value":"concourse-state.json"},{"name":"url","value":"https://s3-eu-west-1.amazonaws.com/hectorcolin-state/concourse-state.json?versionId=FC0rWcQp7_p0Y9zJBdrC6NrBeB748xuN"}]}
  • Integration tests can be run on an AWS instance with the right profile using the S3_USE_INSTANCE_PROFILE environment variable. AWS keys are not required anymore.
$ cd integration
$ S3_USE_INSTANCE_PROFILE=true ginkgo

Add option to retrieve AWS credentials from
[instance profile](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html).
This can be tested on an AWS instance with the right profile with the S3_USE_INSTANCE_PROFILE variable.
@concourse-bot
Copy link
Collaborator

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

  • #111155964 Retrieve credentials from instance profile

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

@vito
Copy link
Member

vito commented Jan 6, 2016

This is pretty neat, thanks for the PR and instructions! However I worry that it's a bad trend to set for resources to each implement multiple ways of acquiring their configuration/credentials. In the long run that'll end up with a lot of fragmentation spread across many different resources and their maintainers (resources being Concourse's third party extension point).

Perhaps this could be tackled elsewhere, maybe in Concourse itself and how it manages the pipeline config?

Sorry for not replying sooner to #20 - that should have perhaps sparked this discussion.

Also tagging concourse/semver-resource#4 - same goes there.

@vito
Copy link
Member

vito commented Jan 6, 2016

fwiw there's an ongoing discussion in Slack for better ways for Concourse to consume credentials

@keymon
Copy link
Contributor

keymon commented Jan 6, 2016

Replying from concourse/semver-resource#4.

Yeah, I do agree. Also having to repeat the same configuration across several resources has a smell. And actually S3 and semver functionally are fair similar but use completely different library implementations.

But despite the outcome and any future design, S3 and semver resource are basic resources and these two PR are quite minimal: In both cases the actual code is just a pair of lines, I don't think it is worth it to delay the merge for a future better implementation. With them we can protect our AWS accounts and avoid the overhead and risks of distributing and maintaining static AWS credentials.

We will be really happy to participate in the discussions and provide further feedback on the topic.

@vito
Copy link
Member

vito commented Jan 6, 2016

Does this pattern mean anything running on your worker now has access to your s3/semver credentials? Seems like that'd be bad for pull requests or any other untrusted code, and possibly only works because we haven't yet locked down the network security well enough. Could you expand on the actual mechanics of this a bit more?

@keymon
Copy link
Contributor

keymon commented Jan 6, 2016

We want to avoid the need to manage static AWS credentials and instead get the benefits of the IAM instance profiles (for details and pros see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html).

The associated role will have a limited policy of course, but you are right about the risk that untrusted code can access the temporary credential given that the tasks containers do not restrict any networking. In our case we will run "trusted" code and deployment pipelines.

But I think that is a different topic though, but I guess concourse should allow implement some short of security groups for tasks, similar to CF security groups. This, together with the a vault as you were discussed can provide a nice solution.

But for the time being, adding this small feature will allow us move forward and later adapt.

@vito
Copy link
Member

vito commented Jan 7, 2016

We already have a feature planned to restrict network access on the workers, which would break this: https://www.pivotaltracker.com/story/show/110743530

While this may unblock you now it's only a (possibly very) temporary fix and kind of goes against Concourse's design of having the workers be stateless. The code change is small, but for me the cons outweigh the pros.

What are you currently doing to manage your pipeline credentials? We use a pipeline template with {{params}} for credentials, and load up those credentials from the store when we run set-pipeline, so we never have credentials on our local machines. Something like fly set-pipeline -p foo -c template.yml -l <(lpass show ...). This makes credential rotation pretty straightforward.

If you absolutely need this change you could always push the image to your own Docker registry and point your Concourse instance at it.

@keymon
Copy link
Contributor

keymon commented Jan 7, 2016

What are the cons exactly?

And I don't understand when you say "goes against Concourse's design of having the workers be stateless" and how this is related to retrieve the credentials from instance profiles.

@vito
Copy link
Member

vito commented Jan 7, 2016

The cons:

  • Sets a precedent for having each resource implement multiple ways of getting their configuration, rather than having resources remain as simple as possible with a single configuration entry point.
  • Depends on the workers being configured in a particular way and living on a particular infrastructure. Part of why Concourse uses containers and a single config file for pipelines is to minimize the amount of configuration you have to do on the workers, so that pipelines are self-contained.
  • Relies on a security hole that we're planning to fix soon.

@keymon
Copy link
Contributor

keymon commented Jan 7, 2016

I understand these points, but I disagree with the relevance of the arguments.

First, retrieving Instance profile credentials is a common documented procedure in any AWS client.

Up to the point that the libraries used by the s3 and semver resources implement that (official aws-sdk-go and mitchellh/goamz). Actually we do not implement it in the client we do use a feature of the libraries already used by both resources.

Other projects like bosh AWS CPI, cloudfoundry cloud controller, terraform, and almost all AWS client library implement that.

I understand your concern, but we are not implementing a feature to retrieve the credentials from an alien external service or store like consul, etcd, or vault. That would indeed add complexity. Instead we are adding a first class feature in most S3 clients, implemented by the client libraries and recommended by AWS itself.

Second, about the workers requiring special configuration. The only requirement would be assign the IAM role to the concourse machine, and that is the decision of the end user that deploys concourse. Our machines are deployed on AWS and rely on multiple configurations (VPC, subnets, etc) and IAM role is just one more. I do not consider that this is adding state to the worker, but setting security restrictions and roles. For instance, this wouldn't be different than depend on a allocated public IP to pass a firewall.

Third, about closing access to AWS metadata. Although I am happy that additional security settings are added, I hope these will be configurable enough to allow the workers and resources access the AWS API and metadata services, or any other service we consider necessary. In that case, we would explicitly allow the worker to access the AWS metadata service.

Finally, I want to explain our specific case: We want to use a concourse instance to orchestrate the deployment pipeline of a CF installation, from bosh-init to final deployment. We use IAM profiles in all the stack: terraform, bosh-init, bosh, cloud-controller, etc... The concourse host has, of course, an IAM profile assigned so these tools can work. The only missing component forcing us use static credentials are the concourse resources.

Hope this helps to clarify my point of view.

BTW, thank you for your time answering me and prompt responses.

@vito
Copy link
Member

vito commented Jan 7, 2016

No problem, and thanks for staying civil through this; I understand it's pretty frustrating to have a PR kind of shut down like this.

Point 1: I'm not worried about the code in this PR or the API it's using, I'm worried about the abstractions that this leaks through and the result of this pattern being followed to all other resources that consume credentials (almost all of them). Resource authors shouldn't have to carry around a swiss army knife of secret stores to integrate with. They should have a single entrypoint. It doesn't matter if each integration is individually easy to implement; if you repeat it for each secret store, and repeat that for every resource, and repeat the sad realities of things like retry logic complicating matters, and updating each resource every time an API changes, etc. etc... things get way more complicated.

Whether it's a one-liner talking to AWS using a library the resource already happened to use, or whether it's importing new libraries to integrate with Vault or some other credential store, it's the same issue. There are many small changes using well-documented APIs that could be added to Concourse that help someone in the short-term but go against its design ethos and compromise its vision in the long-term.

Point 2: even though it's a small amount of state, it's still an additional configuration primarily to integrate with your pipeline, as opposed to configuration to get the worker to run at all. This makes the pipeline configuration no longer portable.

Point 3: we'll keep it in mind but I'm pretty skeptical of things requiring this kind of access for legitimate reasons (usually it's an abstraction or feature missing somewhere else). Admittedly our own pipeline relies on this for the testflight job, which we should fix (not like we haven't tried). We could also let privileged builds have full access to the network, which feels in line with how privileged works already (only used for trusted builds).

I don't disagree with your use case, I disagree with this approach. I'd like to find other ways of implementing this that don't contradict Concourse's design patterns behind resources and stateless workers. In the short-term, rather than having us contradict our own design, you could just use your own resource that has these changes applied.

@keymon
Copy link
Contributor

keymon commented Jan 7, 2016

One last thought about this:

"This makes the pipeline configuration no longer portable"

Actually If this PR are merged, the pipeline can still be defining the access_key_id and secret_access_key. If they set as empty strings, it will try to use IAM profiles, if not, it will use the static credentials.

Because that, the pipeline is still portable, exactly the same, and the user can still use static credentials if running outside the instance with the role.

"even though it's a small amount of state, it's still an additional configuration primarily"

If this PR is not merged, in that case it forces us to:

  1. Fork s3 and semver resources, and maintain them, and build images and/or bosh releases.
  2. Deploy the new resource in the concourse workers. This makes our worker stateful, because it requires the resource to be installed. Additionally, in consequence:
    • We can not use the vanilla concourse-lite for bootstrap pipelines, instead we will have to run a provisioner in vagrant or create and maintain our custom AMI.
    • We will have to add the new configuration to the concourse bosh manifest.
  3. Our pipeline won't be portable and reusable by third parties anymore, because uses a custom resource.

@vito
Copy link
Member

vito commented Jan 8, 2016

Closing this since we won't be merging it; thanks again for the PR though. We'll continue to think about better ways to bring credentials into the pipeline config.

@dcarley
Copy link

dcarley commented Jan 8, 2016

Hi, I work on the same team as @saliceti and @keymon. Some closing thoughts/details..

What are you currently doing to manage your pipeline credentials? We use a pipeline template with {{params}} for credentials, and load up those credentials from the store when we run set-pipeline, so we never have credentials on our local machines. Something like fly set-pipeline -p foo -c template.yml -l <(lpass show ...). This makes credential rotation pretty straightforward.

That's what we do currently with static (pre-generated) AWS access keys. We can't continue using static access keys because of the risk of them being leaked. IAM instance profiles is Amazon's solution to this problem and their recommended best practice. It makes access keys directly available to the instances which need them and regularly rotates them for you.

I understand the concerns about fragmentation and state, but it seems that any credentials solution is still not going to allow us or other people to use instance profiles. They have to be a property of the worker instance (which is assigned a profile) and a function of an AWS library (which uses the metadata service) running on that worker. Even if we hacked together something to get instance profile keys out and put them back into configuration, we couldn't be sure that the keys would be for the correct worker or hadn't since been rotated.

We're happy to help with further work towards this, but for now we'll have to fork any resources which use S3.

If you absolutely need this change you could always push the image to your own Docker registry and point your Concourse instance at it.

Are you referring to overriding the Docker image of an existing resource by using groundcrew.additional_resource_types? Should that be documented, or is it intentionally not?

@vito
Copy link
Member

vito commented Jan 8, 2016

Yeah, it may be that we simply can't recommend that people use instance profiles. That's not too surprising though considering there are many services that AWS provides that can couple you to their infrastructure, and we make a point to not use them when the abstractions just don't fit together or when it leaves the problem unsolved for other infrastructures or deployment schemes. For example, we currently rely on the ELB for SSL termination, and plan to push that down into Concourse itself. There are likely other tools for credential rotation that we could recommend instead, and that fit more cleanly into the Concourse model.

groundcrew.additional_resource_types is the property to set. We may not mention it at concourse.ci; we've been leaving some of the nonessential property docs up to the release itself so there aren't two sources of truth:

http://bosh.io/jobs/groundcrew?source=github.com/concourse/concourse&version=0.70.0#p=groundcrew.additional_resource_types

@vito
Copy link
Member

vito commented Jan 8, 2016

Relevant discussion for supporting instance profile usage in Vault, which I think we could much more easily work into a Concoursey credential rotation scheme:

hashicorp/vault#307

@rogeruiz
Copy link

rogeruiz commented Jun 13, 2017

Since this came up again for us recently because we just simply don't want to bother rotating credentials for AWS in Concourse and because I believe that if Concourse is supporting s3, an AWS service, it should support both popular use-cases ( creds and Instance Profiles ), here's another example of a fork that is required because we don't the support we need in the concourse/s3-resource. 😕 😞

https://github.com/18F/s3-resource

@tracyde
Copy link

tracyde commented Sep 28, 2017

I would like to re-open this discussion and come up with a solution that would allow for instance profiles to be utilized. Currently working on a project in an AWS environment where the use of access and secret keys are not allowed and the only alternatives are instance profiles or custom STS token service.

Don't think I am the only one that would benefit from this discussion. How to proceed?

@pivotal-issuemaster
Copy link

@saliceti Please sign the Contributor License Agreement!

Click here to manually synchronize the status of this Pull Request.

See the FAQ for frequently asked questions.

@keymon
Copy link
Contributor

keymon commented Sep 29, 2017

@tracyde I got to say that thinking in perspective, I kinda agree with @vito that would be a bad idea to include such feature into the s3-resource itself, due the lack of security in the AWS metadata endpoint. If you allow a worker VM to assume a role, any container can do that.

But even the new creds management feature from concourse (I think, might be wrong) does not solve this issue (see https://concourse.ci/creds.html).

Currently you can use the forked version from this PR.

But in the "medium" run, I am considering implement a PoC with the following approach:

  1. First, a task (or custom resource) would generate a temporary AWS STS token. The task would output a file with the token. Options to get the token:
  2. A [wrapped S3-resource] would read that file, intercept the JSON from concourse, inject the credentials and call the original S3 resource.

The reason why I propose to split it in two steps, is because you can use worker-pools to only have one container with permissions to generate STS tokens. For instance, you can use a colocated worker with atc as the privileged container that runs the step to get the credentials.

For 2, I did something similar to inject credentials from credstash (AWS dynamodb+kms cred store) in her e: https://github.com/redfactorlabs/concourse-smuggler-resource/tree/master/examples/credstash_wrapper

shaun7pan added a commit to FidelityInternational/s3-resource that referenced this pull request May 5, 2022
[1]: concourse#22

Signed-off-by: Samson Yu <samson.yu@fil.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants