New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certificate management should be done generically across all resources #1027

Closed
vito opened this Issue Apr 24, 2017 · 24 comments

Comments

@vito
Member

vito commented Apr 24, 2017

(I've preemptively labeled this with anti/contributor-burden and anti/worker-state as both are relevant to this issue as trade-offs, depending on the approach.)

What challenge are you facing?

Today, each resource which talks to an external service (i.e. all resources) have some form of the following configuration:

  • insecure_skip_verify / ignore_ssl / skip_ssl_verification / etc.
  • ca_cert / ca_certs

This is a substantial burden to both resource authors and users. Every author must implement these, and ideally they'd all agree on a naming convention (we've already lost that battle). Every user must also then configure their certs across all resources, which is especially annoying if the format is inconsistent.

A Modest Proposal

One proposal would be to just bind-mount /etc/ssl/certs, read-only, into each resource container. We actually tried this already, and it does work, with some gotchas:

  • /etc/ssl/certs often contains symlinks to other paths outside of /etc/ssl/certs, requiring them to be bind-mounted, too.
  • /etc/ssl/certs is platform-specific, although it seems that it's common for platforms that store them in another location to at least bind-mount /etc/ssl/certs to it (i.e. Fedora).
  • It's possible that /etc/ssl/certs or, more likely, the symlinks contained therein to clobber some part of the resource rootfs.

Other caveats being that you then have to configure CA certs on your workers, which on face value goes against Concourse's "stateless workers" mantra, but arguably it's similar in spirit to proxy configuration, i.e. it's configuration the worker needs to reach the network, and so the worker is the source of truth. This is especially true if the worker is configured with a man-in-the-middle SSL proxy, for example - the cert for the SSL proxy has to be configured.

Call for alternate proposals

We've already explored this particular proposal, so this issue is both to surface our goals and to foster discussion around alternative approaches.

@concourse-bot

This comment has been minimized.

concourse-bot commented Apr 24, 2017

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

  • #144237061 Certificate management should be done generically across all resources

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

@vito

This comment has been minimized.

Member

vito commented Apr 25, 2017

@julz Have you or @ematpl thought about this (the original proposal) before as a potential Garden feature? Certificate propagation is a problem that both Concourse and Diego have had to tackle, and it's a lot of platform-specific orchestration which kind of leaks through the Garden abstraction level. This feels somewhat similar, although much more wart-covered, with propagating /etc/hosts and /etc/resolv.conf in.

@haydonryan

This comment has been minimized.

haydonryan commented May 2, 2017

I'm hitting this issue at a large enterprise security conscious customer. We are using concourse to deploy PCF, in vSphere environment. They use their internal CA certs to sign all internal facing resources including their bitbucket server. Because there is no ability to add a CA cert, they're having to disable SSL, which violates their security conventions. Most on prem customers use internally signed certs so this would be an important feature for the enterprise.

@billimek

This comment has been minimized.

billimek commented May 2, 2017

Same boat as @haydonryan here.

If I'm reading the issue correctly it would require managing the installation of that cert to the host running the concourse worker (and somehow resolving the rats-nest of symlinks?) in order to solve this using the proposed solution.

@DRuggeri

This comment has been minimized.

DRuggeri commented May 2, 2017

Yeah, this is painful, @haydonryan. I'm also suffering from the same affliction.

@vito vito removed the scheduled label May 8, 2017

@clarafu clarafu added this to the Staging milestone May 9, 2017

@chendrix

This comment has been minimized.

Contributor

chendrix commented May 11, 2017

concourse/semver-resource#31 attempted to add skip_ssl_verification to the semver-resource

@bonzofenix

This comment has been minimized.

bonzofenix commented May 15, 2017

@haydonryan @vito we are hitting this issue at Allstate. We are encrypting all our internal traffic at a firewall level with a custom CA that needs to be trusted by all resources. :(

@pn-santos

This comment has been minimized.

pn-santos commented May 16, 2017

We at Yoti also use a custom CA for our internal docker registry... it's a pain to need to include ca_certs in every pipeline and task definition in docker image resources.
It would be nice to have a single point for configuring any custom CAs that should be trusted by all workers and associated containers

@topherbullock

This comment has been minimized.

Member

topherbullock commented Jun 1, 2017

Considering pushing support for container cert mounting down to Garden ; cloudfoundry/guardian#83

@labrown

This comment has been minimized.

labrown commented Oct 18, 2017

I can echo haydonryan's comment of May 2nd pretty much exactly. I'm in the same situation using concourse to deploy PCF in an environment with pervasive internal self-signed certificates.

I like calebwashburn's suggestion of June 2nd, using bosh to insert trusted certificates into the workers and then having all images pick up those certs automatically. That would solve my issue.

@DRuggeri

This comment has been minimized.

DRuggeri commented Oct 31, 2017

Thanks for checking in @vito. I understand your point about the "me too's".
I'd like to follow up and inquire: is there anything we from the community/as consumers can do to help? Are the use cases clear enough? Has enough been said about how trust is managed across the use cases (like: BOSH managed, just plain old ca-certs file, or as a separate file altogether)? Do you need testers or folks to bounce ideas off of?

I'm very eager to see this available so am happy to help however I can... IF I can

@jama-pivotal jama-pivotal added this to Icebox in Runtime via automation Nov 13, 2017

@jama-pivotal jama-pivotal removed this from Icebox in Operations Nov 13, 2017

@jama-pivotal jama-pivotal moved this from Icebox to Backlog in Runtime Nov 13, 2017

@topherbullock

This comment has been minimized.

Member

topherbullock commented Nov 15, 2017

Here's the current plan for tackling this:

  1. Create a magical ssl-certs volume on each worker on startup for certs to live in
  2. Add certs to the the ssl-certs volume by either :
    • a) creating the volume using the import strategy, and importing from the workers' /etc/ssl/certs directory (resolving and copying over all symlinked files)
    • b) streaming in a folder of certs configured on the ATC
  3. Bind-mount the ssl-certs volume on the worker at /etc/ssl/certs when creating any resource containers

The options for hydrating the volume with certs - 2a and 2b - each have their own tradeoffs and impact on how this feature is used:
2a is probably the most 'Concourse-y' option in that it maintains the rule of workers being externally managed, telling the ATC their specific network configuration or hardware capabilities via tags. Gotcha is this involves a new "setup" command executed across all worker deployment scenarios we support, and will need to be configurable. Deployments which use the concourse binary will be easier (just add to the worker), but for BOSH the groundcrew job is probably the best place to put it.. would be nice if it could be done in one place.
2b would allow an operator to ensure any worker which registers has the certs required, and would reduce the operations overhead, and also means workers don't have any state on them, which reduces guesswork on how resources behave on individual workers.

@labrown

This comment has been minimized.

labrown commented Nov 17, 2017

I'm still new to concourse, but I like option 2b better. "just works" is easier to maintain, if you ask me.

@jama-pivotal

This comment has been minimized.

Member

jama-pivotal commented Nov 20, 2017

Going to make this separate issues to track the "work to be done™":

  • Look into how we can create a volume on the workers when they start (import volume)
  • Look into both scenarios for getting the certs on the workers
  • Test that the resources work using the certs in certain scenarios (MITM SSL Proxy, Private Docker Registry, Minio, etc)
@vito

This comment has been minimized.

Member

vito commented Feb 13, 2018

🎊 The eagle has landed! Accepted via BOSH and also via binaries on Fedora.

This will be shipped in 3.9, on by default for the BOSH release and opt-in for the binaries (since we can't assume as much about how they're deployed/configured). We can polish this up a bit more and have it on by default in the binaries in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment