Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container network segregation / firewalling #986

Closed
carmstrong opened this Issue May 16, 2014 · 34 comments

Comments

Projects
None yet
@carmstrong
Copy link
Contributor

carmstrong commented May 16, 2014

Application containers don't need access to etcd, for example.

@carmstrong carmstrong self-assigned this May 16, 2014

@carmstrong carmstrong referenced this issue May 16, 2014

Closed

Tasklist: Deis is ops/HA-ready #984

16 of 16 tasks complete

@gabrtv gabrtv added the security label May 16, 2014

@gabrtv

This comment has been minimized.

Copy link
Member

gabrtv commented May 16, 2014

As it stands today, deployed containers can access anything on the network including unrestricted services running on the CoreOS host (like etcd). To fix this, we need to deploy a unit file that configures iptables on all Deis machines.

Here is a first stab at network security requirements for containers:

  • Containers must be able to access the outside world
  • Containers must be able to access other containers
  • Containers cannot access the CoreOS host (SSH, etcd, etc)

Deis containers are a special case as they require access to etcd. How we accomplish this is an interesting problem. We may be able to launch a sidekick proxy or leverage some more advanced Docker networking.

I see 4 steps:

  1. Think through attack vectors from malicious containers and agree on the requirements
  2. Prototype an iptables unit that locks down traffic according to those requirements
  3. Prototype a mechanism for allowing certain containers restricted access to etcd (proxy?)
  4. Deploy the new iptables unit file across all Deis hosts using custom user-data

Thoughts?

@moretea

This comment has been minimized.

Copy link

moretea commented Jul 10, 2014

What starting with binding critical services, such as Docker and etcd to $COREOS_PRIVATE_IPV4?

@bacongobbler

This comment has been minimized.

Copy link
Member

bacongobbler commented Jul 11, 2014

@moretea could you please expand on that a bit? I'm not quite sure I understand the question.

@gust1n

This comment has been minimized.

Copy link

gust1n commented Jul 11, 2014

We heavily use etcd for watching the /services path for discovering new nodes and connecting over zeroMQ (not using your router). Restricting access to this would make us poll the DNS router instead which would make life harder for us...

@benmccann

This comment has been minimized.

Copy link

benmccann commented Jul 16, 2014

Wouldn't some applications want access to etcd for their own use? It seems like use of etcd authentication and ACLs may be a solution which allows applications access to etcd without being able to access Deis's internally used etcd keys.

@carmstrong

This comment has been minimized.

Copy link
Contributor Author

carmstrong commented Jul 22, 2014

As far as I know, there is no way to support keyspacing in etcd. They support client SSL certs, but no way to restrict keys based on different clients. ACL support is tracked in etcd-io/etcd#91, but there haven't been any updates in quite some time...

If anyone has a better way to handle this before etcd implements ACL support, we're definitely interested!

@carmstrong carmstrong added this to the 1.0 milestone Jul 23, 2014

@paulczar

This comment has been minimized.

Copy link
Contributor

paulczar commented Aug 10, 2014

most important step here is to bind etcd to private network only on the hosts. this way external entities cannot access.

you could also run etcd in a container that also runs iptables, that way you can restrict access regardless of whether or not the host ( coreos ) supports firewalling.

@davedoesdev

This comment has been minimized.

Copy link

davedoesdev commented Sep 11, 2014

Is there a way to stop one application from accessing another application?

@gabrtv

This comment has been minimized.

Copy link
Member

gabrtv commented Sep 11, 2014

@davedoesdev not currently. In fact, that goes a level beyond what we're proposing here.

If you have time I would love to have you write up the use-case in a separate GitHub issue so we can come up with some proposals for addressing it.

@iamveen

This comment has been minimized.

Copy link

iamveen commented Sep 24, 2014

Perhaps some inspiration can be drawn from the geard implementation.

I think it would be pretty awesome if fleet had something like this built-in, rather than relying on docker linking. Something about seeing all those env vars just gets under my skin :\

@davedoesdev

This comment has been minimized.

Copy link

davedoesdev commented Sep 24, 2014

It'd be great to have support for https://github.com/zettio/weave/

@intellix

This comment has been minimized.

Copy link
Contributor

intellix commented Sep 24, 2014

Also confused by this issue. I thought etcd was primarily for discovering services in your application. Like my API asking etcd for the host/port of my database. If it can't access etcd then how do I know where the services exist?

@carmstrong

This comment has been minimized.

Copy link
Contributor Author

carmstrong commented Sep 24, 2014

I thought etcd was primarily for discovering services in your application.

etcd is for the Deis control plane's services to coordinate with each other. Applications shouldn't be concerning themselves with etcd, and instead should be using environment variables to configure things like database endpoints.

@carmstrong carmstrong removed their assignment Oct 10, 2014

@davidillsley

This comment has been minimized.

Copy link

davidillsley commented Nov 16, 2014

I'm wary of the statement "In practice, this is really only a concern when clusters are running untrusted applications.". It's also a concern when you have multiple applications, one of which has a security vulnerability allowing the application access to etcd to be exploited.

Is this the appropriate forum to discuss changing that text, or should I raise another issue?

@carmstrong

This comment has been minimized.

Copy link
Contributor Author

carmstrong commented Nov 16, 2014

@davidillsley That's a great point. I think this issue is a great place to discuss appropriate changes.

@blaggacao

This comment has been minimized.

Copy link

blaggacao commented Jan 20, 2015

Out of curiousity, has been thought to swap etcd for consul as the K/V backend? It comes with acl. Might be sensible to continue using etcd for the control layer until the guys at CoreOS catch up and meanwhile introduce consul for the application panel.

@carmstrong wouldn't environment variables not making troubles when morphing and dynamic in the context of the application runtime? - not sure what that would be good for, but I would best guess for some low level multitenancy, which is managed by the application itself (or better: some intermediate controller code). Let's say a dynamic database creaction with automated credentials creation and propagation. There we are still on the hosting level.

Update:
Maybe we could conceptually think of an platform controller panel, with it's highly segregated lightweight autochtonuous backing services and platform-wide first level backing services (e.g. ceph, dns, networking) and a more flexible application controller panel which implements opinionated 2nd level backing services (eg DB, K/V, monitoring, etc), but in the sense "opinionated by the operator towards the developer", not that much "by the publiser towards the operator"...

@gabrtv

This comment has been minimized.

Copy link
Member

gabrtv commented Jan 21, 2015

Out of curiousity, has been thought to swap etcd for consul as the K/V backend?

Yes, we have actively explored Consul as it can potentially add a lot of value around health monitoring and service discovery. Unfortunately, it's also a lot of duplicated infrastructure (another raft cluster) and a significant engineering effort. As a result, it's not a high priority at this moment.

In the interest of providing better network segregation and security, how do folks feel about:

  1. Separate k/v stores (and raft clusters) for platform-level and application-level concerns?
  2. Enforcing an overlay network to isolate application traffic from platform traffic?
@blaggacao

This comment has been minimized.

Copy link

blaggacao commented Jan 22, 2015

👍
3. as a "soft" design-rule / documentation: make the better part of the application file system(s) read-only? Not sure though if this is really an effective measure, however it seems to me that in 12-factor apps this becomes an option. Or did I conceptually miss something?

@bacongobbler

This comment has been minimized.

Copy link
Member

bacongobbler commented Jan 22, 2015

The filesystem as per Heroku's cedar stack is ephemeral, so that's been the design we've been following. Same goes for any other PaaS out there*. Is there a specific reason that a read-only filesystem would be necessary?

Note: Heroku used to have a read-only filesystem in their bamboo stack which you could only write to ./tmp and ./logs, but they deprecated that in favour of documenting their dynos as ephemeral when they migrated over to the cedar stack.

*: Stackato, Cloud Foundry, Flynn, Dokku, etc.

@blaggacao

This comment has been minimized.

Copy link

blaggacao commented Jan 22, 2015

Beware, it's not an indepth opinion of mine and maybe in the wrong tradeoff scope.

The intention behind was to close down the runtime app rather completely to injection of malicios tools as it's probably the app itself which is the vulnerable point. I think this should never be enforced in any way, I just thought of building my containers this way and deploying as container images directly.

I'm just guessing loud, don't know much about actual attack patterns. However not being able to write to a filesystem sounds secure. Extra compromising tools would have to be loaded and executed directly to ram... persistent store can be monitored for/closed down against improper write access specially. It just would make any attack pattern involving fs writes more dificult.

Maybe this could be an interesting pattern for (some parts of) the control plane as well, reducing surface.

@apps4u

This comment has been minimized.

Copy link

apps4u commented Feb 5, 2015

Im not a expert in Deis or CoreOS so this might be a bad idea , but would there be a way to create VLANS then have deis containers on a vlan that can acccess the required service but have all other containers on a different VLAN that is locked from access things like ETCD , This seems like a easy solution so Im guessing this wont work or some one would of thought of it but I thought I might as well ask.

@wenzowski

This comment has been minimized.

Copy link
Contributor

wenzowski commented Feb 24, 2015

Seems to me that #3072 must be considered when undertaking this if it is to be implemented in iptables like geard does. In terms of etcd access, what about binding keys to environment varibles on the container and sighup'ing it whenever the environment variables change?

@azurewraith

This comment has been minimized.

Copy link

azurewraith commented Feb 25, 2015

In the interest of providing better network segregation and security, how do folks feel about:

  • Separate k/v stores (and raft clusters) for platform-level and application-level concerns?
  • Enforcing an overlay network to isolate application traffic from platform traffic?

Seems reasonable to me. What is the game plan for driving this home?

@carmstrong

This comment has been minimized.

Copy link
Contributor Author

carmstrong commented Feb 25, 2015

Seems reasonable to me. What is the game plan for driving this home?

For a change like this, we'd like the larger Deis community's input. Next step would be to open a proposal PR (see #2911 for a good example) adding documentation around the segregation, as if it had already been implemented. Once everyone agrees on that, implementation can commence with a high degree of confidence that it will be merged without significant changes.

@Brandl

This comment has been minimized.

Copy link

Brandl commented May 1, 2015

Dear Deis Developers,

I came across your software today and read through you documentation. It comes really promising and close to what I am looking for. I was really scared, when I came across the security section:

"Deis is not suitable for multi-tenant environments or hosting untrusted code."

If this is actually the case, I would call this a security critical bug, because if there actually is a scenario, where getting a web app hacked imposes a threat to all other containers and the actual host, that would be terrifying.

Maybe I'm just misinterpreting this issue, so what is the actual impact of this?

@apps4u

This comment has been minimized.

Copy link

apps4u commented May 2, 2015

just don’t run this if you are left users who you don’t trust like random online user. its not designed to work like that. Its safe when say a company runs all its containers on the platform
Jason Kristian | Director | Apps 4 U Pty Ltd.
ph: +61 075699 8109
mob: +61 0411 389 392
e: Jason Kristian mailto:jasonk@apps4u.com.au

On 2 May 2015, at 3:15 am, Brandl notifications@github.com wrote:

Dear Deis Developers,

I came across your software today and read through you documentation. It comes really promising and close to what I am looking for. I was really scared, when I came across the security section:

"Deis is not suitable for multi-tenant environments or hosting untrusted code."

If this is actually the case, I would call this a security critical bug, because if there actually is a scenario, where getting a web app hacked imposes a thread to all other containers and the actual host, that would be terrifying.

Maybe I'm just misinterpreting this issue, so what is the actual impact of this?


Reply to this email directly or view it on GitHub #986 (comment).

@azurewraith

This comment has been minimized.

Copy link

azurewraith commented May 2, 2015

@brandi has a point, even if you trust all your users a security breach (after all we are on the cloud) can jeopardize the security of the other containers / deis core.

@gabrtv

This comment has been minimized.

Copy link
Member

gabrtv commented May 2, 2015

@Brandl you are correct that a compromised container can pose a threat to other containers and anything accessible over the network (including the underlying host).

Ideally containers would run on an isolated-by-default network segment with explicit access grants to endpoints as permitted (e.g. other containers, control plane infrastructure and external/third-party services). Unfortunately, achieving this level of network security is non-trivial and we want to be open about that, hence the disclosure in our documentation.

I do want to be clear that there a big difference between running untrusted code by design and running it by accident via application compromise. While the net result is the same (untrusted code running inside the cluster), the latter scenario is no different in Deis than what you'll find on comparable container platforms or orchestration systems like Mesos, Kubernetes, etc.

We do intend to support multi-tenant environments eventually. We are exploring technologies along the lines of ambassadord and Apcera's semantic pipelines (not open source) to help us get there. We also need help from upstream projects like Docker and etcd. However, until we undergo an audit by a 3rd-party security vendor, it's unlikely we will remove the disclaimer re: true multi-tenancy.

@apps4u

This comment has been minimized.

Copy link

apps4u commented May 3, 2015

That is correct but they are two different things having your application hacked is different to running code that is designed to be unsafe. So running code in a container that is designed to bring down your cluster could happen if you run untrusted code but if your application was hacked then some one got access to the under laying system is not a Deis issue it would also be a issue if you ran Core OS cluster or a Docker Cluster .
So yes the platform is secure as any other like system but if you let some run a container that was designed to cause you issue then that could be a problem . Till Deis is truly safe to run in a multi tenant enviroment you should make sure yo run trusted code then you will be safe just unless you leave a whole in your application but that risk is there whether you run Deis or Not.

Jason Kristian | Director | Apps 4 U Pty Ltd.
ph: +61 075699 8109
mob: +61 0411 389 392
e: Jason Kristian mailto:jasonk@apps4u.com.au

On 3 May 2015, at 1:50 am, azurewraith notifications@github.com wrote:

@brandi https://github.com/Brandi has a point, even if you trust all your users a security breach (after all we are on the cloud) can jeopardize the security of the other containers / deis core.


Reply to this email directly or view it on GitHub #986 (comment).

@bacongobbler

This comment has been minimized.

Copy link
Member

bacongobbler commented Jul 16, 2015

related: #3812

@bacongobbler bacongobbler added the v2 label Jul 16, 2015

@nlsrchtr

This comment has been minimized.

Copy link

nlsrchtr commented Sep 24, 2015

Maybe the Authentication and Authorization features introduces into etcd 2.1 could help solve part of the problem. Combined with a network layer based security, like drafted in #3812, would make it even better.

@jokeyrhyme

This comment has been minimized.

Copy link

jokeyrhyme commented Nov 10, 2015

Regarding @nlsrchtr 's point, it does seem as though key prefixes can now be used to restrict access in etcd: etcd-io/etcd#2384
From my naive glance at the Godeps, it seems etcd 2.0 is being used. Is there work underway to upgrade to etcd 2.1 or newer?

@bacongobbler

This comment has been minimized.

Copy link
Member

bacongobbler commented May 26, 2016

bumping this thread now with v2-related discussion. To bring us all up to speed on the release candidate:

  • Deis (now called Workflow) uses Kubernetes instead of Fleet
  • kubernetes has service discovery built into the system, so etcd has been "removed" from the control plane (kubernetes still uses etcd for skydns, it's built-in service discovery mechanism)

The last thing to resolve would be true network segregation based on which namespace the application is deployed. The application should be able to communicate with any other apps deployed in the same namespace to facilitate #4173, but not with the kubernetes API server, the Workflow API server, or any other application outside of its own namespace (internally, at least). Applications should still be able to go surf the public web, as well as target applications through the router. For example, application foo can talk to application bar via http://bar.example.com (through the router), but not through the pod's IP address (10.247.144.126) or through DNS ($ ping bar) unless bar happened to be deployed in the same namespace as foo.

@deis-admin

This comment has been minimized.

Copy link

deis-admin commented Jan 19, 2017

This issue was moved to deis/controller#1216

@deis-admin deis-admin closed this Jan 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.