Skip to content
This repository has been archived by the owner on Jan 21, 2020. It is now read-only.

Support for creating and managing infrastructure for swarms #84

Open
chungers opened this issue Jun 27, 2016 · 15 comments
Open

Support for creating and managing infrastructure for swarms #84

chungers opened this issue Jun 27, 2016 · 15 comments
Labels

Comments

@chungers
Copy link
Contributor

chungers commented Jun 27, 2016

This is the top priority for supporting Cloud. Also seeking convergence with Docker4X (Editions). The requirements are

  • Ensure a Swarm-compatible networking environment exists
    • On IaaS providers, make sure subnets exist with the proper firewall rules so that Swarm manager and agent nodes can be launched. If no existing networks are specified by user, provision new networks:
      • On AWS this can be running a pre-configured CFN template (this is the approach by Docker4AWS)
      • On Azure this can be running an Azure Resource Management template.
      • On platforms that do not support resource templates / scripting like CFN or ARM templates, this will be accomplished via API calls and host configurations (eg. setting up ufw on hosts etc)
  • Launch Swarm
    • Launch Manager nodes -- on AWS / Azure this is done via supported machine images and userdata on launched instances to execute swarm init and swarm join for managers. On other platforms where there are no Docker-provided machine images, we must install, configure and launch the latest Docker Engines for this purpose.
    • Launch agent / worker nodes - same approach as manager nodes, but with different instance initialization to join the worker node to the swarm.
  • Auto-scale the swarm
    • In general, on each platform use the best practice - AWS - autoscaling group and Azure availability set
    • On other platforms like Packet and Digital Ocean this can mean active processes / nodes that can emulate the functionality to autoscaling group. Initially this can just be maintaining a constant instance count.
  • Swarm upgrade
    • Upgrades the manager nodes and agent nodes. This entails stopping / upgrading / starting manager nodes, draining worker nodes, upgrading and rejoining the agent nodes.
    • See AWS implementation using DynamoDB for coordination and expected behavior, which may have to be emulated on other platforms.
  • Load balancer integration
    • On supported platforms, integrate with native ELB solutions -- ELB on AWS, ALB on Azure.
    • Initiallly L4 routing (port to service published ports), L7, L4/L7 LB hierarchy in future releases
    • Emulate similar on other platforms possibly using nginx, haproxy.
@chungers
Copy link
Contributor Author

@friism
Copy link
Contributor

friism commented Jun 27, 2016

On other platforms where there are no Docker-provided machine images, we must install, configure and launch the latest Docker Engines for this purpose

As much as possible, I think we should also try to auto-build and publish machine images for those platforms too. If nothing else, Moby has to work there I'm guessing.

I'm guessing there's an implied assumption that there can be only Moby Linux? (I personally think that's great, btw)

@friism
Copy link
Contributor

friism commented Jun 27, 2016

@chungers there's no mention of authentication - is that assumed to be handled by Docker Cloud? I think that's a good design, but might be worth spelling out how libmachete gets that and sets it up.

It might also be worth spelling out how libmachete gets credentials to operate the IaaS, how it stores them and what happens if creds have to be rolled.

@chungers
Copy link
Contributor Author

chungers commented Jun 27, 2016

On Moby Linux:

  1. certainly a great idea and for AWS / Azure this would simplify implementation greatly.
  2. distribution - however image support may not be available on all platforms. Right now looking at API docs
    • Digital Ocean seems to have support for custom images
    • Packet seems to support only Packet-defined list of operating systems
    • GCP seems to have support for custom / sharing images
    • Note that these may not be the same as publishing supported images (as in through some marketplace mechanism)
  3. Distribution of the images may have to go through official channels? I will leave this as a question for Partner / Product Management
  4. There will still be requirements to allow users install / run Swarm on top of their own images (in many cases in larger shops have gone through internal security audits / patches).

This issue seems to be RFD-worthy, in particular because it would also involve the team in Cambridge to ensure that Moby is ready for public release. @justincormack @nathanleclaire

@fermayo - can you comment on the requirements from Cloud on the use of Moby? Thanks

@friism
Copy link
Contributor

friism commented Jun 27, 2016

@chungers agree image distribution is a mess - even on Azure it's shaping up to be major pain.

There will still be requirements to allow users install / run Swarm on top of their own images

Agreed, but we may be able to ignore that for now for the purposes of Docker Cloud since that's already a regimented setup and environment that we prescribe. I defer to @fermayo though.

@nathanleclaire
Copy link

As far as other cloud platforms go, I'm adamant that we should roll out support for other platforms very slowly and only once we start to have a really definitive picture of "what it means to be a Docker edition". Mandating that Moby Linux is used for the image could be one such requirement, but I don't think we should be in any hurry to open source Moby just to make other editions possible. Keeping Moby free from the OSS firehose right now is critical.

As tempting as it is to want to release tools for each cloud platform ASAP, AWS and Azure will hit 80%+ of customer requirements in the short term and we need to make sure that we are confident in both the product offering and expectations of partners when we roll it out to other platforms. Otherwise, it will be chaos (the definition of what a cloud edition should be is still very nebulous even to us).

Very few of the people who got drivers merged into Machine actually stuck around to ensure their quality or contribute to the core platform once the initial PRs were merged. Downstream contributors have to be supervised and followed up with and that eats up a ton of time and resources we would be better off devoting to AWS, Azure, OSX, and Windows editions. If we release buggy or half-baked editions for other cloud platforms Docker, Inc. will be blamed and held responsible, not these other cloud platforms.

@nathanleclaire
Copy link

@chungers So, here's the main challenge with libmachete on Azure as I see it:

As far as I can tell (and I've discussed this with Azure team members), Azure doesn't have any equivalent of Instance IAM roles -- at all! So almost by definition getting libmachete to work equivalent-ish to what we have for AWS today on Azure is a secrets management problem.

Basically what we're working with out of the gate trying to make Azure API requests from a VM running on Azure is that:

  1. Somehow the user needs to be able to connect to their browser portal to authenticate a given client. The way you authenticate the azure xplat CLI, for instance, is that you run azure login, and it spits out a URL that you go to to obtain a login token that you manually feed back into the CLI to authenticate that client. I'm not sure if credentials obtained this way could be re-distributed to other instances.
  2. By default these client credentials expire every 2 weeks and would need a refresh :| (hopefully they've fixed the bug that would cause them to become invalid almost immediately). So we'd need a way to refresh the credentials as well.

It used to be the case that in the old "Classic" (pre-Azure Resource Manager) deployment model one could authenticate with a account ID and certificate, but it's hard to divine from the Azure docs which one we should be moving forward with -- presumably ARM (and many features with one auth method don't work with the other and vice versa, so choosing wrongly here could really mess us up later).

@nathanleclaire
Copy link

As for Moby image on Azure, the only way to distribute custom images to other users in Azure (using ARM templates alone without any additional hacks) is to actually get the image vetted and approved in their "Azure Marketplace". Users can have "staging" images that are not public, so we could distribute them privately, but no one-click deploy (to other accounts) is possible without using their image Marketplace.

So, we are planning to work with them to begin this process ASAP.

@nathanleclaire
Copy link

FWIW, as far as initial Swarm bootstrap goes, I am still of the position that we should work with the Swarm team to have a first-class way for Swarm to do its own leader election given an arbitrarily started set of nodes. What we are doing with DynamoDB today is that we are taking the leader election into our own hands. Editions shouldn't handle distributed logic, Swarm should.

@friism
Copy link
Contributor

friism commented Jun 27, 2016

FWIW, as far as initial Swarm bootstrap goes, I am still of the position that we should work with the Swarm team to have a first-class way for Swarm to do its own leader election

+1, tracking here: moby/swarmkit#853

Also agree that we're not at all ready to involve 3rd parties into the edition maintenance process. If we do Digital Ocean / Packet / Softlayer it has to be us doing it, building Moby and templates based on whatever infrastructure we come up with for AWS / Azure.

That still leaves us to determine how libmachete will support @fermayo and @mghazizadeh's roadmap:

  • Cloud regresses to only support AWS and Azure?
  • Libmachete and Editions team take on publishing Moby machine images for the IaaS currently supported IaaS (Packet, Softlayer, DigitalOcean)?
  • Libmachete sets up Packet, Softlayer, DigitalOcean in some more primitive way, eg. on top of Ubuntu or using BYOM?

Agree with @chungers that this is RFD-worthy.

@chungers
Copy link
Contributor Author

chungers commented Jun 27, 2016

On Authn:

  1. On Azure, we will assume OAuth flow (instead of the older management certificate method). This is my understanding with Cloud
  2. For Cloud my understanding is that Cloud will have the user's access key and secret and there is a mechanism in libmachete today to support sending in the credentials via API call (HTTP POST).
  3. After the credentials are sent to Engine/ libmachete, future libmachete calls to provision instances or even launching new swarms will not require over-the-wire passing IaaS API credentials, as they are managed by libmachete.
  4. Same libmachete API will be used to update API credentials in the event of API key rotation. Triggering this update, however, is the responsibility of the client of libmachete (Cloud app server, Engine CLI).
  5. Currently the storage of secrets (e.g. SSH key, API credentials) in libmachete uses abstraction over a filesystem. The actual storage mechanism -- keywhiz, vault, some encrypted secure storage -- is TBD. There is a hard dependency here on the actual storage implementation.
  6. For the storage implementation, a distributed (and secure) filesystem is preferable over some centralized kv store that require additional management. A peer-to-peer setup like Infinit could be of use here?
  7. We will leverage AWS IamRole-ish functionality where possible or makes sense. AWS is more mature in this regard and this is a big feature gap in Azure.

On Authz:

  1. There's an effort within Project Neptuno to provide engine-API level authn and authz.
  2. From libmachete's perspective, once trust between it and its client has been established, it's the responsibility of the client to enforce authz, potentially supporting role based access to libmachete api (since infrastructure provisioning clearly require admin / super user access).
  3. The trust between Cloud client and remote swarm / engine API / libmachete TBD (maybe already addressed in Neptuno).

@nathanleclaire
Copy link

Currently the storage of secrets (e.g. SSH key, API credentials) in libmachete uses abstraction over a filesystem. The actual storage mechanism -- keywhiz, vault, some encrypted secure storage -- is TBD. There is a hard dependency here on the actual storage implementation.

So I guess maybe we just punt a bit on the actual secrets management piece, and assume we will have just a filesystem to read/write secrets from? Then fill that gap in later. Since secrets management will need to be solved for Docker in general, it makes sense to try and integrate directly with that piece for this use case instead of rolling our own if we have a pretty strong guarantee that it will arrive in the next 6 months or so.

@fermayo
Copy link

fermayo commented Jun 28, 2016

My thoughts on what you guys are discussing in this thread:

Regarding cloud providers to support
The most critical ones to support are AWS, Digital Ocean and Azure, in that order (if we base it on current Cloud stats). GCE is currently missing in Cloud but it's one of the big providers we are missing. Packet and Softlayer are lower priority and I don't think we should rush to support them. Digital Ocean, Packet and Softlayer all carry CoreOS as a base OS image so there's definitely a way to get Moby in there if the strategy is to always use Moby for simplicity/support/security/other reasons. IMHO I would prefer that than to start mixing Moby with other distros in order to be able to support more providers. But agree with you that this is a bigger discussion (RFD).

In any case, we should always support "BYON" (or in this case, bring-your-own-cluster, "BYOC"), which is extremely popular in Cloud today, so we will make sure clusters deployed outside of Cloud can be easily "registered". This will make supporting low priority providers such as Softlayer even lower priority.

Authentication and Authorization
Our current proposal (pending approval, RFD71) is for Cloud to provide a native authorization plugin (and possibly an authentication plugin) that uses Docker ID accounts and teams and orgs information stored in Cloud as the source of truth for authN/authZ. In this case, libmachete will need to deploy this plugin as part of the provisioning process. We will also need to install the plugin as part of the "registration" of already deployed clusters with Cloud (BYOC).

Endpoint information
In order to simplify the overall implementation, the current proposal is for Cloud to store an endpoint (or a list of endpoints) where the managers of a specific swarm are reachable at.

For Docker for AWS for example this will mean an ELB that always points to the managers, which libmachete (AWS controller?) keeps up-to-date should the list of managers change.
For "BYOC", the user will manage this and specify this information on registration, and keep it up to date manually.

Provider credentials on the cluster
Tough one. Cloud will definitely store provider credentials in order to do the provisioning of the cluster. Ideally the cluster is self-sufficient and can authenticate with the underlying cloud provider without any stored credentials (using instance roles). Due to the lack of support for this functionality in any other cloud provider other than AWS, I agree we need to solve this in a more generic way.

The easiest solution would be for libmachete to store credentials on the cluster (securely) and for the user to update them when necessary. If the cluster is deployed through Cloud, we could copy/sync those credentials to the cluster, and let the user keep them updated in Cloud. But definitely needs a bit more thought.

Auto-scale swarm
Not sure of the requirement here - Cloud will only need manual scaling up and down of the cluster.

Swarm bootstrap

FWIW, as far as initial Swarm bootstrap goes, I am still of the position that we should work with the Swarm team to have a first-class way for Swarm to do its own leader election

+1

@friism
Copy link
Contributor

friism commented Jun 28, 2016

@fermayo sounds good - to confirm, with BYOC, libmachete will not get involved right?

@fermayo
Copy link

fermayo commented Jun 28, 2016

@friism not if we make the user specify the managers' endpoint manually for all BYOC. However, doing BYOC on a Docker for AWS/Azure cluster, we could make it a bit more elegant and once engine ports are secured using Cloud's plugin, libmachete could create an ELB that points to the managers (and keeps it up to date) and automatically register that endpoint with Cloud, instead of making the user do it manually. WDYT?

chungers pushed a commit to chungers/infrakit that referenced this issue Sep 30, 2017
…fo_realz_this_time

Bump AMIs to 1.12.0-rc3 for reals this time
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants