-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OrleansDockerUtils -> Docker and Docker Swarm Clusters for Orleans #2571
Comments
Cross-referencing the concern from #2542 that applies here as well - #2542 (comment). |
I'm closing this issue and the related PR since it isn't required anymore. With the latest release of Docker 1.13 we don't need an extra provider to run Orleans in it. I'll create doc page and a sample to explain how to make it work. |
Interesting. Please do. I will be interested to read. |
@galvesribeiro - did you ever get around to creating the doc page? I am just starting to investigate how to get our service running in Docker and could use some direction :-). |
@mtdaniels - Will try get to it today. If you have any immediate questions please ping me on gitter and I'll try to help. |
@gabikliot and others interested in Orleans+Docker check out #3004 and #3005 |
This issue is for discussion about the design of
OrleansDockerUtils
so we can add native support to docker containers as Orleans Silos. The implementation is on PR #2569Goal
The idea behind this effort is to enable an Orleans Silo to run in containers with minimal efforts by leveraging both Orleans' provider model and Docker APIs. This PR introduces a
DockerMembershipOracle
and aDockerGatewayProvider
following the design of #2542 but instead of use SF services, use Docker APIs.Some context
Without going too deep, a brief explanation about common topics on docker ecosystem so the reviewers feels more comfortable with it.
Docker Daemon
This is the Docker process running in the underlying OS (
dockerd
on Linux and OSX anddockerd.exe
on Windows). This process is responsible to manage the containers and provide a way (usinglibcontainer
) to talk to underlying OS' kernel in order to provide the container runtime environment. Each docker host (in production or even your development machine and regardless if single-node or clustered) has one instance of this process. This process also expose a API so 3rd party developers can integrate/automate container management.Docker Swarm
Swarm is the most used clustering (and orchestration) technology for Docker and is created by Docker Inc. Swarm implement almost 100% of the Docker APIs which means, that Docker CLI (and API) commands issued to swarm has (almost) no difference to the ones issued to a single node Docker Daemon. While in Swarm mode, commands are issue to the Swarm Manager nodes and it apply the request actions across the cluster of Docker Daemons properly.
Docker Compose
YAML file describing grouped/correlated services/containers and the dependencies between each other as long as networking and volume mapping settings. This file can be pushed to a single Docker Daemon or to a Swarm cluster and will deploy the whole set of containers, networks and whatever artifact described in it. It is the most used composition tech for Docker containers nowadays.
Container
A container is basically an isolation share of the host kernel with its own processes. The PID 1 of each container is the command passed when container start or described in the
Dockerfile
(the container image description file). In our case it would be our Silo host application. If for whatever reason that process crash or return/exit, the container will also exit.Labels
Docker allow one to add metadata to container and/or images thru labels. They are essentially key/value pairs. It can be anything as long as it can be string represented (key or value) and
JSON.stringfy()
(only value). You can even apply a label without a value.Design
I'm trying to make simple the clustering of Orleans silos by leveraging the existent features from Docker. I'm not going to enter in details about how Swarm (or any other tech) cluster work since it is beyond the scope here, but some aspects will be mentioned when necessary. Regardless of running with a single Docker Daemon, a Swarm Cluster, or whatever other clustering stack, as long as it implement the Docker API, we should be fine.
The main idea is to follow almost the same implementation of SF Membership Oracle and Gateway provider, except that instead of having partitions and SF Service discover services to keep the cluster data, we are not keeping the cluster data anywhere! (I know it sounds crazy, but keep reading)
4 labels where defined. Those labels can be applied at each container at startup, in
docker-compose.yml
, or in their images so it is always applied on every container that derive from it (they can be overridden while starting a container).IS_DOCKER_SILO = "com.microsoft.orleans.silo"
-> Should be set on docker images/containers which run an Orleans Silo hostSILO_PORT = "com.microsoft.orleans.siloport"
-> The inter-silo port used in the cluster for all SilosGATEWAY_PORT = "com.microsoft.orleans.gatewayport"
-> The client gateway portGENERATION = "com.microsoft.orleans.generation"
-> The silo generationDEPLOYMENT_ID = "com.microsoft.orleans.deploymentid"
-> This is the key for a deployment using this component. Will make sure that everyone involved in the Orleans Deployment (either client or silo) will talk to everyone on the same deployment.Besides all the common Orleans Membership protocol, ping, etc which is already known, and all the implementation of SF Membership Oracle, here is how we describe the two key components of this Membership Oracle for Docker:
DockerSiloResolver
-> This class is responsible for read the Docker API (now with aTimer
) and refresh its silo list, and publish it toIDockerStatusListener
listeners. It query Docker APIs for all containers that have a givenDEPLOYMENT_ID
label set to thedeploymentId
andIS_DOCKER_SILO
with no value. This ensure that even if there are other containers using that deploymentId (lets say if the developer uses it as a general aggregate of containers) for something else, only the containers which is suppose to have silos are returned.DockerMembershipOracle
-> When initialized, register itself to theIDockerSiloResolver
registered on DI, and wait for updates. All its implementation is exactly a copy from the SF one, so threre not much to explain about it. (I would suggest later in other PR to make this class abstract or something else so we can remove this boilerplate code)DockerGatewayProvider
-> When initialized, it register itself with aDockerSiloResolver
(no DI on client!!!) and wait for updates. This inherit most of the implementation of SF one and uses the same filter logic but only returns to the client the silo which have gateways installed.Yes, now I'm assuming all silos in the cluster are gateways. We can probably change that and ask for ones withGATEWAY_PORT
label in it as well. For the sake of V1, we would live with it.This implementation works. Silos come up and down (and crash) and they are refreshed on the silo/gateway lists. You are either UP or Down. Since a container is immutable, you don't restart a container because the silo process is dead. If it die, it is automatically killed and discarded. You need to start another fresh container, which will have another container name, IP address, ID, etc.
Silo health
The common MBR we have (Azure, SQL, AWS) rely on a consistent persistence table to store the cluster's silo health. Inside the membership protocol, silos "ping" each other and if it don't get a reply, it suspect that this silo is dead and write its own identity+timestamp of that attempt to the suspicious silo record in the MBR table. If the number of suspicious are > than configured threshold, that silo is marked as dead and no messages/activations are sent to it.
Ok. Given that context, we have a problem with this implementation that rely on Docker...
We don't have a table! This implementation don't have a
IMembershipTable
. We are instead implementing aIMembershipOracle
. The idea behind this provider is to rely onDocker Labels
to aggregate silos and gateways by attaching labels to the containers part of the cluster.There are 2 ways to detect if a silo is alive in this implementation:
running
state.HEALTHCHECK
tag inDockerfile
-> This will run an arbitrary command inside the container. We can use it to try connect to the silo ports (common approach on linux based containers). If it doesn't connect or timeout, Docker will kill the container and the next refresh on the MBR Oracle will catch that change.The problem is, what if the other containers/silos part of the cluster can't ping another silo? For example in a network partition... That is why we have the votes. The problem is that we can't store anything in Docker. The labels aren't updatable and the Raft implementation on Swarm, is just for Swarm and we can't touch.
My first idea was to use (another common approach on unix systems) to create a lock file.
Suggested approach to votes
The votes would happen this way:
In case of a network partition (or whatever other reason), SiloB Oracle also keep checking its own directory, so it can suicide in case of the files are > than allowed threshold.
I know it is not the best thing to do (deal with files) but I don't see a way to make votes to work. Suggestions are appreciated.
IMHO this is enough for a V1 but I appreciate any suggestions.
Challengers for a V2
Label updates
Since containers are immutable, it can't be changed/updated. However, labels are metadata essentially. Docker allow you to update a label in runtime but there are two downsides:
GENERATION
value at silo start and update it.Docker event stream
Would be nice if instead of pooling the Docker API only, we could listen to Docker API stream of events like
container_created
,container_running
,container_dead
and react accordingly to update the silo list. However, the stream API is very unstable/experimental at the moment and we shouldn't rely on that yet because only Docker Daemons on experimental mode turned on will emit them. Production environments like AWS or Azure ACS doesn't have it.Kill an unresponsive silo
Somehow we could kill a container even if it is running but become unresponsive. This is just a matter of invoke an API. I just don't know the moment for it or how the best way to declare a silo dead so this API can be invoked.
I think that is it.
cc: @gabikliot @sergeybykov @ReubenBond
The text was updated successfully, but these errors were encountered: