Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker documentation #3759

Closed
wants to merge 4 commits into from

Conversation

benjaminpetit
Copy link
Member

Rewrite of the Docker documentation to use "Sample-Docker" sample, and add a small page regarding Orleans networking when using containers.

I am currently writing a tutorial to deploy using AKS

@dsarfati
Copy link
Contributor

dsarfati commented Dec 7, 2017

We are currently using orleans in AKS if you want some help or reviewers let @seniorquico and I know

@benjaminpetit
Copy link
Member Author

@seniorquico @dsarfati yes yes yes, definitely yes, I am trying to write a tutorial from scratch to deploy on AKS, but since you have more experience than me, please feel free to contact me on gitter so we can discuss!

@galvesribeiro
Copy link
Member

As we discussed offline, here are my 2 cents on this PR...

I believe it is a good and simple (as the doc name state) hello world with docker. My only concerns is that we are misleading people suggesting this doc and sample as the official supported way of running Orleans in docker as we do with other documents like SF and Cloud Services.

As I told you, there are many misunderstandings on this document regarding how Docker works in real world scenarios with Swarm/compose.

The doc I wrote and that is being deleted in this PR, show a real world complete use case for docker-compose/swarm for production, and docker-compose for development/debug. I'm still using it for my daily work and side projects and it works perfectly for me and for other on the community that asked for that document to me. The only thing lacking on that document is Kubernetes for both dev and production which I didn't had time to finish the doc/sample yet and update the image names to use latest .Net Core 2.0 images. All the rest remain the same.

So, I still think introducing this document/sample is misleading people to think that it is supposed to run this way, and will definitively bring more confusion and questions regarding how to run Orleans on containers in real life.

Don't take me bad, I'm just saying that the other doc is a complete real world scenario (even though it need updates) and its a shame it will be deleted in favor of this document which doesn't express how you are suppose to use Orleans in Docker.

@seniorquico
Copy link
Contributor

@galvesribeiro There are some notable differences between the Docker and K8s networking models:

https://kubernetes.io/docs/concepts/cluster-administration/networking/

  • all containers can communicate with all other containers without NAT
  • all nodes can communicate with all containers (and vice-versa) without NAT
  • the IP that a container sees itself as is the same IP that others see it as

We are running an Orleans cluster in Azure AKS using the AzureTable membership provider. The client and silo pods run in the same K8s cluster and live on the same VNet (the Azure AKS networking configuration is similar to that on GCE). The Azure AKS team pledged in a recent webinar to enable support for custom networking configurations in late Q1 or early Q2 2018. We use Minikube for local dev/test.

@benjaminpetit With the alignment and release of Google Kubernetes Engine, Azure AKS, and Amazon EKS, I can see the benefit to having a simplified guide for users of these environments. I'm happy to try and pitch in. However, after a quick glance, I agree that the existing documentation calls out a number of gotchas and include some important notes that appear to have been dropped.

@galvesribeiro
Copy link
Member

@seniorquico yes, I know those differences in terms of networking. But the overall concept of containers remain the same. That was my point.

@SebastianStehle
Copy link
Contributor

Are you going to fix the AppDomain.CurrentDomain.ProcessExit stuff for 2.0?

@ReubenBond
Copy link
Member

@SebastianStehle is there an issue on that? If not, could you open one?

@SebastianStehle
Copy link
Contributor

The issue is mentioned in the docs

@benjaminpetit
Copy link
Member Author

@galvesribeiro @seniorquico Yes my mistake, I think we should not drop the current documentation, there is value in it. Since the current page is more a step-by-step instructions, does it make sense to move it to the tutorial section?

Then from here we can add links to the new documentation? For example:

  • Add a link to the networking consideration
  • Move the debugging part to a subsection in the documentation

ect.

What do you think? Can we sync sometimes on gitter or skype to discuss this? I don't have experience on real production environment using docker/swar/k8s yet (only using it on some small side projects), and your experiences on that would be very valuable.

@ReubenBond yes the issue is the #3621 And yes, we are going to fix it for 2.0, and find a work-around on windows if needed.

@seniorquico
Copy link
Contributor

@benjaminpetit I can make myself available for a Gitter/Skype discussion. I can monitor Gitter for time proposals, or feel free to reach out via email: kyle@zapic.com


It is important that silo and client can communicate between each other. The easiest way to ensure that is that they both are on the same network.

When a silo starts, it will write its IP Adress on the Membership Table. This entry will then be used by other silos and clients from the cluster to interact with it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Adress' typo

@galvesribeiro
Copy link
Member

@benjaminpetit I'm ok with your suggestion. I just wish that we should make sure everyone looking for a complete end-to-end scenario are covered.

@veikkoeeva
Copy link
Contributor

veikkoeeva commented Dec 13, 2017

Recoding Metaparticle and using Docker from .NET here as a potential point of interest: https://metaparticle.io/tutorials/dotnet/.

<edit: https://github.com/Azure/open-service-broker-azure .

@ifle
Copy link

ifle commented Dec 29, 2017

Is there any progress? The docker sample in the master branch is out of date. How to deploy docker to production?
@benjaminpetit @galvesribeiro @seniorquico @SebastianStehle @ReubenBond
Can someone write a blog about the real experience of using docker in the production? Are there any problems using Docker with Service Fabric?

@benjaminpetit
Copy link
Member Author

Yes, still working on it...

I don't have any experience running production service using docker but it would be awesome if someone could share some experience!

@seniorquico
Copy link
Contributor

Here's an overview of our team's experience setting up dev, test, and production workflows around a Kubernetes environment.

We have a project with an Orleans cluster and an ASP.NET frontend. It was built to run in a Service Fabric environment on Azure on Windows VMs. We recently upgraded to the Orleans 2.x beta packages, ASP.NET Core 2, and changed the projects' target frameworks to netstandard2.0/netcoreapp2.0. Our project now runs in a managed Kubernetes environment on Azure on Linux VMs. The upgrade was relatively easy, only a few cross-platform API gotchas IIRC.

Of note, we were able to retain our Orleans cluster membership strategy (Azure Table storage) thanks to the networking requirements imposed by Kubernetes and the networking implementation provided by the managed Kubernetes environment.

Here's a simplified and minimal sketch of our system:

k8s

We currently use Visual Studio 2017 and the Visual Studio Tools for Docker extension. The extension provides a Docker project type, a docker-compose project template, and the Visual Studio hooks to run/debug in containers. We have some devs working in Visual Studio Code, but porting the run/debug commands to Visual Studio Code tasks is still in my backlog.

We have a couple of workflows for running dev environments. From Visual Studio, we can run/debug docker-compose/containers using the extension. This currently requires Docker for Windows with a Moby VM. Linux Containers on Windows (LCOW) isn't quite ready (TODO: cite my sources). Depending on the task, we can also run/debug the projects directly on the host OS. From Visual Studio Code, we currently just run/debug the projects directly on the host OS. We create a Minikube environment for running local test environments (which also doesn't support LCOW yet).

Here's an abridged view of the Visual Studio solution:

vs-docker-project

The extension will run/debug by invoking docker-compose with the docker-compose.yml, docker-compose.override.yml, and docker-compose.vs.debug.g.yml (auto-generated) configuration files. From what I read, Visual Studio Team Services will invoke docker-compose with the docker-compose.yml and docker-compose.ci.build.yml configuration files. We adopted the Visual Studio Team Services pattern and run it on our Shippable CI nodes with just a few additional settings layered in with the docker-compose.ci.build.shippable.yml configuration file like so:

$ docker-compose -f ./docker-compose.ci.build.yml -f ./docker-compose.ci.build.shippable.yml up
$ docker-compose -f ./docker-compose.yml build
$ docker-compose -f ./docker-compose.ci.build.yml -f ./docker-compose.ci.build.shippable.yml down

Our Shippable CD pipeline pushes new container versions to the Kubernetes deployments in our Azure test environment, runs E2E tests, and pushes approved container versions to the Kubernetes deployments in our Azure production environment.

Most of our Docker configuration was generated. If anyone is interested in this approach and has any specific questions about the config, I could look into posting more specific information and/or publishing a sample project.

@ifle
Copy link

ifle commented Dec 30, 2017

@seniorquico thanks. We have application in active development. We want to deploy it to Azure. We under consideration about what is best way to publish orleans application to azure: service fabric as stateless service, docker on service fabric or kibernetes.
We also have the web application and orleans backend. Our legacy web application still legacy asp.net mvc\web pages. Our application is not Core applicatiion.
Again, thanks a lot.

@seniorquico
Copy link
Contributor

seniorquico commented Dec 30, 2017

@ifle The "best" way is likely a subjective design decision and/or a technical requirements/limitations decision. I'll admit it-- we were a bit overwhelmed at first by the various architecture options due to the rapidly changing service offerings from Azure and other cloud providers. Just as an Azure example, this marketing page lists the following options for running containers:

...and this list was completely different several weeks ago! Given the specific requirements you highlighted...

I think Service Fabric is currently the only way to run a Windows container on Azure. However, I recall seeing Windows containers on the product roadmap for a couple of the above service offerings. Alternatively, you don't have to run the ASP.NET MVC/Web Pages frontend in a container. Prior to attempting a full container-based deployment, we experimented with running our ASP.NET Core frontend in Windows-based App Service environments. The "disadvantage" we encountered with this approach was the need to secure frontend and Orleans silo communications using a point-to-site VPN. That may not be a problem for your application, though. Also, if TLS support gets baked into Orleans, the point-to-site VPN may not be necessary.

I was new to both Service Fabric and Kubernetes when we started our project. IMO, learning and deploying Kubernetes has been easier and resulted in a more stable deployment. I'm certain the stability aspect is entirely my fault, but it contributes to my reasoning that Kubernetes has been easier. YMMV. I'm sure someone else can/will argue the opposite. Coupling our experience with the scale of the Kubernetes community, the speed of Kubernetes development, and the recent "standardization" of Kubernetes environments on numerous cloud providers... we were swayed to make the migration I outlined earlier. But I'm certain not everyone will agree on these assessments, hopefully for the betterment of the community.

And, not really related to Orleans, but... After moving to the Kubernetes environment, we changed other aspects of our system. Notably, we removed public IPs/load balancers from our Azure VNet. We now use Cloudflare Warp to "hide behind" the Cloudflare edge servers. We're also playing around with VNet Service Endpoints and Storage Firewalls to improve our overall network security strategy. Also looking forward to Kubernetes RBAC in our production environment once it becomes available in AKS.

EDIT: Just a final, parting thought to make this a little more relevant to the PR discussion. Combining all of this with Gutemberg's previous comments, I think this makes Docker deployment documentation a complicated story. The docs could just focus on the core of creating a container with an Orleans silo, but that would leave quite a bit to be desired if you're trying to piece together how to move it to a production environment. On the other hand, trying to document in detail all of the potential ways to achieve a Docker deployment is unrealistic and (most likely) undesirable. Although similar in spirit, our workflow doesn't look anything like the current Docker deployment documentation. Further, parts of it aren't quite correct when applied to our Kubernetes environment (e.g. the warning in the "Gateway configuration" section).

@ifle
Copy link

ifle commented Dec 31, 2017

Thanks @seniorquico a lot. Very useful and informative. I agree maybe SF is the best solution of us.


## [Docker Deployment](Docker-Deployment.md)

Orleans can also be deployed using Docker containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should state something like:

Orleans can also be deployed for development purposes using regular Docker containers. The recommended reliable deployment for real-world production scenarios still use one of the Orchestrator supported by Docker like Swarm and Kubernetes. They provide high-available deployments by spinning up several container replicas across the cluster, one for each Silo.

@ifle
Copy link

ifle commented Jan 10, 2018

How to configure client gateway in docker? By documentation we are not using the membership provider. We will manually add a gateway yourself to the client configuration. I don't understand how it will works in case of Orleans cluster that will be scaled up or down?

@galvesribeiro
Copy link
Member

@ifle you will not add silo per silo at the gateway list. Just add 1 gateway, which is the service name of you swarm Service object or Kubernetes Service object. That way, the request from the client to the cluster will always know only the Service name regardless of the number of silos that are being scaled up or down.

@ifle
Copy link

ifle commented Jan 10, 2018

Thanks for your answer. I used following code. I've added 1 gateway. I don't understand how gateway client will update the list of silos of cluster?

var hostEntry = await Dns.GetHostEntryAsync("ifle.server");
var ip = hostEntry.AddressList[0];
config.Gateways.Add(new IPEndPoint(ip, 30000));

Is there example of how to init membership for Kubernetes?

@seniorquico
Copy link
Contributor

You may use Kubernetes's service discovery features similar to @galvesribeiro's suggestion for Docker Swarm.

Alternatively, you could use one of the existing membership providers. For example, we use the Azure Table Storage membership provider-- same approach as the non-Docker examples. This works within a Kubernetes cluster thanks to the networking model requirements.

@ifle
Copy link

ifle commented Jan 10, 2018

Maybe, do you have a small example of small orleans application like hello world that running in kubernates?
I'm new to both Orleans and Kubernetes. That will be very helpful. Thanks a lot.

@galvesribeiro
Copy link
Member

@ifile by using the service name, both Swarm and Kubernetes will resolve it (like here var hostEntry = await Dns.GetHostEntryAsync("ifle.server");) to a service IP. That means the client think it is connected to a single gateway. In reality, whenever the client send message to the service IP, it will be routed to one of the available nodes behind that service and the client will never know.

So, it doesn't matter if you have 1 or multiple silos in the cluster from the client perspective. All it cares is that it has a single IP to connect thru.

I've being using it for a while and it is working perfectly.

NOTE: People may think the DNS resolution will fail if you make a request after the name get resolved and the targe silo is not available anymore. It is not true. The DNS resolution if for the Swarm/Kubernetes service IP, and not the target container/silo IP. Once the service is up, the IP is unique as long as you have at least one node on your Swarm/Kubernetes cluster, so that IP will remain the same. Rest assured that your packets will arrive in one of the containers behind the service regardless if the number of containers there.

@seniorquico
Copy link
Contributor

@ifle I'm sorry, but I don't have a "hello world" Kubernetes example. That is part of the scope of this PR, though.

NOTE: People may think the DNS resolution will fail if you make a request after the name get resolved and the targe silo is not available anymore. It is not true. The DNS resolution if for the Swarm/Kubernetes service IP, and not the target container/silo IP. Once the service is up, the IP is unique as long as you have at least one node on your Swarm/Kubernetes cluster, so that IP will remain the same. Rest assured that your packets will arrive in one of the containers behind the service regardless if the number of containers there.

To call it out explicitly... This behavior is not in Kubernetes OOTB. Starting with Kubernetes version 1.2, the iptables proxy is the default service proxy implementation. From the docs:

However, unlike the userspace proxier, the iptables proxier cannot automatically retry another Pod if the one it initially selects does not respond, so it depends on having working readiness probes.

https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies

Although the DNS resolution will succeed and provide the virtual IP address of the service proxy, packets will only reliably make it to your silo containers if you configure the appropriate readiness/liveness probes. When a silo goes down, the probe will update the service endpoints. When the service endpoints are changed, kube-proxy will update the iptables rules if necessary.

@galvesribeiro
Copy link
Member

To call it out explicitly... This behavior is not in Kubernetes OOTB. Starting with Kubernetes version 1.2, the iptables proxy is the default service proxy implementation. From the docs:

However, unlike the userspace proxier, the iptables proxier cannot automatically retry another Pod if the one it initially selects does not respond, so it depends on having working readiness probes.

I may not expressed me correctly. I know how kube-proxy works. I'm just saying that the service name resolves to the service IP. That's all. If a silo die, whatever kubernetes will do behind the scenes doesn't matter for this particular scenario. What matters is that the client will have a stable DNS name which resolve to a stable IP which will redirect to one of the backend pods.

With that in mind, the service will always hit a pod if there is at least one replica available.

@ifle
Copy link

ifle commented Jan 11, 2018

@seniorquico @galvesribeiro Thanks. I will try to deploy my orleans application to kubernetes.
Are there non default requirements in kubernetes for running orleans? I have web app (orleans client) and backend app ( orleans cluster), both run into windows containers. Thansk again.

P.S. Sorry for my bad english

@sergeybykov sergeybykov added this to the 2.0.0 milestone Jan 11, 2018
@suraciii
Copy link
Contributor

suraciii commented Jan 13, 2018

@galvesribeiro

In reality, whenever the client send message to the service IP, it will be routed to one of the available nodes behind that service and the client will never know.

But the client uses fixed address finally, after silos scaled in/out, the silo address client used may be invalid(?), so may be it needs resolve silo address in realtime.
like resolving addresses in DnsGatewayListProvider.GetGateways()
or just allows to use hostname in ClientConfiguration.Gateways, so the client could connect gwy.tcp://siloname directly.

@galvesribeiro
Copy link
Member

@csyszf no, you don't need the actual silo address. Your Grain client can live just fine by using the resolved service IP address.

GrainClient -> Service DNS (Kubernetes or Swarm) -> Service IP -> any silo IP

The only requirement for that is that your silos have all be configured to have the Gateway enabled.

@seniorquico
Copy link
Contributor

@csyszf If you're using Kubernetes, it's the service (specifically, its list of service endpoints) that needs to be updated whenever the set of silo pods changes. The service endpoints need to be updated when new silo pods come online (scale out, after a deployment update, etc.) and go offline (crash, scale in, after a deployment update, etc.).

The appropriate method of updating the service endpoints will be highly dependent on your specific environment and configuration. If you're working with something along the lines of the Kubernetes defaults (e.g. ClusterIP and iptables proxy), the label selector will cover most of the updates automatically. However, be sure to configure readiness/liveness probes on your silo pods to minimize chances of initial connection issues with the iptables proxy (#3759 (comment)).

@benjaminpetit
Copy link
Member Author

Closing this one since it is now very outdated

@benjaminpetit benjaminpetit deleted the gh-pages-docker branch August 14, 2019 16:43
@github-actions github-actions bot locked and limited conversation to collaborators Dec 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.