Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Deploying truly distributed Druid clusters with docker #21

Open
xiaoyao1991 opened this issue Sep 28, 2016 · 10 comments
Open
Assignees

Comments

@xiaoyao1991
Copy link

Hey guys,

I see that the image created from this repo is running every node together in one single container. While it's helpful for users to test and try out Druid, it isn't particularly useful when it comes to deploying a production cluster.

I've been working on using docker to deploy a truly distributed Druid cluster lately. I have had something working and I'd like to share it and contribute back. I'm wondering if it will be valuable though. Take a look at this fork to see what I've done so far. While the settings in that fork are set specifically for my team's research purpose, they can be generalized and made extendable. I'm still in an exploration stage on docker deployments, so I may have made stupid mistakes.

Due to its lightweight, Docker philosophy encourages running only one service in a container and having containers talk to each other, rather than running all dependencies and services in one container. Therefore, I defined separate images for each type of Druid node(druid-broker, druid-coordinator, etc.) as well as dependencies(druid-zookeeper, druid-mysql, druid-kafka, etc.) I also packed the jvm and runtime configurations into a separate image(druid-conf).

I had a discussion earlier with @cheddar on deployment stuff. He made it clear that deployment in general should only have 3 steps:

  1. Download the artifact.
  2. Download the configurations.
  3. Run the artifact with the configurations.

I followed this guideline: when running a specific druid node, say broker, all you need to do is:

  1. Pull druid-broker image
  2. Pull druid-conf image
  3. Run druid-conf in a container, and then link it as a volume provider(using --volumes-from) for the druid-broker container.

Containers on different nodes can freely communicate with each others as long as they are within a same overlay network. I leverages docker-machine to manage/provision remote nodes, and docker swarm for container orchestration. Running a broker node for example is just as simple as:
docker run -itd --volumes-from=node-1-conf --name=broker --network=p-my-net --env="constraint:node==p-node-5" druid-broker:latest

@guobingkun
Copy link
Member

👍 I am totally on board with this.

@guobingkun guobingkun changed the title Deploying truly distributed Druid clusters with docker [Proposal] Deploying truly distributed Druid clusters with docker Sep 28, 2016
@saidimu
Copy link

saidimu commented Oct 31, 2016

@xiaoyao1991 Any updates on this? Looks awesome!

@xiaoyao1991
Copy link
Author

@saidimu I have one more thing to confirm before I organize something up. In our experiment settings, we were using a simple NFS as deep storage instead of HDFS. I'm confirming if the nodes in the swarm overlay network can properly talk to HDFS.

@martin-liu
Copy link

@xiaoyao1991 it's great, any update?

@xiaoyao1991
Copy link
Author

@martin-liu Thanks. I've opened a preliminary PR(#23). I haven't yet had the time to address the comments there that relates to documentation.

@sjtoik
Copy link

sjtoik commented Oct 11, 2017

I have a setup made for docker-compose and for kubernetes if you would be interested to maintain those.

@stingerpk
Copy link

@sjtoik Is it possible to share a link to your setup?

@rathko
Copy link

rathko commented Nov 15, 2017

@sjtoik +1 for Kubernetes setup

1 similar comment
@mdh69
Copy link

mdh69 commented Nov 21, 2017

@sjtoik +1 for Kubernetes setup

@sjtoik
Copy link

sjtoik commented Dec 13, 2017

@stingerpk I'm still testing different aspects of the deployment and features. If it is not too much of an hassle to upkeep our environment specific one and public counterpart, I'll publish it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants