-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Conversation
Early review comment: I wouldn't touch the The reasoning is that, looking at the scheduling interface, it goes far beyond what a scheduler is supposed to do (for instance, Rough idea:
I believe it resonates better with what those types are doing: Thoughts? |
I agree with the naming (for me. most of the comment is about renaming Although, I'm in favor of keeping the cluster as it is today; the So I would agree on something like:
What do you think of this ? One question though, where would you see the filters and strategies ? Only in the Scheduler inside |
I don't think Since all
I'm not sure, I initially would say I don't think so. The goal of the project remains (by default) to be a simple container scheduler, so it kinda makes sense to keep it as a high level item. Also, one point I didn't raise earlier:
Good point. I don't know if filters and strategies will be re-used by the |
@vieux Thanks for the ping. This PR looks great and I'll be trying to catch with these changes! |
} | ||
|
||
// Entries are Mesos masters | ||
func (s *MesosScheduler) NewEntries(entries []*discovery.Entry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what this function is for.
The comment above says entries are mesos masters, but in the code it's adding to the cluster which I assume is slave nodes with swarm launched.
For Mesos there isn't any need to look for new nodes, since when new nodes are added to the data center the framework will get offers from it from the resourceOffer callback, and since you launched the task yourself you also know what slave that is, in the framework side just need to listen on the statusUpdate on TASK_RUNNING to get the information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be the generic way to receive Mesos masters.
I do think swarm will need access to the list of nodes, to be able to do the logs, attach, pause, unpause, ps etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we have a way to lookup containers / list containers and get the node address they are running in.
In that case, we can build cluster.Container
s on the fly when calling Containers()
or Container()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say receive Mesos masters, are you referring when the master changes?
Swarm definitely needs the list of nodes, and you can query the master for all the slaves. Looks like this is being called back from discovery service. One concern I have is that since this is a seperate discovery process that is proxying information, if some reason the mesos slave is down it will still return information given the node is still alive.
If all the docker commands that it needs to proxy is about a running container, it is much better to get the information from the mesos as that's more definitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, but we still have 2 issues with this:
- events: we need direct access to the nodes to get the events
- images: we do not want, each time we do a
docker images
to query mesos for the list of all the nodes, and then to query all the nodes to get their images, it'll take way to long.
I updated the PR with @aluzzardi and @tnachen suggestions. Still lots of blurry things, but let's do baby steps and iterate :)
|
I wonder if these should be called "drivers" rather than "APIs" to avoid confusion with the Remote API? |
@bfirsh good idea, I updated the title. |
Super excited by this PR! If I understand this well:
A few questions though:
Also thanks @vieux for taking those suggestions into account! I'm strongly leaning toward @aluzzardi's comments on the use of a |
In addition to the type Node interface {
ID() string
Name() string
IP() string //to inject the actual IP of the machine in docker ps (hostname:port or ip:port)
Addr() string //to know where to connect with the proxy
Images() []*Image //used by the API
Image(IdOrName string) *Image //used by the filters
Containers() []*Container //used by the filters
Container(IdOrName string) *Container //used by the filters
TotalCpus() int64 //used by the strategy
UsedCpus() int64 //used by the strategy
TotalMemory() int64 //used by the strategy
UsedMemory() int64 //used by the strategy
Labels() map[string]string //used by the filters
IsHealthy() bool
} |
Last week @alexwelch and I wrote a prototype Kubernetes cluster based on this PR. We got
By the end of the week, I felt that a k8s driver is technically possible, but wouldn't be compelling to use. Maybe more work (or more k8s experience than I have!) is required, but that Swarm wouldn't take advantage of many features of k8s, and wouldn't behave quite like docker, either. This might change in the future if there is a I'm very curious to see a real implementation of the Mesos driver. Maybe that work will give us some ideas for the problems we ran into. Sorry for the huge wall of text! |
… a simple scheduler interface. Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Usable -> Total & Reserved -> Used Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Thanks @dpetersen for you valuable feedback, this PR is becoming very big, so we might merge it soon and refine the API in another PR, I'll definitely ping you on it. |
LGTM |
Proposal: Scheduler/Cluster Driver
I just noticed that the Can you make the two consistent, e.g., rename |
Sure will do |
@aluzzardi Hi. I pitched up here having followed the PR link from your Swarm post in the Scheduler section https://blog.docker.com/2015/02/scaling-docker-with-swarm/ - if this is now Cluster Driver it is probably worth updating post? |
Hi all,
This is a very simple version of what a
SchedulerClusterAPIDriver might look like, it's a PoC and comments are very welcome.Here are the keys changes in this PR:
Cluster
is now an interface.Scheduler
or not. Right now,SwarmCluster
does andMesosCluster
doesn't.mesos.go
file with a lot of TODOs.Of course I'd love to split this into multiple PRs but here you can see the full picture.
Here is the simple Interface:
The
Scheduler
is nowhere to be found here, because someCluster
might have not and not the other.Regarding
mesos
here how I see things:It will be a "hybrid" framework, we will use mesos for creating and remove container and direct access to the nodes for everything else.
Start with
swarm manage --cluster mesos zk://<ip:port1>,<ip:port2>,<ip:port3>/mesos
The
MesosCluster
will receive Mesos masters from zookeeper inNewEntries
, connect to them to get the list of actual docker nodes and fill it's internalcluster
.In
CreateContainer
andRemoveContainer
it'll use the Mesos Master to interact with Mesos.In
Nodes
,Containers
,Container
andEvents
it will simply use thecluster
of docker nodes.@aluzzardi: WDYT ?
@tnachen & @benh: can you take a look at the comments in
mesos.go
and tell me if I'm missing something ?@francisbouvier: thanks for your initial work in #178
@chanwit: we will keep it compiled in for now, but this API shouldn't be an issue with your PR.
Related to #213 and #214