-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface for executing workflows on multiple nodes #1273
Comments
This should be more robust than talking to PBS right? Not sure how robust
This refers to how internally the SoS mini-cluster works? It sounds like a redis queue -- once we "booked" multiple nodes, there has to be some existing solutions to use those nodes implementing the mini-cluster, do we?
I cannot think of a user case for it, unless someone has lots of desktops in the lab not configured as a cluster which would be weird (please correct me if I get it wrong). And the mechanism is not going to be robust anyways (have to configure each computer separately, worry about file sync, etc). So I'm not very excited about the non-cluster multi-node mode unless you've got a need? If without having to worry about non-cluster mode, should we only change the |
Our HPC limits outside connection but seems to allow inter-node communication with any port. Otherwise I think MPI stuff would not be able to run, I think.
Yes,
The idea is like task to another server with process queue, but all in memory/network. For example, if previously we need to rsync a bunch of files on the cluster head node, I would have to submit a bunch of rsync tasks. With this option i am basically starting a sos worker on the headnode, which accepts rsync substeps through network. There are pros and cons in this approach, at least in our system the cluster headnode does not allow anything other than ssh/port 22 so the remote worker stuff will not work. This is one of the reasons we used ssh/rsync for everything to achieve best compatibility. I agree that we could implement this as another kind of task engine with different kind of template. |
My point is that I dont see this happen often ... Mostly i see people work on a single HPC cluster, or running from their desktop but submit to that cluster, or at most a couple of random desktop computers to use for remote tasks (in which case the existing task model is not so bad). But it would be warranted if you've got solid user case. Implementing a new template seems the cleanest and most flexible at least for the beta phase. |
It all depends on how far we are willing to go, because if we are only assuming nodes of a cluster, the interface would be a lot cleaner, and we automatically assume shared disk and do not have to worry about path translation and file synchronization. Adding support for non-cluster multi-node computation looks useful, but introduces a lot of complexity (e.g. heterogeneous disks) for developer and user as well, with little benefit. Right now I tend to agree that we should focus on the mini-cluster design. |
Yes, and we can worry about the other case if there indeed is a need (at least among ourselves). |
Currently the
-j
option specifies the number of workers, which are processes on the same machine. Although this more or less works on modern machines with plenty of cores, it would be better to allow the execution of workflows on multiple computers.We are almost there because
http://localhost:port
, which can be almost trivially change tohttp://ip.address:port
if we can start the processes on a remote computer.-q none
to disable tasks.So there are currently two problems:
How to start a sos worker on remote machine. This looks like some sort of
sos remote server --port
. We already have theremote
command and mechanism to run command on remote machine so this is almost done. However, on cluster, it is a bit more difficult to start since we need something alone the line ofmpistart
.What would be a good interface? The question is whether or not we should allow non-cluster multi-node mode, which is a lot harder to use.
Existing options are
-j
, which could be extended to multiple nodes, which seems to make most sense.-r
(remote), which could be used to specify multiple remote machines-q
, which could be used to define special multi-node queues.In any case we need to specify
sos
should assume that workers are automatically started by the batch manager and the master only need to connect to them. In this case the master do not even have to know the number of workers in advance, and could be more intelligent in working with dead nodes.The procedure could be something like
The master process will start working, collect worker information, and treat them as workers.
The worker process will just start as worker. Here we assume that the master node will have 5 workers, and each node will also have five workers.
sos
commands could be started by the master process on the nodes, and the users need to specify machines on whichsos
need to be started, and optionally how many processes.The process could be
so that there will be 4 local workers, 4 workers only server1 and 4 workers on server2. In this case the workers will be started by
sos
.Relationship with other options
-r
can remain untouched.That is to say,
would copy the workflow to server, start 4 workers on server and 4 workers on
server1
as defined byhosts.yml
onserver
.-q
can remain untouched. as this is still the mechanism for tasks. It seems natural to use-q none
in this mode but we could still allow the use of-q
.The text was updated successfully, but these errors were encountered: