Skip to content
This repository has been archived by the owner on Feb 10, 2021. It is now read-only.

workers can't reach multi-homed scheduler #28

Open
dkmichaels opened this issue May 23, 2017 · 3 comments
Open

workers can't reach multi-homed scheduler #28

dkmichaels opened this issue May 23, 2017 · 3 comments

Comments

@dkmichaels
Copy link

Using SGE and wanting to initialize dask fully from python.

My main node is dual-homed, with 1G and 10G interfaces. The 10G is the one that my SGE cluster uses.

from dask_drmaa import DRMAACluster
from dask.distributed import Client

In [9]: cluster = DRMAACluster(hostname='master-10g')
INFO:dask_drmaa.core:Start local scheduler at master-10g

In [10]: cluster.scheduler_address
Out[10]: 'tcp://10.22.150.194:37386' . # this is the master-1g IP, not the one I want 

Meanwhile, the workers are spinning trying to connect to the 1G IP:

tail worker.23523.1.err
distributed.worker - INFO - Trying to connect to scheduler: tcp://10.22.150.194:37386

Can this be extended to allow one to specify the scheduler interface / hostname / IP to give to the workers?

@mrocklin
Copy link
Member

mrocklin commented May 24, 2017 via email

@dkmichaels
Copy link
Author

dkmichaels commented Jul 18, 2017

Here's the workaround I hacked together -- suggestions for improvement welcome:

Replace these lines (note the first line has no effect in the current code):

def create_job_template(...)
        ...
        args = template['args']
        args = [self.scheduler_address] + template['args']
        ...

with:

        # replace scheduler's 1G IP with it's 10G IP
	args = [self.scheduler_address.replace('10.22.150.194', '10.22.250.1')]
	args = args + template['args']

Hardcoding IPs allows me to proceed with my testing, but this is really a hack.

@jakirkham
Copy link
Member

Sometimes using the nativeSpecification argument to DRMAA resolves issues like this. Would need to play around on your cluster and/or ask admins to know for sure.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants