Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] LocalCUDACluster constructor fails with unexpected keyword "env" #274

Closed
sfleisch opened this issue Feb 9, 2019 · 2 comments
Closed

Comments

@sfleisch
Copy link

sfleisch commented Feb 9, 2019

Software:
Anaconda3 2018.12 (Python 3.7)
Rapids 0.5.0 - pip install of cudf-cuda100
dask-cuda - git commit e405194.
notebook: git commit dcbc0d9238e86a0a5383f6841b3f13223bd0bebb notebooks/dask/Dask_Hello_World.ipynb

import dask
from dask.delayed import delayed
from dask.distributed import Client
from dask_cuda import LocalCUDACluster
cluster = LocalCUDACluster()

Exception:

The interesting part:

TypeError: __init__() got an unexpected keyword argument 'env'

/opt/anaconda3/lib/python3.7/site-packages/dask_cuda-0.0.0-py3.7.egg/dask_cuda/local_cuda_cluster.py in _start(self, ip, n_workers)
     77                 env={"CUDA_VISIBLE_DEVICES": cuda_visible_devices(i)},
     78             )
---> 79             for i in range(n_workers)
     80         ]
     81 

More of the code around the exception for context:


    @gen.coroutine
    def _start(self, ip=None, n_workers=0):
        """
        Start all cluster services.
        """
        if self.status == "running":
            return
        if (ip is None) and (not self.scheduler_port) and (not self.processes):
            # Use inproc transport for optimization
            scheduler_address = "inproc://"
        elif ip is not None and ip.startswith("tls://"):
            scheduler_address = "%s:%d" % (ip, self.scheduler_port)
        else:
            if ip is None:
                ip = "127.0.0.1"
            scheduler_address = (ip, self.scheduler_port)
        self.scheduler.start(scheduler_address)

        yield [
            self._start_worker(
                **self.worker_kwargs,
               env={"CUDA_VISIBLE_DEVICES": cuda_visible_devices(i)},
            )
            for i in range(n_workers)
        ]

The function signature for _start_worker doesn't have an env argument, though I'm not sure why it isn't absorbed by **kwargs. (also why call _start_work when there is a public function start_worker?).

%pinfo dask.distributed.LocalCluster._start_worker
Signature:
dask.distributed.LocalCluster._start_worker(
    ['self', 'death_timeout=60', '**kwargs'],
)
Docstring: <no docstring>
File:      /opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py
Type:      function

The full exception:

tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py", line 207, in _start_worker
    silence_logs=self.silence_logs, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py", line 99, in __init__
    **kwargs)
TypeError: __init__() got an unexpected keyword argument 'env'

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-aeeb11c1def4> in <module>
----> 1 cluster = LocalCUDACluster()

/opt/anaconda3/lib/python3.7/site-packages/dask_cuda-0.0.0-py3.7.egg/dask_cuda/local_cuda_cluster.py in __init__(self, n_workers, threads_per_worker, processes, memory_limit, **kwargs)
     51             threads_per_worker=threads_per_worker,
     52             memory_limit=memory_limit,
---> 53             **kwargs,
     54         )
     55 

/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py in __init__(self, n_workers, threads_per_worker, processes, loop, start, ip, scheduler_port, silence_logs, diagnostics_port, services, worker_services, service_kwargs, asynchronous, security, **worker_kwargs)
    139             self.worker_kwargs['security'] = security
    140 
--> 141         self.start(ip=ip, n_workers=n_workers)
    142 
    143         clusters_to_close.add(self)

/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py in start(self, **kwargs)
    169             self._started = self._start(**kwargs)
    170         else:
--> 171             self.sync(self._start, **kwargs)
    172 
    173     @gen.coroutine

/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py in sync(self, func, *args, **kwargs)
    162             return future
    163         else:
--> 164             return sync(self.loop, func, *args, **kwargs)
    165 
    166     def start(self, **kwargs):

/opt/anaconda3/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    275             e.wait(10)
    276     if error[0]:
--> 277         six.reraise(*error[0])
    278     else:
    279         return result[0]

/opt/anaconda3/lib/python3.7/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

/opt/anaconda3/lib/python3.7/site-packages/distributed/utils.py in f()
    260             if timeout is not None:
    261                 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 262             result[0] = yield future
    263         except Exception as exc:
    264             error[0] = sys.exc_info()

/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py in run(self)
   1131 
   1132                     try:
-> 1133                         value = future.result()
   1134                     except Exception:
   1135                         self.had_exception = True

/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py in run(self)
   1139                     if exc_info is not None:
   1140                         try:
-> 1141                             yielded = self.gen.throw(*exc_info)
   1142                         finally:
   1143                             # Break up a reference to itself

/opt/anaconda3/lib/python3.7/site-packages/dask_cuda-0.0.0-py3.7.egg/dask_cuda/local_cuda_cluster.py in _start(self, ip, n_workers)
     77                 env={"CUDA_VISIBLE_DEVICES": cuda_visible_devices(i)},
     78             )
---> 79             for i in range(n_workers)
     80         ]
     81 

/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py in run(self)
   1131 
   1132                     try:
-> 1133                         value = future.result()
   1134                     except Exception:
   1135                         self.had_exception = True

/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py in callback(f)
    881             for f in children:
    882                 try:
--> 883                     result_list.append(f.result())
    884                 except Exception as e:
    885                     if future.done():

/opt/anaconda3/lib/python3.7/site-packages/tornado/gen.py in wrapper(*args, **kwargs)
    324                 try:
    325                     orig_stack_contexts = stack_context._state.contexts
--> 326                     yielded = next(result)
    327                     if stack_context._state.contexts is not orig_stack_contexts:
    328                         yielded = _create_future()

/opt/anaconda3/lib/python3.7/site-packages/distributed/deploy/local.py in _start_worker(self, death_timeout, **kwargs)
    205         w = W(self.scheduler.address, loop=self.loop,
    206               death_timeout=death_timeout,
--> 207               silence_logs=self.silence_logs, **kwargs)
    208         yield w._start()
    209 

/opt/anaconda3/lib/python3.7/site-packages/distributed/nanny.py in __init__(self, scheduler_ip, scheduler_port, scheduler_file, worker_port, ncores, loop, local_dir, services, name, memory_limit, reconnect, validate, quiet, resources, silence_logs, death_timeout, preload, preload_argv, security, contact_address, listen_address, worker_class, **kwargs)
     97         super(Nanny, self).__init__(handlers, io_loop=self.loop,
     98                                     connection_args=self.connection_args,
---> 99                                     **kwargs)
    100 
    101         if self.memory_limit:

TypeError: __init__() got an unexpected keyword argument 'env'
@sfleisch
Copy link
Author

sfleisch commented Feb 9, 2019

Resolved . Needed to upgrade dask distributed.
Close this one.

@dillon-cullinan dillon-cullinan transferred this issue from rapidsai/notebooks Mar 31, 2020
@pentschev
Copy link
Member

This seems to have been resolved quite some time ago, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants