can't import a local module from dask workers #187

rabernat · 2018-03-29T02:06:19Z

In my research, I commonly write a module where I dump longer functions / classes and import it from my notebooks. This reduces clutter in the notebook. Perhaps these functions are used in multiple notebooks, but I don't consider them reusable / general / stable enough to actually package, distribute, etc. I imagine many people work this way.

As a concrete example, I have a file in my examples directory on pangeo.pydata.org called 'foo.py':

class Foo(object):
    def __init__(self):
        pass

I import this from a notebook and create an instance

import foo
f = foo.Foo()

Now I create a cluster and try to scatter this object

from dask.distributed import Client
from dask_kubernetes import KubeCluster
cluster = KubeCluster(n_workers=1)
client = Client(cluster)
client.scatter(f)

I get a long error, the gist of which is distributed.core - ERROR - No module named 'foo'.

What is confusing to me is that this example would work perfectly fine if I just defined Foo within a cell in the notebook.

We should think about how to support this sort of thing, because I feel like lots of people work this way.

Obviously related to other customizable environment issues such as #136, #133, #125, #67, etc.

The text was updated successfully, but these errors were encountered:

mrocklin · 2018-03-29T11:19:19Z

If you have a fixed set of workers then use Client.upload_file If you are willing to push this module to github then consider adding a pip install line to your worker-template with the EXTRA_PIP_PACKAGES environment variable For changing the behavior of how functions in other files serialize you'll have to take that upstream with cloudpickle/pickle

…

On Wed, Mar 28, 2018 at 10:06 PM, Ryan Abernathey ***@***.***> wrote: In my research, I commonly write a module where I dump longer functions / classes and import it from my notebooks. This reduces clutter in the notebook. Perhaps these functions are used in multiple notebooks, but I don't consider them reusable / general / stable enough to actually package, distribute, etc. I imagine many people work this way. As a concrete example, I have a file in my examples directory on pangeo.pydata.org called 'foo.py': class Foo(object): def __init__(self): pass I import this from a notebook and create an instance import foo f = foo.Foo() Now I create a cluster and try to scatter this object from dask.distributed import Clientfrom dask_kubernetes import KubeCluster cluster = KubeCluster(n_workers=1) client = Client(cluster) client.scatter(f) I get a long error, the gist of which is distributed.core - ERROR - No module named 'foo'. What is confusing to me is that this example would work perfectly fine if I just defined Foo within a cell in the notebook. We should think about how to support this sort of thing, because I feel like lots of people work this way. Obviously related to other customizable environment issues such as #136 <#136>, #133 <#133>, #125 <#125>, #67 <#67>, etc. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#187>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszN0Fkr64BzjCoBjGYCjqJXSY0L9qks5tjEGcgaJpZM4S_oba> .

stale · 2018-06-25T16:15:42Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2018-07-02T16:38:25Z

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

jhamman added the google cloud platform label Mar 31, 2018

jacobtomlinson added the user support label Apr 26, 2018

stale bot added the stale label Jun 25, 2018

stale bot closed this as completed Jul 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't import a local module from dask workers #187

can't import a local module from dask workers #187

rabernat commented Mar 29, 2018

mrocklin commented Mar 29, 2018 via email

stale bot commented Jun 25, 2018

stale bot commented Jul 2, 2018

can't import a local module from dask workers #187

can't import a local module from dask workers #187

Comments

rabernat commented Mar 29, 2018

mrocklin commented Mar 29, 2018 via email

stale bot commented Jun 25, 2018

stale bot commented Jul 2, 2018