An attempt has been made to start a new process before the current process has finished its bootstrapping phase. #2557

muammar · 2019-03-08T05:15:10Z

I am new to dask and I would like to thank you all for this wonderful module. I am working on a project where I was able to paralellize in a single machine a loop as you can see here . However, I would like to move over to dask.distributed. I applied the following changes to the class above:

diff --git a/mlchem/fingerprints/gaussian.py b/mlchem/fingerprints/gaussian.py
index ce6a72b..89f8638 100644
--- a/mlchem/fingerprints/gaussian.py
+++ b/mlchem/fingerprints/gaussian.py
@@ -6,7 +6,7 @@ from sklearn.externals import joblib
 from .cutoff import Cosine
 from collections import OrderedDict
 import dask
-import dask.multiprocessing
+from dask.distributed import Client
 import time


@@ -141,13 +141,14 @@ class Gaussian(object):
         for image in images.items():
             computations.append(self.fingerprints_per_image(image))

+        client = Client()
         if self.scaler is None:
-            feature_space = dask.compute(*computations, scheduler='processes',
+            feature_space = dask.compute(*computations, scheduler='distributed',
                                          num_workers=self.cores)
             feature_space = OrderedDict(feature_space)
         else:
             stacked_features = dask.compute(*computations,
-                                            scheduler='processes',
+                                            scheduler='distributed',
                                             num_workers=self.cores)

             stacked_features = numpy.array(stacked_features)

Doing so generates this error:

 File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

I have tried different ways of adding if __name__ == '__main__': without any sucess. This can be reproduced by running cu_training.py in https://github.com/muammar/mlchem/tree/master/examples. I would appreciate if anyone could help me to figure this out. I have no clue on how I should change my code to make it work. Please, feel free to close this report if it does not fit here.

Thanks.

The text was updated successfully, but these errors were encountered:

mrocklin · 2019-03-08T06:11:12Z

Thanks for raising. I suspect that this is a duplicate of #2520

mrocklin closed this as completed Mar 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An attempt has been made to start a new process before the current process has finished its bootstrapping phase. #2557

An attempt has been made to start a new process before the current process has finished its bootstrapping phase. #2557

muammar commented Mar 8, 2019

mrocklin commented Mar 8, 2019

An attempt has been made to start a new process before the current process has finished its bootstrapping phase. #2557

An attempt has been made to start a new process before the current process has finished its bootstrapping phase. #2557

Comments

muammar commented Mar 8, 2019

mrocklin commented Mar 8, 2019