Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError at training due to multiprocessing #29

Closed
anaston opened this issue Nov 19, 2021 · 4 comments
Closed

RuntimeError at training due to multiprocessing #29

anaston opened this issue Nov 19, 2021 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@anaston
Copy link

anaston commented Nov 19, 2021

Hi,

Congrats on this great tool. I wanted to give it a go but I get the error below when I train a reservoir (see reservoir.train) in Introduction_to_RC.ipynb.

I ran the code on macOS Big Sur (Version 11.6.1) and tried both jupiter notebook and copy-pasting the relevant code into a script then using visual studio code (in this case, I even embedded the script in if name == 'main' and called freeze_support() as suggested in forums but with no success). ​

Please can you help me how to resolve this issue?

Thanks a lot and Best,
Agoston

Traceback (most recent call last):
​File "", line 1, in
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
​exitcode = _main(fd, parent_sentinel)
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
​self = reduction.pickle.load(from_parent)
​File "/Users/amihalik/Documents/projects/reservoir/vsc-test/.venv/lib/python3.9/site-packages/reservoirpy/init.py", line 6, in
​from .utils.save import load
​File "/Users/amihalik/Documents/projects/reservoir/vsc-test/.venv/lib/python3.9/site-packages/reservoirpy/utils/save.py", line 11, in
​from .. import regression_models
​File "/Users/amihalik/Documents/projects/reservoir/vsc-test/.venv/lib/python3.9/site-packages/reservoirpy/regression_models.py", line 22, in
​from .utils.parallel import lock as global_lock
​File "/Users/amihalik/Documents/projects/reservoir/vsc-test/.venv/lib/python3.9/site-packages/reservoirpy/utils/parallel.py", line 17, in
​manager = Manager()
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 57, in Manager
​m.start()
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/managers.py", line 554, in start
​self._process.start()
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/process.py", line 121, in start
​self._popen = self._Popen(self)
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
​return Popen(process_obj)
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in init
​super().init(process_obj)
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_fork.py", line 19, in init
​self._launch(process_obj)
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 42, in _launch
​prep_data = spawn.get_preparation_data(process_obj._name)
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 154, in get_preparation_data
​_check_not_importing_main()
​File "/usr/local/Cellar/python@3.9/3.9.8/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 134, in _check_not_importing_main
​raise RuntimeError('''
RuntimeError:
​An attempt has been made to start a new process before the
​current process has finished its bootstrapping phase.
​This probably means that you are not using fork to start your
​child processes and you have forgotten to use the proper idiom
​in the main module:
​if name == 'main':
​freeze_support()
​...
​The "freeze_support()" line can be omitted if the program
​is not going to be frozen to produce an executable.

@nTrouvain
Copy link
Collaborator

Hello,

Could you try to import reservoirpy from within the 'if name == "main":' statement ?

PS: Thank you for using reservoirpy, we are glad you like it :)

@nTrouvain nTrouvain added the bug Something isn't working label Nov 22, 2021
@nTrouvain nTrouvain self-assigned this Nov 22, 2021
@anaston
Copy link
Author

anaston commented Nov 22, 2021

Thank you for the quick reply:)

Yes, I tried adding it into an 'if name == "main":' statement but still getting this RuntimeError. However, what is very odd that actually the code does not stop at the error. Now I added some additional code after this error is printed in the terminal and that runs. Do you have an idea what's happenning? Sorry, I am new to Python, so I might miss something basic.

Pleas the script below that I use:

def split_timeserie_for_task1(forecast, train_length=20000):

X_train, y_train = X[:train_length], X[forecast: train_length+forecast]
X_test, y_test = X[train_length: -forecast], X[train_length+forecast:]

return (X_train, y_train), (X_test, y_test)

def reset_esn():
Win = mat_gen.generate_input_weights(units, 1, input_scaling=input_scaling,
proba=input_connectivity, input_bias=True,
seed=seed)

W = mat_gen.generate_internal_weights(units, sr=spectral_radius,
                              proba=density, seed=seed)

reservoir = ESN(leak_rate, W, Win, ridge=regularization)

return reservoir

def r2_score(y_true, y_pred):
return 1 - (np.sum((y_true - y_pred)**2) / np.sum((y_true - y_true.mean())**2))

def nrmse(y_true, y_pred):
return np.sqrt((np.sum(y_true - y_pred)**2) / len(y_true)) / (y_true.max() - y_true.min())

if name == 'main':
import numpy as np
import matplotlib.pyplot as plt
from reservoirpy import mat_gen, ESN
from reservoirpy.datasets import mackey_glass

# Generate data

timesteps = 25000
tau = 17
X = mackey_glass(timesteps, tau=tau)

# rescale between -1 and 1
X = 2 * (X - X.min()) / (X.max() - X.min()) - 1

forecast = 10
(X_train, y_train), (X_test, y_test) = split_timeserie_for_task1(forecast)

sample = 500
figsizex = 15
figsizey = 3

# %% ESN training and prediction

units = 100
leak_rate = 0.3
spectral_radius = 1.25
input_scaling = 1.0
density = 0.1
input_connectivity = 0.2
regularization = 1e-8
seed = 1234

Win = mat_gen.generate_input_weights(units, 1, input_scaling=input_scaling,
                                    proba=input_connectivity, input_bias=True,
                                    seed=seed)

W = mat_gen.generate_internal_weights(units, sr=spectral_radius,
                            proba=density, seed=seed)

reservoir = ESN(leak_rate, W, Win, ridge=regularization)

states = reservoir.train([X_train.reshape(-1, 1)], [y_train.reshape(-1, 1)],
                return_states=True, verbose=True) # workers=1

y_pred, states1 = reservoir.run([X_test.reshape(-1, 1)], return_states=True,
                verbose=True) # init_state=states[0][-1] workers=1

y_pred = y_pred[0].reshape(-1, 1)
states1 = states1[0]

fig = plt.figure(figsize=(figsizex, figsizey)) # figsize=(15, 7)
plt.plot(np.arange(sample), y_pred[:sample], lw=3, label="ESN prediction")
plt.plot(np.arange(sample), y_test[:sample], linestyle="--", lw=2, label="True value")
plt.plot(np.abs(y_test[:sample] - y_pred[:sample]), label="Absolute deviation")

plt.legend()
plt.show()

@nTrouvain nTrouvain added this to To do in v0.3 Nov 26, 2021
@nTrouvain nTrouvain moved this from To do to Issues in v0.3 Nov 26, 2021
@nTrouvain nTrouvain moved this from Issues to In progress in v0.3 Nov 26, 2021
@nTrouvain nTrouvain moved this from In progress to Issues in v0.3 Nov 26, 2021
@nTrouvain nTrouvain moved this from Issues to In progress in v0.3 Nov 29, 2021
nTrouvain added a commit that referenced this issue Nov 29, 2021
@nTrouvain
Copy link
Collaborator

Hello,
We have partially fixed the issue, but I am affraid it is caused by the behaviour of Mac OS regarding how to instanciate processes.
Using the if __name__ == "__main__": statement is mandatory in your case when launching your code (it is actually a good practice in Python in any case). Thank you for drawing this to our attention, we will try to document this error the best we can. In the meantime, next releases v0.2.4-post1 and v0.3.0b1 will avoid raising this error all the time, even when parallelization is deactivated. You can deactivate it by using "sequential" as a backend in the trainnig functions, or by setting it globaly using the set_joblib_backend("sequential") function.

@nTrouvain nTrouvain moved this from In progress to Done in v0.3 Nov 29, 2021
nTrouvain added a commit that referenced this issue Nov 29, 2021
@nTrouvain nTrouvain moved this from To do to Done in v0.2.4 - Maintainance Nov 29, 2021
@anaston
Copy link
Author

anaston commented Nov 30, 2021

Thanks a lot for trying to fix this and the help with the workaround. Feel free to let me know if I can be helpful in any testing.

@nTrouvain nTrouvain moved this from Done to Done (previous release) in v0.3 Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
v0.3
Done (previous release)
Development

No branches or pull requests

2 participants