Mac/Linux with multiprocessing, all workers are seeded the same random state #14729

FlorinAndrei · 2019-10-16T20:41:01Z

Reproducing code example:

Full code is here, I will leave this branch untouched so you can see the behavior I'm talking about:

https://github.com/FlorinAndrei/nsphere/tree/numpy-mp

On Mac or Linux, edit xpu_workers.py and comment out the rseed lines, and the bug will be triggered.

You can tell the bug has been triggered because there are very few dots in the Monte Carlo simulation graph, in the Jupyter notebook. There are supposedly 100 dots there, but due to the bug there are far fewer - and the whole population is far less random, which affects the app as a whole.

What's really going on:

I create a pool of workers with:

import multiprocessing
from multiprocessing import Pool

p = Pool(processes = num_p)        
arglist = [(points, d, num_p, sysmem, gpumem, pointloops)] * num_p
work_out = p.map(make_dots, arglist)

And within the worker I have something like this:

pts = np.random.random_sample((points, d)) - 0.5

Parts of the pts array are returned as samples from all workers to the master process, and are collated in the work_out matrix. Each worker is supposed to make random samples - and of course the expectation is that each sample is different. https://dilbert.com/strip/2001-10-25

On Windows this works great.

On Mac and Linux, all pts arrays are generated with the exact same "random" content. The samples from workers are all identical. Within each sample the content looks random enough (just an eyeball estimate) but all samples coincide perfectly with each other.

It's a very frustrating bug, hard to figure out the cause, and makes the code misbehave in weird ways.

I have to do this in each worker to get rid of the bug:

rseed = random.randint(0, 4294967296)
xp.random.seed(rseed)

Numpy/Python version information:

1.16.4 3.7.4 (default, Jul  9 2019, 18:13:23) 
[Clang 10.0.1 (clang-1001.0.46.4)]

The text was updated successfully, but these errors were encountered:

mattip · 2019-10-16T21:30:20Z

Without diving too deeply into your code, I wonder if you have seen the new (as of 1.17) random.BitGenerator api? In particular, you might be interested in the work done to ensure parallel processes get "independent" streams. Please let us know if we could improve the documentation to make it clearer, and if it helps solve your problem.

FlorinAndrei · 2019-10-16T22:49:02Z

Thanks for the information, I was not aware of the new API, I'll give it a try and get back to you soon.

FlorinAndrei · 2019-10-17T20:38:01Z

Something I've noticed right away: if I use the spawn method for the multiprocessing pool, then the random sequence issue disappears - the random number generator makes different sequences in each worker, as it should.

multiprocessing.set_start_method('spawn')

spawn is the default on Windows, which is probably why the RNG works fine on Windows. On Unix-like OSes, the default method is fork. If I force spawn regardless of the OS, the RNG is fine.

https://docs.python.org/3.7/library/multiprocessing.html

Now let me take a look at your new API.

FlorinAndrei · 2019-10-18T00:40:28Z

@mattip I think I'll stop here for now. set_start_method('spawn') works fine for me and ensures that each worker gets a different RNG sequence.

I will investigate the new RNG API later, if I decide to make changes to my project. For now, though, I'm done - the master branch now does everything I need.

Thank you.

mattip · 2019-11-04T16:20:22Z

Closing. Thanks for the update. Hopefully you will try the new API.

mattip closed this as completed Nov 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mac/Linux with multiprocessing, all workers are seeded the same random state #14729

Mac/Linux with multiprocessing, all workers are seeded the same random state #14729

FlorinAndrei commented Oct 16, 2019

mattip commented Oct 16, 2019

FlorinAndrei commented Oct 16, 2019

FlorinAndrei commented Oct 17, 2019 •

edited

FlorinAndrei commented Oct 18, 2019 •

edited

mattip commented Nov 4, 2019

Mac/Linux with multiprocessing, all workers are seeded the same random state #14729

Mac/Linux with multiprocessing, all workers are seeded the same random state #14729

Comments

FlorinAndrei commented Oct 16, 2019

Reproducing code example:

Numpy/Python version information:

mattip commented Oct 16, 2019

FlorinAndrei commented Oct 16, 2019

FlorinAndrei commented Oct 17, 2019 • edited

FlorinAndrei commented Oct 18, 2019 • edited

mattip commented Nov 4, 2019

FlorinAndrei commented Oct 17, 2019 •

edited

FlorinAndrei commented Oct 18, 2019 •

edited