Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue 19 : A note on reproducibility #29

Merged
merged 4 commits into from
May 11, 2021

Conversation

geoffreyp
Copy link
Contributor

add the function set_seed()

@sjvrijn
Copy link
Contributor

sjvrijn commented Mar 27, 2021

Looks good to me, do you agree @chkoar?

@sjvrijn
Copy link
Contributor

sjvrijn commented Mar 29, 2021

@chkoar
Copy link
Contributor

chkoar commented Mar 29, 2021

Looks good to me, do you agree @chkoar?

Yes. But I have the impression that this approach is not safe in threaded environments.
A small naive example:

import random
import numpy as np
import time
from moead_framework.problem.numerical import Zdt1
from concurrent.futures import ThreadPoolExecutor

random.seed(0)
problem = Zdt1(2)
expected = [
    [0.8444218515250481, 0.7579544029403025],
    [0.420571580830845, 0.25891675029296335],
]

solutions = []
solutions.append(problem.generate_random_solution().solution)
solutions.append(problem.generate_random_solution().solution)

assert np.array_equal(solutions, expected)

def worker(*args):
    time.sleep(np.random.randint(1,3))
    return problem.generate_random_solution().solution


for i in range(10):
    random.seed(0)
    with ThreadPoolExecutor() as executor:
        solutions = executor.map(worker, range(2))
        solutions = [x for x in solutions]

    assert np.array_equal(solutions, expected)

@geoffreyp geoffreyp linked an issue Mar 31, 2021 that may be closed by this pull request
@geoffreyp
Copy link
Contributor Author

geoffreyp commented Mar 31, 2021

You may want to include the use of your new set_seed() function in the tests:

It is available in 2c322fa

Yes. But I have the impression that this approach is not safe in threaded environments.

From the python documentation :

By re-using a seed value, the same sequence should be reproducible from run to run as long as multiple threads are not running.

The solution available is to instantiate a new instance of Random for each Thread and use this instance in problems and algorithms but it requires a lot of refactoring in the framework to replace each call of "import random" (and it is only for the Random module, the problem is maybe the same for np.random)

local_random = random.Random(n)
for i in xrange(100):
    a.append(local_random.randint(0, 1000))

I propose to add a note to prevent that this approach is not safe in threaded environments in python.

@sjvrijn
Copy link
Contributor

sjvrijn commented Mar 31, 2021

A note in the docstring set_seed and wherever else relevant in the rest of the documentation would sufficiently address this issue for me.

@geoffreyp
Copy link
Contributor Author

A note in the docstring set_seed and wherever else relevant in the rest of the documentation would sufficiently address this issue for me.

Available in 2b313c1

Copy link
Contributor

@sjvrijn sjvrijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be sufficient for me

@chkoar
Copy link
Contributor

chkoar commented May 11, 2021

Well, having this I believe is sufficient.

@geoffreyp geoffreyp merged commit 3722a8b into moead-framework:master May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[joss-reviews#2974] A note on reproducibility
3 participants