Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Set local random seeds in all SimPEG tests #1289

Open
santisoler opened this issue Sep 27, 2023 · 0 comments
Open

MNT: Set local random seeds in all SimPEG tests #1289

santisoler opened this issue Sep 27, 2023 · 0 comments
Labels
maintenance Maintaining code base without actual functionality changes

Comments

@santisoler
Copy link
Member

Proposed new feature or change:

The issue

Several SimPEG tests need to create some form of synthetic data through a pseudo-random generator, usually done through the numpy.random module. One best practice for having reproducible tests is to set a random seed, with the goal of ensuring that every run of the tests suite is done against the same random values. These seeds are currently being set with the numpy.random.seed() function.

In several tests, these seeds are defined globally, outside the test functions and methods. For example:

from SimPEG.potential_fields import gravity
import shutil
np.random.seed(43)
class GravInvLinProblemTest(unittest.TestCase):
def setUp(self):
# Create a self.mesh
dx = 5.0

Pytest then sets the seeds when collecting all tests, therefore setting a single global seed for all tests. This means that now the random state of the tests might change depending on the order these tests are run. If for example, someone introduces some additional tests "in the middle", it might change the random state of the following tests, with the chance of them failing.

Minimal working example

For example, let's say we have two identical test files: test_first.py and test_second.py

import numpy as np

np.random.seed(5)

def test_value():
    random_value = np.random.randint(low=0, high=10)
    assert random_value == 3

When running pytest to run all the tests inside them, we see that the second one fails:

pytest test_first.py test_second.py
===================================== test session starts =====================================
platform linux -- Python 3.10.8, pytest-7.2.0, pluggy-1.0.0
rootdir: /home/santi/tmp/testing-random
plugins: anyio-3.6.2
collected 2 items

test_first.py .                                                                         [ 50%]
test_second.py F                                                                        [100%]

========================================== FAILURES ===========================================
_________________________________________ test_value __________________________________________

    def test_value():
        random_value = np.random.randint(low=0, high=10)
>       assert random_value == 3
E       assert 6 == 3

test_second.py:8: AssertionError
=================================== short test summary info ===================================
FAILED test_second.py::test_value - assert 6 == 3
================================= 1 failed, 1 passed in 0.21s =================================

Solution

A way to solve this is to define local random seeds within each test and ditching the globally defined seeds.

Moreover, it would be nice to move from the np.random.seed() and np.random.___() functions to the new Numpy's random number generator objects.

For example, to create an array of 100 random elements with a gaussian distribution we can do the following:

import numpy as np

random_num_generator = np.random.default_rng(seed=42)
gaussian = random_num_generator.normal(loc=0.0, scale=10.0, size=100)

This random number generator objects are future proof (the NumPy's RandomState is considered legacy) and it creates a more clear code, making it easy to understand which random state is being used when generating random numbers.

Related Issues and PRs

I started working on this in #1286, feel free to use that as inspiration. It would be nice to continue the work with the other test functions.

@lheagy lheagy added the maintenance Maintaining code base without actual functionality changes label Nov 29, 2023
@santisoler santisoler changed the title ENH: Set local random seeds in all SimPEG tests MNT: Set local random seeds in all SimPEG tests May 15, 2024
santisoler added a commit that referenced this issue Jun 6, 2024
Make use of the `random_seed` argument of the `make_synthetic_data`
method in magnetic tests. Remove the lines that set a global
`np.random.seed` in those tests.

Part of the solution to #1289
santisoler added a commit that referenced this issue Jun 7, 2024
Replace the usage of the deprecated functions in `numpy.random` module
for the Numpy's random number generator class and its methods, in most
of the FDEM tests. Increase number of iterations for checking
derivatives where needed.

Part of the solution to #1289

---------

Co-authored-by: Lindsey Heagy <lindseyheagy@gmail.com>
santisoler added a commit that referenced this issue Jun 7, 2024
Add a new `random_seed` argument to the `test()` method of objective
functions to control their random state. Use Numpy's random number
generator for managing the random state and the generation of random
numbers. Minor improvements to the implementation of the test methods.
Update tests that make use of these methods, and make them to use a seed
in every case.

Part of the solution to #1289
santisoler added a commit that referenced this issue Jun 10, 2024
Replace the usage of the deprecated functions in `numpy.random` module
for the Numpy's random number generator class and its methods, in most
of the TDEM tests.

Part of the solution to #1289
santisoler added a commit that referenced this issue Jun 17, 2024
Replace the usage of the deprecated functions in `numpy.random` module
for the Numpy's random number generator class and its methods, in most
of the static EM tests.

Part of the solution to #1289
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Maintaining code base without actual functionality changes
Projects
None yet
Development

No branches or pull requests

2 participants