MNT: Set local random seeds in all SimPEG tests #1289

santisoler · 2023-09-27T16:46:57Z

Proposed new feature or change:

The issue

Several SimPEG tests need to create some form of synthetic data through a pseudo-random generator, usually done through the numpy.random module. One best practice for having reproducible tests is to set a random seed, with the goal of ensuring that every run of the tests suite is done against the same random values. These seeds are currently being set with the numpy.random.seed() function.

In several tests, these seeds are defined globally, outside the test functions and methods. For example:

simpeg/tests/dask/test_grav_inversion_linear.py

Lines 17 to 27 in 875fc32

    
           from SimPEG.potential_fields import gravity 
        
           import shutil 
        
           np.random.seed(43) 
        
           class GravInvLinProblemTest(unittest.TestCase): 
        
               def setUp(self): 
        
                   # Create a self.mesh 
        
                   dx = 5.0

Pytest then sets the seeds when collecting all tests, therefore setting a single global seed for all tests. This means that now the random state of the tests might change depending on the order these tests are run. If for example, someone introduces some additional tests "in the middle", it might change the random state of the following tests, with the chance of them failing.

Minimal working example

For example, let's say we have two identical test files: test_first.py and test_second.py

import numpy as np

np.random.seed(5)

def test_value():
    random_value = np.random.randint(low=0, high=10)
    assert random_value == 3

When running pytest to run all the tests inside them, we see that the second one fails:

pytest test_first.py test_second.py

===================================== test session starts =====================================
platform linux -- Python 3.10.8, pytest-7.2.0, pluggy-1.0.0
rootdir: /home/santi/tmp/testing-random
plugins: anyio-3.6.2
collected 2 items

test_first.py .                                                                         [ 50%]
test_second.py F                                                                        [100%]

========================================== FAILURES ===========================================
_________________________________________ test_value __________________________________________

    def test_value():
        random_value = np.random.randint(low=0, high=10)
>       assert random_value == 3
E       assert 6 == 3

test_second.py:8: AssertionError
=================================== short test summary info ===================================
FAILED test_second.py::test_value - assert 6 == 3
================================= 1 failed, 1 passed in 0.21s =================================

Solution

A way to solve this is to define local random seeds within each test and ditching the globally defined seeds.

Moreover, it would be nice to move from the np.random.seed() and np.random.___() functions to the new Numpy's random number generator objects.

For example, to create an array of 100 random elements with a gaussian distribution we can do the following:

import numpy as np

random_num_generator = np.random.default_rng(seed=42)
gaussian = random_num_generator.normal(loc=0.0, scale=10.0, size=100)

This random number generator objects are future proof (the NumPy's RandomState is considered legacy) and it creates a more clear code, making it easy to understand which random state is being used when generating random numbers.

Related Issues and PRs

I started working on this in #1286, feel free to use that as inspiration. It would be nice to continue the work with the other test functions.

The text was updated successfully, but these errors were encountered:

Make use of the `random_seed` argument of the `make_synthetic_data` method in magnetic tests. Remove the lines that set a global `np.random.seed` in those tests. Part of the solution to #1289

Replace the usage of the deprecated functions in `numpy.random` module for the Numpy's random number generator class and its methods, in most of the FDEM tests. Increase number of iterations for checking derivatives where needed. Part of the solution to #1289 --------- Co-authored-by: Lindsey Heagy <lindseyheagy@gmail.com>

Add a new `random_seed` argument to the `test()` method of objective functions to control their random state. Use Numpy's random number generator for managing the random state and the generation of random numbers. Minor improvements to the implementation of the test methods. Update tests that make use of these methods, and make them to use a seed in every case. Part of the solution to #1289

Replace the usage of the deprecated functions in `numpy.random` module for the Numpy's random number generator class and its methods, in most of the TDEM tests. Part of the solution to #1289

Replace the usage of the deprecated functions in `numpy.random` module for the Numpy's random number generator class and its methods, in most of the static EM tests. Part of the solution to #1289

santisoler mentioned this issue Nov 29, 2023

Allow to use random seed in make_synthetic_data #1286

Merged

6 tasks

lheagy added the maintenance Maintaining code base without actual functionality changes label Nov 29, 2023

This was referenced Jan 2, 2024

EMsimulation face and edge conductivities #1296

Open

Ditch deprecated functions in utils.model_builder #1311

Merged

BUG: Random failure of dask.em.static test #1315

Open

This was referenced Mar 11, 2024

ENH: Improve how random states are handled in directives #1290

Closed

Remove the debug argument from InversionDirective #1370

Merged

santisoler changed the title ~~ENH: Set local random seeds in all SimPEG tests~~ MNT: Set local random seeds in all SimPEG tests May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNT: Set local random seeds in all SimPEG tests #1289

MNT: Set local random seeds in all SimPEG tests #1289

santisoler commented Sep 27, 2023

MNT: Set local random seeds in all SimPEG tests #1289

MNT: Set local random seeds in all SimPEG tests #1289

Comments

santisoler commented Sep 27, 2023

Proposed new feature or change:

The issue

Minimal working example

Solution

Related Issues and PRs