Fix random state for test_noisy_artificial_function_loss #977

pacowong · 2020-12-24T16:31:08Z

Types of changes

Docs change / refactoring / dependency upgrade
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Motivation and Context / Related issue

Related issue: #966
The goal is to resolve the bugs in random_state management for test_noisy_artificial_function_loss which causes reproducibility issues.
If you want to patch the current system immediately, I can add back np.random.seed(seed) to the test case.

Checklist

The documentation is up-to-date with the changes I made.
I have read the CONTRIBUTING document and completed the CLA (see CLA).
All tests passed, and additional code has been covered with new tests.

teytaud · 2020-12-25T16:39:15Z

Thanks! For the bug in "static" CI, just run "black nevergrad" and this should be solved, this is a formatting issue.

pacowong · 2020-12-26T14:58:44Z

Merry Christmas! It is necessary to decide if ArtificialVariable is stochastic, which pseudorandom generator should be used. Because it uses utils.Transform which is stochastic. while test cases are set to fixed values. In other words, any change in using the pseudorandom generator will affect the results.

Until the decision is made, the following statement will be fixed:

if noise_level > 0:
    parametrization.descriptors.deterministic_function = False

teytaud · 2020-12-26T19:46:35Z

Actually there is deterministic_function which means that the function is deterministic and deterministic which means that the function and the transformation are deterministic.

teytaud · 2020-12-26T19:47:20Z

Can you show which bug you are fixing ? This is still unclear to me.

teytaud · 2020-12-26T19:48:40Z

Merry Christmas to all people who have a Christmas or whichever day off you prefer during this time of the year :-)

pacowong · 2020-12-29T16:03:33Z

Actually there is deterministic_function which means that the function is deterministic and deterministic which means that the function and the transformation are deterministic.

This involves several function calls.
First of all, ArtificialFunction implements evaluation_function method that uses _transform method, in which,

def _transform(self, x: tp.ArrayLike) -> np.ndarray:
        data = self.transform_var.process(x) # <<< self.transform_var is a ArtificialVariable object
        return np.array(data)

I found that the process method of ArtificialVariable class is stochastic because

    def process(  # pylint: disable=unused-argument
        self, data: tp.ArrayLike, deterministic: bool = True
    ) -> np.ndarray:
        if not self._transforms: #This is true as self._transforms is []
            self._initialize() # <<< Trace this method
            ....

From the _initalize() method, we can see that it creates an object utils.Transform

    def _initialize(self) -> None:
        for transform_inds in tools.grouper(indices, n=self.block_dimension):
            self._transforms.append(
                utils.Transform( #<<< This line calls __init__ in utils.Transform
                    transform_inds,
                    translation_factor=self.translation_factor,
                    rotation=self.rotation,
                    random_state=self.random_state,
                )
            )

Finally, the class initializer of Transform is stochastic because

class Transform:
    """Defines a unique random transformation (index selection, translation, and optionally rotation)
    which can be applied to a point
    """

    def __init__(...):
       ...
       self.translation = ... #<<< This line is stochastic

Therefore, my conclusion is evaluation_function of ArtificialFunction is stochastic. This is the bug I discovered.

jrapin

That's great thanks!
More comments coming, I'm trying to get a better understanding of it little my little.

Also, I am considering adding a test which counts the number of calls to np.random and fails if we add one, with a message explaining the options :D Might be safer on the long run^^

jrapin · 2021-01-07T09:00:47Z

nevergrad/optimization/optimizerlib.py

@@ -110,7 +110,7 @@ def __init__(
        if mutation == "adaptive":
            self._adaptive_mr = 0.5
        if mutation == "coordinatewise_adaptive":
-            self._velocity = np.random.uniform(size=self.dimension) * arity / 4.0
+            self._velocity = self._rng.uniform(size=self.dimension) * arity / 4.0


good catch!
I would have expected this to get caught by the CI though :s

same for the 3 changes below, nice catches!

nevergrad/benchmark/test_xpbase.py

pacowong · 2021-01-08T15:49:54Z

@jrapin Concerning your question about the comment

    # Because copy() can have different random initialization for the parameters
    # The function should be copied early.
    np.random.seed(seed)
    pfunc = func.copy()
    # assert False #Paco: terminate early for debugging

Please trace the following code:

ExperimentFunction::copy()
Trace this line, output = self._internal_copy()
ExperimentFunction::_internal_copy(self: EF) -> EF
Trace this line, output: EF = self.__class__(...)
ExperimentFunction::init
Trace this line, self.parametrization = parametrization
ExperimentFunction::parametrization(self, parametrization: p.Parameter) -> None

@parametrization.setter
def parametrization(self, parametrization: p.Parameter) -> None:
    self._parametrization = parametrization
    self._parametrization.freeze()
    # pylint: disable=pointless-statement
    # print(self._parameterization._random_state) #Paco: you will see None is output twice for each run in the console
    self._parametrization.random_state  # force initialization for synchronization of random state
    # # TODO investigate why this synchronization is needed

Parameter::random_state(self) -> np.random.RandomState
Somewhere in the program, the random_state of a Parameter variable will be read and this will invoke

    @property
    def random_state(self) -> np.random.RandomState:
        """Random state the instrumentation and the optimizers pull from.
        It can be seeded/replaced.
        """
        if self._random_state is None:
            # use the setter, to make sure the random state is propagated to the variables
            seed = np.random.randint(2 ** 32, dtype=np.uint32)
            self._set_random_state(np.random.RandomState(seed))
        assert self._random_state is not None
        return self._random_state

Because self._random_state is None satisfies, seed = np.random.randint(2 ** 32, dtype=np.uint32) will be invoked.

jrapin · 2021-01-18T12:28:16Z

Sorry I took so long, so many things happening solving this has slipped my mind.
In the flow you are describing, it looks you are missing that 5 is actually called in 4. The last line of parametrization setter actually calls the random_state getter and the random state is therefore initialized from the very beginning.

In both cases anyway, I don't think changing the the seeding and copy location to a few lines above is supposed to matter.

pacowong · 2021-01-18T13:39:37Z

When I added

if self.TEMP_DEBUG:
    assert False

in the block of self._random_state is None: in 5, the assertion is triggered. The seed needs to be reset before func.copy().

In addition, I moved the func.copy() forward because it helps isolate the problem. For example, the random states of func may change after it is used by the optimizer by the code below:

    xp.run()
    loss_ref = xp.result["loss"]
    # now with copy
    assert xp._optimizer is not None
    reco = xp._optimizer.provide_recommendation()
    assert reco is not None

jrapin · 2021-01-20T17:28:15Z

When I added
if self.TEMP_DEBUG:
    assert False
in the block of self._random_state is None: in 5, the assertion is triggered. The seed needs to be reset before func.copy().

Not sure what you mean by this, but I definitely agree that the seed needs to be set before a new function is initialized.

In addition, I moved the func.copy() forward because it helps isolate the problem. For example, the random states of func may change after it is used by the optimizer by the code below:

I see your point. My initial aim with the test was to prove that I could reinstantiate a function later on, in an isolated part of the code, and that it would provide the same results. When you move it, you show that the initial function does not change the copy. I'm not sure one claim is stronger than the other, since in the first case you don't really know if there are no intereferences, and in the second, the copy could have modified the initial function (I would expect it shouldn't have... although maybe sometimes it could, so not sure), I guess both should be true^^

jrapin · 2021-01-20T17:29:42Z

Is there anything blocking you from continuing? I see the CI has some changed values. I guess the values can be updated if that's only due to different random generators.

…into fix_random_state

pacowong · 2021-01-23T05:42:25Z

@jrapin The complaints from mypy are valid. Please check why facebook:master branch passed the tests previously.

jrapin · 2021-01-26T09:10:35Z

@pacowong the typing issue was due to a mypy version update which now seems to parse non-package files. It's now fixed in master so you can merge master within your branch, and if things are still OK I'll merge this, thanks for the fixes ;)

jrapin · 2021-01-26T15:04:03Z

@pacowong thanks for the fix!

pacowong · 2021-01-26T15:14:51Z

@jrapin You're welcome. ;)

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Dec 24, 2020

Add and use random_state in the parameterization

baa41f9

pacowong force-pushed the fix_random_state branch from f005b4d to baa41f9 Compare December 26, 2020 12:15

pacowong added 4 commits December 26, 2020 20:17

Remove save state

071ae1f

Update the expected values

c6e88d0

Fix format of test_xpbase

f6764f9

Replace np.random by RandomState

aad8523

jrapin reviewed Jan 7, 2021

View reviewed changes

Paco, Pak-Kan, WONG and others added 4 commits January 22, 2021 20:57

Merge branch 'master' of https://github.com/facebookresearch/nevergrad …

209eebe

…into fix_random_state

Fix random state

d61605b

Partially fix the randomness

4223527

Fix random state restoration bug

d92c8ec

pacowong changed the title ~~[WIP] Fix random state for test_noisy_artificial_function_loss~~ Fix random state for test_noisy_artificial_function_loss Jan 23, 2021

jrapin mentioned this pull request Jan 25, 2021

Update mypy version to 0.800 #1024

Merged

7 tasks

jrapin self-requested a review January 26, 2021 09:10

jrapin approved these changes Jan 26, 2021

View reviewed changes

Merge branch 'master' into fix_random_state

4239aba

Fix constant range

9d34824

jrapin merged commit a4e7c1f into facebookresearch:master Jan 26, 2021

pacowong deleted the fix_random_state branch January 26, 2021 15:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix random state for test_noisy_artificial_function_loss #977

Fix random state for test_noisy_artificial_function_loss #977

pacowong commented Dec 24, 2020 •

edited

teytaud commented Dec 25, 2020

pacowong commented Dec 26, 2020 •

edited

teytaud commented Dec 26, 2020

teytaud commented Dec 26, 2020

teytaud commented Dec 26, 2020

pacowong commented Dec 29, 2020 •

edited

jrapin left a comment

jrapin Jan 7, 2021

jrapin Jan 7, 2021

pacowong commented Jan 8, 2021

jrapin commented Jan 18, 2021

pacowong commented Jan 18, 2021

jrapin commented Jan 20, 2021

jrapin commented Jan 20, 2021

pacowong commented Jan 23, 2021

jrapin commented Jan 26, 2021

jrapin commented Jan 26, 2021

pacowong commented Jan 26, 2021

Fix random state for test_noisy_artificial_function_loss #977

Fix random state for test_noisy_artificial_function_loss #977

Conversation

pacowong commented Dec 24, 2020 • edited

Types of changes

Motivation and Context / Related issue

Checklist

teytaud commented Dec 25, 2020

pacowong commented Dec 26, 2020 • edited

teytaud commented Dec 26, 2020

teytaud commented Dec 26, 2020

teytaud commented Dec 26, 2020

pacowong commented Dec 29, 2020 • edited

jrapin left a comment

Choose a reason for hiding this comment

jrapin Jan 7, 2021

Choose a reason for hiding this comment

jrapin Jan 7, 2021

Choose a reason for hiding this comment

pacowong commented Jan 8, 2021

jrapin commented Jan 18, 2021

pacowong commented Jan 18, 2021

jrapin commented Jan 20, 2021

jrapin commented Jan 20, 2021

pacowong commented Jan 23, 2021

jrapin commented Jan 26, 2021

jrapin commented Jan 26, 2021

pacowong commented Jan 26, 2021

pacowong commented Dec 24, 2020 •

edited

pacowong commented Dec 26, 2020 •

edited

pacowong commented Dec 29, 2020 •

edited