Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix random state for test_noisy_artificial_function_loss #977

Merged
merged 11 commits into from Jan 26, 2021

Conversation

pacowong
Copy link
Contributor

@pacowong pacowong commented Dec 24, 2020

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Motivation and Context / Related issue

Related issue: #966
The goal is to resolve the bugs in random_state management for test_noisy_artificial_function_loss which causes reproducibility issues.
If you want to patch the current system immediately, I can add back np.random.seed(seed) to the test case.

Checklist

  • The documentation is up-to-date with the changes I made.
  • I have read the CONTRIBUTING document and completed the CLA (see CLA).
  • All tests passed, and additional code has been covered with new tests.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Dec 24, 2020
@teytaud
Copy link
Contributor

teytaud commented Dec 25, 2020

Thanks! For the bug in "static" CI, just run "black nevergrad" and this should be solved, this is a formatting issue.

@pacowong
Copy link
Contributor Author

pacowong commented Dec 26, 2020

Merry Christmas! It is necessary to decide if ArtificialVariable is stochastic, which pseudorandom generator should be used. Because it uses utils.Transform which is stochastic. while test cases are set to fixed values. In other words, any change in using the pseudorandom generator will affect the results.

Until the decision is made, the following statement will be fixed:

if noise_level > 0:
    parametrization.descriptors.deterministic_function = False

@teytaud
Copy link
Contributor

teytaud commented Dec 26, 2020

Actually there is deterministic_function which means that the function is deterministic and deterministic which means that the function and the transformation are deterministic.

@teytaud
Copy link
Contributor

teytaud commented Dec 26, 2020

Can you show which bug you are fixing ? This is still unclear to me.

@teytaud
Copy link
Contributor

teytaud commented Dec 26, 2020

Merry Christmas to all people who have a Christmas or whichever day off you prefer during this time of the year :-)

@pacowong
Copy link
Contributor Author

pacowong commented Dec 29, 2020

Actually there is deterministic_function which means that the function is deterministic and deterministic which means that the function and the transformation are deterministic.

This involves several function calls.
First of all, ArtificialFunction implements evaluation_function method that uses _transform method, in which,

def _transform(self, x: tp.ArrayLike) -> np.ndarray:
        data = self.transform_var.process(x) # <<< self.transform_var is a ArtificialVariable object
        return np.array(data)

I found that the process method of ArtificialVariable class is stochastic because

    def process(  # pylint: disable=unused-argument
        self, data: tp.ArrayLike, deterministic: bool = True
    ) -> np.ndarray:
        if not self._transforms: #This is true as self._transforms is []
            self._initialize() # <<< Trace this method
            ....

From the _initalize() method, we can see that it creates an object utils.Transform

    def _initialize(self) -> None:
        for transform_inds in tools.grouper(indices, n=self.block_dimension):
            self._transforms.append(
                utils.Transform( #<<< This line calls __init__ in utils.Transform
                    transform_inds,
                    translation_factor=self.translation_factor,
                    rotation=self.rotation,
                    random_state=self.random_state,
                )
            )

Finally, the class initializer of Transform is stochastic because

class Transform:
    """Defines a unique random transformation (index selection, translation, and optionally rotation)
    which can be applied to a point
    """

    def __init__(...):
       ...
       self.translation = ... #<<< This line is stochastic

Therefore, my conclusion is evaluation_function of ArtificialFunction is stochastic. This is the bug I discovered.

Copy link
Contributor

@jrapin jrapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great thanks!
More comments coming, I'm trying to get a better understanding of it little my little.

Also, I am considering adding a test which counts the number of calls to np.random and fails if we add one, with a message explaining the options :D Might be safer on the long run^^

@@ -110,7 +110,7 @@ def __init__(
if mutation == "adaptive":
self._adaptive_mr = 0.5
if mutation == "coordinatewise_adaptive":
self._velocity = np.random.uniform(size=self.dimension) * arity / 4.0
self._velocity = self._rng.uniform(size=self.dimension) * arity / 4.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!
I would have expected this to get caught by the CI though :s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for the 3 changes below, nice catches!

nevergrad/benchmark/test_xpbase.py Show resolved Hide resolved
@pacowong
Copy link
Contributor Author

pacowong commented Jan 8, 2021

@jrapin Concerning your question about the comment

    # Because copy() can have different random initialization for the parameters
    # The function should be copied early.
    np.random.seed(seed)
    pfunc = func.copy()
    # assert False #Paco: terminate early for debugging

Please trace the following code:

  1. ExperimentFunction::copy()
    Trace this line, output = self._internal_copy()
  2. ExperimentFunction::_internal_copy(self: EF) -> EF
    Trace this line, output: EF = self.__class__(...)
  3. ExperimentFunction::init
    Trace this line, self.parametrization = parametrization
  4. ExperimentFunction::parametrization(self, parametrization: p.Parameter) -> None
@parametrization.setter
def parametrization(self, parametrization: p.Parameter) -> None:
    self._parametrization = parametrization
    self._parametrization.freeze()
    # pylint: disable=pointless-statement
    # print(self._parameterization._random_state) #Paco: you will see None is output twice for each run in the console
    self._parametrization.random_state  # force initialization for synchronization of random state
    # # TODO investigate why this synchronization is needed
  1. Parameter::random_state(self) -> np.random.RandomState
    Somewhere in the program, the random_state of a Parameter variable will be read and this will invoke
    @property
    def random_state(self) -> np.random.RandomState:
        """Random state the instrumentation and the optimizers pull from.
        It can be seeded/replaced.
        """
        if self._random_state is None:
            # use the setter, to make sure the random state is propagated to the variables
            seed = np.random.randint(2 ** 32, dtype=np.uint32)
            self._set_random_state(np.random.RandomState(seed))
        assert self._random_state is not None
        return self._random_state

Because self._random_state is None satisfies, seed = np.random.randint(2 ** 32, dtype=np.uint32) will be invoked.

@jrapin
Copy link
Contributor

jrapin commented Jan 18, 2021

Sorry I took so long, so many things happening solving this has slipped my mind.
In the flow you are describing, it looks you are missing that 5 is actually called in 4. The last line of parametrization setter actually calls the random_state getter and the random state is therefore initialized from the very beginning.

In both cases anyway, I don't think changing the the seeding and copy location to a few lines above is supposed to matter.

@pacowong
Copy link
Contributor Author

When I added

if self.TEMP_DEBUG:
    assert False

in the block of self._random_state is None: in 5, the assertion is triggered. The seed needs to be reset before func.copy().

In addition, I moved the func.copy() forward because it helps isolate the problem. For example, the random states of func may change after it is used by the optimizer by the code below:

    xp.run()
    loss_ref = xp.result["loss"]
    # now with copy
    assert xp._optimizer is not None
    reco = xp._optimizer.provide_recommendation()
    assert reco is not None

@jrapin
Copy link
Contributor

jrapin commented Jan 20, 2021

When I added

if self.TEMP_DEBUG:
    assert False

in the block of self._random_state is None: in 5, the assertion is triggered. The seed needs to be reset before func.copy().

Not sure what you mean by this, but I definitely agree that the seed needs to be set before a new function is initialized.

In addition, I moved the func.copy() forward because it helps isolate the problem. For example, the random states of func may change after it is used by the optimizer by the code below:

I see your point. My initial aim with the test was to prove that I could reinstantiate a function later on, in an isolated part of the code, and that it would provide the same results. When you move it, you show that the initial function does not change the copy. I'm not sure one claim is stronger than the other, since in the first case you don't really know if there are no intereferences, and in the second, the copy could have modified the initial function (I would expect it shouldn't have... although maybe sometimes it could, so not sure), I guess both should be true^^

@jrapin
Copy link
Contributor

jrapin commented Jan 20, 2021

Is there anything blocking you from continuing? I see the CI has some changed values. I guess the values can be updated if that's only due to different random generators.

@pacowong
Copy link
Contributor Author

@jrapin The complaints from mypy are valid. Please check why facebook:master branch passed the tests previously.

@pacowong pacowong changed the title [WIP] Fix random state for test_noisy_artificial_function_loss Fix random state for test_noisy_artificial_function_loss Jan 23, 2021
@jrapin jrapin mentioned this pull request Jan 25, 2021
7 tasks
@jrapin
Copy link
Contributor

jrapin commented Jan 26, 2021

@pacowong the typing issue was due to a mypy version update which now seems to parse non-package files. It's now fixed in master so you can merge master within your branch, and if things are still OK I'll merge this, thanks for the fixes ;)

@jrapin jrapin self-requested a review January 26, 2021 09:10
@jrapin
Copy link
Contributor

jrapin commented Jan 26, 2021

@pacowong thanks for the fix!

@jrapin jrapin merged commit a4e7c1f into facebookresearch:master Jan 26, 2021
@pacowong
Copy link
Contributor Author

@jrapin You're welcome. ;)

@pacowong pacowong deleted the fix_random_state branch January 26, 2021 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants