Question: How can I manage the reproducibility of an experiment? #35

angel-ayala · 2020-08-14T20:48:33Z

Hi,
I'm currently setting a seed value for the environment using its seed methods, but when I run it multiple times I get very different results.
I know this issue involves many variables that the results may diverge, but I was wondering if there is any another parameter to be set in order to reduce this divergence?

I'm using your advance experiment as base to run a Gym.CartPole environment with LinearDecay epsilon and learning rate parameters just to learn how your library works, because this problem I coded from scratch with Q-learning with successfully results.

My code is something like this

environment = Gym(name='CartPole-v0', horizon=np.inf, gamma=1.)
environment.seed(seed_val)

# Policy
linear_epsilon = LinearParameter(0.9, 0.1, n=episodes//2)
pi = EpsGreedy(epsilon=linear_epsilon)

# state codification
n_tilings = 1
tilings = Tiles.generate(n_tilings, [1, 1, 6, 3],
                         environment.info.observation_space.low,
                         environment.info.observation_space.high)
features = Features(tilings=tilings)

approximator_params = dict(input_shape=(features.size,),
                           output_shape=(environment.info.action_space.n,),
                           n_actions=environment.info.action_space.n)

# agent
linear_alpha = LinearParameter(0.5, 0.1, n=episodes//2)
agent = SARSALambdaContinuous(environment.info, pi, LinearApproximator,
                              approximator_params=approximator_params,
                              learning_rate=linear_alpha,
                              lambda_coeff=.9, features=features)

Currently is not learning, but that is not the issue, just the different obtained results.

Thanks,

angel-ayala · 2020-08-14T20:51:42Z

Another thing, I just notice that parameter value is updated on each step.
Is there anyway to do this when the episode ends?

I currently was to do a class and call it in a step callback.

class EpisodicDecay:
    def __init__(self, parameter):
        self.parameter = parameter
        self._init_table = parameter._n_updates.table.copy()
        
    def __call__(self, dataset):
        if dataset[-1][-1]: # episode has ended
            self._init_table += 1
        self.parameter._n_updates.table = self._init_table.copy()

boris-il-forte · 2020-08-14T20:53:43Z

You need also to set the numpy seed, as it influences the policy.
If you add torch, you should also set that seed too.

In general, we cannot write a general method to set the seed, as many libraries could use different random generators.
e.g. the enviroment seed method is only for gym enviroments, all the others use default numpy seed.

For the parameter decay, the one that you propose is the only supported way to achieve that behavior. That's exactly one of the use cases of callbacks.

angel-ayala · 2020-08-14T20:59:57Z

ooh right!
Yes I know how do that, and make sense. Thanks!

And about the episodic decay parameter, that's ok, I can handle that.

Thanks!

NishanthVAnand · 2021-12-03T18:16:30Z

I get NotImplementedError when I try to set seed on some of the envrionments. Any thoughts on how to fix it?

env = PuddleWorld()
env.seed(seed)

 File  "/python3.8/site-packages/mushroom_rl/core/environment.py", line 137, in seed
     raise NotImplementedError
 NotImplementedError

boris-il-forte · 2021-12-06T11:48:39Z

see my answer to #78

angel-ayala closed this as completed Aug 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: How can I manage the reproducibility of an experiment? #35

Question: How can I manage the reproducibility of an experiment? #35

angel-ayala commented Aug 14, 2020

angel-ayala commented Aug 14, 2020

boris-il-forte commented Aug 14, 2020 •

edited

angel-ayala commented Aug 14, 2020

NishanthVAnand commented Dec 3, 2021

boris-il-forte commented Dec 6, 2021

Question: How can I manage the reproducibility of an experiment? #35

Question: How can I manage the reproducibility of an experiment? #35

Comments

angel-ayala commented Aug 14, 2020

angel-ayala commented Aug 14, 2020

boris-il-forte commented Aug 14, 2020 • edited

angel-ayala commented Aug 14, 2020

NishanthVAnand commented Dec 3, 2021

boris-il-forte commented Dec 6, 2021

boris-il-forte commented Aug 14, 2020 •

edited