Reference for DeepMind using FireResetEnv wrapper's functionality #240

DennisSoemers · 2017-12-30T19:47:09Z

In atari_wrappers.py (specifically, the wrap_deepmind function all the way at the bottom), the docstring says that that function configures the environment for "DeepMind-style Atari".

I was curious if anyone can provide me with a reference to any DeepMind paper mentioning that they do in fact use functionality as implemented by the FireResetEnv wrapper? I was able to find mentions of the functionality implemented by all the other wrappers applied in that function (and also the ones in the make_atari function above) in various DeepMind papers (such as the Mnih et al. (2015) DQN Nature paper), but was unable to find any text resembling the functionality of the FireResetEnv.

It may just be a minor detail, but I do think it's important to be precise with this kind of stuff for the sake of reproducibility.

muupan · 2018-03-18T15:17:49Z

I'm also curious whether the DeepMind papers use the same trick. I was unable to find it in their code (dqn, alewrap and xitari).

DennisSoemers · 2018-03-18T15:56:21Z

@muupan I've continued looking into this since writing that issue. I have not gotten 100% confirmation anywhere. I did send a mail to John Schulman (who wrote the initial commit in which the comment appeared in this repository claiming that this is deepmind-style, although he did not originally implement that wrapper), and he has no idea where it came from.

Additionally, it appears like the intended functionality (automatically playing actions that are required to start the game on reset) is actually already implemented in the Arcade Learning Environment itself, in the following line of code: https://github.com/mgbellemare/Arcade-Learning-Environment/blob/master/src/environment/stella_environment.cpp#L88

So, to me it appears like anyone who uses the Arcade Learning Environment (which is basically everyone, it's DeepMind, but it's also everyone who runs atari games through OpenAI gym, etc.) already gets this functionality out of the box, even without the FireResetEnv. I suspect the FireResetEnv may be completely useless. I've never personally tested how it affects performance anywhere, that would still be interesting to do, just to make sure. I did very briefly test the AirRaid and Asterix games (which, as far as I've been able to find, are two games that supposedly require pressing FIRE to start a game), and they appeared to play just fine both with and without the FireResetEnv.

muupan · 2018-03-22T10:13:10Z

@DennisSoemers Thank you for the information. At least, FireResetEnv actually makes a difference for Breakout. Breakout has no getStartingActions in the ALE, but it needs FIRE to shoot a ball. Without FireResetEnv, when the algorithm fails to learn to push FIRE, it gets stuck.

DennisSoemers · 2018-03-22T10:36:46Z

Ah, I see, thanks for letting me know about that one. I guess the only way to tell for sure is to contact DeepMind and ask them if they're using anything like this. Right now I'm inclined to bet on "no" though.

muupan · 2018-03-22T11:30:05Z

I sent an email to the author of the DQN paper. I hope he will answer it.

DennisSoemers · 2018-04-13T07:58:25Z

@muupan Just wondering if you ever got a reply?

muupan · 2018-04-15T06:49:09Z

Unfortunately not.

Kaixhin · 2018-05-08T07:38:45Z

After doing some experiments I suspect that they don't use this wrapper, and like @muupan I was unable to find it in their code. It seems that for the DQN-based agents at least, DeepMind evaluates using an ɛ-greedy policy, where ɛ = 0.05. So during evaluation in the early stages of training, the ball does end up getting released quickly in Breakout (whereas when I used ɛ = 0.001 evaluation basically got stuck). I'm training a Rainbow agent now, will try to remember to report back with results once it is done.

By the way, if you're interested in reproducibility, DeepMind's code shows that they use bilinear interpolation for downsampling, as opposed to the wrapper here which uses pixel area relation.

Kaixhin · 2018-05-22T22:48:16Z

Got confirmation from Charles Beattie at DeepMind that they do not use anything like the FireResetEnv wrapper, so I have no idea where that came from.

muupan · 2018-10-06T12:26:49Z

@Kaixhin

It seems that for the DQN-based agents at least, DeepMind evaluates using an ɛ-greedy policy, where ɛ = 0.05

Which specific papers do you mean by the DQN-based agents? As far as I know, at least the PER paper seems to use 0.01 for up-to-30-noop evaluation (from Table 5 of http://arxiv.org/abs/1511.05952), while the QR-DQN paper seems to use 0.001 for up-to-30-noop evaluation (from "Best agent performance" subsection of https://arxiv.org/abs/1710.10044). Correct me if I'm wrong.

Kaixhin · 2018-10-06T13:04:02Z

Sorry yes they may differ from paper to paper. I got ɛ = 0.05 from the (Nature) DQN paper, but recently it seems like they've been using ɛ = 0.001.

Unfortunately they are still changing settings - the new Pop-Art + IMPALA paper takes away the termination on loss of life wrapper. I hope they'll settle on the setup in the Revisiting ALE paper, but as long as DM is concerned with improving upon their own results it seems unlikely.

muupan · 2018-10-06T13:54:27Z

Thank you very much for clarifying it. I hope so, too.

…i#240)

steffenvan · 2019-04-30T16:55:37Z

Hi @Kaixhin ,
since this is still open, I was wondering what exactly you mean with "PopArt + IMPALA paper takes away the loss of life wrapper"? I'm asking because I'm currently trying to reproduce those exact results with the current wrapper at the moment.

Best,
Steffen

Kaixhin · 2019-04-30T23:55:44Z

@steffenvan I assume that if you don't use the EpisodicLifeEnv wrapper you'll get an equivalent setup to the one they used in that paper.

muupan mentioned this issue Apr 6, 2018

Use gym and atari wrappers instead of chainerrl.envs.ale chainer/chainerrl#253

Merged

3 tasks

AdamGleave added a commit to HumanCompatibleAI/baselines that referenced this issue Mar 24, 2019

Bugfix in setup.py: don't break when nvidia-smi does not exist (opena…

76e6d2f

…i#240)

pzhokhov mentioned this issue Aug 16, 2019

re-enable and fix atari preprocessing wrappers pixel test openai/gym#1652

Merged

Kaixhin mentioned this issue May 22, 2020

Perform fire on reset after doing no-ops, make fire_on_reset configurable astooke/rlpyt#158

Merged

zmonoid mentioned this issue Aug 23, 2020

About FireReset openai/gym#2028

Closed

Trinkle23897 mentioned this issue Aug 28, 2020

DQN Atari examples thu-ml/tianshou#187

Merged

hh0rva1h mentioned this issue Sep 26, 2022

Reproducing the scores reported by the IQN paper google/dopamine#37

Open

hrpan mentioned this issue Nov 14, 2022

[Feature Request] Option to disable fire-reset in Atari environments sail-sg/envpool#221

Closed

1 task

TuTuHuss mentioned this issue Nov 24, 2022

translation(whd): add env_wrapper english file and correct mistake in its Chinese version opendilab/DI-engine-docs#206

Closed

balazsgyenes mentioned this issue Nov 21, 2023

[Proposal] Add the functionality of FireResetEnv to the AtariPreprocessing wrapper Farama-Foundation/Gymnasium#781

Closed

1 task

pseudo-rnd-thoughts mentioned this issue Jan 29, 2024

[DRAFT] Add functionality of FireReset to AtariPreprocessing wrapper Farama-Foundation/Gymnasium#805

Closed

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference for DeepMind using FireResetEnv wrapper's functionality #240

Reference for DeepMind using FireResetEnv wrapper's functionality #240

DennisSoemers commented Dec 30, 2017

muupan commented Mar 18, 2018

DennisSoemers commented Mar 18, 2018

muupan commented Mar 22, 2018 •

edited

DennisSoemers commented Mar 22, 2018

muupan commented Mar 22, 2018

DennisSoemers commented Apr 13, 2018

muupan commented Apr 15, 2018

Kaixhin commented May 8, 2018 •

edited

Kaixhin commented May 22, 2018

muupan commented Oct 6, 2018 •

edited

Kaixhin commented Oct 6, 2018

muupan commented Oct 6, 2018

steffenvan commented Apr 30, 2019

Kaixhin commented Apr 30, 2019

Reference for DeepMind using FireResetEnv wrapper's functionality #240

Reference for DeepMind using FireResetEnv wrapper's functionality #240

Comments

DennisSoemers commented Dec 30, 2017

muupan commented Mar 18, 2018

DennisSoemers commented Mar 18, 2018

muupan commented Mar 22, 2018 • edited

DennisSoemers commented Mar 22, 2018

muupan commented Mar 22, 2018

DennisSoemers commented Apr 13, 2018

muupan commented Apr 15, 2018

Kaixhin commented May 8, 2018 • edited

Kaixhin commented May 22, 2018

muupan commented Oct 6, 2018 • edited

Kaixhin commented Oct 6, 2018

muupan commented Oct 6, 2018

steffenvan commented Apr 30, 2019

Kaixhin commented Apr 30, 2019

muupan commented Mar 22, 2018 •

edited

Kaixhin commented May 8, 2018 •

edited

muupan commented Oct 6, 2018 •

edited