Fix replay buffer dtype #554

ahtsan · 2019-03-02T05:26:38Z

Replay buffer should not have a default dtype, since each of the element in the replay buffer should have dtype same as the source, e.g. observation should have dtype same as env.observation_space. One example is DQN with pixel environment, we want the observation in replay buffer to be type np.uint8, same as the observation space.

codecov · 2019-03-02T05:43:37Z

Codecov Report

Merging #554 into master will decrease coverage by 0.02%.
The diff coverage is 83.33%.

@@            Coverage Diff             @@
##           master     #554      +/-   ##
==========================================
- Coverage   58.37%   58.35%   -0.03%     
==========================================
  Files         135      135              
  Lines        9159     9159              
  Branches     1361     1361              
==========================================
- Hits         5347     5345       -2     
+ Misses       3430     3423       -7     
- Partials      382      391       +9

Impacted Files	Coverage Δ
garage/replay_buffer/simple_replay_buffer.py	`100% <100%> (ø)`	⬆️
garage/replay_buffer/base.py	`83.05% <100%> (ø)`	⬆️
...arage/tf/samplers/off_policy_vectorized_sampler.py	`71.79% <50%> (ø)`	⬆️
garage/tf/optimizers/first_order_optimizer.py	`65.27% <0%> (-2.78%)`	⬇️
garage/tf/policies/categorical_gru_policy.py	`80% <0%> (ø)`	⬆️
garage/misc/krylov.py	`20% <0%> (ø)`	⬆️
garage/sampler/stateful_pool.py	`53.78% <0%> (ø)`	⬆️
garage/tf/policies/gaussian_lstm_policy.py	`78.83% <0%> (ø)`	⬆️
garage/tf/policies/gaussian_gru_policy.py	`78.67% <0%> (ø)`	⬆️
garage/tf/policies/categorical_lstm_policy.py	`79.83% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b8ba37e...be27825. Read the comment docs.

Replay buffer should not have a default dtype, since each of the element in the replay buffer should have dtype same as the source, e.g. observation should have dtype same as env.observation_space.

ryanjulian · 2019-03-04T16:59:20Z

tests/garage/replay_buffer/test_replay_buffer.py

+        obs = env.reset()
+        replay_buffer = SimpleReplayBuffer(
+            env_spec=env, size_in_transitions=100, time_horizon=1)
+        replay_buffer.add_transition(


if this API operates on collections instead of single values, shouldn't it be add_transitions?

I think the API works specifically for VecEnvExecutor, where it adds a single transition for all of the n vec_env, resulting in a list of length n. If I have two vec_env, I will be calling replay_buffer.add_transition(observation=[obs1, obs2], action=[act1, act2]). But I think this assumption is not clear. We should add docstring that this replay buffer only works this way, rename it to something like VecEnvReplayBuffer and create another replay buffer which works with a single environment.
@CatherineSue Please correct me if I am wrong.

No there is no such assumption.

It adds a collection of transitions instead of a single transition. I agree the api should be add_transitions.

I see, we can also add more than one transition even when there is only one vec_env. I will rename the api and submit.

if you would like to also have the add_transition API, it's trivial to define given add_transitions (just wrap the args in a list and call add_transitions)

ahtsan requested a review from a team as a code owner March 2, 2019 05:26

ahtsan requested review from CatherineSue and ryanjulian March 2, 2019 05:27

CatherineSue approved these changes Mar 2, 2019

View reviewed changes

ahtsan added 2 commits March 1, 2019 23:42

Fix replay buffer dtype

91c153b

Replay buffer should not have a default dtype, since each of the element in the replay buffer should have dtype same as the source, e.g. observation should have dtype same as env.observation_space.

Add test

6dd4c69

ahtsan force-pushed the replay_buffer_fix branch from 0643db5 to 6dd4c69 Compare March 2, 2019 07:44

ryanjulian approved these changes Mar 4, 2019

View reviewed changes

API change

be27825

ahtsan merged commit 16d2d04 into master Mar 5, 2019

ahtsan deleted the replay_buffer_fix branch March 5, 2019 01:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix replay buffer dtype #554

Fix replay buffer dtype #554

ahtsan commented Mar 2, 2019

codecov bot commented Mar 2, 2019 •

edited

Loading

ryanjulian Mar 4, 2019

ahtsan Mar 4, 2019

CatherineSue Mar 4, 2019

CatherineSue Mar 4, 2019 •

edited

Loading

ahtsan Mar 4, 2019

ryanjulian Mar 4, 2019

Fix replay buffer dtype #554

Fix replay buffer dtype #554

Conversation

ahtsan commented Mar 2, 2019

codecov bot commented Mar 2, 2019 • edited Loading

Codecov Report

ryanjulian Mar 4, 2019

Choose a reason for hiding this comment

ahtsan Mar 4, 2019

Choose a reason for hiding this comment

CatherineSue Mar 4, 2019

Choose a reason for hiding this comment

CatherineSue Mar 4, 2019 • edited Loading

Choose a reason for hiding this comment

ahtsan Mar 4, 2019

Choose a reason for hiding this comment

ryanjulian Mar 4, 2019

Choose a reason for hiding this comment

codecov bot commented Mar 2, 2019 •

edited

Loading

CatherineSue Mar 4, 2019 •

edited

Loading