Support Actor preprocessing network reuse for continuous case, fixes in DQN network #1128

opcode81 · 2024-04-29T12:30:10Z

I have added the correct label(s) to this Pull Request or linked the relevant issue(s)
I have provided a description of the changes in this Pull Request
I have added documentation for my changes
If applicable, I have added tests to cover my changes.
I have reformatted the code using poe format
I have checked style and types with poe lint and poe type-check
(Optional) I ran tests locally with poe test
(or a subset of them with poe test-reduced) ,and they pass
(Optional) I have tested that documentation builds correctly with poe doc-build

This PR fixes a bug in DQN and lifts a limination in reusing the actor's preprocessing network for continuous environments.

atari_network.DQN:
- Fix input validation
- Fix output_dim not being set if features_only=True and output_dim_added_layer not None
continuous.Critic:
- Add flag apply_preprocess_net_to_obs_only to allow the
  preprocessing network to be applied to the observations only (without
  the actions concatenated), which is essential for the case where we want
  to reuse the actor's preprocessing network
- CriticFactoryReuseActor: Use the flag, fixing the case where we want to reuse an actor's
  preprocessing network for the critic (must be applied before concatenating
  the actions)
Minor improvements in docs/docstrings

* Fix input validation * Fix output_dim not being set if features_only=True and output_dim_added_layer not None

… the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network

preprocessing network for the critic (must be applied before concatenating the actions)

…s, replay_buffer_stack_num

tianshou/utils/net/continuous.py

codecov-commenter · 2024-04-29T21:18:26Z

Codecov Report

Attention: Patch coverage is 43.10345% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 86.07%. Comparing base (081aded) to head (40f7724).

Files	Patch %	Lines
tianshou/utils/pickle.py	31.11%	31 Missing ⚠️
tianshou/highlevel/module/critic.py	0.00%	1 Missing ⚠️
tianshou/utils/net/continuous.py	90.00%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1128      +/-   ##
==========================================
- Coverage   86.36%   86.07%   -0.29%     
==========================================
  Files         102      103       +1     
  Lines        8579     8632      +53     
==========================================
+ Hits         7409     7430      +21     
- Misses       1170     1202      +32

Flag	Coverage Δ
unittests	`86.07% <43.10%> (-0.29%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DQN:

be1c8cd

* Fix input validation * Fix output_dim not being set if features_only=True and output_dim_added_layer not None

opcode81 added bug Something isn't working high-level-api Mainly affects tianshou.highlevel labels Apr 29, 2024

opcode81 force-pushed the fix/continuous-actor-with-nonflat-obs branch 2 times, most recently from 94537e3 to ccc2aa5 Compare April 29, 2024 15:36

opcode81 changed the title ~~Fix/continuous actor with nonflat obs~~ Support Actor preprocessing network reuse for continuous case, fixes in DQN network Apr 29, 2024

opcode81 force-pushed the fix/continuous-actor-with-nonflat-obs branch from ccc2aa5 to 678bf40 Compare April 29, 2024 15:46

opcode81 added 6 commits April 29, 2024 18:27

Add pickle/serialisation utils: setstate and getstate

18ed981

continuous.Critic: Add flag apply_preprocess_net_to_obs_only to allow…

0b49484

… the preprocessing network to be applied to the observations only (without the actions concatenated), which is essential for the case where we want to reuse the actor's preprocessing network

CriticFactoryReuseActor: Fix the case where we want to reuse an actor's

d18ded3

preprocessing network for the critic (must be applied before concatenating the actions)

Fix typo

7473741

SamplingConfig: Improve docstrings of replay_buffer_save_only_last_ob…

250a129

…s, replay_buffer_stack_num

Improve docstrings

8ac6bf5

opcode81 force-pushed the fix/continuous-actor-with-nonflat-obs branch from 678bf40 to 8ac6bf5 Compare April 29, 2024 16:27

MischaPanch approved these changes Apr 29, 2024

View reviewed changes

tianshou/utils/net/continuous.py Show resolved Hide resolved

MischaPanch marked this pull request as ready for review April 29, 2024 19:49

opcode81 added 2 commits April 29, 2024 22:14

Mention CHANGELOG.md in PR template

8308392

Update change log with changes from #1128

40f7724

MischaPanch merged commit a65920f into master Apr 29, 2024
8 checks passed

MischaPanch deleted the fix/continuous-actor-with-nonflat-obs branch April 29, 2024 21:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Actor preprocessing network reuse for continuous case, fixes in DQN network #1128

Support Actor preprocessing network reuse for continuous case, fixes in DQN network #1128

opcode81 commented Apr 29, 2024 •

edited

Loading

codecov-commenter commented Apr 29, 2024 •

edited

Loading

Support Actor preprocessing network reuse for continuous case, fixes in DQN network #1128

Support Actor preprocessing network reuse for continuous case, fixes in DQN network #1128

Conversation

opcode81 commented Apr 29, 2024 • edited Loading

codecov-commenter commented Apr 29, 2024 • edited Loading

Codecov Report

opcode81 commented Apr 29, 2024 •

edited

Loading

codecov-commenter commented Apr 29, 2024 •

edited

Loading