Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in shaped reward networks #544

Merged
merged 16 commits into from Sep 8, 2022
Merged

Fix bug in shaped reward networks #544

merged 16 commits into from Sep 8, 2022

Conversation

levmckinney
Copy link
Collaborator

@levmckinney levmckinney commented Aug 30, 2022

Description

Addresses #543

Testing

  • Add test for shaping actually being applied during predict and predict processed.
  • Add test for ValueError being raised when you try to shape a wrapped reward net.

@levmckinney levmckinney linked an issue Aug 30, 2022 that may be closed by this pull request
@levmckinney levmckinney changed the title Fix bug in shapped reward networks Fix bug in shaped reward networks Aug 30, 2022
@codecov
Copy link

codecov bot commented Aug 30, 2022

Codecov Report

Merging #544 (fc9db6b) into master (de36306) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #544      +/-   ##
==========================================
+ Coverage   96.95%   96.96%   +0.01%     
==========================================
  Files          84       84              
  Lines        7460     7486      +26     
==========================================
+ Hits         7233     7259      +26     
  Misses        227      227              
Impacted Files Coverage Δ
src/imitation/rewards/reward_nets.py 97.14% <100.00%> (+0.06%) ⬆️
src/imitation/testing/reward_nets.py 100.00% <100.00%> (ø)
tests/rewards/test_reward_nets.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@levmckinney levmckinney marked this pull request as ready for review August 30, 2022 19:06
Copy link
Member

@AdamGleave AdamGleave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to reduce code duplication but otherwise this looks fine. Please request re-review once addressed.

src/imitation/rewards/reward_nets.py Outdated Show resolved Hide resolved
tests/rewards/test_reward_nets.py Show resolved Hide resolved
src/imitation/rewards/reward_nets.py Show resolved Hide resolved
src/imitation/rewards/reward_nets.py Outdated Show resolved Hide resolved
tests/rewards/test_reward_nets.py Outdated Show resolved Hide resolved
tests/rewards/test_reward_nets.py Show resolved Hide resolved
levmckinney and others added 2 commits September 3, 2022 22:48
Co-authored-by: Adam Gleave <adam@gleave.me>
@dfilan
Copy link
Collaborator

dfilan commented Sep 7, 2022

FWIW I'm kind of confused re: what about ForwardWrapper makes it so that it primarily changes the behaviour of forward, and why it looks so different from the code for PredictProcessedWrapper.

@levmckinney
Copy link
Collaborator Author

FWIW I'm kind of confused re: what about ForwardWrapper makes it so that it primarily changes the behaviour of forward, and why it looks so different from the code for PredictProcessedWrapper.

Since the forward wrapper inherits from RewardNet without implementing forward, it still an abstract class. Thus, anyone subclassing ForwardWrapper will have to implement forward if they want to be able to create concrete instances of the class.

The reason it looks so different is that: for the ForwardWrapepr we don't want all the methods calls on instances of ForwardWrapper to be pass through to their self.base. On the other hand, for the PredictProcessedWrapper we generally do.

Copy link
Member

@AdamGleave AdamGleave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dfilan dfilan merged commit 838ccca into master Sep 8, 2022
@dfilan dfilan deleted the shaped_reward_fix branch September 8, 2022 21:58
@AdamGleave AdamGleave restored the shaped_reward_fix branch September 8, 2022 22:05
@AdamGleave AdamGleave deleted the shaped_reward_fix branch September 8, 2022 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ShapedRewardNet has no effect when calling predict_processed
3 participants