Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BranchDQN for large discrete action spaces #618

Merged
merged 54 commits into from
May 15, 2022
Merged

Conversation

BFAnas
Copy link
Contributor

@BFAnas BFAnas commented Apr 27, 2022

  • I have marked all applicable categories:
    • exception-raising fix
    • algorithm implementation fix
    • documentation modification
    • new feature
  • I have reformatted the code using make format (required)
  • I have checked the code using make commit-checks (required)
  • If applicable, I have mentioned the relevant/related issue(s)
  • If applicable, I have listed every items in this Pull Request below

@Trinkle23897
Copy link
Collaborator

Could you please resolve the merge conflict?

tianshou/utils/net/common.py Outdated Show resolved Hide resolved
tianshou/policy/imitation/cql.py Outdated Show resolved Hide resolved
tianshou/env/gym_wrappers.py Show resolved Hide resolved
test/offline/test_discrete_crr.py Outdated Show resolved Hide resolved
test/offline/test_discrete_cql.py Outdated Show resolved Hide resolved
test/offline/test_bcq.py Outdated Show resolved Hide resolved
test/offline/gather_pendulum_data.py Outdated Show resolved Hide resolved
test/offline/gather_cartpole_data.py Outdated Show resolved Hide resolved
test/modelbased/test_psrl.py Outdated Show resolved Hide resolved
test/continuous/test_ddpg.py Outdated Show resolved Hide resolved
@BFAnas BFAnas marked this pull request as ready for review April 29, 2022 07:44
@BFAnas
Copy link
Contributor Author

BFAnas commented Apr 29, 2022

@Trinkle23897 Could you check now? And please let me know if you have any comment :)

@Trinkle23897
Copy link
Collaborator

Thanks, will do it today.

@codecov-commenter
Copy link

codecov-commenter commented May 14, 2022

Codecov Report

Merging #618 (5b6cd12) into master (a03f19a) will increase coverage by 0.10%.
The diff coverage is 97.18%.

@@            Coverage Diff             @@
##           master     #618      +/-   ##
==========================================
+ Coverage   93.55%   93.66%   +0.10%     
==========================================
  Files          69       71       +2     
  Lines        4608     4733     +125     
==========================================
+ Hits         4311     4433     +122     
- Misses        297      300       +3     
Flag Coverage Δ
unittests 93.66% <97.18%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
tianshou/policy/modelfree/bdq.py 94.73% <94.73%> (ø)
tianshou/data/utils/converter.py 100.00% <100.00%> (ø)
tianshou/env/__init__.py 75.00% <100.00%> (+3.57%) ⬆️
tianshou/env/gym_wrappers.py 100.00% <100.00%> (ø)
tianshou/policy/__init__.py 100.00% <100.00%> (ø)
tianshou/policy/base.py 80.89% <100.00%> (ø)
tianshou/policy/imitation/base.py 100.00% <100.00%> (ø)
tianshou/policy/imitation/bcq.py 100.00% <100.00%> (ø)
tianshou/policy/imitation/cql.py 100.00% <100.00%> (ø)
tianshou/policy/imitation/discrete_bcq.py 98.24% <100.00%> (ø)
... and 5 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@BFAnas
Copy link
Contributor Author

BFAnas commented May 14, 2022

Done! the checks have passed! Ready to merge 🥳

@Trinkle23897
Copy link
Collaborator

Is it possible to make unit test faster, by tuning the hyperparameter?

============================== slowest durations ===============================
246.38s call     test/discrete/test_bdq.py::test_bdq
108.78s call     test/discrete/test_qrdqn.py::test_pqrdqn
67.95s call     test/continuous/test_redq.py::test_redq
59.09s call     test/pettingzoo/test_pistonball.py::test_piston_ball
58.40s call     test/offline/test_gail.py::test_gail
43.85s call     test/continuous/test_td3.py::test_td3

BDQ test is 4x slower than TD3...

@BFAnas
Copy link
Contributor Author

BFAnas commented May 14, 2022

Yes, I think it can be done. For now, the hyperparameters are not tuned yet. Which tool would you recommend for this?

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented May 14, 2022

Which tool would you recommend for this?

Hmm... I don't have a specific recommendation because the current hparam for all test scripts are tuned by my hand lol
Maybe optuna?

Maybe increase the learning rate and eps decay as the first step to see if it can work with 10 different seeds?

@BFAnas
Copy link
Contributor Author

BFAnas commented May 14, 2022

Tuned hyper-parameteres for bdq_test.py, now it runs almost 10 times faster 😃

Which tool would you recommend for this?

Hmm... I don't have a specific recommendation because the current hparam for all test scripts are tuned by my hand lol Maybe optuna?

Maybe increase the learning rate and eps decay as the first step to see if it can work with 10 different seeds?

Copy link
Collaborator

@Trinkle23897 Trinkle23897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two slight issues.

tianshou/policy/modelfree/bdq.py Outdated Show resolved Hide resolved
tianshou/policy/modelfree/bdq.py Outdated Show resolved Hide resolved
BFAnas and others added 2 commits May 15, 2022 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants