Skip to content

Releases: yamatokataoka/reinforcement-learning-replications

0.0.6

03 Sep 02:39
aa5f89a
Compare
Choose a tag to compare

What's Changed

  • Fix numpy typing

Full Changelog: 0.0.5...0.0.6

0.0.5

27 Aug 03:19
d7eb873
Compare
Choose a tag to compare

What's Changed

  • Reply buffer accepts Experience data class

Full Changelog: 0.0.4...0.0.5

0.0.4

20 Aug 14:17
573ab0d
Compare
Choose a tag to compare

What's Changed

  • Simplify release flow in #56
  • Change log level in #58
  • Add save_model function in #63
  • Refactor collect_one_epoch_experience function in OnPolicyAlgorithm in #65
  • Move last reward bootstrap to train function for on-policy in #67
  • Refactor collect one epoch experience in off policy algorithm in #68
  • Use action selection function in #69
  • Delete common folder in #70
  • Use same data class for one epoch experience in #72
  • Bump up Python version to 3.10 in #77
  • Refactor mypy in #78
  • Refactor comments in #81
  • Bump down python version in #84
  • Update example in #85
  • Refactor experience calculations in #90
  • Refactor number variable names in #91
  • Refactor Replybuffer in #92
  • Implement samplers in #93
  • Implement seed manager in #95
  • Extract ReplyBuffer in #97
  • Implement evaluator in #99
  • Add get_action_tensor and get_action_numpy in #101
  • Always log for tensorboard in #103
  • Refactor train functions in #104
  • Delete base algorithms in #106
  • Replace select_action_with_noise with add_noise_to_get_action in #107
  • Improve logging of train function in #108
  • Add helper functions to train function in #109
  • Add evaluation to integration tests in #111
  • Add metrics manager in #112
  • Improve model saving in #113
  • Benchmarking in #116
  • Fix ddpg and td3 in #118
  • Simplify seed management in #120
  • Update gym env versions in #121
  • Improve release flow in #122
  • Fix release flow in #124

Full Changelog: 0.0.3...0.0.4

0.0.3

29 Jan 04:24
Compare
Choose a tag to compare

What's Changed

  • Split policy into base policy, stochastic policy and categorical policy in #25
  • Implement DDPG in #26
  • Create MLP class in #28
  • Refactor comments in #30
  • Refactor DDPG in #31
  • Store each variables on list in #32
  • Implement TD3 in #33
  • Fix typing in #35
  • Logging improvement in #37
  • Support continuous action spaces in #39
  • Adopt the src layout in #41
  • Improve packaging in #46
  • set up GitHub Actions for PyPI and Test PyPI in #47
  • Add linters in #50
  • Add integration test in #52
  • Set up mypy in #53

Full Changelog: 0.0.2...0.0.3

0.0.2

17 Nov 11:15
Compare
Choose a tag to compare

Reinforcement Learning Replications 0.0.2