Skip to content

v1.1.8

Compare
Choose a tag to compare
@FilippoAiraldi FilippoAiraldi released this 26 Oct 16:52
· 88 commits to main since this release

Changes

Major

  • upgraded dependency to csnlp==1.5.8
  • reworked inner computations both in LstdQlearningAgent and LstdDpgAgent for performance and adherence to theory
  • reworked inner workings of callbacks: now they are stored in an internal dict, so easier to debug
  • fixed disrupting bug in the computations of the parameters' bounds for a constrained update
  • implemented the mpcrl.optim sub-module: it contains different optimizers such as
    • Stochastic Gradient Descent
    • Newton's Method
    • Adam
    • RMSprop
  • moved parameters' constrained update solver to OSQP (QRQP was having scaling issues)
  • removed LearningRate class
  • implemented schedulers.Chain, allowing to chain multiple schedulers into a single one

Minor

  • added possibility to pass integer argument to experience. This will create a buffer with the specified size
  • improvements to mpcrl.util.math
  • improvements to wrappers.agents.Log (now uses lazy logging)
  • fixed bugs on on_episode_end and on_episode_start callback hook
  • improvements to examples