Releases
v1.1.8
Changes
Major
upgraded dependency to csnlp==1.5.8
reworked inner computations both in LstdQlearningAgent
and LstdDpgAgent
for performance and adherence to theory
reworked inner workings of callbacks: now they are stored in an internal dict, so easier to debug
fixed disrupting bug in the computations of the parameters' bounds for a constrained update
implemented the mpcrl.optim
sub-module: it contains different optimizers such as
Stochastic Gradient Descent
Newton's Method
Adam
RMSprop
moved parameters' constrained update solver to OSQP (QRQP was having scaling issues)
removed LearningRate
class
implemented schedulers.Chain
, allowing to chain multiple schedulers into a single one
Minor
added possibility to pass integer argument to experience
. This will create a buffer with the specified size
improvements to mpcrl.util.math
improvements to wrappers.agents.Log
(now uses lazy logging)
fixed bugs on on_episode_end
and on_episode_start
callback hook
improvements to examples
You can’t perform that action at this time.