Feature/param reset #328

joshuaspear · 2023-08-29T12:36:33Z

Closes #326

…impl. Also 1.Updated the conservative loss of discrete cql to be captured including the alpha multiplication to align with continuous cql. 2. Updated the critic loss of ddpg and continuous CQL to use dataclasses - aligning with DQN and discrete cql

joshuaspear · 2023-08-29T12:37:21Z

@takuseno I am still working on the tests but please let me know if you think the implementation is a reasonable approach

joshuaspear · 2023-08-29T17:25:28Z

d3rlpy/algos/qlearning/torch/callbacks.py

+    def _get_layers(self, q_func:nn.ModuleList)->List[nn.Module]:
+        all_modules = {nm:module for (nm, module) in q_func.named_modules()}
+        q_func_layers = [
+            *all_modules["_encoder._layers"],


@takuseno assuming you're happy with the general approach of using the epoch_callback to inject the parameter reset functionality - I wondered if you could recommend a better approach for obtaining the encoder and fc layers which follows static typing?

takuseno · 2023-09-02T07:47:28Z

@joshuaspear Thanks for the proposal! For now, can we make this as an experimental feature? I imagine something like this:

file location

Let's make experimental directory:

d3rlpy/experimental/parameter_reset.py

usage

Just rough illustration:

import d3rlpy

# e.g. 50% reset, every 1000 gradient steps
parameter_reset = d3rlpy.experimental.ParameterReset(reset_ratio=0.5, reset_interval=1000)

cql = d3rlpy.algos.CQLConfig().create()

def callback(algo, epoch, total_step):
    parameter_reset(algo.q_function, total_step)
    
cql.fit(..., callback=callback)

In this way, we can use the existing callback to inject reset operation.

Reset is still under investigation in RL community. Once it gets more mature, we can lift this from experimental.

joshuaspear · 2023-09-02T10:52:44Z

Makes sense - will have a go next week :)

joshuaspear added 11 commits August 4, 2023 06:51

tracking of cql regularisation for continuous cql

4a6edc9

updated for linting and formatting

5b7185c

overwriting dr3 pull and aligning cql logging

7fd0a37

updated formatting

a52651e

update gitignore

a46d73f

corrected formatting

e87cb94

first draft of parameter reset callback

3ebd8cb

corrected formatting

2846d20

first go at tests for param reset callback

7017e6f

Merge branch 'master' into feature/param_reset

def6ad7

fixed issues in call method

d793c6f

joshuaspear commented Aug 29, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/param reset #328

Feature/param reset #328

joshuaspear commented Aug 29, 2023

joshuaspear commented Aug 29, 2023

joshuaspear Aug 29, 2023

takuseno commented Sep 2, 2023

joshuaspear commented Sep 2, 2023

Feature/param reset #328

Are you sure you want to change the base?

Feature/param reset #328

Conversation

joshuaspear commented Aug 29, 2023

joshuaspear commented Aug 29, 2023

joshuaspear Aug 29, 2023

Choose a reason for hiding this comment

takuseno commented Sep 2, 2023

file location

usage

joshuaspear commented Sep 2, 2023