Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] ScannedRNN hidden state initialisation improvement #1058

Closed
lbeyers opened this issue Mar 6, 2024 · 0 comments
Closed

[FEATURE] ScannedRNN hidden state initialisation improvement #1058

lbeyers opened this issue Mar 6, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@lbeyers
Copy link
Contributor

lbeyers commented Mar 6, 2024

Please describe the purpose of the feature. Is it related to a problem?

Every time a hidden state is initialised with ScannedRNN.initialise_carry, a variable giving dimension for the layer width is required (sometimes this variable is called hidden_size or actor_network.pre_torso.layer_sizes[-1]). This is particularly a problem in the evaluator, where this variable cannot be consistently named between systems.

Describe the solution you'd like

When a ScannedRNN is initialised for use in a larger network, the hidden size must be recorded in such a way that it will be available whenever the hidden state must be reinitialised. I propose solutions below in order of preference:

  • When a ScannedRNN is used in a larger network, have that network optionally take in a hidden state. If no hidden state is provided, it will be automatically reinitialised.
  • Potentially, redo ScannedRNN to initialise hidden state automatically when it does not receive one specifically.
  • Alternatively, have an instance of the ScannedRNN live inside the greater network and have that instance retain its hidden size in a viewable way.

How do we know when implementation of this feature is complete?

Checklist:

  • In the evaluator and all system files, initialise_carry has access to a hidden size variable that is directly linked to the network the carry is then used for
  • OR initialise_carry is only ever called INSIDE the ScannedRNN or greater network (ie actor network or q learner network).

Additional context

This is a change that is too big to handle before the next deadline. It may affect all systems and their configs.

@lbeyers lbeyers added the enhancement New feature or request label Mar 6, 2024
@lbeyers lbeyers changed the title [FEATURE] ScannedRNN retains hidden dimension [FEATURE] ScannedRNN hidden state initialisation improvement Mar 7, 2024
@lbeyers lbeyers closed this as completed Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant