-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RL Tuning Tips #67
Comments
It's hard for me to say more without understanding the setting more completely. At a glance, the model seems to be set up correctly. Some things I notice are:
|
Based on your comments, I trained S4 models with more layers and a larger |
It's very hard for me to say without knowing more details about the problem (and I'm not an expert in RL). If MLP is doing well, perhaps the problem is Markovian and doesn't need a sequence model at all. Another sanity check could be to try other recurrent baselines such as an LSTM/GRU core, which should have a similar interface to S4(D). I know of another RL project using S4(D)/S5 where they found that it was consistently much better than an LSTM, so this type of baseline could reveal if the discrepancy is in the problem setup (e.g. if you find MLP is better than any sequence model) or in the S4 usage specifically (e.g. if you find LSTM is better than S4) |
Hi Albert, given the situation that people are trying S4D in the RL settings, would it be possible to provide or extend the current minimal s4d.py to also have step function to be able to run(train) in the RNN model, in case people are doing it wrong? |
The minimal file is purposefully minimal. The documentation explains the additional features available in the main module and how to use them. For example step code: https://github.com/HazyResearch/state-spaces/blob/main/models/s4/s4.py#L1197 |
I'm currently writing a recurrent reinforcement library, with LSTMs, linear attention, etc that I would like to add S4 to.
Unfortunately, I find S4D unable to learn in even simple RL tasks (e.g. output the input from 4 timesteps ago).
Do you know of any configurations or tips for making S4/S4D models smaller/more robust? LR is already set very low (1e-5) and I'm using enormous batch sizes (65536 transitions per batch). Other recurrent models are able to learn, but not S4. This is what I have thus far:
The text was updated successfully, but these errors were encountered: