Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't tune hyperparameters with CustomSACPolicy - multiple values for keyword argument 'layers' #78

Closed
PierreExeter opened this issue Apr 16, 2020 · 5 comments
Labels
documentation enhancement New feature or request

Comments

@PierreExeter
Copy link
Contributor

Describe the bug
Running hyperparameter tuning with SAC and CustomSACPolicy returns

TypeError: __init__() got multiple values for keyword argument 'layers'.

Note, hyperparameter tuning is working fine with MlpPolicy and normal training is working fine with CustomSACPolicy. The issue seems to be coming from Tensorflow.

Code example

After a recent git clone and using the defaults hyperparameters in hyperparameters/sac.yml:

python train.py --algo sac --env HopperBulletEnv-v0 -n 50000 -optimize --n-trials 100 --n-jobs 1

Full traceback:

Traceback (most recent call last):
  File "***/bin/anaconda3/lib/python3.7/site-packages/optuna/study.py", line 648, in _run_trial
    result = func(trial)
  File "***/rl-baselines-zoo/utils/hyperparams_opt.py", line 88, in objective
    model = model_fn(**kwargs)
  File "train.py", line 373, in create_model
    verbose=0, **kwargs)
  File "***/bin/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/sac.py", line 125, in __init__
    self.setup_model()
  File "***/bin/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/sac.py", line 145, in setup_model
    **self.policy_kwargs)
  File "***/rl-baselines-zoo/utils/utils.py", line 71, in __init__
    feature_extraction="mlp")
TypeError: __init__() got multiple values for keyword argument 'layers'

System Info

  • stable baselines: 2.10.0 (installed with pip)
  • rl-baselines-zoo commit: 645ea17
  • Python 3.7.4
  • Tensorflow: 1.14.0
  • Gym: 0.15.4
  • Pybullet: 2.5.8
  • Ubuntu 18.04
  • GPU: GeForce GTX 1060
  • CUDA: 10.2
@araffin
Copy link
Owner

araffin commented Apr 16, 2020

Hello,
This is normal. If you do hyperparameter tuning, you should set policy='MlpPolicy' otherwise you will get the mentioned error, as the CustomSACPolicy is already custom in term of number of layers, would be nice to change CustomSACPolicy to MlpPolicy but with policy_kwargs="dict(layers=[256,256])"

@araffin araffin added documentation enhancement New feature or request labels Apr 16, 2020
@PierreExeter
Copy link
Contributor Author

Ok thanks for your very quick reply.

@PierreExeter
Copy link
Contributor Author

Just a doubt, is it OK then to tune the hyperparameters with policy='MlpPolicy' and then to train the model with CustomSACPolicy? Does it not defeat the purpose of tuning in the first place? i.e. would hyperparameters optimised with one policy be also optimal for another policy?

@araffin
Copy link
Owner

araffin commented Apr 20, 2020

Does it not defeat the purpose of tuning in the first place? i.e. would hyperparameters optimised with one policy be also optimal for another policy?

If in your hyperparameter optimization you allow architecture search:

net_arch = trial.suggest_categorical('net_arch', ["small", "medium", "big"])
net_arch = {
'small': [64, 64],
'medium': [256, 256],
'big': [400, 300],

then it does make sense to have policy='MlpPolicy'.
However, if you fix the architecture (by commenting the lines above), then you can use CustomSACPolicy (or in a equivalent way, MlpPolicy + policy_kwargs="dict(layers=[256,256])")

@PierreExeter
Copy link
Contributor Author

Ok thanks a lot for your help, I'm closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants