Can't tune hyperparameters with CustomSACPolicy - multiple values for keyword argument 'layers' #78

PierreExeter · 2020-04-16T20:13:58Z

Describe the bug
Running hyperparameter tuning with SAC and CustomSACPolicy returns

TypeError: __init__() got multiple values for keyword argument 'layers'.

Note, hyperparameter tuning is working fine with MlpPolicy and normal training is working fine with CustomSACPolicy. The issue seems to be coming from Tensorflow.

Code example

After a recent git clone and using the defaults hyperparameters in hyperparameters/sac.yml:

python train.py --algo sac --env HopperBulletEnv-v0 -n 50000 -optimize --n-trials 100 --n-jobs 1

Full traceback:

Traceback (most recent call last):
  File "***/bin/anaconda3/lib/python3.7/site-packages/optuna/study.py", line 648, in _run_trial
    result = func(trial)
  File "***/rl-baselines-zoo/utils/hyperparams_opt.py", line 88, in objective
    model = model_fn(**kwargs)
  File "train.py", line 373, in create_model
    verbose=0, **kwargs)
  File "***/bin/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/sac.py", line 125, in __init__
    self.setup_model()
  File "***/bin/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/sac.py", line 145, in setup_model
    **self.policy_kwargs)
  File "***/rl-baselines-zoo/utils/utils.py", line 71, in __init__
    feature_extraction="mlp")
TypeError: __init__() got multiple values for keyword argument 'layers'

System Info

stable baselines: 2.10.0 (installed with pip)
rl-baselines-zoo commit: 645ea17
Python 3.7.4
Tensorflow: 1.14.0
Gym: 0.15.4
Pybullet: 2.5.8
Ubuntu 18.04
GPU: GeForce GTX 1060
CUDA: 10.2

The text was updated successfully, but these errors were encountered:

araffin · 2020-04-16T20:31:28Z

Hello,
This is normal. If you do hyperparameter tuning, you should set policy='MlpPolicy' otherwise you will get the mentioned error, as the CustomSACPolicy is already custom in term of number of layers, would be nice to change CustomSACPolicy to MlpPolicy but with policy_kwargs="dict(layers=[256,256])"

PierreExeter · 2020-04-17T11:22:22Z

Ok thanks for your very quick reply.

PierreExeter · 2020-04-17T18:42:59Z

Just a doubt, is it OK then to tune the hyperparameters with policy='MlpPolicy' and then to train the model with CustomSACPolicy? Does it not defeat the purpose of tuning in the first place? i.e. would hyperparameters optimised with one policy be also optimal for another policy?

araffin · 2020-04-20T08:51:45Z

Does it not defeat the purpose of tuning in the first place? i.e. would hyperparameters optimised with one policy be also optimal for another policy?

If in your hyperparameter optimization you allow architecture search:

rl-baselines-zoo/utils/hyperparams_opt.py

Lines 245 to 250 in 645ea17

    
           net_arch = trial.suggest_categorical('net_arch', ["small", "medium", "big"]) 
        
           net_arch = { 
        
               'small': [64, 64], 
        
               'medium': [256, 256], 
        
               'big': [400, 300],

then it does make sense to have policy='MlpPolicy'.
However, if you fix the architecture (by commenting the lines above), then you can use CustomSACPolicy (or in a equivalent way, MlpPolicy + policy_kwargs="dict(layers=[256,256])")

PierreExeter · 2020-04-21T16:24:17Z

Ok thanks a lot for your help, I'm closing this issue now.

araffin added documentation enhancement New feature or request labels Apr 16, 2020

PierreExeter closed this as completed Apr 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't tune hyperparameters with CustomSACPolicy - multiple values for keyword argument 'layers' #78

Can't tune hyperparameters with CustomSACPolicy - multiple values for keyword argument 'layers' #78

PierreExeter commented Apr 16, 2020

araffin commented Apr 16, 2020

PierreExeter commented Apr 17, 2020

PierreExeter commented Apr 17, 2020

araffin commented Apr 20, 2020

PierreExeter commented Apr 21, 2020

Can't tune hyperparameters with CustomSACPolicy - multiple values for keyword argument 'layers' #78

Can't tune hyperparameters with CustomSACPolicy - multiple values for keyword argument 'layers' #78

Comments

PierreExeter commented Apr 16, 2020

araffin commented Apr 16, 2020

PierreExeter commented Apr 17, 2020

PierreExeter commented Apr 17, 2020

araffin commented Apr 20, 2020

PierreExeter commented Apr 21, 2020