You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your excellent work, I have a question about the argument:
When I fixed the action of the first dimension as a symbolic policy on the "LunarLanderContinuous-v2" environment, the program reported an error:
TypeError: from_str_tokens() got an unexpected keyword argument 'optimize'
in control.py, line 193.
I ran successfully with "optimize" removed. However, the results of the run cannot reach the results in the paper (I get lower r_avg_test than 238 many times, while the paper is 251.66).
So I'm wondering how to run it successfully without removing "optimize" and get the results in the paper.
Here is my config file:
// This example contains the tuned entropy_weight and entropy_gamma
// hyperparameters used to solve LunarLanderContinuous-v2
{
"task" : {
"task_type" : "control",
"env" : "LunarLanderContinuous-v2",
"action_spec" : [["exp","cos","exp","mul","div","add","sub","add","add","add","exp", "add","add","add","add","x2","x4","x4","5.0","x4","1.0","x5","x4","x4 ","5.0","x4","x4"], null],
},
"training" : {
// Recommended to set this to as many cores as you can use!
"n_cores_batch" : 16
},
"controller" : {
"entropy_weight" : 0.02,
"entropy_gamma" : 0.85
},
}
The text was updated successfully, but these errors were encountered:
Sorry for the delay on this. I don't think the optimize flag is going to make any difference here. DSO is stochastic, and the code has undergone some changes since the paper, so it's going to be hard to exactly reproduce those results.
Thanks for your excellent work, I have a question about the argument:
When I fixed the action of the first dimension as a symbolic policy on the "LunarLanderContinuous-v2" environment, the program reported an error:
TypeError: from_str_tokens() got an unexpected keyword argument 'optimize'
in control.py, line 193.
I ran successfully with "optimize" removed. However, the results of the run cannot reach the results in the paper (I get lower r_avg_test than 238 many times, while the paper is 251.66).
So I'm wondering how to run it successfully without removing "optimize" and get the results in the paper.
Here is my config file:
// This example contains the tuned entropy_weight and entropy_gamma
// hyperparameters used to solve LunarLanderContinuous-v2
{
"task" : {
"task_type" : "control",
"env" : "LunarLanderContinuous-v2",
"action_spec" : [["exp","cos","exp","mul","div","add","sub","add","add","add","exp", "add","add","add","add","x2","x4","x4","5.0","x4","1.0","x5","x4","x4 ","5.0","x4","x4"], null],
},
"training" : {
// Recommended to set this to as many cores as you can use!
"n_cores_batch" : 16
},
"controller" : {
"entropy_weight" : 0.02,
"entropy_gamma" : 0.85
},
}
The text was updated successfully, but these errors were encountered: