You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to run the Cart Pole experiment with the adam-update. The code is as following:
from rllab.algos.trpo import TRPO
from rllab.baselines.linear_feature_baseline import LinearFeatureBaseline
from rllab.envs.box2d.cartpole_env import CartpoleEnv
from rllab.envs.normalized_env import normalize
from rllab.policies.gaussian_gru_policy import GaussianGRUPolicy
from rllab.policies.gaussian_mlp_policy import GaussianMLPPolicy
from rllab.optimizers.first_order_optimizer import FirstOrderOptimizer
env = normalize(CartpoleEnv())
policy = GaussianMLPPolicy(
env_spec=env.spec,
adaptive_std=True,
# The neural network policy should have two hidden layers, each with 32 hidden units.
)
baseline = LinearFeatureBaseline(env_spec=env.spec)
algo = TRPO(
env=env,
policy=policy,
baseline=baseline,
batch_size=4000,
max_path_length=100,
n_itr=200,
discount=0.99,
step_size=0.01,
optimizer=FirstOrderOptimizer(),
)
algo.train()
However, I was not able to run the code and I got the following error:
Traceback (most recent call last):
File "/home/drl/rllab/examples/trpo_cartpole.py", line 30, in
algo.train()
File "/home/drl/rllab/rllab/algos/batch_polopt.py", line 95, in train
self.optimize_policy(itr, samples_data)
File "/home/drl/rllab/rllab/algos/npo.py", line 110, in optimize_policy
mean_kl = self.optimizer.constraint_val(all_input_values)
AttributeError: 'FirstOrderOptimizer' object has no attribute 'constraint_val'.
I found out that conjugateGradientOptimizer has this attribute but I got a different error when I put the constrain_val function onto the FirstOrderOptimizer class.
I will appreciate if you can tell me what is the objective of this constraint_val function in the optimizer call.
Thank you
The text was updated successfully, but these errors were encountered:
abhishm
changed the title
using the first_order_optimizer with TNPG gives error
using the first_order_optimizer with TRPO gives error
May 12, 2016
Hi @abhishm, TRPO doesn't work with FirstOrderOptimizer, since it requires solving a constrained optimization problem, where the constraint is given by the KL divergence between the old policy and the new one. You can choose between ConjugateGradientOptimizer or PenaltyLbfgsOptimizer (which is what the PPO algorithm uses).
If you are interested, you can also try to write a variant of the provided first order optimizer that either use a fixed penalty term, or somehow adaptively adjust it (you can look into PenaltyLbfgsOptimizer for inspirations).
I tried to run the Cart Pole experiment with the adam-update. The code is as following:
However, I was not able to run the code and I got the following error:
Traceback (most recent call last):
File "/home/drl/rllab/examples/trpo_cartpole.py", line 30, in
algo.train()
File "/home/drl/rllab/rllab/algos/batch_polopt.py", line 95, in train
self.optimize_policy(itr, samples_data)
File "/home/drl/rllab/rllab/algos/npo.py", line 110, in optimize_policy
mean_kl = self.optimizer.constraint_val(all_input_values)
AttributeError: 'FirstOrderOptimizer' object has no attribute 'constraint_val'.
I found out that conjugateGradientOptimizer has this attribute but I got a different error when I put the constrain_val function onto the FirstOrderOptimizer class.
I will appreciate if you can tell me what is the objective of this constraint_val function in the optimizer call.
Thank you
The text was updated successfully, but these errors were encountered: