Skip to content

Commit

Permalink
save and restore env state before evaluation
Browse files Browse the repository at this point in the history
  • Loading branch information
MaximeBouton committed Jan 11, 2019
1 parent 895837b commit cf60925
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions src/solver.jl
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,13 @@ function POMDPs.solve(solver::DeepQLearningSolver, env::AbstractEnvironment)
end

if t%solver.eval_freq == 0
saved_state = env.state
scores_eval = evaluation(solver.evaluation_policy,
policy, env,
solver.num_ep_eval,
solver.max_episode_length,
solver.verbose)
env.state = saved_state
end

if t%solver.log_freq == 0
Expand Down

2 comments on commit cf60925

@zsunberg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it would be better to use the functions from RLInterface rather than accessing the state field. What if someone tries to use this with a model that conforms to RLInterface, but doesn't have a state field, like a ROS simulator or something? I would recommend resetting the environment if it is done.

@MaximeBouton
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, you are perfectly right, unfortunately there is nothing in the interface right now that would support saving the current state of the environment and restoring it later.
The bug encountered is due to the fact that the evaluation and exploration are using the same env object, hence during evaluation the potential hidden state in env are changed inconsistently with the current state of env in the exploration.
I think about the following options:

  • only evaluate at the end of an episode
  • have two copies of env, one for exploration and one for evaluation
  • adding functions like hidden_states and set_hidden_states to RLInterface.

I will file an issue in DeepQLearning.jl and will try to address this soon

Please sign in to comment.