# Generative Adverserial Imitation Learning

Generative Adversarial Imitation Learning (GAIL) was first proposed in the paper [Generative Adversarial Imitation Learning](https://arxiv.org/abs/1606.03476) by Jonathan Ho and Stefano Ermon. The project task is to implement the Generative Adverserial Imitation Learning model for driving scenarios using the BARK-simulator.

GAIL is based on the setting of Reinforcement Learning (RL). In Reinforcment Learning, the agent interacts with the environment through its actions and receives rewards in return. The aim of the learning process is to maximize the cummulative reward by chosing the best action in all states.

As the name suggests, GAIL belongs to a smaller subgroup of RL, called Imitation Learning. In this setup the goal of the agent is to mimic an expert behavior as closely as possible. The environment awards higher rewards to expert-like behavior and smaller ones to actions which substantially deviate from the expert behavior. In our case, expert trajectories were generated from real life data, namely from the Interaction Dataset, as well as from a pretrained SAC (Soft Actor-Critic) agent. The expert trajectories, which are obtained in this way, represent the expert knowledge by containing many states with corresponding actions that were produced by the expert.

As mentioned previously, learning of the agent in the RL setting is driven by the rewards it receives from the environment. The amount of the reward in the Imitation Learning setting are determined based on how closely the agent mimics the expert behavior. Special in the GAIL approach is that we receive the reward from an adversarial game: The agent is represented by a generator network which is trained based on the feedback of a discriminator network. The generator produces actions for given states which are then evaluated by the discriminator. In the meantime, the discriminator is trained by feeding it with expert and agent state-action pairs for classification. This way, the generator tries to fool the discriminator, hence he aims at acting as expert-like as possible. Meanwhile, the discriminator tries to distinguish between expert and agent trajectories. Intuitively, learning converges when the generator learned to act so similarly to the expert that the discriminator cannot tell apart expert and agent trajectories any more. In game theory this point is called the Nash-equilibrium.

In practice, the implementation of a GAIL agent is usually solved in the following way for sample efficiency: The agent interacts with the environment by following its actual policy and hence agent state-action pairs are generated. These points are stored in a replay buffer for further learning. After a specified interval a training step is carried out. This training step has 2 substeps: training the discriminator and training the generator networks. 
* __Discriminator training:__ The discriminator is fed a batch of expert (from the expert trajectories) and agent (from the replay buffer) state-action pairs. It classifies all of them. Based on their true labels the loss is calculated and a gradient descent step is carried out in order to minimize the loss.
* __Generator training:__ The generator is fed a batch of states from the replay buffer and it produces actions for them. The resulting state-action pairs are fed to the discriminator for classification. The negative output of the discriminator is used as a loss for the generator network. (Close to -1 if the agent mimics the expert successfully.) The gradient of the loss is propagated all the way back to the generator network to carry out a gradient step to minimize its loss.<br>
As already stated, training runs until both, the generator and the discriminator loss, converge to a steady state value.


The training process is visualized in the following figure:


<img width=70% src="files/data/gail_overview.gif">

# Interaction Dataset
As data source, we use the Interaction Dataset: https://arxiv.org/abs/1910.03088. We are interested in the merging scenarios: 
* DR_DEU_Merging_MT
* DR_CHN_Merging_ZS

These scenarios contain a map specification and track specifications for multiple vehicles that drive on the map. The tracks represent the trajectories of the vehicles which consist of a number of consecutive recorded states. 

Have a look how the Interaction Dataset is [integrated in BARK](https://github.com/bark-simulator/bark/blob/setup_tutorials/docs/tutorials/04_interaction_dataset.ipynb). 
(Note that the dataset itself is NOT enclosed within BARK due to license limitations)

# Expert Trajectories

The Interaction Dataset of course contains trajectories of many different vehicles with different wheel bases. As there are only states recorded in the dataset, we calculate the action the vehicle has taken to go from one state to the next ourselfs.

The wheel base is used to calculate the action following the [Single Track Model](https://borrelli.me.berkeley.edu/pdfpub/IV_KinematicMPC_jason.pdf). As we are only interested in trajectories with a wheel base as of our agent, we use a fixed wheel base when calculating the actions from the successive states. This however does not restrict the accuracy of the data. We just consider that all state trajectories were carried out by the same car and calculate the actions that a car would have needed to carry out that behavior. 

As the state variables and also the actions have different magnitudes, we normalize all of them for training. The normalization of the expert trajectories takes place while loading the generated expert trajectories. The loading function also takes the current environment (BARK runtime) as an input, hence the trajectories are normalized according to the current parameters (Current state and action spaces).

You can have a look at the source code in `bark_ml.library_wrappers.lib_tf2rl.generate_expert_trajectories`

# Generate Expert Trajectories
A short example script for generating expert trajectories from the interaction dataset is shown in the following. 

In [None]:
import os 
import bark
from pprint import pprint
from bark_ml.library_wrappers.lib_tf2rl.generate_expert_trajectories import *

tracks_folder = os.path.join(os.getcwd(), 'data')
map_file = os.path.join(os.getcwd(), 'data/DR_DEU_Merging_MT_v01_shifted.xodr')
known_key = ('DR_DEU_Merging_MT_v01_shifted', 'vehicle_tracks_013')
ego_agent = 66

param_server = create_parameter_servers_for_scenarios(map_file, tracks_folder)[known_key]
generation_params = param_server["Scenario"]["Generation"]["InteractionDatasetScenarioGeneration"]
generation_params["TrackIds"] = [63, 64, 65, 66, 67, 68]
generation_params["StartTs"] = 232000
generation_params["EndTs"] = 259000
generation_params["EgoTrackId"] = ego_agent
param_server["Scenario"]["Generation"]["InteractionDatasetScenarioGeneration"] = generation_params

In [None]:
expert_trajectories = generate_expert_trajectories_for_scenario(param_server, sim_time_step=200, renderer="matplotlib_jupyter")

In [3]:
import pandas as pd
from IPython.display import display
from helpers import *

## The generated expert trajectories
The generated expert trajectories are stored in a dictionary with key-value pairs:
* `obs`: list, contains the observation vector for the timestep.
* `act`: list, contains the action that was carried out in that timestep.
* `next_obs`: list, contains the next observation after carrying out the action `act` in the state `obs`. 

### Format of observations

\begin{align*}
\begin{pmatrix}
x \\
y \\
\Theta \\
v
\end{pmatrix}
\text{ for the ego vehicle and the three nearest vehicles in the scene, where $x$ and $y$ are 2D coordinates, $\Theta$ is orientation and $v$ velocity.}
\end{align*}

Stored as `expert_trajectories[ego_agent]['obs']` 

In [None]:
# Small number of observations for our agent

pd.options.display.float_format = '{:,.2f}'.format
display(observations_to_dataframe(expert_trajectories[ego_agent]['obs'][:5]))

### Format of actions
\begin{align*}
\begin{pmatrix}
a \\
\delta 
\end{pmatrix}
\text{, where $a$ is acceleration and $\delta$ is steering angle.}
\end{align*}

Stored as `expert_trajectories[ego_agent]['act']` 

In [5]:
# Small number of actions from our agent

pd.options.display.float_format = '{:,.2f}'.format
display(actions_to_dataframe(expert_trajectories[ego_agent]['act'][:5]))

Unnamed: 0,Acceleration,Steering angle
0,-0.03,0.0
1,-0.04,-0.01
2,-0.07,0.0
3,-0.1,-0.01
4,-0.12,0.0


---

# GAIL implementation
The following section describes the implementation details we chose to implement the Generative Adverserial Imitation Learning setup.

## TF2RL implementation
We use an off the shelf implementation, the library [TF2RL](https://github.com/keiohta/tf2rl). It implements several reinforcement learning algorithms in [tensorflow 2](https://www.tensorflow.org/guide/effective_tf2). 

The GAIL agent is built up as follows:
* __Generator:__ A complete DDPG agent with actor and critic networks. Both of them have 2-2 hidden layers.
* __Discriminator:__ A normal discriminator network with 2 hidden layers.

In this respect, the agent is not in the traditional GAIL setup with 2 neural networks. Instead, it has 5 networks, since the DDPG agent itself has 4 networks for greater stability during training. The DDPG agent's critic network receives the judgement of the discriminator network as the reward from the environment and its training aims to maximize this reward.

## Integration into BARK
The TF2RL based GAIL agent is integrated into the existing BARK concepts and is implemented in the following most important classes:
* __TF2RLWrapper:__ Wraps the BARK runtime to match the expectations of TF2RL about the environment. The observation and action normalization also takes place here.
    * Source: `bark_ml/library_wrappers/lib_tf2rl/tf2rl_wrapper.py`
* __BehaviorTF2RLAgent:__ Base class for TF2RL based agents.
    * Source: `bark_ml/library_wrappers/lib_tf2rl/agents/tf2rl_agent.py`
* __BehaviorGAILAgent:__ The TF2RL based GAIL agent.
    * Source: `bark_ml/library_wrappers/lib_tf2rl/agents/gail_agent.py`
* __TF2RLRunner:__ Base class for TF2RL based runners.
    * Source: `bark_ml/library_wrappers/lib_tf2rl/runners/tf2rl_runner.py`
* __GAILRunner:__ The TF2RL based GAIL runner.
    * Source: `bark_ml/library_wrappers/lib_tf2rl/runners/gail_runner.py`
    
In the followings the training process is demonstrated. Later the performance of a pre-trained agent can be visualized.

---

# Training
We will now train a GAIL agent using the implementation described above. There are several training parameters which can be set on demand:
* The number of steps to train for
* The frequency of testing during training
* The number of episodes in each testing round
* The usage of GPU accelerated calculations

In [6]:
# Customize some parameters here!

max_steps = 100000          # Number of steps to train for.
test_interval = 100         # test in every ... steps.
test_episodes = 5           # number of test episodes.
gpu = 0                     # use -1 for cpu only.

In [7]:
# imports
import os
from pathlib import Path

# BARK imports
from bark_project.bark.runtime.commons.parameters import ParameterServer
from bark.runtime.viewer.matplotlib_viewer import MPViewer
from bark.runtime.viewer.video_renderer import VideoRenderer

# BARK-ML imports
from bark_ml.environments.blueprints import ContinuousHighwayBlueprint, \
  ContinuousMergingBlueprint, ContinuousIntersectionBlueprint
from bark_ml.environments.single_agent_runtime import SingleAgentRuntime
from bark_ml.library_wrappers.lib_tf2rl.tf2rl_wrapper import TF2RLWrapper
from bark_ml.library_wrappers.lib_tf2rl.agents.gail_agent import BehaviorGAILAgent
from bark_ml.library_wrappers.lib_tf2rl.runners.gail_runner import GAILRunner
from bark_ml.library_wrappers.lib_tf2rl.load_expert_trajectories import load_expert_trajectories

PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')


## Training graphs

The next cell deletes the previous logs and launches tensorboard. After tensorboard has launched, please go on to the next cell and start the training. The tensorboard window refreshes itself every 30 seconds, but you can also refresh it manually in the upper right corner.

You should see the graphs for:
* **Common**
    * Common/average_step_count: The average number of steps the agent takes in the environment per scenario
    * Common/average_test_return: The average return during the test scenarios
    * Common/fps: The steps the agent takes in the environment per second
    * Common/training_return: The per scenario return of the agent during training
* **DDPG**
    * DDPG/actor_loss: The loss of the actor network
    * DDPG/critic_loss: The loss of the critic network
* **GAIL**
    * GAIL/Accuracy: The agent/expert distinguishing accuracy of the discriminator
    * GAIL/DiscriminatorLoss: The loss of the discriminator network
    * GAIL/JSdivergence: The Jensen–Shannon divergence measuring the similarity between the expert and agent

The GAIL agent should converge to a Common/average_test_return of 1, so success in every scenario it faces, after at most 10.000 scenarios.

***

Sometimes tensorboard does not refresh correctly. If you don't see all of the above graphs after 300 scenarios, please rightclick the tensorboard and click _Reload frame_.

In [8]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

# launching tensorboard and deleting the previous runs logdirs:
%rm -r "data/logs"
%mkdir "data/logs"
%tensorboard --logdir "data/logs"

rm: cannot remove 'data/logs': No such file or directory


Reusing TensorBoard on port 6006 (pid 41312), started 0:26:37 ago. (Use '!kill 41312' to kill it.)

In [None]:
# load params from the json file to create the parameter server object
params = ParameterServer(filename="data/params/gail_params.json")

# customized parameters:
params["ML"]["Settings"]["GPUUse"] = gpu
tf2rl_params = params["ML"]["GAILRunner"]["tf2rl"]
tf2rl_params["max_steps"] = max_steps
tf2rl_params["test_interval"] = test_interval
tf2rl_params["test_episodes"] = test_episodes
params["ML"]["GAILRunner"]["tf2rl"] = tf2rl_params
if params["ML"]["BehaviorGAILAgent"]["WarmUp"] > max_steps / 2:
    params["ML"]["BehaviorGAILAgent"]["WarmUp"] = max_steps / 2

# create environment
bp = ContinuousMergingBlueprint(params,
                                number_of_senarios=500,
                                random_seed=0)
env = SingleAgentRuntime(blueprint=bp,
                         render=False)

# wrapped environment for compatibility with tf2rl
wrapped_env = TF2RLWrapper(env, 
                           normalize_features=params["ML"]["Settings"]["NormalizeFeatures"])

# instantiate the GAIL agent
gail_agent = BehaviorGAILAgent(environment=wrapped_env,
                               params=params)

# load the expert trajectories
expert_trajectories, _, _ = load_expert_trajectories(
    params['ML']['ExpertTrajectories']['expert_path_dir'],
    normalize_features=params["ML"]["Settings"]["NormalizeFeatures"],
    env=env, # the unwrapped env has to be used, since that contains the unnormalized spaces.
    subset_size=params["ML"]["ExpertTrajectories"]["subset_size"]
    ) 

# instantiate a runner that is going to train the agent
runner = GAILRunner(params=params,
                 environment=wrapped_env,
                 agent=gail_agent,
                 expert_trajs=expert_trajectories)

# train the agent
runner.Train()

  if lane.find("userData"):
12:07:30.986 [INFO] (trainer.py:65) Restored None
12:07:31.038 [INFO] (irl_trainer.py:73) Total Epi:     1 Steps:       9 Episode Steps:     9 Return: -1.0000 FPS: 182.56
12:07:31.413 [INFO] (irl_trainer.py:73) Total Epi:     2 Steps:      12 Episode Steps:     3 Return: -1.0000 FPS: 545.44
12:07:31.421 [INFO] (irl_trainer.py:73) Total Epi:     3 Steps:      16 Episode Steps:     4 Return: -1.0000 FPS: 662.53
12:07:31.442 [INFO] (irl_trainer.py:73) Total Epi:     4 Steps:      37 Episode Steps:    21 Return: -1.0000 FPS: 1116.58
12:07:31.468 [INFO] (irl_trainer.py:73) Total Epi:     5 Steps:      66 Episode Steps:    29 Return: -1.0000 FPS: 1197.69
12:07:31.479 [INFO] (irl_trainer.py:73) Total Epi:     6 Steps:      75 Episode Steps:     9 Return: -1.0000 FPS: 1042.32
12:07:31.486 [INFO] (irl_trainer.py:73) Total Epi:     7 Steps:      78 Episode Steps:     3 Return: -1.0000 FPS: 592.23
12:07:31.494 [INFO] (irl_trainer.py:73) Total Epi:     8 Steps:      83 

12:07:32.161 [INFO] (irl_trainer.py:73) Total Epi:    66 Steps:     594 Episode Steps:     2 Return: -1.0000 FPS: 259.72
12:07:32.174 [INFO] (irl_trainer.py:73) Total Epi:    67 Steps:     607 Episode Steps:    13 Return: -1.0000 FPS: 1090.18
12:07:32.182 [INFO] (irl_trainer.py:73) Total Epi:    68 Steps:     614 Episode Steps:     7 Return: -1.0000 FPS: 1039.38
12:07:32.200 [INFO] (irl_trainer.py:73) Total Epi:    69 Steps:     633 Episode Steps:    19 Return:  1.0000 FPS: 1178.65
12:07:32.217 [INFO] (irl_trainer.py:73) Total Epi:    70 Steps:     650 Episode Steps:    17 Return: -1.0000 FPS: 1065.18
12:07:32.225 [INFO] (irl_trainer.py:73) Total Epi:    71 Steps:     653 Episode Steps:     3 Return: -1.0000 FPS: 489.06
12:07:32.240 [INFO] (irl_trainer.py:73) Total Epi:    72 Steps:     666 Episode Steps:    13 Return: -1.0000 FPS: 1002.06
12:07:32.249 [INFO] (irl_trainer.py:73) Total Epi:    73 Steps:     669 Episode Steps:     3 Return: -1.0000 FPS: 464.12
12:07:32.258 [INFO] (irl_tr



To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.



12:07:35.604 [INFO] (irl_trainer.py:73) Total Epi:   124 Steps:    1003 Episode Steps:     3 Return: -1.0000 FPS:  1.05
12:07:35.727 [INFO] (irl_trainer.py:73) Total Epi:   125 Steps:    1007 Episode Steps:     4 Return: -1.0000 FPS: 33.01
12:07:35.844 [INFO] (irl_trainer.py:73) Total Epi:   126 Steps:    1011 Episode Steps:     4 Return: -1.0000 FPS: 34.64
12:07:36.359 [INFO] (irl_trainer.py:73) Total Epi:   127 Steps:    1028 Episode Steps:    17 Return: -1.0000 FPS: 33.09
12:07:36.511 [INFO] (irl_trainer.py:73) Total Epi:   128 Steps:    1033 Episode Steps:     5 Return: -1.0000 FPS: 33.38
12:07:36.675 [INFO] (irl_trainer.py:73) Total Epi:   129 Steps:    1038 Episode Steps:     5 Return: -1.0000 FPS: 31.91
12:07:37.050 [INFO] (irl_trainer.py:73) Total Epi:   130 Steps:    1050 Episode Steps:    12 Return: -1.0000 FPS: 32.20
12:07:37.171 [INFO] (irl_trainer.py:73) Total Epi:   131 Steps:    1054 Episode Steps:     4 Return: -1.0000 FPS: 34.09
12:07:37.379 [INFO] (irl_trainer.py:73) 

12:07:48.811 [INFO] (irl_trainer.py:73) Total Epi:   193 Steps:    1431 Episode Steps:    16 Return: -1.0000 FPS: 36.22
12:07:49.250 [INFO] (irl_trainer.py:73) Total Epi:   194 Steps:    1446 Episode Steps:    15 Return: -1.0000 FPS: 34.31
12:07:49.726 [INFO] (irl_trainer.py:73) Total Epi:   195 Steps:    1462 Episode Steps:    16 Return:  1.0000 FPS: 33.78
12:07:50.419 [INFO] (irl_trainer.py:73) Total Epi:   196 Steps:    1484 Episode Steps:    22 Return: -1.0000 FPS: 31.81
12:07:50.720 [INFO] (irl_trainer.py:73) Total Epi:   197 Steps:    1494 Episode Steps:    10 Return: -1.0000 FPS: 33.47
12:07:51.353 [INFO] (irl_trainer.py:73) Total Epi:   198 Steps:    1514 Episode Steps:    20 Return:  1.0000 FPS: 31.67
12:07:51.940 [INFO] (irl_trainer.py:73) Total Epi:   199 Steps:    1533 Episode Steps:    19 Return: -1.0000 FPS: 32.47
12:07:52.547 [INFO] (irl_trainer.py:73) Total Epi:   200 Steps:    1553 Episode Steps:    20 Return: -1.0000 FPS: 33.03
12:07:52.798 [INFO] (irl_trainer.py:119)

12:08:20.097 [INFO] (irl_trainer.py:73) Total Epi:   261 Steps:    2452 Episode Steps:    13 Return: -1.0000 FPS: 33.44
12:08:20.597 [INFO] (irl_trainer.py:73) Total Epi:   262 Steps:    2468 Episode Steps:    16 Return: -1.0000 FPS: 32.12
12:08:21.045 [INFO] (irl_trainer.py:73) Total Epi:   263 Steps:    2482 Episode Steps:    14 Return: -1.0000 FPS: 31.37
12:08:21.513 [INFO] (irl_trainer.py:73) Total Epi:   264 Steps:    2498 Episode Steps:    16 Return: -1.0000 FPS: 34.31
12:08:21.800 [INFO] (irl_trainer.py:73) Total Epi:   265 Steps:    2507 Episode Steps:     9 Return: -1.0000 FPS: 31.58
12:08:22.123 [INFO] (irl_trainer.py:73) Total Epi:   266 Steps:    2518 Episode Steps:    11 Return: -1.0000 FPS: 34.23
12:08:22.397 [INFO] (irl_trainer.py:73) Total Epi:   267 Steps:    2527 Episode Steps:     9 Return: -1.0000 FPS: 33.25
12:08:22.696 [INFO] (irl_trainer.py:73) Total Epi:   268 Steps:    2537 Episode Steps:    10 Return: -1.0000 FPS: 33.64
12:08:23.070 [INFO] (irl_trainer.py:73) 

12:08:45.802 [INFO] (irl_trainer.py:73) Total Epi:   329 Steps:    3288 Episode Steps:    10 Return: -1.0000 FPS: 33.10
12:08:46.273 [INFO] (irl_trainer.py:73) Total Epi:   330 Steps:    3303 Episode Steps:    15 Return: -1.0000 FPS: 32.00
12:08:46.627 [INFO] (irl_trainer.py:73) Total Epi:   331 Steps:    3315 Episode Steps:    12 Return: -1.0000 FPS: 34.15
12:08:46.876 [INFO] (irl_trainer.py:73) Total Epi:   332 Steps:    3323 Episode Steps:     8 Return: -1.0000 FPS: 32.36
12:08:47.318 [INFO] (irl_trainer.py:73) Total Epi:   333 Steps:    3338 Episode Steps:    15 Return: -1.0000 FPS: 34.03
12:08:47.899 [INFO] (irl_trainer.py:73) Total Epi:   334 Steps:    3356 Episode Steps:    18 Return: -1.0000 FPS: 31.10
12:08:48.199 [INFO] (irl_trainer.py:73) Total Epi:   335 Steps:    3366 Episode Steps:    10 Return: -1.0000 FPS: 33.58
12:08:48.760 [INFO] (irl_trainer.py:73) Total Epi:   336 Steps:    3383 Episode Steps:    17 Return: -1.0000 FPS: 30.41
12:08:49.174 [INFO] (irl_trainer.py:73) 

12:09:13.287 [INFO] (irl_trainer.py:73) Total Epi:   398 Steps:    4161 Episode Steps:     5 Return: -1.0000 FPS: 36.78
12:09:13.764 [INFO] (irl_trainer.py:73) Total Epi:   399 Steps:    4177 Episode Steps:    16 Return: -1.0000 FPS: 33.71
12:09:14.034 [INFO] (irl_trainer.py:73) Total Epi:   400 Steps:    4186 Episode Steps:     9 Return: -1.0000 FPS: 33.71
12:09:14.233 [INFO] (irl_trainer.py:119) Evaluation Total Steps:    4186 Average Reward -0.2000 / Average Step Count  12.6 over  5 episodes
12:09:14.240 [INFO] (irl_trainer.py:73) Total Epi:   401 Steps:    4187 Episode Steps:     1 Return: -1.0000 FPS:  4.88
12:09:14.768 [INFO] (irl_trainer.py:73) Total Epi:   402 Steps:    4203 Episode Steps:    16 Return:  1.0000 FPS: 30.46
12:09:15.087 [INFO] (irl_trainer.py:73) Total Epi:   403 Steps:    4213 Episode Steps:    10 Return: -1.0000 FPS: 31.54
12:09:15.243 [INFO] (irl_trainer.py:73) Total Epi:   404 Steps:    4218 Episode Steps:     5 Return: -1.0000 FPS: 32.44
12:09:15.601 [INFO] 

12:09:43.323 [INFO] (irl_trainer.py:73) Total Epi:   466 Steps:    5130 Episode Steps:    26 Return:  1.0000 FPS: 32.43
12:09:43.533 [INFO] (irl_trainer.py:73) Total Epi:   467 Steps:    5137 Episode Steps:     7 Return: -1.0000 FPS: 33.61
12:09:44.045 [INFO] (irl_trainer.py:73) Total Epi:   468 Steps:    5154 Episode Steps:    17 Return: -1.0000 FPS: 33.34
12:09:44.349 [INFO] (irl_trainer.py:73) Total Epi:   469 Steps:    5164 Episode Steps:    10 Return: -1.0000 FPS: 33.16
12:09:44.933 [INFO] (irl_trainer.py:73) Total Epi:   470 Steps:    5183 Episode Steps:    19 Return:  1.0000 FPS: 32.64
12:09:45.373 [INFO] (irl_trainer.py:73) Total Epi:   471 Steps:    5197 Episode Steps:    14 Return: -1.0000 FPS: 32.04
12:09:45.786 [INFO] (irl_trainer.py:73) Total Epi:   472 Steps:    5210 Episode Steps:    13 Return:  1.0000 FPS: 31.57
12:09:46.244 [INFO] (irl_trainer.py:73) Total Epi:   473 Steps:    5225 Episode Steps:    15 Return:  1.0000 FPS: 32.91
12:09:46.421 [INFO] (irl_trainer.py:73) 

12:10:14.010 [INFO] (irl_trainer.py:73) Total Epi:   534 Steps:    6129 Episode Steps:     5 Return: -1.0000 FPS: 31.88
12:10:14.388 [INFO] (irl_trainer.py:73) Total Epi:   535 Steps:    6142 Episode Steps:    13 Return: -1.0000 FPS: 34.61
12:10:15.001 [INFO] (irl_trainer.py:73) Total Epi:   536 Steps:    6162 Episode Steps:    20 Return: -1.0000 FPS: 32.75
12:10:15.310 [INFO] (irl_trainer.py:73) Total Epi:   537 Steps:    6172 Episode Steps:    10 Return: -1.0000 FPS: 32.70
12:10:16.213 [INFO] (irl_trainer.py:73) Total Epi:   538 Steps:    6201 Episode Steps:    29 Return: -1.0000 FPS: 32.13
12:10:16.866 [INFO] (irl_trainer.py:73) Total Epi:   539 Steps:    6222 Episode Steps:    21 Return: -1.0000 FPS: 32.27
12:10:17.362 [INFO] (irl_trainer.py:73) Total Epi:   540 Steps:    6239 Episode Steps:    17 Return:  1.0000 FPS: 34.44
12:10:17.774 [INFO] (irl_trainer.py:73) Total Epi:   541 Steps:    6252 Episode Steps:    13 Return: -1.0000 FPS: 31.74
12:10:18.462 [INFO] (irl_trainer.py:73) 

12:10:45.469 [INFO] (irl_trainer.py:73) Total Epi:   602 Steps:    7136 Episode Steps:    14 Return:  1.0000 FPS: 33.45
12:10:46.085 [INFO] (irl_trainer.py:73) Total Epi:   603 Steps:    7157 Episode Steps:    21 Return:  1.0000 FPS: 34.23
12:10:46.502 [INFO] (irl_trainer.py:73) Total Epi:   604 Steps:    7170 Episode Steps:    13 Return:  1.0000 FPS: 31.30
12:10:46.910 [INFO] (irl_trainer.py:73) Total Epi:   605 Steps:    7184 Episode Steps:    14 Return: -1.0000 FPS: 34.50
12:10:47.313 [INFO] (irl_trainer.py:73) Total Epi:   606 Steps:    7197 Episode Steps:    13 Return: -1.0000 FPS: 32.38
12:10:47.703 [INFO] (irl_trainer.py:73) Total Epi:   607 Steps:    7209 Episode Steps:    12 Return:  1.0000 FPS: 30.94
12:10:48.541 [INFO] (irl_trainer.py:73) Total Epi:   608 Steps:    7236 Episode Steps:    27 Return:  1.0000 FPS: 32.32
12:10:48.796 [INFO] (irl_trainer.py:73) Total Epi:   609 Steps:    7244 Episode Steps:     8 Return: -1.0000 FPS: 31.56
12:10:49.251 [INFO] (irl_trainer.py:73) 

12:11:19.192 [INFO] (irl_trainer.py:73) Total Epi:   671 Steps:    8225 Episode Steps:    20 Return:  1.0000 FPS: 32.66
12:11:19.814 [INFO] (irl_trainer.py:73) Total Epi:   672 Steps:    8245 Episode Steps:    20 Return: -1.0000 FPS: 32.28
12:11:20.498 [INFO] (irl_trainer.py:73) Total Epi:   673 Steps:    8267 Episode Steps:    22 Return:  1.0000 FPS: 32.26
12:11:21.190 [INFO] (irl_trainer.py:73) Total Epi:   674 Steps:    8289 Episode Steps:    22 Return:  1.0000 FPS: 31.91
12:11:21.372 [INFO] (irl_trainer.py:73) Total Epi:   675 Steps:    8295 Episode Steps:     6 Return: -1.0000 FPS: 33.18
12:11:21.977 [INFO] (irl_trainer.py:73) Total Epi:   676 Steps:    8314 Episode Steps:    19 Return: -1.0000 FPS: 31.49
12:11:22.270 [INFO] (irl_trainer.py:73) Total Epi:   677 Steps:    8323 Episode Steps:     9 Return: -1.0000 FPS: 30.98
12:11:22.458 [INFO] (irl_trainer.py:73) Total Epi:   678 Steps:    8329 Episode Steps:     6 Return: -1.0000 FPS: 32.30
12:11:23.028 [INFO] (irl_trainer.py:73) 

12:11:48.524 [INFO] (irl_trainer.py:73) Total Epi:   739 Steps:    9159 Episode Steps:     7 Return: -1.0000 FPS: 31.69
12:11:48.813 [INFO] (irl_trainer.py:73) Total Epi:   740 Steps:    9168 Episode Steps:     9 Return: -1.0000 FPS: 31.33
12:11:49.223 [INFO] (irl_trainer.py:73) Total Epi:   741 Steps:    9181 Episode Steps:    13 Return: -1.0000 FPS: 31.89
12:11:49.633 [INFO] (irl_trainer.py:73) Total Epi:   742 Steps:    9194 Episode Steps:    13 Return: -1.0000 FPS: 31.91
12:11:50.086 [INFO] (irl_trainer.py:73) Total Epi:   743 Steps:    9207 Episode Steps:    13 Return:  1.0000 FPS: 28.89
12:11:50.551 [INFO] (irl_trainer.py:73) Total Epi:   744 Steps:    9222 Episode Steps:    15 Return:  1.0000 FPS: 32.37
12:11:50.718 [INFO] (irl_trainer.py:73) Total Epi:   745 Steps:    9228 Episode Steps:     6 Return: -1.0000 FPS: 36.36
12:11:51.239 [INFO] (irl_trainer.py:73) Total Epi:   746 Steps:    9245 Episode Steps:    17 Return:  1.0000 FPS: 32.74
12:11:51.663 [INFO] (irl_trainer.py:73) 

12:12:18.368 [INFO] (irl_trainer.py:73) Total Epi:   807 Steps:   10103 Episode Steps:     8 Return: -1.0000 FPS: 29.98
12:12:19.034 [INFO] (irl_trainer.py:73) Total Epi:   808 Steps:   10123 Episode Steps:    20 Return:  1.0000 FPS: 30.11
12:12:19.643 [INFO] (irl_trainer.py:73) Total Epi:   809 Steps:   10143 Episode Steps:    20 Return:  1.0000 FPS: 32.95
12:12:19.984 [INFO] (irl_trainer.py:73) Total Epi:   810 Steps:   10154 Episode Steps:    11 Return: -1.0000 FPS: 32.49
12:12:20.596 [INFO] (irl_trainer.py:73) Total Epi:   811 Steps:   10174 Episode Steps:    20 Return:  1.0000 FPS: 32.80
12:12:21.121 [INFO] (irl_trainer.py:73) Total Epi:   812 Steps:   10191 Episode Steps:    17 Return:  1.0000 FPS: 32.63
12:12:21.485 [INFO] (irl_trainer.py:73) Total Epi:   813 Steps:   10202 Episode Steps:    11 Return: -1.0000 FPS: 30.40
12:12:22.116 [INFO] (irl_trainer.py:73) Total Epi:   814 Steps:   10223 Episode Steps:    21 Return:  1.0000 FPS: 33.40
12:12:22.261 [INFO] (irl_trainer.py:73) 

12:12:51.364 [INFO] (irl_trainer.py:73) Total Epi:   876 Steps:   11186 Episode Steps:    17 Return:  1.0000 FPS: 33.09
12:12:51.748 [INFO] (irl_trainer.py:73) Total Epi:   877 Steps:   11199 Episode Steps:    13 Return: -1.0000 FPS: 34.07
12:12:52.098 [INFO] (irl_trainer.py:73) Total Epi:   878 Steps:   11209 Episode Steps:    10 Return: -1.0000 FPS: 28.69
12:12:52.847 [INFO] (irl_trainer.py:73) Total Epi:   879 Steps:   11233 Episode Steps:    24 Return:  1.0000 FPS: 32.17
12:12:53.516 [INFO] (irl_trainer.py:73) Total Epi:   880 Steps:   11256 Episode Steps:    23 Return: -1.0000 FPS: 34.46
12:12:53.987 [INFO] (irl_trainer.py:73) Total Epi:   881 Steps:   11271 Episode Steps:    15 Return: -1.0000 FPS: 31.98
12:12:54.568 [INFO] (irl_trainer.py:73) Total Epi:   882 Steps:   11290 Episode Steps:    19 Return:  1.0000 FPS: 32.83
12:12:54.796 [INFO] (irl_trainer.py:73) Total Epi:   883 Steps:   11297 Episode Steps:     7 Return: -1.0000 FPS: 31.15
12:12:55.552 [INFO] (irl_trainer.py:73) 

12:13:23.381 [INFO] (irl_trainer.py:73) Total Epi:   944 Steps:   12218 Episode Steps:    10 Return:  1.0000 FPS: 33.21
12:13:23.994 [INFO] (irl_trainer.py:73) Total Epi:   945 Steps:   12238 Episode Steps:    20 Return:  1.0000 FPS: 32.72
12:13:24.432 [INFO] (irl_trainer.py:73) Total Epi:   946 Steps:   12252 Episode Steps:    14 Return: -1.0000 FPS: 32.15
12:13:24.814 [INFO] (irl_trainer.py:73) Total Epi:   947 Steps:   12265 Episode Steps:    13 Return:  1.0000 FPS: 34.37
12:13:25.250 [INFO] (irl_trainer.py:73) Total Epi:   948 Steps:   12279 Episode Steps:    14 Return:  1.0000 FPS: 32.21
12:13:25.999 [INFO] (irl_trainer.py:73) Total Epi:   949 Steps:   12297 Episode Steps:    18 Return:  1.0000 FPS: 24.08
12:13:26.411 [INFO] (irl_trainer.py:73) Total Epi:   950 Steps:   12310 Episode Steps:    13 Return: -1.0000 FPS: 31.69
12:13:26.829 [INFO] (irl_trainer.py:73) Total Epi:   951 Steps:   12322 Episode Steps:    12 Return: -1.0000 FPS: 28.86
12:13:27.396 [INFO] (irl_trainer.py:73) 

12:13:54.669 [INFO] (irl_trainer.py:73) Total Epi:  1012 Steps:   13222 Episode Steps:    11 Return: -1.0000 FPS: 32.37
12:13:55.234 [INFO] (irl_trainer.py:73) Total Epi:  1013 Steps:   13240 Episode Steps:    18 Return:  1.0000 FPS: 31.92
12:13:55.705 [INFO] (irl_trainer.py:73) Total Epi:  1014 Steps:   13255 Episode Steps:    15 Return: -1.0000 FPS: 31.96
12:13:56.218 [INFO] (irl_trainer.py:73) Total Epi:  1015 Steps:   13272 Episode Steps:    17 Return:  1.0000 FPS: 33.25
12:13:56.559 [INFO] (irl_trainer.py:73) Total Epi:  1016 Steps:   13284 Episode Steps:    12 Return: -1.0000 FPS: 35.43
12:13:57.322 [INFO] (irl_trainer.py:73) Total Epi:  1017 Steps:   13308 Episode Steps:    24 Return:  1.0000 FPS: 31.52
12:13:57.689 [INFO] (irl_trainer.py:73) Total Epi:  1018 Steps:   13320 Episode Steps:    12 Return:  1.0000 FPS: 32.95
12:13:58.062 [INFO] (irl_trainer.py:73) Total Epi:  1019 Steps:   13332 Episode Steps:    12 Return: -1.0000 FPS: 32.32
12:13:58.505 [INFO] (irl_trainer.py:73) 

12:14:29.345 [INFO] (irl_trainer.py:73) Total Epi:  1081 Steps:   14344 Episode Steps:    14 Return:  1.0000 FPS: 32.18
12:14:29.800 [INFO] (irl_trainer.py:73) Total Epi:  1082 Steps:   14359 Episode Steps:    15 Return: -1.0000 FPS: 33.12
12:14:30.481 [INFO] (irl_trainer.py:73) Total Epi:  1083 Steps:   14382 Episode Steps:    23 Return:  1.0000 FPS: 33.88
12:14:30.909 [INFO] (irl_trainer.py:73) Total Epi:  1084 Steps:   14397 Episode Steps:    15 Return:  1.0000 FPS: 35.13
12:14:31.410 [INFO] (irl_trainer.py:73) Total Epi:  1085 Steps:   14413 Episode Steps:    16 Return: -1.0000 FPS: 32.04
12:14:31.903 [INFO] (irl_trainer.py:73) Total Epi:  1086 Steps:   14429 Episode Steps:    16 Return:  1.0000 FPS: 32.58
12:14:32.242 [INFO] (irl_trainer.py:73) Total Epi:  1087 Steps:   14440 Episode Steps:    11 Return: -1.0000 FPS: 32.68
12:14:32.465 [INFO] (irl_trainer.py:73) Total Epi:  1088 Steps:   14447 Episode Steps:     7 Return: -1.0000 FPS: 31.75
12:14:32.709 [INFO] (irl_trainer.py:73) 

12:15:04.324 [INFO] (irl_trainer.py:73) Total Epi:  1149 Steps:   15470 Episode Steps:    22 Return:  1.0000 FPS: 31.80
12:15:04.594 [INFO] (irl_trainer.py:73) Total Epi:  1150 Steps:   15479 Episode Steps:     9 Return: -1.0000 FPS: 33.60
12:15:05.193 [INFO] (irl_trainer.py:73) Total Epi:  1151 Steps:   15499 Episode Steps:    20 Return:  1.0000 FPS: 33.46
12:15:05.497 [INFO] (irl_trainer.py:73) Total Epi:  1152 Steps:   15508 Episode Steps:     9 Return: -1.0000 FPS: 29.89
12:15:06.066 [INFO] (irl_trainer.py:73) Total Epi:  1153 Steps:   15527 Episode Steps:    19 Return: -1.0000 FPS: 33.48
12:15:06.674 [INFO] (irl_trainer.py:73) Total Epi:  1154 Steps:   15547 Episode Steps:    20 Return:  1.0000 FPS: 33.01
12:15:07.112 [INFO] (irl_trainer.py:73) Total Epi:  1155 Steps:   15562 Episode Steps:    15 Return: -1.0000 FPS: 34.35
12:15:07.574 [INFO] (irl_trainer.py:73) Total Epi:  1156 Steps:   15577 Episode Steps:    15 Return:  1.0000 FPS: 32.60
12:15:08.353 [INFO] (irl_trainer.py:73) 

12:15:36.750 [INFO] (irl_trainer.py:73) Total Epi:  1214 Steps:   16506 Episode Steps:    18 Return:  1.0000 FPS: 30.68
12:15:37.142 [INFO] (irl_trainer.py:73) Total Epi:  1215 Steps:   16519 Episode Steps:    13 Return:  1.0000 FPS: 33.34
12:15:37.537 [INFO] (irl_trainer.py:73) Total Epi:  1216 Steps:   16532 Episode Steps:    13 Return:  1.0000 FPS: 33.10
12:15:37.976 [INFO] (irl_trainer.py:73) Total Epi:  1217 Steps:   16547 Episode Steps:    15 Return:  1.0000 FPS: 34.29
12:15:38.738 [INFO] (irl_trainer.py:73) Total Epi:  1218 Steps:   16571 Episode Steps:    24 Return:  1.0000 FPS: 31.56
12:15:39.180 [INFO] (irl_trainer.py:73) Total Epi:  1219 Steps:   16586 Episode Steps:    15 Return: -1.0000 FPS: 34.10
12:15:39.812 [INFO] (irl_trainer.py:73) Total Epi:  1220 Steps:   16606 Episode Steps:    20 Return:  1.0000 FPS: 31.79
12:15:40.372 [INFO] (irl_trainer.py:73) Total Epi:  1221 Steps:   16624 Episode Steps:    18 Return:  1.0000 FPS: 32.21
12:15:40.968 [INFO] (irl_trainer.py:73) 

12:16:11.765 [INFO] (irl_trainer.py:73) Total Epi:  1283 Steps:   17639 Episode Steps:    12 Return: -1.0000 FPS: 33.49
12:16:12.305 [INFO] (irl_trainer.py:73) Total Epi:  1284 Steps:   17656 Episode Steps:    17 Return:  1.0000 FPS: 31.58
12:16:12.827 [INFO] (irl_trainer.py:73) Total Epi:  1285 Steps:   17673 Episode Steps:    17 Return:  1.0000 FPS: 32.67
12:16:13.343 [INFO] (irl_trainer.py:73) Total Epi:  1286 Steps:   17690 Episode Steps:    17 Return: -1.0000 FPS: 33.10
12:16:13.856 [INFO] (irl_trainer.py:73) Total Epi:  1287 Steps:   17706 Episode Steps:    16 Return:  1.0000 FPS: 31.30
12:16:14.439 [INFO] (irl_trainer.py:73) Total Epi:  1288 Steps:   17725 Episode Steps:    19 Return:  1.0000 FPS: 32.66
12:16:15.018 [INFO] (irl_trainer.py:73) Total Epi:  1289 Steps:   17744 Episode Steps:    19 Return: -1.0000 FPS: 32.96
12:16:15.618 [INFO] (irl_trainer.py:73) Total Epi:  1290 Steps:   17764 Episode Steps:    20 Return:  1.0000 FPS: 33.40
12:16:16.392 [INFO] (irl_trainer.py:73) 

12:16:43.896 [INFO] (irl_trainer.py:73) Total Epi:  1349 Steps:   18659 Episode Steps:    14 Return: -1.0000 FPS: 32.55
12:16:44.317 [INFO] (irl_trainer.py:73) Total Epi:  1350 Steps:   18673 Episode Steps:    14 Return:  1.0000 FPS: 33.42
12:16:44.774 [INFO] (irl_trainer.py:73) Total Epi:  1351 Steps:   18688 Episode Steps:    15 Return:  1.0000 FPS: 32.99
12:16:45.132 [INFO] (irl_trainer.py:73) Total Epi:  1352 Steps:   18700 Episode Steps:    12 Return:  1.0000 FPS: 33.67
12:16:45.466 [INFO] (irl_trainer.py:73) Total Epi:  1353 Steps:   18711 Episode Steps:    11 Return: -1.0000 FPS: 33.09
12:16:45.833 [INFO] (irl_trainer.py:73) Total Epi:  1354 Steps:   18723 Episode Steps:    12 Return:  1.0000 FPS: 32.87
12:16:46.548 [INFO] (irl_trainer.py:73) Total Epi:  1355 Steps:   18746 Episode Steps:    23 Return:  1.0000 FPS: 32.24
12:16:47.195 [INFO] (irl_trainer.py:73) Total Epi:  1356 Steps:   18766 Episode Steps:    20 Return:  1.0000 FPS: 31.01
12:16:47.561 [INFO] (irl_trainer.py:73) 

12:17:16.539 [INFO] (irl_trainer.py:73) Total Epi:  1417 Steps:   19711 Episode Steps:    15 Return:  1.0000 FPS: 30.61
12:17:16.907 [INFO] (irl_trainer.py:73) Total Epi:  1418 Steps:   19722 Episode Steps:    11 Return: -1.0000 FPS: 30.02
12:17:17.553 [INFO] (irl_trainer.py:73) Total Epi:  1419 Steps:   19742 Episode Steps:    20 Return:  1.0000 FPS: 31.08
12:17:18.162 [INFO] (irl_trainer.py:73) Total Epi:  1420 Steps:   19762 Episode Steps:    20 Return:  1.0000 FPS: 32.93
12:17:18.654 [INFO] (irl_trainer.py:73) Total Epi:  1421 Steps:   19778 Episode Steps:    16 Return:  1.0000 FPS: 32.64
12:17:19.014 [INFO] (irl_trainer.py:73) Total Epi:  1422 Steps:   19789 Episode Steps:    11 Return: -1.0000 FPS: 30.78
12:17:19.647 [INFO] (irl_trainer.py:73) Total Epi:  1423 Steps:   19809 Episode Steps:    20 Return:  1.0000 FPS: 31.68
12:17:20.381 [INFO] (irl_trainer.py:73) Total Epi:  1424 Steps:   19833 Episode Steps:    24 Return:  1.0000 FPS: 32.75
12:17:20.908 [INFO] (irl_trainer.py:73) 

12:17:51.308 [INFO] (irl_trainer.py:73) Total Epi:  1486 Steps:   20845 Episode Steps:    14 Return:  1.0000 FPS: 33.57
12:17:51.580 [INFO] (irl_trainer.py:73) Total Epi:  1487 Steps:   20855 Episode Steps:    10 Return: -1.0000 FPS: 36.99
12:17:52.036 [INFO] (irl_trainer.py:73) Total Epi:  1488 Steps:   20870 Episode Steps:    15 Return:  1.0000 FPS: 33.06
12:17:52.505 [INFO] (irl_trainer.py:73) Total Epi:  1489 Steps:   20886 Episode Steps:    16 Return:  1.0000 FPS: 34.26
12:17:52.998 [INFO] (irl_trainer.py:73) Total Epi:  1490 Steps:   20901 Episode Steps:    15 Return:  1.0000 FPS: 30.56
12:17:53.412 [INFO] (irl_trainer.py:73) Total Epi:  1491 Steps:   20915 Episode Steps:    14 Return:  1.0000 FPS: 33.99
12:17:53.860 [INFO] (irl_trainer.py:73) Total Epi:  1492 Steps:   20930 Episode Steps:    15 Return:  1.0000 FPS: 33.64
12:17:54.254 [INFO] (irl_trainer.py:73) Total Epi:  1493 Steps:   20943 Episode Steps:    13 Return: -1.0000 FPS: 33.18
12:17:54.691 [INFO] (irl_trainer.py:73) 

12:18:24.587 [INFO] (irl_trainer.py:73) Total Epi:  1551 Steps:   21910 Episode Steps:    18 Return:  1.0000 FPS: 31.42
12:18:25.167 [INFO] (irl_trainer.py:73) Total Epi:  1552 Steps:   21929 Episode Steps:    19 Return:  1.0000 FPS: 32.89
12:18:25.496 [INFO] (irl_trainer.py:73) Total Epi:  1553 Steps:   21940 Episode Steps:    11 Return:  1.0000 FPS: 33.60
12:18:26.205 [INFO] (irl_trainer.py:73) Total Epi:  1554 Steps:   21964 Episode Steps:    24 Return: -1.0000 FPS: 33.96
12:18:26.602 [INFO] (irl_trainer.py:73) Total Epi:  1555 Steps:   21977 Episode Steps:    13 Return:  1.0000 FPS: 32.85
12:18:27.050 [INFO] (irl_trainer.py:73) Total Epi:  1556 Steps:   21992 Episode Steps:    15 Return:  1.0000 FPS: 33.70
12:18:27.567 [INFO] (irl_trainer.py:73) Total Epi:  1557 Steps:   22009 Episode Steps:    17 Return: -1.0000 FPS: 33.06
12:18:27.907 [INFO] (irl_trainer.py:73) Total Epi:  1558 Steps:   22020 Episode Steps:    11 Return: -1.0000 FPS: 32.49
12:18:28.431 [INFO] (irl_trainer.py:73) 

12:18:58.474 [INFO] (irl_trainer.py:73) Total Epi:  1619 Steps:   23003 Episode Steps:    19 Return: -1.0000 FPS: 30.09
12:18:58.936 [INFO] (irl_trainer.py:73) Total Epi:  1620 Steps:   23018 Episode Steps:    15 Return:  1.0000 FPS: 32.60
12:18:59.455 [INFO] (irl_trainer.py:73) Total Epi:  1621 Steps:   23035 Episode Steps:    17 Return:  1.0000 FPS: 32.88
12:19:00.136 [INFO] (irl_trainer.py:73) Total Epi:  1622 Steps:   23057 Episode Steps:    22 Return:  1.0000 FPS: 32.41
12:19:00.691 [INFO] (irl_trainer.py:73) Total Epi:  1623 Steps:   23075 Episode Steps:    18 Return:  1.0000 FPS: 32.49
12:19:01.093 [INFO] (irl_trainer.py:73) Total Epi:  1624 Steps:   23089 Episode Steps:    14 Return: -1.0000 FPS: 35.03
12:19:01.296 [INFO] (irl_trainer.py:73) Total Epi:  1625 Steps:   23095 Episode Steps:     6 Return: -1.0000 FPS: 29.81
12:19:01.965 [INFO] (irl_trainer.py:73) Total Epi:  1626 Steps:   23117 Episode Steps:    22 Return:  1.0000 FPS: 32.95
12:19:02.304 [INFO] (irl_trainer.py:73) 

12:19:32.268 [INFO] (irl_trainer.py:73) Total Epi:  1688 Steps:   24099 Episode Steps:    14 Return:  1.0000 FPS: 32.89
12:19:32.612 [INFO] (irl_trainer.py:73) Total Epi:  1689 Steps:   24110 Episode Steps:    11 Return: -1.0000 FPS: 32.23
12:19:32.906 [INFO] (irl_trainer.py:73) Total Epi:  1690 Steps:   24120 Episode Steps:    10 Return: -1.0000 FPS: 34.27
12:19:33.510 [INFO] (irl_trainer.py:73) Total Epi:  1691 Steps:   24140 Episode Steps:    20 Return:  1.0000 FPS: 33.19
12:19:33.912 [INFO] (irl_trainer.py:73) Total Epi:  1692 Steps:   24153 Episode Steps:    13 Return:  0.0000 FPS: 32.55
12:19:34.278 [INFO] (irl_trainer.py:73) Total Epi:  1693 Steps:   24165 Episode Steps:    12 Return: -1.0000 FPS: 32.93
12:19:34.782 [INFO] (irl_trainer.py:73) Total Epi:  1694 Steps:   24182 Episode Steps:    17 Return:  1.0000 FPS: 33.88
12:19:35.326 [INFO] (irl_trainer.py:73) Total Epi:  1695 Steps:   24200 Episode Steps:    18 Return:  1.0000 FPS: 33.22
12:19:35.699 [INFO] (irl_trainer.py:73) 

12:20:08.201 [INFO] (irl_trainer.py:73) Total Epi:  1756 Steps:   25258 Episode Steps:    18 Return:  1.0000 FPS: 32.96
12:20:08.705 [INFO] (irl_trainer.py:73) Total Epi:  1757 Steps:   25275 Episode Steps:    17 Return:  1.0000 FPS: 33.90
12:20:09.287 [INFO] (irl_trainer.py:73) Total Epi:  1758 Steps:   25294 Episode Steps:    19 Return:  1.0000 FPS: 32.71
12:20:09.899 [INFO] (irl_trainer.py:73) Total Epi:  1759 Steps:   25314 Episode Steps:    20 Return:  1.0000 FPS: 32.81
12:20:10.474 [INFO] (irl_trainer.py:73) Total Epi:  1760 Steps:   25332 Episode Steps:    18 Return:  1.0000 FPS: 31.43
12:20:10.784 [INFO] (irl_trainer.py:73) Total Epi:  1761 Steps:   25342 Episode Steps:    10 Return: -1.0000 FPS: 32.53
12:20:11.397 [INFO] (irl_trainer.py:73) Total Epi:  1762 Steps:   25362 Episode Steps:    20 Return:  1.0000 FPS: 32.71
12:20:11.840 [INFO] (irl_trainer.py:73) Total Epi:  1763 Steps:   25377 Episode Steps:    15 Return:  1.0000 FPS: 33.95
12:20:12.281 [INFO] (irl_trainer.py:73) 

12:20:41.909 [INFO] (irl_trainer.py:73) Total Epi:  1824 Steps:   26356 Episode Steps:    17 Return:  1.0000 FPS: 32.48
12:20:42.266 [INFO] (irl_trainer.py:73) Total Epi:  1825 Steps:   26368 Episode Steps:    12 Return: -1.0000 FPS: 33.81
12:20:42.929 [INFO] (irl_trainer.py:73) Total Epi:  1826 Steps:   26389 Episode Steps:    21 Return:  1.0000 FPS: 31.72
12:20:43.337 [INFO] (irl_trainer.py:73) Total Epi:  1827 Steps:   26402 Episode Steps:    13 Return: -1.0000 FPS: 32.06
12:20:43.707 [INFO] (irl_trainer.py:73) Total Epi:  1828 Steps:   26414 Episode Steps:    12 Return:  1.0000 FPS: 32.60
12:20:44.073 [INFO] (irl_trainer.py:73) Total Epi:  1829 Steps:   26426 Episode Steps:    12 Return:  1.0000 FPS: 32.98
12:20:44.782 [INFO] (irl_trainer.py:73) Total Epi:  1830 Steps:   26449 Episode Steps:    23 Return:  1.0000 FPS: 32.54
12:20:45.307 [INFO] (irl_trainer.py:73) Total Epi:  1831 Steps:   26466 Episode Steps:    17 Return:  1.0000 FPS: 32.50
12:20:45.844 [INFO] (irl_trainer.py:73) 

12:21:16.363 [INFO] (irl_trainer.py:73) Total Epi:  1893 Steps:   27466 Episode Steps:    20 Return:  1.0000 FPS: 31.78
12:21:16.971 [INFO] (irl_trainer.py:73) Total Epi:  1894 Steps:   27486 Episode Steps:    20 Return:  1.0000 FPS: 33.00
12:21:17.491 [INFO] (irl_trainer.py:73) Total Epi:  1895 Steps:   27502 Episode Steps:    16 Return:  1.0000 FPS: 30.86
12:21:17.837 [INFO] (irl_trainer.py:73) Total Epi:  1896 Steps:   27513 Episode Steps:    11 Return: -1.0000 FPS: 32.00
12:21:18.438 [INFO] (irl_trainer.py:73) Total Epi:  1897 Steps:   27532 Episode Steps:    19 Return:  1.0000 FPS: 31.75
12:21:19.018 [INFO] (irl_trainer.py:73) Total Epi:  1898 Steps:   27552 Episode Steps:    20 Return:  1.0000 FPS: 34.64
12:21:19.522 [INFO] (irl_trainer.py:73) Total Epi:  1899 Steps:   27569 Episode Steps:    17 Return:  1.0000 FPS: 33.87
12:21:19.999 [INFO] (irl_trainer.py:73) Total Epi:  1900 Steps:   27585 Episode Steps:    16 Return:  1.0000 FPS: 33.66
12:21:20.240 [INFO] (irl_trainer.py:119)

12:21:49.665 [INFO] (irl_trainer.py:73) Total Epi:  1961 Steps:   28547 Episode Steps:    17 Return:  1.0000 FPS: 32.13
12:21:50.115 [INFO] (irl_trainer.py:73) Total Epi:  1962 Steps:   28562 Episode Steps:    15 Return:  1.0000 FPS: 33.48
12:21:50.703 [INFO] (irl_trainer.py:73) Total Epi:  1963 Steps:   28581 Episode Steps:    19 Return:  1.0000 FPS: 32.43
12:21:51.181 [INFO] (irl_trainer.py:73) Total Epi:  1964 Steps:   28597 Episode Steps:    16 Return:  1.0000 FPS: 33.64
12:21:51.634 [INFO] (irl_trainer.py:73) Total Epi:  1965 Steps:   28611 Episode Steps:    14 Return:  1.0000 FPS: 31.06
12:21:52.056 [INFO] (irl_trainer.py:73) Total Epi:  1966 Steps:   28625 Episode Steps:    14 Return:  1.0000 FPS: 33.31
12:21:52.500 [INFO] (irl_trainer.py:73) Total Epi:  1967 Steps:   28640 Episode Steps:    15 Return:  1.0000 FPS: 33.98
12:21:52.801 [INFO] (irl_trainer.py:73) Total Epi:  1968 Steps:   28650 Episode Steps:    10 Return: -1.0000 FPS: 33.47
12:21:53.302 [INFO] (irl_trainer.py:73) 

12:22:24.434 [INFO] (irl_trainer.py:73) Total Epi:  2029 Steps:   29666 Episode Steps:    22 Return:  1.0000 FPS: 32.48
12:22:24.878 [INFO] (irl_trainer.py:73) Total Epi:  2030 Steps:   29681 Episode Steps:    15 Return:  1.0000 FPS: 33.94
12:22:25.363 [INFO] (irl_trainer.py:73) Total Epi:  2031 Steps:   29697 Episode Steps:    16 Return:  1.0000 FPS: 33.17
12:22:25.858 [INFO] (irl_trainer.py:73) Total Epi:  2032 Steps:   29713 Episode Steps:    16 Return:  1.0000 FPS: 32.45
12:22:26.178 [INFO] (irl_trainer.py:73) Total Epi:  2033 Steps:   29724 Episode Steps:    11 Return: -1.0000 FPS: 34.62
12:22:26.713 [INFO] (irl_trainer.py:73) Total Epi:  2034 Steps:   29741 Episode Steps:    17 Return:  1.0000 FPS: 31.94
12:22:27.199 [INFO] (irl_trainer.py:73) Total Epi:  2035 Steps:   29757 Episode Steps:    16 Return:  1.0000 FPS: 33.00
12:22:27.731 [INFO] (irl_trainer.py:73) Total Epi:  2036 Steps:   29774 Episode Steps:    17 Return:  1.0000 FPS: 32.06
12:22:28.398 [INFO] (irl_trainer.py:73) 

12:22:59.746 [INFO] (irl_trainer.py:73) Total Epi:  2098 Steps:   30803 Episode Steps:    16 Return:  1.0000 FPS: 31.68
12:23:00.221 [INFO] (irl_trainer.py:73) Total Epi:  2099 Steps:   30819 Episode Steps:    16 Return:  1.0000 FPS: 33.87
12:23:00.692 [INFO] (irl_trainer.py:73) Total Epi:  2100 Steps:   30834 Episode Steps:    15 Return:  1.0000 FPS: 32.00
12:23:00.915 [INFO] (irl_trainer.py:119) Evaluation Total Steps:   30834 Average Reward  0.6000 / Average Step Count  16.0 over  5 episodes
12:23:00.926 [INFO] (irl_trainer.py:73) Total Epi:  2101 Steps:   30835 Episode Steps:     1 Return:  1.0000 FPS:  4.30
12:23:01.267 [INFO] (irl_trainer.py:73) Total Epi:  2102 Steps:   30846 Episode Steps:    11 Return:  1.0000 FPS: 32.51
12:23:01.925 [INFO] (irl_trainer.py:73) Total Epi:  2103 Steps:   30866 Episode Steps:    20 Return:  1.0000 FPS: 30.48
12:23:02.541 [INFO] (irl_trainer.py:73) Total Epi:  2104 Steps:   30886 Episode Steps:    20 Return:  1.0000 FPS: 32.58
12:23:02.779 [INFO] 

12:23:33.777 [INFO] (irl_trainer.py:73) Total Epi:  2166 Steps:   31903 Episode Steps:    21 Return:  1.0000 FPS: 31.02
12:23:34.201 [INFO] (irl_trainer.py:73) Total Epi:  2167 Steps:   31917 Episode Steps:    14 Return:  1.0000 FPS: 33.12
12:23:34.584 [INFO] (irl_trainer.py:73) Total Epi:  2168 Steps:   31929 Episode Steps:    12 Return:  1.0000 FPS: 31.54
12:23:35.122 [INFO] (irl_trainer.py:73) Total Epi:  2169 Steps:   31946 Episode Steps:    17 Return:  1.0000 FPS: 31.71
12:23:35.691 [INFO] (irl_trainer.py:73) Total Epi:  2170 Steps:   31965 Episode Steps:    19 Return:  1.0000 FPS: 33.53
12:23:36.199 [INFO] (irl_trainer.py:73) Total Epi:  2171 Steps:   31981 Episode Steps:    16 Return:  1.0000 FPS: 31.63
12:23:36.702 [INFO] (irl_trainer.py:73) Total Epi:  2172 Steps:   31997 Episode Steps:    16 Return:  1.0000 FPS: 31.93
12:23:37.427 [INFO] (irl_trainer.py:73) Total Epi:  2173 Steps:   32020 Episode Steps:    23 Return:  1.0000 FPS: 31.81
12:23:37.989 [INFO] (irl_trainer.py:73) 

12:24:10.683 [INFO] (irl_trainer.py:73) Total Epi:  2234 Steps:   33075 Episode Steps:    18 Return:  1.0000 FPS: 33.69
12:24:11.267 [INFO] (irl_trainer.py:73) Total Epi:  2235 Steps:   33093 Episode Steps:    18 Return:  1.0000 FPS: 30.97
12:24:11.613 [INFO] (irl_trainer.py:73) Total Epi:  2236 Steps:   33103 Episode Steps:    10 Return:  1.0000 FPS: 29.05
12:24:12.177 [INFO] (irl_trainer.py:73) Total Epi:  2237 Steps:   33122 Episode Steps:    19 Return:  1.0000 FPS: 33.82
12:24:12.618 [INFO] (irl_trainer.py:73) Total Epi:  2238 Steps:   33137 Episode Steps:    15 Return:  1.0000 FPS: 34.10
12:24:13.101 [INFO] (irl_trainer.py:73) Total Epi:  2239 Steps:   33153 Episode Steps:    16 Return:  1.0000 FPS: 33.22
12:24:13.580 [INFO] (irl_trainer.py:73) Total Epi:  2240 Steps:   33169 Episode Steps:    16 Return: -1.0000 FPS: 33.57
12:24:14.179 [INFO] (irl_trainer.py:73) Total Epi:  2241 Steps:   33188 Episode Steps:    19 Return:  1.0000 FPS: 31.84
12:24:14.739 [INFO] (irl_trainer.py:73) 

12:24:44.065 [INFO] (irl_trainer.py:73) Total Epi:  2302 Steps:   34139 Episode Steps:    15 Return:  1.0000 FPS: 30.77
12:24:44.417 [INFO] (irl_trainer.py:73) Total Epi:  2303 Steps:   34151 Episode Steps:    12 Return:  1.0000 FPS: 34.33
12:24:44.736 [INFO] (irl_trainer.py:73) Total Epi:  2304 Steps:   34162 Episode Steps:    11 Return:  1.0000 FPS: 34.66
12:24:45.319 [INFO] (irl_trainer.py:73) Total Epi:  2305 Steps:   34180 Episode Steps:    18 Return:  1.0000 FPS: 31.01
12:24:45.841 [INFO] (irl_trainer.py:73) Total Epi:  2306 Steps:   34197 Episode Steps:    17 Return:  1.0000 FPS: 32.72
12:24:46.442 [INFO] (irl_trainer.py:73) Total Epi:  2307 Steps:   34216 Episode Steps:    19 Return:  1.0000 FPS: 31.71
12:24:46.863 [INFO] (irl_trainer.py:73) Total Epi:  2308 Steps:   34230 Episode Steps:    14 Return:  1.0000 FPS: 33.39
12:24:47.426 [INFO] (irl_trainer.py:73) Total Epi:  2309 Steps:   34248 Episode Steps:    18 Return:  1.0000 FPS: 32.07
12:24:48.015 [INFO] (irl_trainer.py:73) 

12:25:18.727 [INFO] (irl_trainer.py:73) Total Epi:  2371 Steps:   35273 Episode Steps:    17 Return:  1.0000 FPS: 35.08
12:25:19.345 [INFO] (irl_trainer.py:73) Total Epi:  2372 Steps:   35293 Episode Steps:    20 Return:  1.0000 FPS: 32.47
12:25:20.058 [INFO] (irl_trainer.py:73) Total Epi:  2373 Steps:   35315 Episode Steps:    22 Return:  1.0000 FPS: 30.99
12:25:20.631 [INFO] (irl_trainer.py:73) Total Epi:  2374 Steps:   35333 Episode Steps:    18 Return:  1.0000 FPS: 31.56
12:25:21.115 [INFO] (irl_trainer.py:73) Total Epi:  2375 Steps:   35349 Episode Steps:    16 Return:  1.0000 FPS: 33.22
12:25:21.510 [INFO] (irl_trainer.py:73) Total Epi:  2376 Steps:   35362 Episode Steps:    13 Return:  1.0000 FPS: 33.04
12:25:21.984 [INFO] (irl_trainer.py:73) Total Epi:  2377 Steps:   35377 Episode Steps:    15 Return:  1.0000 FPS: 31.77
12:25:22.684 [INFO] (irl_trainer.py:73) Total Epi:  2378 Steps:   35400 Episode Steps:    23 Return:  1.0000 FPS: 32.91
12:25:23.165 [INFO] (irl_trainer.py:73) 

12:25:53.183 [INFO] (irl_trainer.py:73) Total Epi:  2439 Steps:   36372 Episode Steps:    15 Return:  1.0000 FPS: 32.98
12:25:53.714 [INFO] (irl_trainer.py:73) Total Epi:  2440 Steps:   36389 Episode Steps:    17 Return:  1.0000 FPS: 32.12
12:25:54.276 [INFO] (irl_trainer.py:73) Total Epi:  2441 Steps:   36406 Episode Steps:    17 Return:  1.0000 FPS: 30.35
12:25:54.784 [INFO] (irl_trainer.py:73) Total Epi:  2442 Steps:   36422 Episode Steps:    16 Return:  1.0000 FPS: 31.62
12:25:55.316 [INFO] (irl_trainer.py:73) Total Epi:  2443 Steps:   36440 Episode Steps:    18 Return:  1.0000 FPS: 33.99
12:25:55.794 [INFO] (irl_trainer.py:73) Total Epi:  2444 Steps:   36456 Episode Steps:    16 Return:  1.0000 FPS: 33.61
12:25:56.253 [INFO] (irl_trainer.py:73) Total Epi:  2445 Steps:   36471 Episode Steps:    15 Return:  1.0000 FPS: 32.82
12:25:56.775 [INFO] (irl_trainer.py:73) Total Epi:  2446 Steps:   36489 Episode Steps:    18 Return:  1.0000 FPS: 34.57
12:25:57.394 [INFO] (irl_trainer.py:73) 

12:26:28.754 [INFO] (irl_trainer.py:73) Total Epi:  2507 Steps:   37511 Episode Steps:    16 Return:  1.0000 FPS: 30.76
12:26:29.087 [INFO] (irl_trainer.py:73) Total Epi:  2508 Steps:   37522 Episode Steps:    11 Return: -1.0000 FPS: 33.32
12:26:29.624 [INFO] (irl_trainer.py:73) Total Epi:  2509 Steps:   37539 Episode Steps:    17 Return:  1.0000 FPS: 31.78
12:26:30.145 [INFO] (irl_trainer.py:73) Total Epi:  2510 Steps:   37556 Episode Steps:    17 Return:  1.0000 FPS: 32.72
12:26:30.625 [INFO] (irl_trainer.py:73) Total Epi:  2511 Steps:   37572 Episode Steps:    16 Return:  1.0000 FPS: 33.46
12:26:31.168 [INFO] (irl_trainer.py:73) Total Epi:  2512 Steps:   37590 Episode Steps:    18 Return:  1.0000 FPS: 33.28
12:26:31.783 [INFO] (irl_trainer.py:73) Total Epi:  2513 Steps:   37609 Episode Steps:    19 Return:  1.0000 FPS: 30.98
12:26:32.440 [INFO] (irl_trainer.py:73) Total Epi:  2514 Steps:   37631 Episode Steps:    22 Return: -1.0000 FPS: 33.58
12:26:33.093 [INFO] (irl_trainer.py:73) 

12:27:04.932 [INFO] (irl_trainer.py:73) Total Epi:  2576 Steps:   38664 Episode Steps:    20 Return:  1.0000 FPS: 32.35
12:27:05.607 [INFO] (irl_trainer.py:73) Total Epi:  2577 Steps:   38685 Episode Steps:    21 Return:  1.0000 FPS: 31.20
12:27:06.140 [INFO] (irl_trainer.py:73) Total Epi:  2578 Steps:   38702 Episode Steps:    17 Return:  1.0000 FPS: 32.06
12:27:06.618 [INFO] (irl_trainer.py:73) Total Epi:  2579 Steps:   38718 Episode Steps:    16 Return:  1.0000 FPS: 33.59
12:27:06.809 [INFO] (irl_trainer.py:73) Total Epi:  2580 Steps:   38724 Episode Steps:     6 Return: -1.0000 FPS: 31.68
12:27:07.393 [INFO] (irl_trainer.py:73) Total Epi:  2581 Steps:   38743 Episode Steps:    19 Return:  1.0000 FPS: 32.66
12:27:07.711 [INFO] (irl_trainer.py:73) Total Epi:  2582 Steps:   38753 Episode Steps:    10 Return:  1.0000 FPS: 31.58
12:27:08.357 [INFO] (irl_trainer.py:73) Total Epi:  2583 Steps:   38774 Episode Steps:    21 Return:  1.0000 FPS: 32.62
12:27:08.941 [INFO] (irl_trainer.py:73) 

12:27:38.848 [INFO] (irl_trainer.py:73) Total Epi:  2644 Steps:   39754 Episode Steps:    17 Return:  1.0000 FPS: 33.00
12:27:39.405 [INFO] (irl_trainer.py:73) Total Epi:  2645 Steps:   39771 Episode Steps:    17 Return:  1.0000 FPS: 30.65
12:27:39.942 [INFO] (irl_trainer.py:73) Total Epi:  2646 Steps:   39788 Episode Steps:    17 Return:  1.0000 FPS: 31.77
12:27:40.296 [INFO] (irl_trainer.py:73) Total Epi:  2647 Steps:   39800 Episode Steps:    12 Return: -1.0000 FPS: 34.13
12:27:40.805 [INFO] (irl_trainer.py:73) Total Epi:  2648 Steps:   39816 Episode Steps:    16 Return:  1.0000 FPS: 31.56
12:27:41.321 [INFO] (irl_trainer.py:73) Total Epi:  2649 Steps:   39833 Episode Steps:    17 Return:  1.0000 FPS: 33.16
12:27:41.937 [INFO] (irl_trainer.py:73) Total Epi:  2650 Steps:   39854 Episode Steps:    21 Return: -1.0000 FPS: 34.13
12:27:42.484 [INFO] (irl_trainer.py:73) Total Epi:  2651 Steps:   39873 Episode Steps:    19 Return:  1.0000 FPS: 34.91
12:27:42.947 [INFO] (irl_trainer.py:73) 

12:28:15.313 [INFO] (irl_trainer.py:73) Total Epi:  2712 Steps:   40920 Episode Steps:    19 Return:  1.0000 FPS: 32.93
12:28:15.821 [INFO] (irl_trainer.py:73) Total Epi:  2713 Steps:   40937 Episode Steps:    17 Return:  1.0000 FPS: 33.62
12:28:16.354 [INFO] (irl_trainer.py:73) Total Epi:  2714 Steps:   40954 Episode Steps:    17 Return:  1.0000 FPS: 32.00
12:28:16.852 [INFO] (irl_trainer.py:73) Total Epi:  2715 Steps:   40970 Episode Steps:    16 Return:  1.0000 FPS: 32.31
12:28:17.430 [INFO] (irl_trainer.py:73) Total Epi:  2716 Steps:   40989 Episode Steps:    19 Return:  1.0000 FPS: 33.00
12:28:18.023 [INFO] (irl_trainer.py:73) Total Epi:  2717 Steps:   41008 Episode Steps:    19 Return:  1.0000 FPS: 32.17
12:28:18.583 [INFO] (irl_trainer.py:73) Total Epi:  2718 Steps:   41026 Episode Steps:    18 Return:  1.0000 FPS: 32.27
12:28:19.054 [INFO] (irl_trainer.py:73) Total Epi:  2719 Steps:   41041 Episode Steps:    15 Return:  1.0000 FPS: 31.99
12:28:19.558 [INFO] (irl_trainer.py:73) 

12:28:50.590 [INFO] (irl_trainer.py:73) Total Epi:  2781 Steps:   42049 Episode Steps:    19 Return:  1.0000 FPS: 32.89
12:28:51.022 [INFO] (irl_trainer.py:73) Total Epi:  2782 Steps:   42063 Episode Steps:    14 Return:  1.0000 FPS: 32.48
12:28:51.393 [INFO] (irl_trainer.py:73) Total Epi:  2783 Steps:   42075 Episode Steps:    12 Return:  1.0000 FPS: 32.53
12:28:51.739 [INFO] (irl_trainer.py:73) Total Epi:  2784 Steps:   42086 Episode Steps:    11 Return:  1.0000 FPS: 31.93
12:28:52.305 [INFO] (irl_trainer.py:73) Total Epi:  2785 Steps:   42104 Episode Steps:    18 Return:  1.0000 FPS: 31.95
12:28:52.900 [INFO] (irl_trainer.py:73) Total Epi:  2786 Steps:   42123 Episode Steps:    19 Return:  1.0000 FPS: 32.04
12:28:53.475 [INFO] (irl_trainer.py:73) Total Epi:  2787 Steps:   42142 Episode Steps:    19 Return:  1.0000 FPS: 33.17
12:28:53.932 [INFO] (irl_trainer.py:73) Total Epi:  2788 Steps:   42157 Episode Steps:    15 Return:  1.0000 FPS: 32.96
12:28:54.477 [INFO] (irl_trainer.py:73) 

12:29:24.706 [INFO] (irl_trainer.py:73) Total Epi:  2849 Steps:   43143 Episode Steps:    18 Return:  1.0000 FPS: 32.86
12:29:25.304 [INFO] (irl_trainer.py:73) Total Epi:  2850 Steps:   43163 Episode Steps:    20 Return: -1.0000 FPS: 33.55
12:29:25.728 [INFO] (irl_trainer.py:73) Total Epi:  2851 Steps:   43177 Episode Steps:    14 Return:  1.0000 FPS: 33.14
12:29:26.186 [INFO] (irl_trainer.py:73) Total Epi:  2852 Steps:   43192 Episode Steps:    15 Return:  1.0000 FPS: 32.95
12:29:26.912 [INFO] (irl_trainer.py:73) Total Epi:  2853 Steps:   43215 Episode Steps:    23 Return:  1.0000 FPS: 31.78
12:29:27.410 [INFO] (irl_trainer.py:73) Total Epi:  2854 Steps:   43231 Episode Steps:    16 Return:  1.0000 FPS: 32.25
12:29:27.919 [INFO] (irl_trainer.py:73) Total Epi:  2855 Steps:   43247 Episode Steps:    16 Return:  1.0000 FPS: 31.55
12:29:28.589 [INFO] (irl_trainer.py:73) Total Epi:  2856 Steps:   43268 Episode Steps:    21 Return:  1.0000 FPS: 31.43
12:29:29.219 [INFO] (irl_trainer.py:73) 

12:29:59.922 [INFO] (irl_trainer.py:73) Total Epi:  2917 Steps:   44270 Episode Steps:    16 Return:  1.0000 FPS: 32.90
12:30:00.480 [INFO] (irl_trainer.py:73) Total Epi:  2918 Steps:   44288 Episode Steps:    18 Return:  1.0000 FPS: 32.34
12:30:00.952 [INFO] (irl_trainer.py:73) Total Epi:  2919 Steps:   44303 Episode Steps:    15 Return:  1.0000 FPS: 32.10
12:30:01.437 [INFO] (irl_trainer.py:73) Total Epi:  2920 Steps:   44319 Episode Steps:    16 Return:  1.0000 FPS: 33.15
12:30:02.035 [INFO] (irl_trainer.py:73) Total Epi:  2921 Steps:   44338 Episode Steps:    19 Return:  1.0000 FPS: 31.85
12:30:02.404 [INFO] (irl_trainer.py:73) Total Epi:  2922 Steps:   44350 Episode Steps:    12 Return: -1.0000 FPS: 32.71
12:30:02.775 [INFO] (irl_trainer.py:73) Total Epi:  2923 Steps:   44362 Episode Steps:    12 Return: -1.0000 FPS: 32.58
12:30:03.212 [INFO] (irl_trainer.py:73) Total Epi:  2924 Steps:   44375 Episode Steps:    13 Return:  1.0000 FPS: 29.87
12:30:03.803 [INFO] (irl_trainer.py:73) 

12:30:36.422 [INFO] (irl_trainer.py:73) Total Epi:  2986 Steps:   45431 Episode Steps:    17 Return:  1.0000 FPS: 31.37
12:30:36.908 [INFO] (irl_trainer.py:73) Total Epi:  2987 Steps:   45447 Episode Steps:    16 Return:  1.0000 FPS: 33.07
12:30:37.245 [INFO] (irl_trainer.py:73) Total Epi:  2988 Steps:   45458 Episode Steps:    11 Return:  1.0000 FPS: 32.80
12:30:37.802 [INFO] (irl_trainer.py:73) Total Epi:  2989 Steps:   45475 Episode Steps:    17 Return:  1.0000 FPS: 30.64
12:30:38.445 [INFO] (irl_trainer.py:73) Total Epi:  2990 Steps:   45496 Episode Steps:    21 Return:  1.0000 FPS: 32.74
12:30:39.006 [INFO] (irl_trainer.py:73) Total Epi:  2991 Steps:   45514 Episode Steps:    18 Return:  1.0000 FPS: 32.23
12:30:39.579 [INFO] (irl_trainer.py:73) Total Epi:  2992 Steps:   45533 Episode Steps:    19 Return:  1.0000 FPS: 33.25
12:30:40.202 [INFO] (irl_trainer.py:73) Total Epi:  2993 Steps:   45553 Episode Steps:    20 Return:  1.0000 FPS: 32.25
12:30:40.791 [INFO] (irl_trainer.py:73) 

12:31:11.344 [INFO] (irl_trainer.py:73) Total Epi:  3054 Steps:   46554 Episode Steps:    14 Return:  1.0000 FPS: 32.79
12:31:11.495 [INFO] (irl_trainer.py:73) Total Epi:  3055 Steps:   46559 Episode Steps:     5 Return: -1.0000 FPS: 33.64
12:31:12.093 [INFO] (irl_trainer.py:73) Total Epi:  3056 Steps:   46578 Episode Steps:    19 Return:  1.0000 FPS: 31.89
12:31:12.392 [INFO] (irl_trainer.py:73) Total Epi:  3057 Steps:   46588 Episode Steps:    10 Return:  1.0000 FPS: 33.65
12:31:13.054 [INFO] (irl_trainer.py:73) Total Epi:  3058 Steps:   46609 Episode Steps:    21 Return:  1.0000 FPS: 31.83
12:31:13.644 [INFO] (irl_trainer.py:73) Total Epi:  3059 Steps:   46628 Episode Steps:    19 Return:  1.0000 FPS: 32.32
12:31:13.858 [INFO] (irl_trainer.py:73) Total Epi:  3060 Steps:   46635 Episode Steps:     7 Return: -1.0000 FPS: 32.96
12:31:14.349 [INFO] (irl_trainer.py:73) Total Epi:  3061 Steps:   46651 Episode Steps:    16 Return:  1.0000 FPS: 32.75
12:31:14.686 [INFO] (irl_trainer.py:73) 

12:31:45.857 [INFO] (irl_trainer.py:73) Total Epi:  3122 Steps:   47659 Episode Steps:    13 Return:  1.0000 FPS: 33.02
12:31:46.312 [INFO] (irl_trainer.py:73) Total Epi:  3123 Steps:   47674 Episode Steps:    15 Return:  1.0000 FPS: 33.10
12:31:46.832 [INFO] (irl_trainer.py:73) Total Epi:  3124 Steps:   47691 Episode Steps:    17 Return:  1.0000 FPS: 32.80
12:31:47.506 [INFO] (irl_trainer.py:73) Total Epi:  3125 Steps:   47712 Episode Steps:    21 Return: -1.0000 FPS: 31.30
12:31:48.104 [INFO] (irl_trainer.py:73) Total Epi:  3126 Steps:   47732 Episode Steps:    20 Return:  1.0000 FPS: 33.51
12:31:48.496 [INFO] (irl_trainer.py:73) Total Epi:  3127 Steps:   47745 Episode Steps:    13 Return: -1.0000 FPS: 33.40
12:31:49.091 [INFO] (irl_trainer.py:73) Total Epi:  3128 Steps:   47765 Episode Steps:    20 Return:  1.0000 FPS: 33.71
12:31:49.439 [INFO] (irl_trainer.py:73) Total Epi:  3129 Steps:   47776 Episode Steps:    11 Return: -1.0000 FPS: 31.81
12:31:49.828 [INFO] (irl_trainer.py:73) 

12:32:24.482 [INFO] (irl_trainer.py:73) Total Epi:  3191 Steps:   48905 Episode Steps:    10 Return:  1.0000 FPS: 30.88
12:32:25.149 [INFO] (irl_trainer.py:73) Total Epi:  3192 Steps:   48926 Episode Steps:    21 Return:  1.0000 FPS: 31.57
12:32:25.551 [INFO] (irl_trainer.py:73) Total Epi:  3193 Steps:   48939 Episode Steps:    13 Return:  1.0000 FPS: 32.51
12:32:26.053 [INFO] (irl_trainer.py:73) Total Epi:  3194 Steps:   48956 Episode Steps:    17 Return:  1.0000 FPS: 33.99
12:32:26.499 [INFO] (irl_trainer.py:73) Total Epi:  3195 Steps:   48971 Episode Steps:    15 Return:  1.0000 FPS: 33.75
12:32:27.127 [INFO] (irl_trainer.py:73) Total Epi:  3196 Steps:   48991 Episode Steps:    20 Return:  1.0000 FPS: 31.93
12:32:27.613 [INFO] (irl_trainer.py:73) Total Epi:  3197 Steps:   49006 Episode Steps:    15 Return:  1.0000 FPS: 31.04
12:32:28.120 [INFO] (irl_trainer.py:73) Total Epi:  3198 Steps:   49023 Episode Steps:    17 Return:  1.0000 FPS: 33.60
12:32:28.512 [INFO] (irl_trainer.py:73) 

12:32:56.100 [INFO] (irl_trainer.py:73) Total Epi:  3259 Steps:   49923 Episode Steps:    11 Return:  1.0000 FPS: 31.49
12:32:56.647 [INFO] (irl_trainer.py:73) Total Epi:  3260 Steps:   49941 Episode Steps:    18 Return:  1.0000 FPS: 32.98
12:32:56.733 [INFO] (irl_trainer.py:73) Total Epi:  3261 Steps:   49944 Episode Steps:     3 Return: -1.0000 FPS: 35.81
12:32:57.330 [INFO] (irl_trainer.py:73) Total Epi:  3262 Steps:   49963 Episode Steps:    19 Return:  1.0000 FPS: 31.89
12:32:57.427 [INFO] (irl_trainer.py:73) Total Epi:  3263 Steps:   49966 Episode Steps:     3 Return: -1.0000 FPS: 31.80
12:32:58.055 [INFO] (irl_trainer.py:73) Total Epi:  3264 Steps:   49985 Episode Steps:    19 Return:  1.0000 FPS: 30.37
12:32:58.701 [INFO] (irl_trainer.py:73) Total Epi:  3265 Steps:   50005 Episode Steps:    20 Return:  1.0000 FPS: 31.09
12:32:58.846 [INFO] (irl_trainer.py:73) Total Epi:  3266 Steps:   50010 Episode Steps:     5 Return: -1.0000 FPS: 34.92
12:32:58.940 [INFO] (irl_trainer.py:73) 

12:33:22.236 [INFO] (irl_trainer.py:73) Total Epi:  3327 Steps:   50748 Episode Steps:    15 Return:  1.0000 FPS: 34.84
12:33:22.965 [INFO] (irl_trainer.py:73) Total Epi:  3328 Steps:   50772 Episode Steps:    24 Return:  1.0000 FPS: 32.96
12:33:23.090 [INFO] (irl_trainer.py:73) Total Epi:  3329 Steps:   50776 Episode Steps:     4 Return: -1.0000 FPS: 32.45
12:33:23.311 [INFO] (irl_trainer.py:73) Total Epi:  3330 Steps:   50783 Episode Steps:     7 Return: -1.0000 FPS: 31.92
12:33:24.119 [INFO] (irl_trainer.py:73) Total Epi:  3331 Steps:   50808 Episode Steps:    25 Return:  1.0000 FPS: 31.00
12:33:24.770 [INFO] (irl_trainer.py:73) Total Epi:  3332 Steps:   50829 Episode Steps:    21 Return:  1.0000 FPS: 32.38
12:33:24.871 [INFO] (irl_trainer.py:73) Total Epi:  3333 Steps:   50832 Episode Steps:     3 Return: -1.0000 FPS: 30.19
12:33:25.557 [INFO] (irl_trainer.py:73) Total Epi:  3334 Steps:   50854 Episode Steps:    22 Return:  1.0000 FPS: 32.18
12:33:26.098 [INFO] (irl_trainer.py:73) 

12:33:47.151 [INFO] (irl_trainer.py:73) Total Epi:  3396 Steps:   51549 Episode Steps:    16 Return:  1.0000 FPS: 34.98
12:33:47.622 [INFO] (irl_trainer.py:73) Total Epi:  3397 Steps:   51564 Episode Steps:    15 Return:  1.0000 FPS: 32.00
12:33:47.788 [INFO] (irl_trainer.py:73) Total Epi:  3398 Steps:   51569 Episode Steps:     5 Return: -1.0000 FPS: 30.48
12:33:47.916 [INFO] (irl_trainer.py:73) Total Epi:  3399 Steps:   51573 Episode Steps:     4 Return: -1.0000 FPS: 31.80
12:33:48.375 [INFO] (irl_trainer.py:73) Total Epi:  3400 Steps:   51588 Episode Steps:    15 Return:  1.0000 FPS: 32.84
12:33:48.532 [INFO] (irl_trainer.py:119) Evaluation Total Steps:   51588 Average Reward -0.2000 / Average Step Count  10.4 over  5 episodes
12:33:48.540 [INFO] (irl_trainer.py:73) Total Epi:  3401 Steps:   51589 Episode Steps:     1 Return:  1.0000 FPS:  6.12
12:33:48.641 [INFO] (irl_trainer.py:73) Total Epi:  3402 Steps:   51592 Episode Steps:     3 Return: -1.0000 FPS: 30.39
12:33:49.230 [INFO] 

12:34:13.371 [INFO] (irl_trainer.py:73) Total Epi:  3464 Steps:   52390 Episode Steps:    17 Return:  1.0000 FPS: 32.71
12:34:14.048 [INFO] (irl_trainer.py:73) Total Epi:  3465 Steps:   52411 Episode Steps:    21 Return:  1.0000 FPS: 31.13
12:34:14.161 [INFO] (irl_trainer.py:73) Total Epi:  3466 Steps:   52415 Episode Steps:     4 Return: -1.0000 FPS: 35.93
12:34:14.710 [INFO] (irl_trainer.py:73) Total Epi:  3467 Steps:   52433 Episode Steps:    18 Return:  1.0000 FPS: 32.86
12:34:15.341 [INFO] (irl_trainer.py:73) Total Epi:  3468 Steps:   52454 Episode Steps:    21 Return:  1.0000 FPS: 33.43
12:34:15.517 [INFO] (irl_trainer.py:73) Total Epi:  3469 Steps:   52460 Episode Steps:     6 Return: -1.0000 FPS: 34.36
12:34:16.269 [INFO] (irl_trainer.py:73) Total Epi:  3470 Steps:   52484 Episode Steps:    24 Return:  1.0000 FPS: 31.98
12:34:16.883 [INFO] (irl_trainer.py:73) Total Epi:  3471 Steps:   52503 Episode Steps:    19 Return:  1.0000 FPS: 31.08
12:34:17.647 [INFO] (irl_trainer.py:73) 

12:34:44.733 [INFO] (irl_trainer.py:73) Total Epi:  3532 Steps:   53389 Episode Steps:    11 Return:  1.0000 FPS: 30.95
12:34:45.392 [INFO] (irl_trainer.py:73) Total Epi:  3533 Steps:   53410 Episode Steps:    21 Return:  1.0000 FPS: 31.99
12:34:46.015 [INFO] (irl_trainer.py:73) Total Epi:  3534 Steps:   53430 Episode Steps:    20 Return:  1.0000 FPS: 32.16
12:34:46.252 [INFO] (irl_trainer.py:73) Total Epi:  3535 Steps:   53438 Episode Steps:     8 Return: -1.0000 FPS: 34.13
12:34:46.339 [INFO] (irl_trainer.py:73) Total Epi:  3536 Steps:   53441 Episode Steps:     3 Return: -1.0000 FPS: 35.15
12:34:46.701 [INFO] (irl_trainer.py:73) Total Epi:  3537 Steps:   53453 Episode Steps:    12 Return:  1.0000 FPS: 33.47
12:34:46.814 [INFO] (irl_trainer.py:73) Total Epi:  3538 Steps:   53457 Episode Steps:     4 Return: -1.0000 FPS: 35.68
12:34:47.433 [INFO] (irl_trainer.py:73) Total Epi:  3539 Steps:   53478 Episode Steps:    21 Return:  1.0000 FPS: 34.00
12:34:47.963 [INFO] (irl_trainer.py:73) 

12:35:17.320 [INFO] (irl_trainer.py:119) Evaluation Total Steps:   54436 Average Reward  0.2000 / Average Step Count  17.4 over  5 episodes
12:35:17.327 [INFO] (irl_trainer.py:73) Total Epi:  3601 Steps:   54437 Episode Steps:     1 Return:  1.0000 FPS:  4.02
12:35:17.818 [INFO] (irl_trainer.py:73) Total Epi:  3602 Steps:   54453 Episode Steps:    16 Return:  1.0000 FPS: 32.76
12:35:18.434 [INFO] (irl_trainer.py:73) Total Epi:  3603 Steps:   54473 Episode Steps:    20 Return:  1.0000 FPS: 32.56
12:35:18.890 [INFO] (irl_trainer.py:73) Total Epi:  3604 Steps:   54488 Episode Steps:    15 Return:  1.0000 FPS: 33.09
12:35:19.341 [INFO] (irl_trainer.py:73) Total Epi:  3605 Steps:   54502 Episode Steps:    14 Return:  1.0000 FPS: 31.11
12:35:19.762 [INFO] (irl_trainer.py:73) Total Epi:  3606 Steps:   54516 Episode Steps:    14 Return:  1.0000 FPS: 33.45
12:35:20.296 [INFO] (irl_trainer.py:73) Total Epi:  3607 Steps:   54533 Episode Steps:    17 Return:  1.0000 FPS: 31.99
12:35:20.902 [INFO] 

12:35:54.263 [INFO] (irl_trainer.py:73) Total Epi:  3669 Steps:   55623 Episode Steps:    16 Return:  1.0000 FPS: 31.98
12:35:54.762 [INFO] (irl_trainer.py:73) Total Epi:  3670 Steps:   55639 Episode Steps:    16 Return:  1.0000 FPS: 32.31
12:35:55.385 [INFO] (irl_trainer.py:73) Total Epi:  3671 Steps:   55660 Episode Steps:    21 Return:  1.0000 FPS: 33.77
12:35:55.960 [INFO] (irl_trainer.py:73) Total Epi:  3672 Steps:   55679 Episode Steps:    19 Return:  1.0000 FPS: 33.16
12:35:56.541 [INFO] (irl_trainer.py:73) Total Epi:  3673 Steps:   55698 Episode Steps:    19 Return:  1.0000 FPS: 32.77
12:35:56.946 [INFO] (irl_trainer.py:73) Total Epi:  3674 Steps:   55710 Episode Steps:    12 Return:  1.0000 FPS: 29.76
12:35:57.378 [INFO] (irl_trainer.py:73) Total Epi:  3675 Steps:   55724 Episode Steps:    14 Return:  1.0000 FPS: 32.53
12:35:58.059 [INFO] (irl_trainer.py:73) Total Epi:  3676 Steps:   55745 Episode Steps:    21 Return:  1.0000 FPS: 30.94
12:35:58.744 [INFO] (irl_trainer.py:73) 

12:36:28.950 [INFO] (irl_trainer.py:73) Total Epi:  3737 Steps:   56725 Episode Steps:    21 Return:  1.0000 FPS: 32.94
12:36:29.515 [INFO] (irl_trainer.py:73) Total Epi:  3738 Steps:   56744 Episode Steps:    19 Return: -1.0000 FPS: 33.74
12:36:30.060 [INFO] (irl_trainer.py:73) Total Epi:  3739 Steps:   56762 Episode Steps:    18 Return:  1.0000 FPS: 33.13
12:36:30.686 [INFO] (irl_trainer.py:73) Total Epi:  3740 Steps:   56782 Episode Steps:    20 Return:  1.0000 FPS: 32.01
12:36:31.144 [INFO] (irl_trainer.py:73) Total Epi:  3741 Steps:   56797 Episode Steps:    15 Return:  1.0000 FPS: 32.88
12:36:31.711 [INFO] (irl_trainer.py:73) Total Epi:  3742 Steps:   56814 Episode Steps:    17 Return:  1.0000 FPS: 30.11
12:36:32.251 [INFO] (irl_trainer.py:73) Total Epi:  3743 Steps:   56832 Episode Steps:    18 Return:  1.0000 FPS: 33.39
12:36:32.918 [INFO] (irl_trainer.py:73) Total Epi:  3744 Steps:   56854 Episode Steps:    22 Return:  1.0000 FPS: 33.13
12:36:33.540 [INFO] (irl_trainer.py:73) 

12:37:04.125 [INFO] (irl_trainer.py:73) Total Epi:  3805 Steps:   57851 Episode Steps:    12 Return:  1.0000 FPS: 31.45
12:37:04.789 [INFO] (irl_trainer.py:73) Total Epi:  3806 Steps:   57872 Episode Steps:    21 Return:  1.0000 FPS: 31.75
12:37:05.473 [INFO] (irl_trainer.py:73) Total Epi:  3807 Steps:   57893 Episode Steps:    21 Return:  1.0000 FPS: 30.78
12:37:05.996 [INFO] (irl_trainer.py:73) Total Epi:  3808 Steps:   57909 Episode Steps:    16 Return:  1.0000 FPS: 30.64
12:37:06.646 [INFO] (irl_trainer.py:73) Total Epi:  3809 Steps:   57931 Episode Steps:    22 Return:  1.0000 FPS: 33.96
12:37:07.217 [INFO] (irl_trainer.py:73) Total Epi:  3810 Steps:   57950 Episode Steps:    19 Return:  1.0000 FPS: 33.40
12:37:07.794 [INFO] (irl_trainer.py:73) Total Epi:  3811 Steps:   57968 Episode Steps:    18 Return:  1.0000 FPS: 31.30
12:37:08.355 [INFO] (irl_trainer.py:73) Total Epi:  3812 Steps:   57986 Episode Steps:    18 Return:  1.0000 FPS: 32.20
12:37:08.920 [INFO] (irl_trainer.py:73) 

12:37:39.743 [INFO] (irl_trainer.py:73) Total Epi:  3874 Steps:   58980 Episode Steps:    14 Return:  1.0000 FPS: 33.31
12:37:40.175 [INFO] (irl_trainer.py:73) Total Epi:  3875 Steps:   58994 Episode Steps:    14 Return:  1.0000 FPS: 32.56
12:37:40.787 [INFO] (irl_trainer.py:73) Total Epi:  3876 Steps:   59013 Episode Steps:    19 Return:  1.0000 FPS: 31.14
12:37:41.193 [INFO] (irl_trainer.py:73) Total Epi:  3877 Steps:   59027 Episode Steps:    14 Return:  1.0000 FPS: 34.67
12:37:41.627 [INFO] (irl_trainer.py:73) Total Epi:  3878 Steps:   59041 Episode Steps:    14 Return:  1.0000 FPS: 32.39
12:37:42.014 [INFO] (irl_trainer.py:73) Total Epi:  3879 Steps:   59054 Episode Steps:    13 Return:  1.0000 FPS: 33.77
12:37:42.409 [INFO] (irl_trainer.py:73) Total Epi:  3880 Steps:   59067 Episode Steps:    13 Return:  1.0000 FPS: 33.03
12:37:43.036 [INFO] (irl_trainer.py:73) Total Epi:  3881 Steps:   59087 Episode Steps:    20 Return:  1.0000 FPS: 32.01
12:37:43.507 [INFO] (irl_trainer.py:73) 

12:38:14.478 [INFO] (irl_trainer.py:73) Total Epi:  3942 Steps:   60093 Episode Steps:    19 Return:  1.0000 FPS: 31.35
12:38:15.127 [INFO] (irl_trainer.py:73) Total Epi:  3943 Steps:   60113 Episode Steps:    20 Return:  1.0000 FPS: 30.91
12:38:15.648 [INFO] (irl_trainer.py:73) Total Epi:  3944 Steps:   60130 Episode Steps:    17 Return:  1.0000 FPS: 32.72
12:38:16.275 [INFO] (irl_trainer.py:73) Total Epi:  3945 Steps:   60150 Episode Steps:    20 Return:  1.0000 FPS: 32.04
12:38:16.886 [INFO] (irl_trainer.py:73) Total Epi:  3946 Steps:   60169 Episode Steps:    19 Return:  1.0000 FPS: 31.23
12:38:17.556 [INFO] (irl_trainer.py:73) Total Epi:  3947 Steps:   60191 Episode Steps:    22 Return:  1.0000 FPS: 32.90
12:38:18.104 [INFO] (irl_trainer.py:73) Total Epi:  3948 Steps:   60208 Episode Steps:    17 Return:  1.0000 FPS: 31.17
12:38:18.651 [INFO] (irl_trainer.py:73) Total Epi:  3949 Steps:   60225 Episode Steps:    17 Return:  1.0000 FPS: 31.15
12:38:19.076 [INFO] (irl_trainer.py:73) 

12:38:49.894 [INFO] (irl_trainer.py:73) Total Epi:  4010 Steps:   61226 Episode Steps:     9 Return:  1.0000 FPS: 33.31
12:38:50.374 [INFO] (irl_trainer.py:73) Total Epi:  4011 Steps:   61242 Episode Steps:    16 Return:  1.0000 FPS: 33.45
12:38:50.742 [INFO] (irl_trainer.py:73) Total Epi:  4012 Steps:   61254 Episode Steps:    12 Return:  1.0000 FPS: 32.84
12:38:51.177 [INFO] (irl_trainer.py:73) Total Epi:  4013 Steps:   61268 Episode Steps:    14 Return:  1.0000 FPS: 32.31
12:38:51.756 [INFO] (irl_trainer.py:73) Total Epi:  4014 Steps:   61287 Episode Steps:    19 Return:  1.0000 FPS: 32.89
12:38:52.337 [INFO] (irl_trainer.py:73) Total Epi:  4015 Steps:   61305 Episode Steps:    18 Return:  1.0000 FPS: 31.08
12:38:53.033 [INFO] (irl_trainer.py:73) Total Epi:  4016 Steps:   61327 Episode Steps:    22 Return:  1.0000 FPS: 31.69
12:38:53.526 [INFO] (irl_trainer.py:73) Total Epi:  4017 Steps:   61343 Episode Steps:    16 Return:  1.0000 FPS: 32.60
12:38:54.219 [INFO] (irl_trainer.py:73) 

12:39:25.010 [INFO] (irl_trainer.py:73) Total Epi:  4079 Steps:   62360 Episode Steps:    17 Return:  1.0000 FPS: 34.14
12:39:25.671 [INFO] (irl_trainer.py:73) Total Epi:  4080 Steps:   62381 Episode Steps:    21 Return:  1.0000 FPS: 31.85
12:39:26.310 [INFO] (irl_trainer.py:73) Total Epi:  4081 Steps:   62401 Episode Steps:    20 Return:  1.0000 FPS: 31.39
12:39:26.778 [INFO] (irl_trainer.py:73) Total Epi:  4082 Steps:   62416 Episode Steps:    15 Return:  1.0000 FPS: 32.15
12:39:27.390 [INFO] (irl_trainer.py:73) Total Epi:  4083 Steps:   62436 Episode Steps:    20 Return:  1.0000 FPS: 32.75
12:39:27.793 [INFO] (irl_trainer.py:73) Total Epi:  4084 Steps:   62449 Episode Steps:    13 Return:  1.0000 FPS: 32.46
12:39:28.216 [INFO] (irl_trainer.py:73) Total Epi:  4085 Steps:   62463 Episode Steps:    14 Return:  1.0000 FPS: 33.24
12:39:28.666 [INFO] (irl_trainer.py:73) Total Epi:  4086 Steps:   62477 Episode Steps:    14 Return:  1.0000 FPS: 31.20
12:39:29.211 [INFO] (irl_trainer.py:73) 

12:40:03.758 [INFO] (irl_trainer.py:73) Total Epi:  4147 Steps:   63597 Episode Steps:    19 Return:  1.0000 FPS: 30.37
12:40:04.326 [INFO] (irl_trainer.py:73) Total Epi:  4148 Steps:   63615 Episode Steps:    18 Return:  1.0000 FPS: 31.84
12:40:04.677 [INFO] (irl_trainer.py:73) Total Epi:  4149 Steps:   63627 Episode Steps:    12 Return:  1.0000 FPS: 34.46
12:40:05.122 [INFO] (irl_trainer.py:73) Total Epi:  4150 Steps:   63642 Episode Steps:    15 Return:  1.0000 FPS: 33.79
12:40:05.732 [INFO] (irl_trainer.py:73) Total Epi:  4151 Steps:   63662 Episode Steps:    20 Return:  1.0000 FPS: 32.86
12:40:06.402 [INFO] (irl_trainer.py:73) Total Epi:  4152 Steps:   63684 Episode Steps:    22 Return:  1.0000 FPS: 32.96
12:40:06.953 [INFO] (irl_trainer.py:73) Total Epi:  4153 Steps:   63701 Episode Steps:    17 Return:  1.0000 FPS: 30.98
12:40:07.307 [INFO] (irl_trainer.py:73) Total Epi:  4154 Steps:   63712 Episode Steps:    11 Return:  1.0000 FPS: 31.14
12:40:07.916 [INFO] (irl_trainer.py:73) 

12:40:37.599 [INFO] (irl_trainer.py:73) Total Epi:  4215 Steps:   64693 Episode Steps:    20 Return:  1.0000 FPS: 31.09
12:40:37.989 [INFO] (irl_trainer.py:73) Total Epi:  4216 Steps:   64705 Episode Steps:    12 Return:  1.0000 FPS: 30.97
12:40:38.622 [INFO] (irl_trainer.py:73) Total Epi:  4217 Steps:   64722 Episode Steps:    17 Return:  1.0000 FPS: 26.91
12:40:39.221 [INFO] (irl_trainer.py:73) Total Epi:  4218 Steps:   64739 Episode Steps:    17 Return:  1.0000 FPS: 28.48
12:40:39.931 [INFO] (irl_trainer.py:73) Total Epi:  4219 Steps:   64759 Episode Steps:    20 Return:  1.0000 FPS: 28.27
12:40:40.539 [INFO] (irl_trainer.py:73) Total Epi:  4220 Steps:   64779 Episode Steps:    20 Return:  1.0000 FPS: 33.00
12:40:41.071 [INFO] (irl_trainer.py:73) Total Epi:  4221 Steps:   64797 Episode Steps:    18 Return:  1.0000 FPS: 33.98
12:40:41.560 [INFO] (irl_trainer.py:73) Total Epi:  4222 Steps:   64812 Episode Steps:    15 Return:  1.0000 FPS: 30.78
12:40:41.978 [INFO] (irl_trainer.py:73) 

12:41:14.281 [INFO] (irl_trainer.py:73) Total Epi:  4284 Steps:   65869 Episode Steps:    19 Return:  1.0000 FPS: 31.83
12:41:14.682 [INFO] (irl_trainer.py:73) Total Epi:  4285 Steps:   65882 Episode Steps:    13 Return:  1.0000 FPS: 32.67
12:41:15.307 [INFO] (irl_trainer.py:73) Total Epi:  4286 Steps:   65902 Episode Steps:    20 Return:  1.0000 FPS: 32.17
12:41:15.987 [INFO] (irl_trainer.py:73) Total Epi:  4287 Steps:   65925 Episode Steps:    23 Return:  1.0000 FPS: 33.93
12:41:16.461 [INFO] (irl_trainer.py:73) Total Epi:  4288 Steps:   65941 Episode Steps:    16 Return:  1.0000 FPS: 33.90
12:41:17.211 [INFO] (irl_trainer.py:73) Total Epi:  4289 Steps:   65965 Episode Steps:    24 Return:  1.0000 FPS: 32.06
12:41:17.818 [INFO] (irl_trainer.py:73) Total Epi:  4290 Steps:   65985 Episode Steps:    20 Return:  1.0000 FPS: 33.09
12:41:18.412 [INFO] (irl_trainer.py:73) Total Epi:  4291 Steps:   66003 Episode Steps:    18 Return:  1.0000 FPS: 30.44
12:41:18.958 [INFO] (irl_trainer.py:73) 

12:41:48.918 [INFO] (irl_trainer.py:73) Total Epi:  4352 Steps:   66991 Episode Steps:    13 Return:  1.0000 FPS: 32.60
12:41:49.344 [INFO] (irl_trainer.py:73) Total Epi:  4353 Steps:   67004 Episode Steps:    13 Return:  1.0000 FPS: 30.64
12:41:49.737 [INFO] (irl_trainer.py:73) Total Epi:  4354 Steps:   67017 Episode Steps:    13 Return:  1.0000 FPS: 33.24
12:41:50.170 [INFO] (irl_trainer.py:73) Total Epi:  4355 Steps:   67031 Episode Steps:    14 Return:  1.0000 FPS: 32.46
12:41:50.786 [INFO] (irl_trainer.py:73) Total Epi:  4356 Steps:   67051 Episode Steps:    20 Return:  1.0000 FPS: 32.54
12:41:51.265 [INFO] (irl_trainer.py:73) Total Epi:  4357 Steps:   67066 Episode Steps:    15 Return:  1.0000 FPS: 31.46
12:41:51.806 [INFO] (irl_trainer.py:73) Total Epi:  4358 Steps:   67083 Episode Steps:    17 Return:  1.0000 FPS: 31.53
12:41:52.269 [INFO] (irl_trainer.py:73) Total Epi:  4359 Steps:   67098 Episode Steps:    15 Return:  1.0000 FPS: 32.59
12:41:52.846 [INFO] (irl_trainer.py:73) 

12:42:25.244 [INFO] (irl_trainer.py:73) Total Epi:  4420 Steps:   68144 Episode Steps:    20 Return:  1.0000 FPS: 32.46
12:42:25.843 [INFO] (irl_trainer.py:73) Total Epi:  4421 Steps:   68163 Episode Steps:    19 Return:  1.0000 FPS: 31.84
12:42:26.539 [INFO] (irl_trainer.py:73) Total Epi:  4422 Steps:   68185 Episode Steps:    22 Return:  1.0000 FPS: 31.70
12:42:27.132 [INFO] (irl_trainer.py:73) Total Epi:  4423 Steps:   68203 Episode Steps:    18 Return:  1.0000 FPS: 30.44
12:42:27.671 [INFO] (irl_trainer.py:73) Total Epi:  4424 Steps:   68220 Episode Steps:    17 Return:  1.0000 FPS: 31.67
12:42:28.131 [INFO] (irl_trainer.py:73) Total Epi:  4425 Steps:   68235 Episode Steps:    15 Return:  1.0000 FPS: 32.80
12:42:28.539 [INFO] (irl_trainer.py:73) Total Epi:  4426 Steps:   68248 Episode Steps:    13 Return:  1.0000 FPS: 31.98
12:42:28.831 [INFO] (irl_trainer.py:73) Total Epi:  4427 Steps:   68258 Episode Steps:    10 Return: -1.0000 FPS: 34.51
12:42:29.449 [INFO] (irl_trainer.py:73) 

12:43:01.599 [INFO] (irl_trainer.py:73) Total Epi:  4489 Steps:   69307 Episode Steps:    23 Return:  1.0000 FPS: 27.45
12:43:01.845 [INFO] (irl_trainer.py:73) Total Epi:  4490 Steps:   69315 Episode Steps:     8 Return:  1.0000 FPS: 32.83
12:43:02.335 [INFO] (irl_trainer.py:73) Total Epi:  4491 Steps:   69331 Episode Steps:    16 Return:  1.0000 FPS: 32.75
12:43:02.738 [INFO] (irl_trainer.py:73) Total Epi:  4492 Steps:   69344 Episode Steps:    13 Return:  1.0000 FPS: 32.36
12:43:03.175 [INFO] (irl_trainer.py:73) Total Epi:  4493 Steps:   69358 Episode Steps:    14 Return:  1.0000 FPS: 32.27
12:43:03.830 [INFO] (irl_trainer.py:73) Total Epi:  4494 Steps:   69379 Episode Steps:    21 Return:  1.0000 FPS: 32.14
12:43:04.350 [INFO] (irl_trainer.py:73) Total Epi:  4495 Steps:   69396 Episode Steps:    17 Return:  1.0000 FPS: 32.79
12:43:05.004 [INFO] (irl_trainer.py:73) Total Epi:  4496 Steps:   69416 Episode Steps:    20 Return:  1.0000 FPS: 30.67
12:43:05.475 [INFO] (irl_trainer.py:73) 

12:43:36.295 [INFO] (irl_trainer.py:73) Total Epi:  4557 Steps:   70409 Episode Steps:    14 Return:  1.0000 FPS: 32.88
12:43:36.954 [INFO] (irl_trainer.py:73) Total Epi:  4558 Steps:   70431 Episode Steps:    22 Return:  1.0000 FPS: 33.50
12:43:37.319 [INFO] (irl_trainer.py:73) Total Epi:  4559 Steps:   70443 Episode Steps:    12 Return:  1.0000 FPS: 33.05
12:43:37.651 [INFO] (irl_trainer.py:73) Total Epi:  4560 Steps:   70454 Episode Steps:    11 Return:  1.0000 FPS: 33.33
12:43:38.125 [INFO] (irl_trainer.py:73) Total Epi:  4561 Steps:   70469 Episode Steps:    15 Return:  1.0000 FPS: 31.85
12:43:38.624 [INFO] (irl_trainer.py:73) Total Epi:  4562 Steps:   70486 Episode Steps:    17 Return:  1.0000 FPS: 34.20
12:43:39.199 [INFO] (irl_trainer.py:73) Total Epi:  4563 Steps:   70504 Episode Steps:    18 Return:  1.0000 FPS: 31.40
12:43:39.939 [INFO] (irl_trainer.py:73) Total Epi:  4564 Steps:   70523 Episode Steps:    19 Return:  1.0000 FPS: 25.79
12:43:40.395 [INFO] (irl_trainer.py:73) 

12:44:13.387 [INFO] (irl_trainer.py:73) Total Epi:  4625 Steps:   71593 Episode Steps:    14 Return:  1.0000 FPS: 31.89
12:44:14.050 [INFO] (irl_trainer.py:73) Total Epi:  4626 Steps:   71613 Episode Steps:    20 Return:  1.0000 FPS: 30.27
12:44:14.736 [INFO] (irl_trainer.py:73) Total Epi:  4627 Steps:   71636 Episode Steps:    23 Return:  1.0000 FPS: 33.64
12:44:15.271 [INFO] (irl_trainer.py:73) Total Epi:  4628 Steps:   71653 Episode Steps:    17 Return:  1.0000 FPS: 31.83
12:44:15.624 [INFO] (irl_trainer.py:73) Total Epi:  4629 Steps:   71665 Episode Steps:    12 Return:  1.0000 FPS: 34.18
12:44:16.166 [INFO] (irl_trainer.py:73) Total Epi:  4630 Steps:   71683 Episode Steps:    18 Return:  1.0000 FPS: 33.34
12:44:16.830 [INFO] (irl_trainer.py:73) Total Epi:  4631 Steps:   71703 Episode Steps:    20 Return:  1.0000 FPS: 30.25
12:44:17.443 [INFO] (irl_trainer.py:73) Total Epi:  4632 Steps:   71724 Episode Steps:    21 Return:  1.0000 FPS: 34.37
12:44:17.817 [INFO] (irl_trainer.py:73) 

12:44:47.984 [INFO] (irl_trainer.py:73) Total Epi:  4694 Steps:   72721 Episode Steps:    19 Return:  1.0000 FPS: 32.45
12:44:48.548 [INFO] (irl_trainer.py:73) Total Epi:  4695 Steps:   72740 Episode Steps:    19 Return:  1.0000 FPS: 33.77
12:44:48.981 [INFO] (irl_trainer.py:73) Total Epi:  4696 Steps:   72755 Episode Steps:    15 Return:  1.0000 FPS: 34.79
12:44:49.454 [INFO] (irl_trainer.py:73) Total Epi:  4697 Steps:   72771 Episode Steps:    16 Return:  1.0000 FPS: 34.07
12:44:49.955 [INFO] (irl_trainer.py:73) Total Epi:  4698 Steps:   72788 Episode Steps:    17 Return:  1.0000 FPS: 34.02
12:44:50.670 [INFO] (irl_trainer.py:73) Total Epi:  4699 Steps:   72811 Episode Steps:    23 Return:  1.0000 FPS: 32.26
12:44:51.268 [INFO] (irl_trainer.py:73) Total Epi:  4700 Steps:   72831 Episode Steps:    20 Return:  1.0000 FPS: 33.53
12:44:51.499 [INFO] (irl_trainer.py:119) Evaluation Total Steps:   72831 Average Reward  1.0000 / Average Step Count  16.2 over  5 episodes
12:44:51.510 [INFO] 

12:45:23.151 [INFO] (irl_trainer.py:73) Total Epi:  4762 Steps:   73858 Episode Steps:    21 Return:  1.0000 FPS: 33.31
12:45:23.624 [INFO] (irl_trainer.py:73) Total Epi:  4763 Steps:   73874 Episode Steps:    16 Return:  1.0000 FPS: 34.01
12:45:24.325 [INFO] (irl_trainer.py:73) Total Epi:  4764 Steps:   73897 Episode Steps:    23 Return:  1.0000 FPS: 32.87
12:45:24.940 [INFO] (irl_trainer.py:73) Total Epi:  4765 Steps:   73916 Episode Steps:    19 Return:  1.0000 FPS: 30.97
12:45:25.462 [INFO] (irl_trainer.py:73) Total Epi:  4766 Steps:   73934 Episode Steps:    18 Return:  1.0000 FPS: 34.64
12:45:25.937 [INFO] (irl_trainer.py:73) Total Epi:  4767 Steps:   73950 Episode Steps:    16 Return:  1.0000 FPS: 33.84
12:45:26.450 [INFO] (irl_trainer.py:73) Total Epi:  4768 Steps:   73968 Episode Steps:    18 Return:  1.0000 FPS: 35.27
12:45:26.885 [INFO] (irl_trainer.py:73) Total Epi:  4769 Steps:   73983 Episode Steps:    15 Return:  1.0000 FPS: 34.65
12:45:27.483 [INFO] (irl_trainer.py:73) 

12:45:56.722 [INFO] (irl_trainer.py:73) Total Epi:  4830 Steps:   74949 Episode Steps:    13 Return:  1.0000 FPS: 34.55
12:45:57.287 [INFO] (irl_trainer.py:73) Total Epi:  4831 Steps:   74967 Episode Steps:    18 Return:  1.0000 FPS: 31.97
12:45:57.732 [INFO] (irl_trainer.py:73) Total Epi:  4832 Steps:   74982 Episode Steps:    15 Return:  1.0000 FPS: 33.83
12:45:58.231 [INFO] (irl_trainer.py:73) Total Epi:  4833 Steps:   74999 Episode Steps:    17 Return:  1.0000 FPS: 34.18
12:45:58.739 [INFO] (irl_trainer.py:73) Total Epi:  4834 Steps:   75014 Episode Steps:    15 Return:  1.0000 FPS: 29.69
12:45:59.369 [INFO] (irl_trainer.py:73) Total Epi:  4835 Steps:   75036 Episode Steps:    22 Return:  1.0000 FPS: 35.02
12:46:00.076 [INFO] (irl_trainer.py:73) Total Epi:  4836 Steps:   75060 Episode Steps:    24 Return:  1.0000 FPS: 34.05
12:46:00.803 [INFO] (irl_trainer.py:73) Total Epi:  4837 Steps:   75084 Episode Steps:    24 Return:  1.0000 FPS: 33.09
12:46:01.396 [INFO] (irl_trainer.py:73) 

12:46:32.994 [INFO] (irl_trainer.py:73) Total Epi:  4899 Steps:   76132 Episode Steps:    11 Return: -1.0000 FPS: 33.62
12:46:33.691 [INFO] (irl_trainer.py:73) Total Epi:  4900 Steps:   76155 Episode Steps:    23 Return:  1.0000 FPS: 33.11
12:46:33.939 [INFO] (irl_trainer.py:119) Evaluation Total Steps:   76155 Average Reward  1.0000 / Average Step Count  17.8 over  5 episodes
12:46:33.949 [INFO] (irl_trainer.py:73) Total Epi:  4901 Steps:   76156 Episode Steps:     1 Return:  1.0000 FPS:  3.94
12:46:34.291 [INFO] (irl_trainer.py:73) Total Epi:  4902 Steps:   76167 Episode Steps:    11 Return:  1.0000 FPS: 32.33
12:46:34.918 [INFO] (irl_trainer.py:73) Total Epi:  4903 Steps:   76187 Episode Steps:    20 Return:  1.0000 FPS: 32.03
12:46:35.629 [INFO] (irl_trainer.py:73) Total Epi:  4904 Steps:   76209 Episode Steps:    22 Return:  1.0000 FPS: 31.03
12:46:36.258 [INFO] (irl_trainer.py:73) Total Epi:  4905 Steps:   76230 Episode Steps:    21 Return:  1.0000 FPS: 33.45
12:46:36.796 [INFO] 

# Visualization of a trained agent 

Lastly we show you how a trained agent interacts with the environment.
Therefore, please set up the number of scenarios to visualize in the next cell.

In [None]:
# number of scenarios to visualize
num_scenarios_to_visualize = 10

In [None]:
# load params from the json file to create the parameter server object
params = ParameterServer(filename="data/params/gail_params.json")

# setting the path for the pretrained agent.
params["ML"]["GAILRunner"]["tf2rl"]["model_dir"] = "../../../com_github_gail_4_bark_large_data_store/pretrained_agents/gail/merging"

# customized parameters:
params["ML"]["Settings"]["GPUUse"] = gpu
tf2rl_params = params["ML"]["GAILRunner"]["tf2rl"]
tf2rl_params["max_steps"] = max_steps
tf2rl_params["test_interval"] = test_interval
tf2rl_params["test_episodes"] = test_episodes
params["ML"]["GAILRunner"]["tf2rl"] = tf2rl_params
if params["ML"]["BehaviorGAILAgent"]["WarmUp"] > max_steps / 2:
    params["ML"]["BehaviorGAILAgent"]["WarmUp"] = max_steps / 2

# create environment
bp = ContinuousMergingBlueprint(params,
                              number_of_senarios=500,
                              random_seed=0)
env = SingleAgentRuntime(blueprint=bp,
                      render=False)

# wrapped environment for compatibility with tf2rl
wrapped_env = TF2RLWrapper(env, 
normalize_features=params["ML"]["Settings"]["NormalizeFeatures"])

# instantiate the GAIL agent
gail_agent = BehaviorGAILAgent(environment=wrapped_env,
                           params=params)

# instantiate a runner that is going to train the agent.
runner = GAILRunner(params=params,
                 environment=wrapped_env,
                 agent=gail_agent,)

# Visualize the agent
runner.Visualize(num_scenarios_to_visualize, renderer="matplotlib_jupyter")