# Q Table Learning

Example of reinforcement learning using Q-Learning with table method, and the ["FrozenLake"](https://gym.openai.com/envs/FrozenLake-v0) environment of [OpenAI gym](https://gym.openai.com/)

## Frozen Lake environment and problem description
In this example, the AI agent is located in a 4X4 grid, the surface of every cell in the grid can be described with one of the following:
* S: starting point(safe)
* F: frozen surface(safe)
* H: hole(unsafe surface, the agent falls the problem is finished unsuccesfully)
* G: goal(safe surface, this is where the agent needs to go ,if the agent gets here its rewarded)

The objective of the agent is to learn a way to reach the G(goal) state ,if it reaches the goal state gets a reward,else gets no reward. The 4X4 grid is defined as the following(creating 16 possible states):

SFFF

FHFH

FFFH

HFFG

For example:

State 0 is the agent being in initial state(starting point S).

From state 0 if the agent moves right, will reach state 1(F frozen surface)

From state 0 if the agent moves down, will reach state 4(F frozen surface)

From state 4, if the agent moves right, will reach state 5(H hole, it will fall and end the episode with no reward)



## Frozen lake solution with Q-Table learning

In [1]:
import gym
import numpy as np
from gym.envs.registration import register

### Import Frozen lake environment from OpenAI gym
In the original frozen lake environment theres a "slippery" property that makes the state reached as the result of an action in a given state non-deterministic, to make the example simpler, the is_slippery flag is removed,so the reached state as result of an action a, in a state s, is always the same 

In [2]:
register(
    id='FrozenLakeNotSlippery-v0',
    entry_point='gym.envs.toy_text:FrozenLakeEnv',
    kwargs={'map_name' : '4x4', 'is_slippery': False},
    max_episode_steps=100,
    reward_threshold=0.78, # optimum = .8196
)

In [3]:
frozenLakeEnv =  gym.make('FrozenLakeNotSlippery-v0')

[2017-08-14 00:48:42,139] Making new env: FrozenLakeNotSlippery-v0


In [4]:
actions_dictionary = {0:"left",1:"down",2:"right",3:"up"}

### Q-table learning reinforcement learning algorithm

In [5]:
# Initialize learning parameters
learning_rate = 0.8
discount_factor = 0.95 #the importance of future rewards,0 means only current reward matters(ignore future rewards)
num_episodes = 50000 #number of  iterations (from initial state, to final state)
max_steps_per_episode = 100 #maximum number allowed to execute per episode
initial_explore_probability = 0.50 #the agent will explore with e probability, and exploit best action(from q table) with 1-e probability

In [6]:
# Initialize q table with zeros ,dimensions: statesXactions
Qtable = np.zeros([frozenLakeEnv.observation_space.n,frozenLakeEnv.action_space.n])
# Counts the number of times an action is taken in a given state
taken_action_count = np.zeros([frozenLakeEnv.observation_space.n,frozenLakeEnv.action_space.n])

# Create a list to contain steps per episode, and rewards per episode
steps_per_episode = []
rewards_per_episode = []

for episode in range(num_episodes):
    #print("#### Starting episode "+str(episode))
    # Reset environment and get first state
    state = frozenLakeEnv.reset()
    episode_rewards = 0
    episode_steps = 0
    state_is_final = False
    step_behavior = "" #will indicate if agent is exploring or exploiting
    episode_state_sequence = str(state) #sequence of states the agent will be in every episode
    explore_probability = initial_explore_probability*(1-episode/num_episodes)
    #print(explore_probability)
    
    for step in range(max_steps_per_episode):
        #frozenLakeEnv.render()
        # Chose an action exploring with e probability and exploiting(best action from q table) with 1-e prob.
        if np.random.rand(1) < explore_probability:
            # explore
            action = frozenLakeEnv.action_space.sample()
            step_behavior = "R"
        else:
            # exploit best action(selected from Q table)
            action = np.argmax(Qtable[state,:])
            step_behavior = "B"
            
        # Given current state, and selected action, get new state , action reward and "is final" flag
        new_state,reward,state_is_final,_ = frozenLakeEnv.step(action)
        episode_rewards+=reward
        # if its not succesfull final state,create punishment
        reward = -1 if state_is_final and reward == 0 else reward
        # Update Qtable for current state and selected action using bellman equation
        Qtable[state,action] += learning_rate*(reward+discount_factor*np.max(Qtable[new_state,:])-Qtable[state,action])
            
        # Update the episode step count 
        episode_steps+=1
        
        #Set that the new step's state is the new state reached by selected action
        state=new_state
        episode_state_sequence = episode_state_sequence+"->"+step_behavior+str(state)
        
        # If the state is final(goal state or failure state, exit)
        if state_is_final:
            break
            
    steps_per_episode.append(episode_steps)
    rewards_per_episode.append(episode_rewards)
    
    if state_is_final and reward > 0:
        print("Reached goal state in episode "+str(episode)+ " after "+ str(step) +" steps,average score in time: "+str(sum(rewards_per_episode)/num_episodes)+ ",stage sequence:"+ episode_state_sequence)

Reached goal state in episode 1295 after 8 steps,average score in time: 2e-05,stage sequence:0->R1->B0->R4->B8->R8->R9->R13->R14->R15
Reached goal state in episode 1387 after 14 steps,average score in time: 4e-05,stage sequence:0->B0->B0->B0->B0->B0->R0->R4->R8->B8->B8->B8->R9->R13->R14->B15
Reached goal state in episode 1389 after 7 steps,average score in time: 6e-05,stage sequence:0->R4->B8->B8->R9->R13->R13->B14->B15
Reached goal state in episode 1390 after 18 steps,average score in time: 8e-05,stage sequence:0->B0->R4->B8->B8->B8->B8->B8->B8->R4->B8->B8->R8->R4->B8->B8->R9->B13->R14->B15
Reached goal state in episode 1393 after 9 steps,average score in time: 0.0001,stage sequence:0->B4->B8->B9->R10->B9->B13->B14->R13->B14->B15
Reached goal state in episode 1394 after 5 steps,average score in time: 0.00012,stage sequence:0->R4->B8->B9->R13->B14->B15
Reached goal state in episode 1396 after 7 steps,average score in time: 0.00014,stage sequence:0->R0->R0->B4->B8->B9->R13->B14->R15
Rea

Reached goal state in episode 2052 after 10 steps,average score in time: 0.00546,stage sequence:0->R0->B4->R0->B4->R0->R1->B2->R6->B10->B14->B15
Reached goal state in episode 2055 after 6 steps,average score in time: 0.00548,stage sequence:0->R0->B4->B8->R9->R10->B14->B15
Reached goal state in episode 2057 after 7 steps,average score in time: 0.0055,stage sequence:0->R0->R1->B2->B6->B10->B14->R14->B15
Reached goal state in episode 2058 after 7 steps,average score in time: 0.00552,stage sequence:0->B4->B8->B9->R10->B14->R13->R14->R15
Reached goal state in episode 2062 after 6 steps,average score in time: 0.00554,stage sequence:0->R0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 2064 after 9 steps,average score in time: 0.00556,stage sequence:0->R0->B4->B8->B9->R8->R8->B9->R10->R14->B15
Reached goal state in episode 2065 after 5 steps,average score in time: 0.00558,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 2070 after 7 steps,average score in tim

Reached goal state in episode 2853 after 10 steps,average score in time: 0.01218,stage sequence:0->B4->B8->R4->R0->B4->R4->B8->R9->R13->R14->B15
Reached goal state in episode 2858 after 6 steps,average score in time: 0.0122,stage sequence:0->R0->B4->R8->R9->B13->B14->B15
Reached goal state in episode 2859 after 7 steps,average score in time: 0.01222,stage sequence:0->B4->B8->R9->B13->R9->B13->B14->R15
Reached goal state in episode 2864 after 7 steps,average score in time: 0.01224,stage sequence:0->B4->R4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 2865 after 10 steps,average score in time: 0.01226,stage sequence:0->R0->R4->R0->R4->B8->B9->B13->R14->R14->R14->B15
Reached goal state in episode 2866 after 17 steps,average score in time: 0.01228,stage sequence:0->B4->B8->B9->R8->B9->B13->B14->R10->B14->R10->R9->R8->R9->B13->R9->R13->R14->B15
Reached goal state in episode 2867 after 5 steps,average score in time: 0.0123,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal s

Reached goal state in episode 3604 after 5 steps,average score in time: 0.01934,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 3615 after 5 steps,average score in time: 0.01936,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 3616 after 8 steps,average score in time: 0.01938,stage sequence:0->R0->B4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 3618 after 5 steps,average score in time: 0.0194,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 3620 after 5 steps,average score in time: 0.01942,stage sequence:0->B4->R8->R9->R10->B14->B15
Reached goal state in episode 3623 after 6 steps,average score in time: 0.01944,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 3624 after 7 steps,average score in time: 0.01946,stage sequence:0->R1->B2->R3->B2->B6->B10->R14->B15
Reached goal state in episode 3625 after 5 steps,average score in time: 0.01948,stage sequence:0->B4->B8->

Reached goal state in episode 4280 after 13 steps,average score in time: 0.02538,stage sequence:0->R4->B8->R8->R8->B9->R8->B9->R10->B14->R10->B14->R10->B14->B15
Reached goal state in episode 4285 after 6 steps,average score in time: 0.0254,stage sequence:0->B4->R4->R8->R9->B13->R14->R15
Reached goal state in episode 4286 after 6 steps,average score in time: 0.02542,stage sequence:0->B4->R8->B9->B13->R13->R14->B15
Reached goal state in episode 4288 after 6 steps,average score in time: 0.02544,stage sequence:0->R4->R4->R8->B9->B13->B14->B15
Reached goal state in episode 4289 after 5 steps,average score in time: 0.02546,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 4290 after 5 steps,average score in time: 0.02548,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 4295 after 7 steps,average score in time: 0.0255,stage sequence:0->R1->B2->B6->B10->B14->R10->B14->B15
Reached goal state in episode 4301 after 5 steps,average score in time: 0.

Reached goal state in episode 4935 after 14 steps,average score in time: 0.0318,stage sequence:0->R4->B8->B9->B13->R9->B13->R14->R10->R9->B13->B14->R14->R13->B14->B15
Reached goal state in episode 4936 after 7 steps,average score in time: 0.03182,stage sequence:0->R0->R0->R4->B8->R9->B13->B14->B15
Reached goal state in episode 4939 after 10 steps,average score in time: 0.03184,stage sequence:0->R0->B4->B8->R4->B8->R9->B13->R9->B13->R14->B15
Reached goal state in episode 4945 after 7 steps,average score in time: 0.03186,stage sequence:0->R0->R0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 4947 after 5 steps,average score in time: 0.03188,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 4948 after 5 steps,average score in time: 0.0319,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 4949 after 9 steps,average score in time: 0.03192,stage sequence:0->B4->B8->R4->B8->R8->B9->B13->R13->B14->R15
Reached goal state in episode 4951 

Reached goal state in episode 5666 after 8 steps,average score in time: 0.03922,stage sequence:0->B4->R8->B9->B13->B14->R10->B14->R14->B15
Reached goal state in episode 5668 after 5 steps,average score in time: 0.03924,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 5670 after 7 steps,average score in time: 0.03926,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 5672 after 7 steps,average score in time: 0.03928,stage sequence:0->R1->B2->R1->B2->B6->B10->B14->B15
Reached goal state in episode 5675 after 9 steps,average score in time: 0.0393,stage sequence:0->B4->B8->B9->B13->R9->B13->R9->B13->B14->B15
Reached goal state in episode 5679 after 14 steps,average score in time: 0.03932,stage sequence:0->R0->R0->R4->B8->B9->B13->B14->R14->R13->B14->R13->B14->R10->B14->B15
Reached goal state in episode 5683 after 7 steps,average score in time: 0.03934,stage sequence:0->B4->B8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 56

Reached goal state in episode 6411 after 7 steps,average score in time: 0.04672,stage sequence:0->B4->B8->R8->B9->B13->R13->B14->B15
Reached goal state in episode 6414 after 11 steps,average score in time: 0.04674,stage sequence:0->B4->B8->R9->R8->B9->B13->R9->B13->R9->R13->B14->B15
Reached goal state in episode 6415 after 10 steps,average score in time: 0.04676,stage sequence:0->R4->R4->B8->B9->R8->R4->B8->B9->B13->R14->B15
Reached goal state in episode 6417 after 9 steps,average score in time: 0.04678,stage sequence:0->B4->R0->R1->R0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 6421 after 8 steps,average score in time: 0.0468,stage sequence:0->B4->B8->B9->B13->R13->R9->B13->B14->B15
Reached goal state in episode 6423 after 7 steps,average score in time: 0.04682,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 6424 after 6 steps,average score in time: 0.04684,stage sequence:0->B4->R8->R8->B9->R13->B14->R15
Reached goal state in episode 6425

Reached goal state in episode 7145 after 5 steps,average score in time: 0.05396,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 7146 after 7 steps,average score in time: 0.05398,stage sequence:0->R0->R1->R1->B2->B6->B10->B14->B15
Reached goal state in episode 7147 after 7 steps,average score in time: 0.054,stage sequence:0->B4->R0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 7148 after 5 steps,average score in time: 0.05402,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 7149 after 5 steps,average score in time: 0.05404,stage sequence:0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 7151 after 7 steps,average score in time: 0.05406,stage sequence:0->R0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 7152 after 8 steps,average score in time: 0.05408,stage sequence:0->B4->R4->B8->R9->R8->B9->R10->B14->B15
Reached goal state in episode 7155 after 7 steps,average score in time: 0.0541,stage sequence:0->

Reached goal state in episode 7840 after 5 steps,average score in time: 0.06122,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 7842 after 10 steps,average score in time: 0.06124,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->R14->R10->R14->B15
Reached goal state in episode 7844 after 9 steps,average score in time: 0.06126,stage sequence:0->B4->R0->R4->B8->B9->B13->R9->B13->R14->R15
Reached goal state in episode 7845 after 8 steps,average score in time: 0.06128,stage sequence:0->R0->B4->B8->B9->R8->B9->B13->R14->B15
Reached goal state in episode 7846 after 8 steps,average score in time: 0.0613,stage sequence:0->B4->R8->B9->B13->B14->R10->B14->R14->B15
Reached goal state in episode 7847 after 6 steps,average score in time: 0.06132,stage sequence:0->B4->R8->B9->B13->R13->B14->B15
Reached goal state in episode 7848 after 7 steps,average score in time: 0.06134,stage sequence:0->R0->B4->B8->R8->R9->R10->R14->B15
Reached goal state in episode 7849 after 19 steps,av

Reached goal state in episode 8532 after 12 steps,average score in time: 0.0685,stage sequence:0->R0->B4->B8->B9->R10->R6->B10->R6->B10->R9->B13->B14->B15
Reached goal state in episode 8533 after 7 steps,average score in time: 0.06852,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 8534 after 6 steps,average score in time: 0.06854,stage sequence:0->R0->B4->B8->B9->R10->R14->B15
Reached goal state in episode 8535 after 6 steps,average score in time: 0.06856,stage sequence:0->R0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 8536 after 6 steps,average score in time: 0.06858,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 8539 after 5 steps,average score in time: 0.0686,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 8540 after 8 steps,average score in time: 0.06862,stage sequence:0->R4->B8->B9->B13->R13->B14->R13->B14->B15
Reached goal state in episode 8543 after 6 steps,average score in 

Reached goal state in episode 9253 after 5 steps,average score in time: 0.07608,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 9259 after 6 steps,average score in time: 0.0761,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 9260 after 9 steps,average score in time: 0.07612,stage sequence:0->B4->R0->R4->R8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 9264 after 5 steps,average score in time: 0.07614,stage sequence:0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 9266 after 10 steps,average score in time: 0.07616,stage sequence:0->R0->B4->R0->B4->B8->B9->R10->B14->R10->B14->B15
Reached goal state in episode 9267 after 7 steps,average score in time: 0.07618,stage sequence:0->B4->B8->B9->R10->B14->R13->B14->R15
Reached goal state in episode 9268 after 6 steps,average score in time: 0.0762,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 9269 after 5 steps,average score in time: 0.07

Reached goal state in episode 9951 after 9 steps,average score in time: 0.0835,stage sequence:0->R0->R0->B4->R0->R1->B2->B6->R10->B14->B15
Reached goal state in episode 9953 after 7 steps,average score in time: 0.08352,stage sequence:0->B4->R8->R9->B13->R9->R10->R14->B15
Reached goal state in episode 9956 after 6 steps,average score in time: 0.08354,stage sequence:0->B4->B8->R8->B9->R10->B14->B15
Reached goal state in episode 9957 after 7 steps,average score in time: 0.08356,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 9960 after 7 steps,average score in time: 0.08358,stage sequence:0->B4->R0->B4->B8->R9->B13->B14->R15
Reached goal state in episode 9962 after 5 steps,average score in time: 0.0836,stage sequence:0->B4->R8->B9->B13->B14->R15
Reached goal state in episode 9964 after 6 steps,average score in time: 0.08362,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 9965 after 6 steps,average score in time: 0.08364,sta

Reached goal state in episode 10669 after 6 steps,average score in time: 0.09144,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 10672 after 7 steps,average score in time: 0.09146,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 10673 after 6 steps,average score in time: 0.09148,stage sequence:0->R0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 10677 after 8 steps,average score in time: 0.0915,stage sequence:0->B4->R0->B4->B8->R8->B9->B13->B14->B15
Reached goal state in episode 10678 after 6 steps,average score in time: 0.09152,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 10683 after 9 steps,average score in time: 0.09154,stage sequence:0->R1->B2->B6->R2->R3->B2->R6->B10->B14->B15
Reached goal state in episode 10684 after 7 steps,average score in time: 0.09156,stage sequence:0->B4->B8->B9->B13->B14->R13->R14->R15
Reached goal state in episode 10685 after 5 steps,average score in

Reached goal state in episode 11357 after 5 steps,average score in time: 0.0993,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 11358 after 7 steps,average score in time: 0.09932,stage sequence:0->R0->B4->R8->B9->B13->R13->B14->B15
Reached goal state in episode 11359 after 9 steps,average score in time: 0.09934,stage sequence:0->R0->R0->R4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 11360 after 5 steps,average score in time: 0.09936,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 11364 after 8 steps,average score in time: 0.09938,stage sequence:0->B4->R4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 11365 after 8 steps,average score in time: 0.0994,stage sequence:0->R4->B8->B9->B13->R9->B13->B14->R14->B15
Reached goal state in episode 11366 after 6 steps,average score in time: 0.09942,stage sequence:0->B4->R8->B9->R13->R13->B14->B15
Reached goal state in episode 11367 after 6 steps,average score in t

Reached goal state in episode 12029 after 5 steps,average score in time: 0.10706,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 12030 after 5 steps,average score in time: 0.10708,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 12034 after 7 steps,average score in time: 0.1071,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 12035 after 5 steps,average score in time: 0.10712,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 12040 after 5 steps,average score in time: 0.10714,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 12041 after 5 steps,average score in time: 0.10716,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 12042 after 7 steps,average score in time: 0.10718,stage sequence:0->R1->B2->B6->B10->B14->R13->R14->B15
Reached goal state in episode 12043 after 9 steps,average score in time: 0.1072,stage sequence:0->B4->R4->B

Reached goal state in episode 12714 after 9 steps,average score in time: 0.11504,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->R13->B14->B15
Reached goal state in episode 12715 after 8 steps,average score in time: 0.11506,stage sequence:0->R0->B4->B8->B9->B13->B14->R10->B14->B15
Reached goal state in episode 12716 after 10 steps,average score in time: 0.11508,stage sequence:0->R0->B4->B8->B9->R8->R9->R8->B9->B13->B14->B15
Reached goal state in episode 12717 after 10 steps,average score in time: 0.1151,stage sequence:0->B4->B8->R8->B9->B13->R13->B14->R13->R13->B14->B15
Reached goal state in episode 12718 after 5 steps,average score in time: 0.11512,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 12719 after 9 steps,average score in time: 0.11514,stage sequence:0->B4->R8->B9->B13->B14->R10->B14->R13->R14->B15
Reached goal state in episode 12720 after 7 steps,average score in time: 0.11516,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state

Reached goal state in episode 13383 after 7 steps,average score in time: 0.12314,stage sequence:0->B4->R0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 13388 after 5 steps,average score in time: 0.12316,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 13389 after 7 steps,average score in time: 0.12318,stage sequence:0->B4->R0->B4->B8->R9->B13->B14->R15
Reached goal state in episode 13390 after 9 steps,average score in time: 0.1232,stage sequence:0->R4->B8->B9->B13->B14->R13->R14->R13->B14->B15
Reached goal state in episode 13391 after 5 steps,average score in time: 0.12322,stage sequence:0->B4->R8->R9->B13->B14->R15
Reached goal state in episode 13392 after 6 steps,average score in time: 0.12324,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 13393 after 5 steps,average score in time: 0.12326,stage sequence:0->R1->B2->B6->B10->B14->R15
Reached goal state in episode 13394 after 5 steps,average score in time: 0.12328,stage

Reached goal state in episode 14088 after 11 steps,average score in time: 0.13136,stage sequence:0->R0->R4->R0->B4->B8->B9->B13->B14->R14->R13->B14->B15
Reached goal state in episode 14092 after 11 steps,average score in time: 0.13138,stage sequence:0->B4->R8->R4->B8->B9->R8->B9->B13->R14->R13->B14->R15
Reached goal state in episode 14096 after 6 steps,average score in time: 0.1314,stage sequence:0->B4->B8->B9->B13->B14->R14->R15
Reached goal state in episode 14100 after 9 steps,average score in time: 0.13142,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->R13->B14->R15
Reached goal state in episode 14103 after 6 steps,average score in time: 0.13144,stage sequence:0->R0->R4->R8->B9->B13->B14->B15
Reached goal state in episode 14105 after 12 steps,average score in time: 0.13146,stage sequence:0->R0->B4->R8->R4->R0->R0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 14107 after 10 steps,average score in time: 0.13148,stage sequence:0->B4->B8->R8->B9->R8->B9->R8->B9->R10->R14

Reached goal state in episode 14747 after 10 steps,average score in time: 0.13944,stage sequence:0->R0->B4->B8->R4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 14748 after 6 steps,average score in time: 0.13946,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 14750 after 5 steps,average score in time: 0.13948,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 14752 after 5 steps,average score in time: 0.1395,stage sequence:0->B4->R8->B9->R10->B14->B15
Reached goal state in episode 14753 after 6 steps,average score in time: 0.13952,stage sequence:0->B4->B8->R8->B9->R10->B14->B15
Reached goal state in episode 14754 after 5 steps,average score in time: 0.13954,stage sequence:0->R1->B2->B6->B10->R14->B15
Reached goal state in episode 14756 after 7 steps,average score in time: 0.13956,stage sequence:0->R1->B2->B6->B10->R9->B13->B14->B15
Reached goal state in episode 14765 after 6 steps,average score in time: 0.13958,stage

Reached goal state in episode 15404 after 12 steps,average score in time: 0.1473,stage sequence:0->B4->B8->R4->R0->R0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 15405 after 5 steps,average score in time: 0.14732,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 15406 after 6 steps,average score in time: 0.14734,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 15408 after 7 steps,average score in time: 0.14736,stage sequence:0->B4->R0->B4->B8->R9->R13->R14->B15
Reached goal state in episode 15410 after 5 steps,average score in time: 0.14738,stage sequence:0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 15412 after 6 steps,average score in time: 0.1474,stage sequence:0->R0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 15413 after 6 steps,average score in time: 0.14742,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 15415 after 5 steps,average score in time: 0.14

Reached goal state in episode 16073 after 11 steps,average score in time: 0.15518,stage sequence:0->R0->B4->B8->B9->B13->R9->R8->B9->B13->R14->R14->B15
Reached goal state in episode 16078 after 6 steps,average score in time: 0.1552,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 16079 after 6 steps,average score in time: 0.15522,stage sequence:0->B4->B8->B9->R13->R13->R14->B15
Reached goal state in episode 16081 after 6 steps,average score in time: 0.15524,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 16084 after 9 steps,average score in time: 0.15526,stage sequence:0->B4->R0->R1->B2->B6->B10->B14->R10->B14->B15
Reached goal state in episode 16085 after 5 steps,average score in time: 0.15528,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 16086 after 6 steps,average score in time: 0.1553,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 16087 after 7 steps,average score 

Reached goal state in episode 16730 after 5 steps,average score in time: 0.16354,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 16732 after 7 steps,average score in time: 0.16356,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 16733 after 10 steps,average score in time: 0.16358,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->R13->R14->R14->R15
Reached goal state in episode 16737 after 6 steps,average score in time: 0.1636,stage sequence:0->B4->R4->B8->B9->R10->B14->B15
Reached goal state in episode 16738 after 5 steps,average score in time: 0.16362,stage sequence:0->B4->B8->B9->R10->R14->B15
Reached goal state in episode 16742 after 5 steps,average score in time: 0.16364,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 16743 after 6 steps,average score in time: 0.16366,stage sequence:0->B4->B8->R9->R13->B14->R14->B15
Reached goal state in episode 16744 after 7 steps,average score in time: 0.16368,s

Reached goal state in episode 17377 after 5 steps,average score in time: 0.1721,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 17378 after 11 steps,average score in time: 0.17212,stage sequence:0->B4->R0->B4->R0->B4->B8->R9->R8->R9->B13->B14->B15
Reached goal state in episode 17379 after 11 steps,average score in time: 0.17214,stage sequence:0->B4->R4->R0->B4->B8->R8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 17380 after 5 steps,average score in time: 0.17216,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 17381 after 7 steps,average score in time: 0.17218,stage sequence:0->R0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 17382 after 5 steps,average score in time: 0.1722,stage sequence:0->R1->R2->B6->B10->B14->B15
Reached goal state in episode 17384 after 7 steps,average score in time: 0.17222,stage sequence:0->B4->B8->B9->B13->R14->R10->B14->B15
Reached goal state in episode 17385 after 5 steps,average

Reached goal state in episode 18013 after 7 steps,average score in time: 0.18044,stage sequence:0->R4->B8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 18017 after 6 steps,average score in time: 0.18046,stage sequence:0->B4->R4->B8->R9->B13->B14->B15
Reached goal state in episode 18018 after 9 steps,average score in time: 0.18048,stage sequence:0->B4->B8->R9->R10->R9->R8->B9->B13->R14->B15
Reached goal state in episode 18020 after 5 steps,average score in time: 0.1805,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 18021 after 6 steps,average score in time: 0.18052,stage sequence:0->R0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 18026 after 10 steps,average score in time: 0.18054,stage sequence:0->B4->B8->R4->B8->R8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 18027 after 8 steps,average score in time: 0.18056,stage sequence:0->R0->B4->B8->R9->B13->R9->R10->B14->B15
Reached goal state in episode 18028 after 5 steps,average

Reached goal state in episode 18663 after 5 steps,average score in time: 0.18886,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 18664 after 5 steps,average score in time: 0.18888,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 18665 after 6 steps,average score in time: 0.1889,stage sequence:0->B4->R8->B9->B13->B14->R14->B15
Reached goal state in episode 18667 after 7 steps,average score in time: 0.18892,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 18671 after 7 steps,average score in time: 0.18894,stage sequence:0->B4->R8->R9->B13->B14->R13->B14->B15
Reached goal state in episode 18672 after 5 steps,average score in time: 0.18896,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 18674 after 7 steps,average score in time: 0.18898,stage sequence:0->B4->B8->B9->B13->R13->R13->B14->B15
Reached goal state in episode 18675 after 5 steps,average score in time: 0.189,stage sequence

Reached goal state in episode 19298 after 8 steps,average score in time: 0.19708,stage sequence:0->R4->R4->B8->B9->B13->B14->R10->R14->B15
Reached goal state in episode 19299 after 6 steps,average score in time: 0.1971,stage sequence:0->B4->R4->B8->B9->B13->R14->B15
Reached goal state in episode 19301 after 5 steps,average score in time: 0.19712,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 19302 after 6 steps,average score in time: 0.19714,stage sequence:0->R4->B8->B9->R10->B14->R14->B15
Reached goal state in episode 19304 after 13 steps,average score in time: 0.19716,stage sequence:0->B4->R0->R0->R0->R0->B4->B8->B9->B13->B14->R13->R13->B14->B15
Reached goal state in episode 19306 after 5 steps,average score in time: 0.19718,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 19307 after 5 steps,average score in time: 0.1972,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 19309 after 5 steps,average score in t

Reached goal state in episode 19908 after 7 steps,average score in time: 0.20532,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 19910 after 6 steps,average score in time: 0.20534,stage sequence:0->B4->B8->R8->B9->B13->B14->B15
Reached goal state in episode 19913 after 7 steps,average score in time: 0.20536,stage sequence:0->R0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 19914 after 8 steps,average score in time: 0.20538,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->R14->B15
Reached goal state in episode 19915 after 5 steps,average score in time: 0.2054,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 19916 after 9 steps,average score in time: 0.20542,stage sequence:0->R0->R0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 19917 after 7 steps,average score in time: 0.20544,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->R15
Reached goal state in episode 19918 after 6 steps,average score

Reached goal state in episode 20543 after 6 steps,average score in time: 0.21354,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 20546 after 5 steps,average score in time: 0.21356,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 20547 after 10 steps,average score in time: 0.21358,stage sequence:0->B4->R8->R4->B8->B9->B13->B14->R10->B14->R14->B15
Reached goal state in episode 20548 after 5 steps,average score in time: 0.2136,stage sequence:0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 20550 after 5 steps,average score in time: 0.21362,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 20552 after 6 steps,average score in time: 0.21364,stage sequence:0->B4->R4->B8->B9->R10->B14->B15
Reached goal state in episode 20557 after 5 steps,average score in time: 0.21366,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 20559 after 7 steps,average score in time: 0.21368,stage sequen

Reached goal state in episode 21149 after 5 steps,average score in time: 0.22184,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 21150 after 6 steps,average score in time: 0.22186,stage sequence:0->B4->B8->R9->B13->B14->R14->B15
Reached goal state in episode 21152 after 6 steps,average score in time: 0.22188,stage sequence:0->R0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 21153 after 9 steps,average score in time: 0.2219,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->R10->B14->B15
Reached goal state in episode 21154 after 7 steps,average score in time: 0.22192,stage sequence:0->R1->B2->R1->B2->B6->B10->B14->B15
Reached goal state in episode 21155 after 6 steps,average score in time: 0.22194,stage sequence:0->B4->B8->R9->B13->B14->R14->B15
Reached goal state in episode 21156 after 5 steps,average score in time: 0.22196,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 21157 after 5 steps,average score in time: 0.22198,stage

Reached goal state in episode 21755 after 6 steps,average score in time: 0.23038,stage sequence:0->R0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 21756 after 5 steps,average score in time: 0.2304,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 21757 after 5 steps,average score in time: 0.23042,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 21759 after 5 steps,average score in time: 0.23044,stage sequence:0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 21760 after 5 steps,average score in time: 0.23046,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 21761 after 9 steps,average score in time: 0.23048,stage sequence:0->B4->B8->R4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 21762 after 13 steps,average score in time: 0.2305,stage sequence:0->R4->B8->B9->R8->B9->R8->B9->R13->B14->R13->B14->R13->B14->B15
Reached goal state in episode 21763 after 5 steps,average score in tim

Reached goal state in episode 22383 after 5 steps,average score in time: 0.2389,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 22384 after 8 steps,average score in time: 0.23892,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 22385 after 6 steps,average score in time: 0.23894,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 22386 after 9 steps,average score in time: 0.23896,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->R10->B14->B15
Reached goal state in episode 22387 after 5 steps,average score in time: 0.23898,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 22388 after 6 steps,average score in time: 0.239,stage sequence:0->B4->R4->R8->B9->R13->B14->B15
Reached goal state in episode 22389 after 8 steps,average score in time: 0.23902,stage sequence:0->R0->R0->R0->B4->R8->B9->R13->B14->B15
Reached goal state in episode 22390 after 5 steps,average score in time: 0.23

Reached goal state in episode 22992 after 7 steps,average score in time: 0.24778,stage sequence:0->R0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 22993 after 5 steps,average score in time: 0.2478,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 22994 after 7 steps,average score in time: 0.24782,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 22995 after 10 steps,average score in time: 0.24784,stage sequence:0->B4->B8->R4->B8->R4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 22996 after 5 steps,average score in time: 0.24786,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 22997 after 5 steps,average score in time: 0.24788,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 22998 after 5 steps,average score in time: 0.2479,stage sequence:0->R1->B2->R6->B10->B14->R15
Reached goal state in episode 22999 after 5 steps,average score in time: 0.24792,stage 

Reached goal state in episode 23604 after 8 steps,average score in time: 0.25644,stage sequence:0->R1->R0->R1->B2->B6->B10->B14->R14->B15
Reached goal state in episode 23605 after 5 steps,average score in time: 0.25646,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 23606 after 5 steps,average score in time: 0.25648,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 23607 after 7 steps,average score in time: 0.2565,stage sequence:0->R0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 23608 after 8 steps,average score in time: 0.25652,stage sequence:0->B4->R0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 23609 after 9 steps,average score in time: 0.25654,stage sequence:0->R4->R8->B9->R8->B9->R10->B14->R10->B14->B15
Reached goal state in episode 23611 after 5 steps,average score in time: 0.25656,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 23612 after 7 steps,average score in time: 

Reached goal state in episode 24207 after 11 steps,average score in time: 0.265,stage sequence:0->B4->B8->R4->B8->B9->R8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 24208 after 5 steps,average score in time: 0.26502,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 24209 after 5 steps,average score in time: 0.26504,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 24210 after 6 steps,average score in time: 0.26506,stage sequence:0->R0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 24212 after 5 steps,average score in time: 0.26508,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 24214 after 6 steps,average score in time: 0.2651,stage sequence:0->B4->B8->R8->B9->R13->B14->B15
Reached goal state in episode 24215 after 6 steps,average score in time: 0.26512,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 24216 after 5 steps,average score in time: 0.26514,stage seq

Reached goal state in episode 24801 after 5 steps,average score in time: 0.27386,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 24802 after 7 steps,average score in time: 0.27388,stage sequence:0->R0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 24804 after 6 steps,average score in time: 0.2739,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 24808 after 8 steps,average score in time: 0.27392,stage sequence:0->R0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 24809 after 7 steps,average score in time: 0.27394,stage sequence:0->B4->B8->B9->B13->R14->R14->R14->B15
Reached goal state in episode 24810 after 6 steps,average score in time: 0.27396,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 24812 after 7 steps,average score in time: 0.27398,stage sequence:0->B4->B8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 24813 after 7 steps,average score in time: 0.274

Reached goal state in episode 25431 after 7 steps,average score in time: 0.28272,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 25432 after 5 steps,average score in time: 0.28274,stage sequence:0->B4->B8->B9->R13->R14->B15
Reached goal state in episode 25433 after 5 steps,average score in time: 0.28276,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 25434 after 5 steps,average score in time: 0.28278,stage sequence:0->R4->B8->R9->B13->B14->B15
Reached goal state in episode 25435 after 6 steps,average score in time: 0.2828,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 25437 after 6 steps,average score in time: 0.28282,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 25438 after 5 steps,average score in time: 0.28284,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 25439 after 6 steps,average score in time: 0.28286,stage sequence:0->B4->R4->

Reached goal state in episode 26030 after 5 steps,average score in time: 0.29142,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 26032 after 5 steps,average score in time: 0.29144,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 26033 after 7 steps,average score in time: 0.29146,stage sequence:0->R4->B8->B9->R10->B14->R13->B14->B15
Reached goal state in episode 26034 after 7 steps,average score in time: 0.29148,stage sequence:0->B4->R8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 26035 after 7 steps,average score in time: 0.2915,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 26037 after 5 steps,average score in time: 0.29152,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 26039 after 5 steps,average score in time: 0.29154,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 26040 after 5 steps,average score in time: 0.29156,stage sequence:0->

Reached goal state in episode 26628 after 5 steps,average score in time: 0.30016,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 26629 after 6 steps,average score in time: 0.30018,stage sequence:0->B4->R8->B9->B13->R13->B14->R15
Reached goal state in episode 26631 after 6 steps,average score in time: 0.3002,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 26632 after 5 steps,average score in time: 0.30022,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 26635 after 6 steps,average score in time: 0.30024,stage sequence:0->B4->B8->R8->B9->B13->B14->B15
Reached goal state in episode 26637 after 8 steps,average score in time: 0.30026,stage sequence:0->B4->R0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 26638 after 9 steps,average score in time: 0.30028,stage sequence:0->B4->B8->R8->R8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 26643 after 6 steps,average score in time: 0.3003,stag

Reached goal state in episode 27235 after 5 steps,average score in time: 0.30898,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 27236 after 5 steps,average score in time: 0.309,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 27237 after 5 steps,average score in time: 0.30902,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 27238 after 7 steps,average score in time: 0.30904,stage sequence:0->R1->B2->R6->B10->R6->B10->R14->B15
Reached goal state in episode 27239 after 5 steps,average score in time: 0.30906,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 27240 after 5 steps,average score in time: 0.30908,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 27241 after 9 steps,average score in time: 0.3091,stage sequence:0->B4->B8->B9->R8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 27242 after 5 steps,average score in time: 0.30912,stage sequence:0->B4-

Reached goal state in episode 27823 after 5 steps,average score in time: 0.31806,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 27824 after 7 steps,average score in time: 0.31808,stage sequence:0->B4->B8->R9->B13->B14->R10->B14->B15
Reached goal state in episode 27825 after 6 steps,average score in time: 0.3181,stage sequence:0->B4->R8->B9->B13->R13->B14->B15
Reached goal state in episode 27828 after 7 steps,average score in time: 0.31812,stage sequence:0->R0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 27829 after 7 steps,average score in time: 0.31814,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 27830 after 5 steps,average score in time: 0.31816,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 27831 after 9 steps,average score in time: 0.31818,stage sequence:0->B4->B8->B9->B13->B14->R13->R9->R10->B14->B15
Reached goal state in episode 27832 after 5 steps,average score in time: 0

Reached goal state in episode 28427 after 7 steps,average score in time: 0.32702,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 28429 after 5 steps,average score in time: 0.32704,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 28430 after 5 steps,average score in time: 0.32706,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 28431 after 6 steps,average score in time: 0.32708,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 28433 after 9 steps,average score in time: 0.3271,stage sequence:0->R1->B2->B6->B10->R6->R2->B6->B10->B14->B15
Reached goal state in episode 28434 after 7 steps,average score in time: 0.32712,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 28435 after 5 steps,average score in time: 0.32714,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 28436 after 7 steps,average score in time: 0.32716,stage

Reached goal state in episode 28995 after 5 steps,average score in time: 0.33564,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 28996 after 5 steps,average score in time: 0.33566,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 28997 after 8 steps,average score in time: 0.33568,stage sequence:0->B4->B8->B9->B13->R13->B14->R13->B14->B15
Reached goal state in episode 28998 after 10 steps,average score in time: 0.3357,stage sequence:0->R0->B4->B8->B9->B13->R9->R8->B9->B13->B14->B15
Reached goal state in episode 29001 after 7 steps,average score in time: 0.33572,stage sequence:0->R1->B2->B6->B10->R9->B13->B14->B15
Reached goal state in episode 29002 after 5 steps,average score in time: 0.33574,stage sequence:0->R4->B8->B9->R13->B14->B15
Reached goal state in episode 29003 after 5 steps,average score in time: 0.33576,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 29005 after 5 steps,average score in time: 0.33578

Reached goal state in episode 29561 after 5 steps,average score in time: 0.34446,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 29564 after 8 steps,average score in time: 0.34448,stage sequence:0->R0->R4->B8->B9->R8->B9->B13->R14->B15
Reached goal state in episode 29565 after 7 steps,average score in time: 0.3445,stage sequence:0->B4->B8->B9->R13->B14->R14->R14->B15
Reached goal state in episode 29566 after 7 steps,average score in time: 0.34452,stage sequence:0->B4->B8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 29567 after 6 steps,average score in time: 0.34454,stage sequence:0->R0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 29568 after 5 steps,average score in time: 0.34456,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 29570 after 5 steps,average score in time: 0.34458,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 29571 after 5 steps,average score in time: 0.3446,stage sequen

Reached goal state in episode 30121 after 5 steps,average score in time: 0.35326,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 30122 after 9 steps,average score in time: 0.35328,stage sequence:0->B4->B8->B9->B13->B14->R10->B14->R10->B14->B15
Reached goal state in episode 30123 after 7 steps,average score in time: 0.3533,stage sequence:0->B4->B8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 30125 after 7 steps,average score in time: 0.35332,stage sequence:0->B4->R4->B8->R8->B9->B13->B14->B15
Reached goal state in episode 30126 after 5 steps,average score in time: 0.35334,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 30127 after 5 steps,average score in time: 0.35336,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 30128 after 6 steps,average score in time: 0.35338,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 30129 after 5 steps,average score in time: 0.3534,stage

Reached goal state in episode 30678 after 5 steps,average score in time: 0.36212,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 30679 after 10 steps,average score in time: 0.36214,stage sequence:0->B4->B8->R8->R8->B9->B13->B14->R13->B14->R14->B15
Reached goal state in episode 30680 after 5 steps,average score in time: 0.36216,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 30682 after 6 steps,average score in time: 0.36218,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 30683 after 5 steps,average score in time: 0.3622,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 30684 after 5 steps,average score in time: 0.36222,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 30685 after 5 steps,average score in time: 0.36224,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 30686 after 5 steps,average score in time: 0.36226,stage sequence:0

Reached goal state in episode 31155 after 7 steps,average score in time: 0.36928,stage sequence:0->B4->B8->B9->R8->B9->B13->B14->B15
Reached goal state in episode 31156 after 5 steps,average score in time: 0.3693,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31157 after 5 steps,average score in time: 0.36932,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 31159 after 5 steps,average score in time: 0.36934,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31160 after 5 steps,average score in time: 0.36936,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31162 after 5 steps,average score in time: 0.36938,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31163 after 6 steps,average score in time: 0.3694,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 31164 after 5 steps,average score in time: 0.36942,stage sequence:0->B4->B8->B9->B13

Reached goal state in episode 31636 after 9 steps,average score in time: 0.377,stage sequence:0->B4->B8->R8->R8->R4->B8->R9->B13->B14->B15
Reached goal state in episode 31637 after 7 steps,average score in time: 0.37702,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31638 after 5 steps,average score in time: 0.37704,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31640 after 5 steps,average score in time: 0.37706,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31641 after 5 steps,average score in time: 0.37708,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31642 after 8 steps,average score in time: 0.3771,stage sequence:0->R0->B4->B8->B9->R13->B14->R13->B14->B15
Reached goal state in episode 31643 after 5 steps,average score in time: 0.37712,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 31644 after 5 steps,average score in time: 0.37714,stage seq

Reached goal state in episode 32191 after 9 steps,average score in time: 0.3859,stage sequence:0->B4->R0->R4->R0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 32192 after 5 steps,average score in time: 0.38592,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 32194 after 9 steps,average score in time: 0.38594,stage sequence:0->R4->B8->B9->B13->B14->R13->R13->B14->R14->B15
Reached goal state in episode 32195 after 5 steps,average score in time: 0.38596,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 32196 after 5 steps,average score in time: 0.38598,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 32197 after 6 steps,average score in time: 0.386,stage sequence:0->R0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 32198 after 7 steps,average score in time: 0.38602,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 32199 after 6 steps,average score in time: 0.38604

Reached goal state in episode 32749 after 9 steps,average score in time: 0.39484,stage sequence:0->B4->B8->B9->B13->R14->R10->B14->R13->B14->B15
Reached goal state in episode 32750 after 7 steps,average score in time: 0.39486,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 32751 after 5 steps,average score in time: 0.39488,stage sequence:0->R1->B2->B6->R10->B14->B15
Reached goal state in episode 32753 after 7 steps,average score in time: 0.3949,stage sequence:0->B4->B8->B9->B13->B14->R10->B14->B15
Reached goal state in episode 32754 after 5 steps,average score in time: 0.39492,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 32755 after 5 steps,average score in time: 0.39494,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 32756 after 8 steps,average score in time: 0.39496,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->R14->B15
Reached goal state in episode 32757 after 5 steps,average score in time:

Reached goal state in episode 33295 after 5 steps,average score in time: 0.40378,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 33296 after 8 steps,average score in time: 0.4038,stage sequence:0->B4->R4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33297 after 5 steps,average score in time: 0.40382,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33298 after 5 steps,average score in time: 0.40384,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33299 after 5 steps,average score in time: 0.40386,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33300 after 5 steps,average score in time: 0.40388,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 33301 after 7 steps,average score in time: 0.4039,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 33302 after 5 steps,average score in time: 0.40392,stage sequence:0->B4->B8-

Reached goal state in episode 33834 after 5 steps,average score in time: 0.41272,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33835 after 5 steps,average score in time: 0.41274,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33836 after 5 steps,average score in time: 0.41276,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 33837 after 7 steps,average score in time: 0.41278,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33839 after 5 steps,average score in time: 0.4128,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 33841 after 6 steps,average score in time: 0.41282,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 33842 after 5 steps,average score in time: 0.41284,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 33843 after 5 steps,average score in time: 0.41286,stage sequence:0->R1->B2->B6->B1

Reached goal state in episode 34368 after 5 steps,average score in time: 0.42194,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34369 after 6 steps,average score in time: 0.42196,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34370 after 5 steps,average score in time: 0.42198,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34371 after 5 steps,average score in time: 0.422,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34372 after 7 steps,average score in time: 0.42202,stage sequence:0->B4->R0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 34373 after 7 steps,average score in time: 0.42204,stage sequence:0->B4->B8->B9->B13->R9->B13->R14->B15
Reached goal state in episode 34374 after 6 steps,average score in time: 0.42206,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 34375 after 5 steps,average score in time: 0.42208,stage sequence:0->B4

Reached goal state in episode 34902 after 5 steps,average score in time: 0.43106,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34903 after 5 steps,average score in time: 0.43108,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34904 after 7 steps,average score in time: 0.4311,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 34905 after 5 steps,average score in time: 0.43112,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34906 after 5 steps,average score in time: 0.43114,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34907 after 5 steps,average score in time: 0.43116,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 34910 after 7 steps,average score in time: 0.43118,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 34911 after 5 steps,average score in time: 0.4312,stage sequence:0->B4->B8->B9-

Reached goal state in episode 35430 after 5 steps,average score in time: 0.44008,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35431 after 5 steps,average score in time: 0.4401,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 35433 after 5 steps,average score in time: 0.44012,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35435 after 5 steps,average score in time: 0.44014,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35436 after 5 steps,average score in time: 0.44016,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35437 after 5 steps,average score in time: 0.44018,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35438 after 5 steps,average score in time: 0.4402,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35439 after 5 steps,average score in time: 0.44022,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 35957 after 5 steps,average score in time: 0.44886,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 35958 after 5 steps,average score in time: 0.44888,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35959 after 5 steps,average score in time: 0.4489,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35960 after 5 steps,average score in time: 0.44892,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35961 after 5 steps,average score in time: 0.44894,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35962 after 5 steps,average score in time: 0.44896,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35963 after 5 steps,average score in time: 0.44898,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 35964 after 5 steps,average score in time: 0.449,stage sequence:0->B4->B8->B9->B13->B14->B15
Rea

Reached goal state in episode 36473 after 5 steps,average score in time: 0.45784,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36474 after 5 steps,average score in time: 0.45786,stage sequence:0->R4->B8->B9->B13->B14->B15
Reached goal state in episode 36475 after 5 steps,average score in time: 0.45788,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36476 after 8 steps,average score in time: 0.4579,stage sequence:0->B4->R4->B8->B9->B13->R9->B13->B14->B15
Reached goal state in episode 36477 after 6 steps,average score in time: 0.45792,stage sequence:0->B4->B8->R8->B9->B13->R14->B15
Reached goal state in episode 36478 after 5 steps,average score in time: 0.45794,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36479 after 6 steps,average score in time: 0.45796,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 36480 after 5 steps,average score in time: 0.45798,stage sequence:0->B4->B

Reached goal state in episode 36974 after 6 steps,average score in time: 0.46644,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 36975 after 7 steps,average score in time: 0.46646,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36976 after 5 steps,average score in time: 0.46648,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36977 after 5 steps,average score in time: 0.4665,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36978 after 5 steps,average score in time: 0.46652,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36979 after 5 steps,average score in time: 0.46654,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36980 after 5 steps,average score in time: 0.46656,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 36981 after 5 steps,average score in time: 0.46658,stage sequence:0->R1->B2->B6->B10

Reached goal state in episode 37461 after 5 steps,average score in time: 0.47516,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37462 after 5 steps,average score in time: 0.47518,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37463 after 6 steps,average score in time: 0.4752,stage sequence:0->R0->R1->B2->B6->R10->B14->B15
Reached goal state in episode 37464 after 5 steps,average score in time: 0.47522,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37465 after 5 steps,average score in time: 0.47524,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37466 after 5 steps,average score in time: 0.47526,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37468 after 5 steps,average score in time: 0.47528,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 37469 after 5 steps,average score in time: 0.4753,stage sequence:0->B4->B8->B9->B13->B14->B1

Reached goal state in episode 37987 after 7 steps,average score in time: 0.48428,stage sequence:0->B4->R0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 37988 after 7 steps,average score in time: 0.4843,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 37989 after 5 steps,average score in time: 0.48432,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37990 after 5 steps,average score in time: 0.48434,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37991 after 6 steps,average score in time: 0.48436,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 37992 after 5 steps,average score in time: 0.48438,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 37993 after 7 steps,average score in time: 0.4844,stage sequence:0->B4->B8->B9->B13->B14->R13->R14->B15
Reached goal state in episode 37995 after 5 steps,average score in time: 0.48442,stage sequence

Reached goal state in episode 38469 after 5 steps,average score in time: 0.49272,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38470 after 5 steps,average score in time: 0.49274,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38471 after 5 steps,average score in time: 0.49276,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 38472 after 5 steps,average score in time: 0.49278,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38473 after 6 steps,average score in time: 0.4928,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38474 after 5 steps,average score in time: 0.49282,stage sequence:0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 38475 after 5 steps,average score in time: 0.49284,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38476 after 7 steps,average score in time: 0.49286,stage sequence:0->B4->B8->R4->B8->B9->B13

Reached goal state in episode 38894 after 5 steps,average score in time: 0.50024,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38895 after 5 steps,average score in time: 0.50026,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38896 after 5 steps,average score in time: 0.50028,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38897 after 5 steps,average score in time: 0.5003,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38898 after 5 steps,average score in time: 0.50032,stage sequence:0->B4->B8->B9->R10->B14->B15
Reached goal state in episode 38899 after 5 steps,average score in time: 0.50034,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38900 after 9 steps,average score in time: 0.50036,stage sequence:0->B4->B8->R4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 38901 after 5 steps,average score in time: 0.50038,stage sequence:0->B4->B8->B9-

Reached goal state in episode 39387 after 5 steps,average score in time: 0.509,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39388 after 5 steps,average score in time: 0.50902,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39389 after 6 steps,average score in time: 0.50904,stage sequence:0->B4->R4->B8->B9->B13->B14->B15
Reached goal state in episode 39390 after 5 steps,average score in time: 0.50906,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39391 after 6 steps,average score in time: 0.50908,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39392 after 5 steps,average score in time: 0.5091,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39393 after 6 steps,average score in time: 0.50912,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39394 after 5 steps,average score in time: 0.50914,stage sequence:0->B4->B8->R9->B13->

Reached goal state in episode 39892 after 6 steps,average score in time: 0.51784,stage sequence:0->B4->B8->B9->B13->R13->B14->B15
Reached goal state in episode 39894 after 5 steps,average score in time: 0.51786,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39895 after 7 steps,average score in time: 0.51788,stage sequence:0->B4->R0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 39896 after 9 steps,average score in time: 0.5179,stage sequence:0->R0->B4->R0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39897 after 5 steps,average score in time: 0.51792,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 39898 after 7 steps,average score in time: 0.51794,stage sequence:0->B4->B8->B9->B13->B14->R10->B14->B15
Reached goal state in episode 39900 after 8 steps,average score in time: 0.51796,stage sequence:0->B4->B8->R8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 39902 after 5 steps,average score in time: 0.

Reached goal state in episode 40394 after 5 steps,average score in time: 0.52688,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40395 after 5 steps,average score in time: 0.5269,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40396 after 7 steps,average score in time: 0.52692,stage sequence:0->B4->B8->R4->B8->B9->B13->B14->B15
Reached goal state in episode 40397 after 5 steps,average score in time: 0.52694,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40398 after 5 steps,average score in time: 0.52696,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40399 after 5 steps,average score in time: 0.52698,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40400 after 5 steps,average score in time: 0.527,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40401 after 5 steps,average score in time: 0.52702,stage sequence:0->B4->B8->B9->R10->B14-

Reached goal state in episode 40882 after 5 steps,average score in time: 0.53568,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40883 after 5 steps,average score in time: 0.5357,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40884 after 5 steps,average score in time: 0.53572,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40885 after 5 steps,average score in time: 0.53574,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40886 after 7 steps,average score in time: 0.53576,stage sequence:0->B4->B8->B9->R13->R9->B13->B14->B15
Reached goal state in episode 40887 after 5 steps,average score in time: 0.53578,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 40888 after 6 steps,average score in time: 0.5358,stage sequence:0->R0->R1->B2->B6->B10->B14->B15
Reached goal state in episode 40889 after 5 steps,average score in time: 0.53582,stage sequence:0->B4->B8->B9->B13

Reached goal state in episode 41368 after 5 steps,average score in time: 0.54446,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41369 after 5 steps,average score in time: 0.54448,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41371 after 5 steps,average score in time: 0.5445,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41372 after 5 steps,average score in time: 0.54452,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41373 after 5 steps,average score in time: 0.54454,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41374 after 5 steps,average score in time: 0.54456,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41375 after 5 steps,average score in time: 0.54458,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41376 after 5 steps,average score in time: 0.5446,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 41857 after 7 steps,average score in time: 0.55348,stage sequence:0->B4->B8->B9->R10->B14->R10->B14->B15
Reached goal state in episode 41858 after 6 steps,average score in time: 0.5535,stage sequence:0->B4->B8->R8->B9->B13->B14->B15
Reached goal state in episode 41859 after 5 steps,average score in time: 0.55352,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41860 after 5 steps,average score in time: 0.55354,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41861 after 5 steps,average score in time: 0.55356,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41863 after 5 steps,average score in time: 0.55358,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 41864 after 5 steps,average score in time: 0.5536,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 41865 after 5 steps,average score in time: 0.55362,stage sequence:0->B4->B8->B9->B1

Reached goal state in episode 42344 after 5 steps,average score in time: 0.5624,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42345 after 5 steps,average score in time: 0.56242,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42346 after 7 steps,average score in time: 0.56244,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42347 after 7 steps,average score in time: 0.56246,stage sequence:0->B4->B8->B9->B13->B14->R10->B14->B15
Reached goal state in episode 42348 after 7 steps,average score in time: 0.56248,stage sequence:0->B4->B8->B9->B13->B14->R13->B14->B15
Reached goal state in episode 42349 after 6 steps,average score in time: 0.5625,stage sequence:0->B4->B8->B9->B13->B14->R14->B15
Reached goal state in episode 42350 after 5 steps,average score in time: 0.56252,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42351 after 5 steps,average score in time: 0.56254,stage sequenc

Reached goal state in episode 42825 after 5 steps,average score in time: 0.57116,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42826 after 5 steps,average score in time: 0.57118,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42828 after 6 steps,average score in time: 0.5712,stage sequence:0->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42829 after 5 steps,average score in time: 0.57122,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42830 after 5 steps,average score in time: 0.57124,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42831 after 5 steps,average score in time: 0.57126,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 42832 after 5 steps,average score in time: 0.57128,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 42834 after 5 steps,average score in time: 0.5713,stage sequence:0->R1->B2->B6->B10->B14->B1

Reached goal state in episode 43269 after 7 steps,average score in time: 0.57942,stage sequence:0->B4->R0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43270 after 5 steps,average score in time: 0.57944,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43271 after 5 steps,average score in time: 0.57946,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43272 after 5 steps,average score in time: 0.57948,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43273 after 5 steps,average score in time: 0.5795,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43274 after 6 steps,average score in time: 0.57952,stage sequence:0->B4->B8->B9->B13->R14->R14->B15
Reached goal state in episode 43275 after 5 steps,average score in time: 0.57954,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43276 after 5 steps,average score in time: 0.57956,stage sequence:0->B4->B8->B9->B1

Reached goal state in episode 43747 after 5 steps,average score in time: 0.58832,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43748 after 5 steps,average score in time: 0.58834,stage sequence:0->B4->B8->B9->R13->B14->B15
Reached goal state in episode 43749 after 5 steps,average score in time: 0.58836,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43750 after 5 steps,average score in time: 0.58838,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43751 after 5 steps,average score in time: 0.5884,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43752 after 5 steps,average score in time: 0.58842,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43753 after 5 steps,average score in time: 0.58844,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 43754 after 5 steps,average score in time: 0.58846,stage sequence:0->B4->B8->B9->B13->B14->B15
R

Reached goal state in episode 44201 after 5 steps,average score in time: 0.59696,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44202 after 5 steps,average score in time: 0.59698,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44203 after 5 steps,average score in time: 0.597,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44204 after 5 steps,average score in time: 0.59702,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44205 after 5 steps,average score in time: 0.59704,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44206 after 5 steps,average score in time: 0.59706,stage sequence:0->B4->R8->B9->B13->B14->B15
Reached goal state in episode 44207 after 5 steps,average score in time: 0.59708,stage sequence:0->B4->B8->B9->B13->B14->R15
Reached goal state in episode 44208 after 5 steps,average score in time: 0.5971,stage sequence:0->B4->B8->B9->B13->B14->B15
Rea

Reached goal state in episode 44603 after 5 steps,average score in time: 0.60468,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44604 after 5 steps,average score in time: 0.6047,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44607 after 5 steps,average score in time: 0.60472,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44608 after 5 steps,average score in time: 0.60474,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44609 after 5 steps,average score in time: 0.60476,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44610 after 5 steps,average score in time: 0.60478,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44611 after 5 steps,average score in time: 0.6048,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 44612 after 5 steps,average score in time: 0.60482,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 45068 after 5 steps,average score in time: 0.61342,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45069 after 5 steps,average score in time: 0.61344,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45070 after 5 steps,average score in time: 0.61346,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45071 after 5 steps,average score in time: 0.61348,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45072 after 5 steps,average score in time: 0.6135,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45073 after 5 steps,average score in time: 0.61352,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45074 after 5 steps,average score in time: 0.61354,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45075 after 5 steps,average score in time: 0.61356,stage sequence:0->B4->B8->B9->B13->B14->B15
R

Reached goal state in episode 45512 after 5 steps,average score in time: 0.62186,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45513 after 5 steps,average score in time: 0.62188,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45514 after 5 steps,average score in time: 0.6219,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45515 after 5 steps,average score in time: 0.62192,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45516 after 6 steps,average score in time: 0.62194,stage sequence:0->B4->B8->R8->B9->B13->B14->B15
Reached goal state in episode 45517 after 5 steps,average score in time: 0.62196,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45518 after 5 steps,average score in time: 0.62198,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45519 after 5 steps,average score in time: 0.622,stage sequence:0->B4->B8->B9->B13->B14->B15

Reached goal state in episode 45937 after 5 steps,average score in time: 0.63006,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45938 after 5 steps,average score in time: 0.63008,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45939 after 5 steps,average score in time: 0.6301,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45940 after 5 steps,average score in time: 0.63012,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45941 after 5 steps,average score in time: 0.63014,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45942 after 5 steps,average score in time: 0.63016,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45943 after 5 steps,average score in time: 0.63018,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 45944 after 5 steps,average score in time: 0.6302,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 46406 after 5 steps,average score in time: 0.63906,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46407 after 5 steps,average score in time: 0.63908,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46408 after 5 steps,average score in time: 0.6391,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46409 after 5 steps,average score in time: 0.63912,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46410 after 5 steps,average score in time: 0.63914,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46411 after 5 steps,average score in time: 0.63916,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46412 after 5 steps,average score in time: 0.63918,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46413 after 5 steps,average score in time: 0.6392,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 46791 after 9 steps,average score in time: 0.64648,stage sequence:0->B4->B8->B9->B13->R9->B13->B14->R10->B14->B15
Reached goal state in episode 46792 after 5 steps,average score in time: 0.6465,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46793 after 5 steps,average score in time: 0.64652,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46794 after 5 steps,average score in time: 0.64654,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46795 after 5 steps,average score in time: 0.64656,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46796 after 5 steps,average score in time: 0.64658,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46797 after 5 steps,average score in time: 0.6466,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 46798 after 5 steps,average score in time: 0.64662,stage sequence:0->B4->B8->B

Reached goal state in episode 47220 after 5 steps,average score in time: 0.65474,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47221 after 5 steps,average score in time: 0.65476,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47222 after 5 steps,average score in time: 0.65478,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47223 after 5 steps,average score in time: 0.6548,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47224 after 5 steps,average score in time: 0.65482,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47225 after 5 steps,average score in time: 0.65484,stage sequence:0->B4->B8->B9->B13->R14->B15
Reached goal state in episode 47226 after 5 steps,average score in time: 0.65486,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47227 after 5 steps,average score in time: 0.65488,stage sequence:0->B4->B8->B9->B13->B14->B15
R

Reached goal state in episode 47679 after 5 steps,average score in time: 0.66366,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47680 after 5 steps,average score in time: 0.66368,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47681 after 5 steps,average score in time: 0.6637,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47682 after 5 steps,average score in time: 0.66372,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47683 after 5 steps,average score in time: 0.66374,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47684 after 5 steps,average score in time: 0.66376,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47685 after 5 steps,average score in time: 0.66378,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 47686 after 5 steps,average score in time: 0.6638,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 48144 after 5 steps,average score in time: 0.67258,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48145 after 5 steps,average score in time: 0.6726,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48146 after 5 steps,average score in time: 0.67262,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48147 after 5 steps,average score in time: 0.67264,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48148 after 5 steps,average score in time: 0.67266,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48149 after 5 steps,average score in time: 0.67268,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48150 after 5 steps,average score in time: 0.6727,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48151 after 5 steps,average score in time: 0.67272,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 48604 after 5 steps,average score in time: 0.68162,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48605 after 5 steps,average score in time: 0.68164,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48606 after 5 steps,average score in time: 0.68166,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48607 after 5 steps,average score in time: 0.68168,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48609 after 5 steps,average score in time: 0.6817,stage sequence:0->B4->B8->R9->B13->B14->B15
Reached goal state in episode 48610 after 5 steps,average score in time: 0.68172,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48611 after 5 steps,average score in time: 0.68174,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 48612 after 5 steps,average score in time: 0.68176,stage sequence:0->B4->B8->B9->B13->B14->B15
R

Reached goal state in episode 49063 after 5 steps,average score in time: 0.69068,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49064 after 5 steps,average score in time: 0.6907,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49065 after 5 steps,average score in time: 0.69072,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49066 after 5 steps,average score in time: 0.69074,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49067 after 5 steps,average score in time: 0.69076,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49068 after 5 steps,average score in time: 0.69078,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49069 after 5 steps,average score in time: 0.6908,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49070 after 5 steps,average score in time: 0.69082,stage sequence:0->B4->B8->B9->B13->B14->B15
Re

Reached goal state in episode 49520 after 5 steps,average score in time: 0.69974,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49521 after 5 steps,average score in time: 0.69976,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49522 after 5 steps,average score in time: 0.69978,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49523 after 5 steps,average score in time: 0.6998,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49524 after 5 steps,average score in time: 0.69982,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49525 after 5 steps,average score in time: 0.69984,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49526 after 5 steps,average score in time: 0.69986,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49527 after 5 steps,average score in time: 0.69988,stage sequence:0->B4->B8->B9->B13->B14->B15
R

Reached goal state in episode 49970 after 5 steps,average score in time: 0.70874,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49971 after 5 steps,average score in time: 0.70876,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49972 after 5 steps,average score in time: 0.70878,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49973 after 5 steps,average score in time: 0.7088,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49974 after 5 steps,average score in time: 0.70882,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49975 after 5 steps,average score in time: 0.70884,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49976 after 5 steps,average score in time: 0.70886,stage sequence:0->B4->B8->B9->B13->B14->B15
Reached goal state in episode 49977 after 5 steps,average score in time: 0.70888,stage sequence:0->B4->B8->B9->B13->B14->B15
R

In [7]:
print( "Total over time: " +  str(sum(rewards_per_episode)/num_episodes))

Total over time: 0.70932


### Learned Q-table
Interpretation : every row of the Q-table is an state, and every column an action, the values of the table means how good is to take action a in state s, so bigger values mean actions that provide better results(a way to reach the Goal state)

In [8]:
np.set_printoptions(precision=6)
np.set_printoptions(suppress=True)
Qtable

array([[ 0.735092,  0.773781,  0.773781,  0.735092],
       [ 0.735092, -1.      ,  0.814506,  0.773781],
       [ 0.773781,  0.857375,  0.773781,  0.814506],
       [ 0.814506, -1.      ,  0.773781,  0.773781],
       [ 0.773781,  0.814506, -1.      ,  0.735092],
       [ 0.      ,  0.      ,  0.      ,  0.      ],
       [-1.      ,  0.9025  , -1.      ,  0.814506],
       [ 0.      ,  0.      ,  0.      ,  0.      ],
       [ 0.814506, -1.      ,  0.857375,  0.773781],
       [ 0.814506,  0.9025  ,  0.9025  , -1.      ],
       [ 0.857375,  0.95    , -1.      ,  0.857375],
       [ 0.      ,  0.      ,  0.      ,  0.      ],
       [ 0.      ,  0.      ,  0.      ,  0.      ],
       [-1.      ,  0.9025  ,  0.95    ,  0.857375],
       [ 0.9025  ,  0.95    ,  1.      ,  0.9025  ],
       [ 0.      ,  0.      ,  0.      ,  0.      ]])

### Lets see if the agent learned to reach goal state during training:
Set the agent in the initial state and see its route  to goal state

In [9]:
state = frozenLakeEnv.reset()
state_sequence = str(state)
action_sequence = ""
max_steps = 100
is_final_state = False

step_count = 0
while not is_final_state and  step_count < max_steps:
    frozenLakeEnv.render()
    best_action = np.argmax(Qtable[state,:])
    state,_,is_final_state,_ = frozenLakeEnv.step(best_action)
    state_sequence=state_sequence+"->"+str(state)
    action_sequence = action_sequence+"->"+actions_dictionary[best_action]
    step_count += 1
frozenLakeEnv.render()
print("State sequence:"+state_sequence)
print("Action sequence:"+action_sequence)


[41mS[0mFFF
FHFH
FFFH
HFFG
  (Down)
SFFF
[41mF[0mHFH
FFFH
HFFG
  (Down)
SFFF
FHFH
[41mF[0mFFH
HFFG
  (Right)
SFFF
FHFH
F[41mF[0mFH
HFFG
  (Down)
SFFF
FHFH
FFFH
H[41mF[0mFG
  (Right)
SFFF
FHFH
FFFH
HF[41mF[0mG
  (Right)
SFFF
FHFH
FFFH
HFF[41mG[0m
State sequence:0->4->8->9->13->14->15
Action sequence:->down->down->right->down->right->right


### The agent learned to find a way to goal state.
Table method works for small problems but if there are many states and/or many actions its not possible to use a table,  a Neural network can  be used that approximates a Q table