# Simple Imitation Learning in MineRL
This tutorial contains a simple example of how to build a imitation-learning based agent that can solve the MineRLNavigateDense-v0 environment. For more information about that environment, see this [MineRL Docs](http://minerl.io/docs/environments/index.html#minerlnavigatedense-v0).

For more Imitation Learning algorithms, like a Dagger in Tensorflow, see that Github repo, [Dagger](https://github.com/zsdonghao/Imitation-Learning-Dagger-Torcs).

Parts of this tutorial are based on code by [Arthur Juliani](https://medium.com/@awjuliani/super-simple-reinforcement-learning-tutorial-part-2-ded33892c724).

In [1]:
from __future__ import division

import numpy as np
import tensorflow as tf
import tensorflow.contrib.slim as slim
%matplotlib inline
import matplotlib.pyplot as plt
import math

try:
    xrange = xrange
except:
    xrange = range
    
env_name = 'MineRLTreechop-v0'
data_path = '/media/kimbring2/6224AA7924AA5039/minerl_data'

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


### Setting up our Neural Network agent
This time we will be using a Policy neural network that takes observations, passes them through a single hidden layer, and then produces a probability of choosing a left/right movement. To learn more about this network, see [Andrej Karpathy's blog on Policy Gradient networks](http://karpathy.github.io/2016/05/31/rl/).

In [2]:
H = 1024

tf.reset_default_graph()

if (env_name == 'MineRLTreechop-v0'):
    state = tf.placeholder(shape=[None,64,64,3], dtype=tf.float32)
elif (env_name == 'MineRLNavigateDense-v0'):
    state = tf.placeholder(shape=[None,64,64,4], dtype=tf.float32)

conv1 = slim.conv2d(inputs=state, num_outputs=32, kernel_size=[8,8], stride=[4,4], padding='VALID', 
                    biases_initializer=None, activation_fn=tf.nn.relu)
conv2 = slim.conv2d(inputs=conv1, num_outputs=64, kernel_size=[4,4], stride=[2,2], padding='VALID', 
                    biases_initializer=None, activation_fn=tf.nn.relu)
conv3 = slim.conv2d(inputs=conv2, num_outputs=64, kernel_size=[3,3], stride=[1,1], padding='VALID', 
                    biases_initializer=None, activation_fn=tf.nn.relu)

convFlat = slim.flatten(conv3)
#print("convFlat: " + str(convFlat))

if (env_name == 'MineRLTreechop-v0'):
    W = tf.get_variable("W", shape=[H,6],
               initializer=tf.contrib.layers.xavier_initializer())
    score = tf.matmul(convFlat, W)
    probability = tf.nn.softmax(score)
    real_action = tf.placeholder(shape=[None,6], dtype=tf.int32)
elif (env_name == 'MineRLNavigateDense-v0'):
    W = tf.get_variable("W", shape=[H,4],
               initializer=tf.contrib.layers.xavier_initializer())
    score = tf.matmul(convFlat, W)
    probability = tf.nn.softmax(score)
    real_action = tf.placeholder(shape=[None,4], dtype=tf.int32)
    
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=real_action, 
                                                              logits=score))
tf.summary.scalar('loss', loss)
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss)

merged = tf.summary.merge_all()

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use keras.layers.flatten instead.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



# Train
MineRL package provides a human playing dataset for improving effiency of traning. At first, we are going to train our network by this dataset and use pretrained network for Reinforcement Learning. I assure it will reduce traing time tremendously. 

For more information about that dataset, see this [MineRL Dataset Docs](http://minerl.io/docs/tutorials/data_sampling.html).

### Running the Agent and Environment

In [3]:
import minerl
data = minerl.data.make(env_name, data_path)

print("test")

init = tf.global_variables_initializer()
restore = False
with tf.Session() as sess:
    rendering = False
    sess.run(init)
    saver = tf.train.Saver(max_to_keep=5)
    train_writer = tf.summary.FileWriter('/home/kimbring2/MineRL/train_summary/' + env_name, sess.graph)
    
    if restore == True:
        path = '/home/kimbring2/MineRL/model/' + env_name
        ckpt = tf.train.get_checkpoint_state(path)
        saver.restore(sess, ckpt.model_checkpoint_path)
    
    episode_count = 0
    for current_state, action, reward, next_state, done in data.sarsd_iter(num_epochs=500, max_sequence_len=10):
        #print("action: " + str(action))
        
        #print("current_state['equipped_items']: " + str(current_state['equipped_items']))
        length = (current_state['pov'].shape)[0]

        action_list = []
        states_list = []
        for i in range(0, length):
            if (env_name == 'MineRLTreechop-v0'):
                state_concat = current_state['pov'][i].astype(np.float32) / 255.0 - 0.5
            elif (env_name == 'MineRLNavigateDense-v0'):
                pov = current_state['pov'][i].astype(np.float32) / 255.0 - 0.5
                compass = current_state['compassAngle'][i]
                #print("compass: " + str(compass))
                
                #print("np.ones(shape=list(pov.shape[:-1]): " + str(np.ones(shape=list(pov.shape[:-1]))))
                compass_channel = np.ones(shape=list(pov.shape[:-1]) + [1], dtype=np.float32) * compass
                #print("compass_channel: " + str(compass_channel))
                
                compass_channel /= 180.0
                
                state_concat = np.concatenate([pov, compass_channel], axis=-1)

                
            if (env_name == 'MineRLNavigateDense-v0'):
                if (action['camera'][i][1] < 0):
                    action_ = [1, 0, 0, 0]
                elif (action['camera'][i][1] > 0):
                    action_ = [0, 1, 0, 0]
                else:
                    if (action['jump'][i] == 0):
                        action_ = [0, 0, 1, 0]
                    else:
                        action_ = [0, 0, 0, 1]
            elif (env_name == 'MineRLTreechop-v0'):
                if (action['camera'][i][1] < 0):
                    action_ = [1, 0, 0, 0, 0, 0]
                elif (action['camera'][i][1] > 0):
                    action_ = [0, 1, 0, 0, 0, 0]
                elif (action['camera'][i][0] < 0):
                    action_ = [0, 0, 1, 0, 0, 0]
                elif (action['camera'][i][0] > 0):
                    action_ = [0, 0, 0, 1, 0, 0]
                else:
                    if ( (action['jump'][i] == 0) & (action['forward'][i] == 0) ):
                        action_ = [0, 0, 0, 0, 1, 0]
                    elif ( (action['jump'][i] == 1) & (action['forward'][i] == 1) ):
                        action_ = [0, 0, 0, 0, 0, 1]                    
                                                       
            states_list.append(state_concat)
            action_list.append(action_)
        
        episode_count = episode_count + 1
        
        #state_train = (np.zeros([1,H]), np.zeros([1,H]))
        feed_dict = {state:np.stack(states_list, 0),
                     real_action:np.stack(action_list, 0)
                    }
        
        #if episode_count % 100 == 0:
        summary, _ = sess.run([merged, train_step], feed_dict=feed_dict)
        train_writer.add_summary(summary, episode_count)

        sess.run(train_step, feed_dict=feed_dict)
        
        if episode_count % 10 == 0:
            model_path = '/home/kimbring2/MineRL/model/' + env_name
            saver.save(sess, model_path + '/model-' + str(episode_count) + '.cptk')
            print("Saved Model")

test
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Instructions for updating:
Use standard file APIs to delete files with this prefix.
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Saved Model
Save

Process ForkPoolWorker-3:
Process ForkPoolWorker-1:
Process ForkPoolWorker-4:
Process ForkPoolWorker-2:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 47, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File 

Saved Model
Saved Model


KeyboardInterrupt: 

In [None]:
'''
import minerl
import gym
env = gym.make('MineRLNavigateDense-v0')

obs  = env.reset()
done = False
net_reward = 0

while not done:
    action = env.action_space.noop()

    action['camera'] = [0, -10]
    action['back'] = 0
    action['forward'] = 1
    action['jump'] = 1
    action['attack'] = 1

    obs, reward, done, info = env.step(
        action)

    net_reward += reward
    print("Total reward: ", net_reward)
'''

In [None]:
#import minerl
#data = minerl.data.make('MineRLNavigateDense-v0', '/home/kimbring2/MineRL/data/')

# Test

In [3]:
import minerl
import gym

env = gym.make('MineRLTreechop-v0')
obs = env.reset()

In [5]:
import random

init = tf.global_variables_initializer()
with tf.Session() as sess:
# Launch the graph
    rendering = False
    sess.run(init)
    saver = tf.train.Saver(max_to_keep=5)
    train_writer = tf.summary.FileWriter('/home/kimbring2/MineRL/train_summary', sess.graph)
    
    #print('Loading Model...')
    #path = '/home/kimbring2/MineRL/model/' + env_name
    #ckpt = tf.train.get_checkpoint_state(path)
    #saver.restore(sess, ckpt.model_checkpoint_path)
    
    env.init()
    obs = env.reset()
    net_reward = 0
    for i in range(0, 500000):
        if (env_name == 'MineRLTreechop-v0'):
            state_concat = obs['pov'].astype(np.float32) / 255.0 - 0.5
        elif (env_name == 'MineRLNavigateDense-v0'):
            pov = obs['pov'].astype(np.float32) / 255.0 - 0.5
            compass = obs['compassAngle']
            compass_channel = np.ones(shape=list(pov.shape[:-1]) + [1], dtype=np.float32) * compass
            compass_channel /= 180.0
            state_concat = np.concatenate([pov, compass_channel], axis=-1)
        
        e = 0.01
        
        action_probability = sess.run(probability, feed_dict={state:[state_concat]})
        print("action_probability: " + str(action_probability))
        if np.random.rand(1) >= e:
            action_index = np.argmax(action_probability)
        else:
            if (env_name == 'MineRLNavigateDense-v0'):
                action_index = random.randint(0,4)
            elif (env_name == 'MineRLTreechop-v0'):
                action_index = random.randint(0,6)
            
        action = env.action_space.noop()
        if (env_name == 'MineRLNavigateDense-v0'):
            if (action_index == 0):
                action['camera'] = [0, -10]
                action['jump'] = 0
                action['forward'] = 1
                action['sprint'] = 1
            elif (action_index == 1):
                action['camera'] = [0, 10]
                action['jump'] = 0
                action['forward'] = 1
                action['sprint'] = 1
            elif (action_index == 2):
                action['camera'] = [0, 0]
                action['jump'] = 0
                action['forward'] = 1
                action['sprint'] = 1
            else:
                action['camera'] = [0, 0]
                action['jump'] = 1
                action['forward'] = 1
                action['sprint'] = 1
        elif (env_name == 'MineRLTreechop-v0'):
            if (action_index == 0):
                action['camera'] = [0, -10]
                action['jump'] = 0
                action['forward'] = 1
                action['attack'] = 1
                action['sprint'] = 0
            elif (action_index == 1):
                action['camera'] = [0, 10]
                action['jump'] = 0
                action['forward'] = 1
                action['attack'] = 1
                action['sprint'] = 0
            elif (action_index == 2):
                action['camera'] = [-10, 0]
                action['jump'] = 0
                action['forward'] = 1
                action['attack'] = 1
                action['sprint'] = 0
            elif (action_index == 3):
                action['camera'] = [10, 0]
                action['jump'] = 0
                action['forward'] = 1
                action['attack'] = 1
                action['sprint'] = 0
            elif (action_index == 4):
                action['camera'] = [0, 0]
                action['jump'] = 0
                action['forward'] = 0
                action['attack'] = 1
                action['sprint'] = 0
            else:
                action['camera'] = [0, 0]
                action['jump'] = 1
                action['forward'] = 1
                action['attack'] = 1
                action['sprint'] = 0

                
            action['back'] = 0
            action['left'] = 0
            action['right'] = 0
            action['sprint'] = 0

        obs1, reward, done, info = env.step(action)
        
        if done == True:
            break
                
        obs = obs1

        net_reward += reward
    print("Total reward: ", net_reward)

action_probability: [[0.16819865 0.15656818 0.16091618 0.16928844 0.17296357 0.17206502]]
action_probability: [[0.16872552 0.15331563 0.16154423 0.1708176  0.17408825 0.17150868]]
action_probability: [[0.16734146 0.15411998 0.16247398 0.17027286 0.17530881 0.17048292]]
action_probability: [[0.16789693 0.15651992 0.15965329 0.16888022 0.17476052 0.1722891 ]]
action_probability: [[0.16505669 0.15349355 0.1583317  0.17201406 0.17734727 0.17375667]]
action_probability: [[0.16229992 0.15669706 0.15982316 0.16886812 0.17742929 0.17488244]]
action_probability: [[0.1611329  0.16105378 0.15914594 0.16679788 0.17807831 0.1737912 ]]
action_probability: [[0.16068459 0.16122665 0.15962635 0.1682439  0.17779048 0.17242803]]
action_probability: [[0.1632486  0.16335458 0.15693411 0.16762851 0.17220707 0.17662714]]
action_probability: [[0.163821   0.15612723 0.15851974 0.16649486 0.17501163 0.1800255 ]]
action_probability: [[0.16535215 0.15903665 0.15815262 0.16864704 0.16985603 0.1789555 ]]
action_pro

action_probability: [[0.16420613 0.15335457 0.16324818 0.17406099 0.17975003 0.16538005]]
action_probability: [[0.16434678 0.1530629  0.1633109  0.17403336 0.17975405 0.16549201]]
action_probability: [[0.1643278  0.15293074 0.1636372  0.17456193 0.17927355 0.16526875]]
action_probability: [[0.16434892 0.15308835 0.16353793 0.17430405 0.1794558  0.16526498]]
action_probability: [[0.16478096 0.15272814 0.16329159 0.17393872 0.17969069 0.16556984]]
action_probability: [[0.16483858 0.1527243  0.16339494 0.1740981  0.17942813 0.16551597]]
action_probability: [[0.16480356 0.15300511 0.16327825 0.17408071 0.17920269 0.16562966]]
action_probability: [[0.16480638 0.1530976  0.16349214 0.17366652 0.17930776 0.16562957]]
action_probability: [[0.16456789 0.15300345 0.1630592  0.17433964 0.1794784  0.16555144]]
action_probability: [[0.16464876 0.14745213 0.15782638 0.17518091 0.18242583 0.17246598]]
action_probability: [[0.16771026 0.1485743  0.15575823 0.1732672  0.18231776 0.17237231]]
action_pro

action_probability: [[0.1652532  0.1478702  0.16159774 0.17224824 0.18346739 0.16956323]]
action_probability: [[0.16536766 0.14860028 0.16151595 0.17164695 0.18333219 0.16953693]]
action_probability: [[0.16499239 0.14866267 0.16175935 0.17159021 0.18270206 0.17029332]]
action_probability: [[0.16499102 0.14873928 0.16167468 0.17162272 0.18261585 0.17035647]]
action_probability: [[0.16510804 0.14864753 0.16177137 0.17167908 0.18272889 0.17006512]]
action_probability: [[0.16506746 0.14851691 0.16181965 0.17194642 0.18250315 0.17014636]]
action_probability: [[0.16493958 0.1486333  0.16164231 0.17187421 0.18243338 0.17047714]]
action_probability: [[0.16520983 0.1482753  0.16203266 0.17214312 0.1828084  0.16953067]]
action_probability: [[0.16518596 0.14847158 0.16164139 0.17174949 0.18309925 0.16985235]]
action_probability: [[0.16518198 0.14849165 0.16175722 0.17172597 0.18305553 0.16978765]]
action_probability: [[0.16532667 0.14857039 0.16166557 0.17176148 0.18308719 0.16958871]]
action_pro

action_probability: [[0.16263284 0.15408568 0.15846616 0.17648475 0.17911245 0.1692182 ]]
action_probability: [[0.1638969  0.15402825 0.15980619 0.17588623 0.17755584 0.16882661]]
action_probability: [[0.1637851  0.15419884 0.15982188 0.1761286  0.17757653 0.16848901]]
action_probability: [[0.16400892 0.15387249 0.16007608 0.1760774  0.17789467 0.16807047]]
action_probability: [[0.16383201 0.15392464 0.16008745 0.17617683 0.17800304 0.16797602]]
action_probability: [[0.16348656 0.15424705 0.1599754  0.17627609 0.1780741  0.16794084]]
action_probability: [[0.16371664 0.15404373 0.16021912 0.17635033 0.17799538 0.16767474]]
action_probability: [[0.16373074 0.15398468 0.16024078 0.17627351 0.1779856  0.16778466]]
action_probability: [[0.16378576 0.15401615 0.16022171 0.17616658 0.17782544 0.1679844 ]]
action_probability: [[0.16376407 0.15404715 0.16005567 0.17623296 0.1778536  0.16804655]]
action_probability: [[0.16381884 0.15388502 0.1598087  0.17630944 0.17809989 0.16807812]]
action_pro

action_probability: [[0.16531718 0.15478888 0.1554303  0.17410268 0.18046953 0.16989143]]
action_probability: [[0.16525634 0.15479465 0.15543443 0.17418152 0.1808176  0.16951545]]
action_probability: [[0.16539131 0.15434587 0.15577212 0.17423888 0.18040624 0.16984557]]
action_probability: [[0.16538207 0.15470777 0.15576781 0.17391798 0.18022306 0.1700013 ]]
action_probability: [[0.16590714 0.15465066 0.15598513 0.1732145  0.17998305 0.17025946]]
action_probability: [[0.16598967 0.15430824 0.15602036 0.17366934 0.17976996 0.17024244]]
action_probability: [[0.16574377 0.15436609 0.15592153 0.17384958 0.17980959 0.17030947]]
action_probability: [[0.16589795 0.15471701 0.15581985 0.17375578 0.17988862 0.16992077]]
action_probability: [[0.1659695  0.15470387 0.15580966 0.17371504 0.18006447 0.16973746]]
action_probability: [[0.16552922 0.15482196 0.15577915 0.17376871 0.17994435 0.17015652]]
action_probability: [[0.16548383 0.15484181 0.15597777 0.17375103 0.17956713 0.17037845]]
action_pro

action_probability: [[0.16733843 0.15328059 0.15694109 0.17562534 0.17878146 0.16803315]]
action_probability: [[0.16715744 0.15326442 0.15699369 0.17594561 0.17866108 0.16797775]]
action_probability: [[0.16728257 0.15342024 0.15652353 0.1757053  0.17874014 0.16832827]]
action_probability: [[0.1672948  0.1532018  0.15693724 0.17572905 0.1787554  0.1680817 ]]
action_probability: [[0.16721556 0.15355363 0.15704034 0.17564261 0.17874835 0.16779953]]
action_probability: [[0.16757251 0.15327287 0.15719584 0.17556949 0.17868067 0.16770858]]
action_probability: [[0.16736664 0.15352492 0.1572723  0.17552751 0.17866246 0.1676462 ]]
action_probability: [[0.16714697 0.15335415 0.15733276 0.17563412 0.17859478 0.16793723]]
action_probability: [[0.16726074 0.15311895 0.15778807 0.17575489 0.17825888 0.1678185 ]]
action_probability: [[0.16701905 0.15403455 0.1569886  0.17478028 0.17865229 0.16852525]]
action_probability: [[0.16699368 0.15410711 0.15685932 0.17481364 0.17863074 0.16859554]]
action_pro

action_probability: [[0.163931   0.15910923 0.15990114 0.16903722 0.17914551 0.16887586]]
action_probability: [[0.1638816  0.15920159 0.15993223 0.16917636 0.17902742 0.16878083]]
action_probability: [[0.16393612 0.15921095 0.15992026 0.16920991 0.17899959 0.16872318]]
action_probability: [[0.16389582 0.15907887 0.16005814 0.16926591 0.17899503 0.16870625]]
action_probability: [[0.163113   0.15998863 0.1599867  0.16899307 0.17917249 0.16874608]]
action_probability: [[0.1626903  0.16020675 0.15984826 0.16901228 0.17926121 0.16898121]]
action_probability: [[0.16267073 0.16042288 0.15977244 0.16891198 0.17912295 0.169099  ]]
action_probability: [[0.16275908 0.16031122 0.15976271 0.16897844 0.17915352 0.1690351 ]]
action_probability: [[0.16330995 0.1597754  0.16001143 0.16897447 0.17877962 0.16914913]]
action_probability: [[0.1632741  0.15940109 0.16035199 0.16878791 0.17885761 0.16932735]]
action_probability: [[0.16347547 0.15956306 0.16003548 0.16893652 0.17883416 0.1691553 ]]
action_pro

action_probability: [[0.16326381 0.15914452 0.15998305 0.16917221 0.1797426  0.1686938 ]]
action_probability: [[0.16303864 0.15928455 0.15998651 0.16922347 0.17983013 0.16863665]]
action_probability: [[0.16307461 0.15913805 0.15998292 0.1689902  0.17982846 0.16898583]]
action_probability: [[0.16313286 0.15896228 0.15993737 0.16897905 0.17989594 0.1690926 ]]
action_probability: [[0.16313569 0.15896328 0.15993586 0.16898108 0.1798928  0.16909128]]
action_probability: [[0.16313298 0.15879434 0.16000447 0.16897613 0.17995228 0.16913983]]
action_probability: [[0.1629784  0.15929891 0.15976073 0.16907665 0.17964225 0.16924308]]
action_probability: [[0.16311681 0.15954997 0.15982689 0.16889204 0.17971613 0.16889818]]
action_probability: [[0.1628776  0.15950905 0.15979445 0.16910961 0.17971332 0.1689959 ]]
action_probability: [[0.16279559 0.15956733 0.1597254  0.16896468 0.17977192 0.16917503]]
action_probability: [[0.16280767 0.1593958  0.15984134 0.16901648 0.1798879  0.16905078]]
action_pro

action_probability: [[0.16285264 0.15944932 0.15994188 0.16937219 0.17948458 0.16889936]]
action_probability: [[0.16275823 0.15933053 0.15991065 0.16950285 0.17977457 0.16872317]]
action_probability: [[0.16291022 0.15916173 0.15990484 0.16928956 0.17976625 0.16896734]]
action_probability: [[0.16284752 0.15911888 0.16014966 0.16933791 0.17972438 0.16882166]]
action_probability: [[0.16300374 0.15938345 0.1600181  0.16923644 0.17980324 0.16855505]]
action_probability: [[0.1628873  0.15913562 0.15999289 0.16945921 0.17979153 0.16873354]]
action_probability: [[0.16298777 0.1591502  0.16000947 0.16937403 0.17971604 0.16876243]]
action_probability: [[0.16293368 0.159136   0.160031   0.16933982 0.17981595 0.16874354]]
action_probability: [[0.16309886 0.15916441 0.16011156 0.16913658 0.17966214 0.1688264 ]]
action_probability: [[0.16296382 0.15938026 0.15989295 0.16926071 0.17978074 0.16872159]]
action_probability: [[0.16302347 0.15950394 0.15985997 0.16928276 0.17982595 0.16850393]]
action_pro

action_probability: [[0.16454479 0.15876299 0.15796082 0.16989066 0.18039241 0.16844827]]
action_probability: [[0.16437659 0.15914963 0.15746996 0.16987379 0.18073268 0.16839735]]
action_probability: [[0.16413105 0.15920366 0.15776066 0.17011093 0.18058103 0.16821265]]
action_probability: [[0.1644413  0.159299   0.15769069 0.17021754 0.18030964 0.16804184]]
action_probability: [[0.16430362 0.15914926 0.15789913 0.17016959 0.18035744 0.168121  ]]
action_probability: [[0.1645375  0.15918393 0.15755673 0.16986594 0.18045658 0.1683993 ]]
action_probability: [[0.16405667 0.1594316  0.15742911 0.1701068  0.18060236 0.16837339]]
action_probability: [[0.16405667 0.1594316  0.15742911 0.1701068  0.18060236 0.16837339]]
action_probability: [[0.16386402 0.15907073 0.15770988 0.17009753 0.1808512  0.16840667]]
action_probability: [[0.16397674 0.1591229  0.15762661 0.16997804 0.18069227 0.16860351]]
action_probability: [[0.16386452 0.15930523 0.15782487 0.17035586 0.1805115  0.16813807]]
action_pro

action_probability: [[0.16347101 0.1597193  0.15888216 0.17044121 0.18052746 0.16695882]]
action_probability: [[0.1634284  0.15938886 0.15901843 0.17027792 0.18058148 0.16730496]]
action_probability: [[0.16350874 0.15967809 0.15870783 0.1702466  0.18054354 0.16731524]]
action_probability: [[0.16421627 0.15963109 0.1586381  0.17026126 0.18015079 0.16710241]]
action_probability: [[0.1640957  0.15979883 0.15877427 0.1701352  0.18016157 0.16703448]]
action_probability: [[0.16412117 0.15997753 0.15830164 0.16943379 0.18041773 0.1677482 ]]
action_probability: [[0.16406503 0.15989769 0.15847754 0.16953246 0.18048148 0.1675458 ]]
action_probability: [[0.16373132 0.16032168 0.15855305 0.16972381 0.18049234 0.16717789]]
action_probability: [[0.16395663 0.1598778  0.15869144 0.16965581 0.18049358 0.16732469]]
action_probability: [[0.16381781 0.15980548 0.15859804 0.16969123 0.18061414 0.16747327]]
action_probability: [[0.16348732 0.15992749 0.15826382 0.1697534  0.18104875 0.16751923]]
action_pro

action_probability: [[0.16355143 0.15728001 0.1635778  0.17147411 0.17802843 0.16608821]]
action_probability: [[0.16397409 0.15702634 0.16370754 0.17133549 0.17728305 0.16667347]]
action_probability: [[0.16412097 0.15757002 0.16320719 0.17154028 0.17744438 0.1661172 ]]
action_probability: [[0.16395132 0.1573676  0.16327195 0.17152369 0.17754152 0.16634396]]
action_probability: [[0.16394922 0.15717253 0.16326071 0.17183101 0.17769256 0.16609395]]
action_probability: [[0.16358349 0.15755638 0.1633302  0.17199266 0.17757013 0.16596708]]
action_probability: [[0.16351074 0.15764657 0.16315003 0.17171195 0.17769729 0.16628341]]
action_probability: [[0.16370738 0.15763666 0.16308883 0.17162578 0.17758156 0.16635978]]
action_probability: [[0.16367958 0.1572814  0.16336346 0.17181136 0.17729627 0.16656795]]
action_probability: [[0.16389443 0.15723565 0.16326113 0.17190717 0.17736815 0.16633344]]
action_probability: [[0.16367485 0.15747435 0.16317765 0.17179413 0.1774733  0.16640574]]
action_pro

action_probability: [[0.16315576 0.1574238  0.1554581  0.17250377 0.18022412 0.17123446]]
action_probability: [[0.16327393 0.15723161 0.15532622 0.1725508  0.18022107 0.17139637]]
action_probability: [[0.16366018 0.15709478 0.15517186 0.1723519  0.1803994  0.17132184]]
action_probability: [[0.16370761 0.15694097 0.15516403 0.17219175 0.18046291 0.17153274]]
action_probability: [[0.16349418 0.15701005 0.15513057 0.17252173 0.18052046 0.17132306]]
action_probability: [[0.16315314 0.15693739 0.15549225 0.17239723 0.18085964 0.17116033]]
action_probability: [[0.16328885 0.15694268 0.1553484  0.17220265 0.1808503  0.17136708]]
action_probability: [[0.16326897 0.15692168 0.15535618 0.17227772 0.18083926 0.17133616]]
action_probability: [[0.16390875 0.15643579 0.15585949 0.17188457 0.18066074 0.17125063]]
action_probability: [[0.16362184 0.1565835  0.15595755 0.17210148 0.18048353 0.17125209]]
action_probability: [[0.16351885 0.1566548  0.1562154  0.17226231 0.1802512  0.17109743]]
action_pro

action_probability: [[0.16205983 0.15752904 0.15835577 0.17047277 0.1817832  0.16979937]]
action_probability: [[0.1622399  0.15702263 0.1585164  0.1711079  0.18184334 0.1692698 ]]
action_probability: [[0.16225864 0.15703309 0.15847601 0.17122523 0.18182972 0.16917735]]
action_probability: [[0.1617833  0.15747862 0.15837188 0.17125459 0.18180594 0.16930565]]
action_probability: [[0.16184653 0.15738815 0.1584725  0.17120707 0.1817178  0.16936792]]
action_probability: [[0.16197863 0.15734974 0.15836464 0.17096663 0.18188265 0.16945767]]
action_probability: [[0.16222379 0.1571976  0.15829663 0.17103608 0.18196172 0.16928418]]
action_probability: [[0.16201986 0.15722083 0.15837255 0.17110617 0.18186326 0.16941735]]
action_probability: [[0.16212164 0.15714489 0.15835978 0.17103994 0.18197358 0.16936018]]
action_probability: [[0.16146258 0.15728346 0.15885764 0.17147008 0.1813572  0.16956897]]
action_probability: [[0.1628102  0.15840326 0.15787312 0.17336448 0.1788295  0.1687194 ]]
action_pro

action_probability: [[0.16454533 0.15674518 0.158262   0.17257611 0.181664   0.1662074 ]]
action_probability: [[0.1646762  0.15668832 0.15807487 0.17272998 0.18166752 0.16616313]]
action_probability: [[0.16454102 0.15692748 0.15824868 0.17236558 0.18159887 0.16631842]]
action_probability: [[0.16461897 0.15688863 0.15805781 0.17239067 0.18139535 0.16664852]]
action_probability: [[0.16474245 0.15677229 0.15802436 0.17270884 0.18108825 0.16666377]]
action_probability: [[0.16475612 0.15655033 0.1580328  0.17276052 0.18124798 0.16665225]]
action_probability: [[0.16486251 0.15654811 0.15802294 0.17278086 0.18136375 0.16642183]]
action_probability: [[0.16480353 0.15642996 0.15818568 0.17290366 0.18134479 0.16633242]]
action_probability: [[0.16462316 0.15641801 0.1581354  0.17301202 0.18139306 0.16641833]]
action_probability: [[0.1647101  0.15660761 0.15836497 0.17297639 0.18101427 0.1663267 ]]
action_probability: [[0.16484155 0.15630168 0.15838772 0.17313716 0.1810782  0.16625379]]
action_pro

action_probability: [[0.16392308 0.15728068 0.16147226 0.17359556 0.17511559 0.16861276]]
action_probability: [[0.16386355 0.1575397  0.16121021 0.17335522 0.1754401  0.16859126]]
action_probability: [[0.1637481  0.1576494  0.16126573 0.17338501 0.17498365 0.16896808]]
action_probability: [[0.1637365  0.15768157 0.16133694 0.17354318 0.17488654 0.1688152 ]]
action_probability: [[0.16358729 0.15800552 0.16146386 0.17351963 0.17480141 0.1686223 ]]
action_probability: [[0.1635894  0.15800248 0.16146247 0.17352171 0.17480431 0.16861957]]
action_probability: [[0.16359478 0.15813787 0.16142422 0.17337742 0.17478973 0.16867599]]
action_probability: [[0.16350187 0.15796955 0.16124696 0.17361636 0.17511207 0.16855316]]
action_probability: [[0.16360159 0.15798038 0.16150054 0.1738014  0.17464805 0.16846803]]
action_probability: [[0.16797802 0.15754037 0.16077526 0.17059141 0.17490825 0.16820662]]
action_probability: [[0.16526367 0.1554645  0.15928636 0.17340684 0.17877312 0.16780552]]
action_pro

action_probability: [[0.16693278 0.15708497 0.15814035 0.1719893  0.17827626 0.16757642]]
action_probability: [[0.16650957 0.157381   0.15843914 0.17204173 0.17798167 0.16764688]]
action_probability: [[0.16677055 0.15698138 0.15860924 0.17210257 0.17796734 0.16756892]]
action_probability: [[0.166791   0.15705582 0.15865894 0.17206472 0.17796223 0.1674673 ]]
action_probability: [[0.16669504 0.15754174 0.15846589 0.17170784 0.17780949 0.16777992]]
action_probability: [[0.16645148 0.15764862 0.15859154 0.17172    0.17788573 0.16770263]]
action_probability: [[0.16680792 0.15738234 0.15858246 0.1715428  0.17774174 0.1679427 ]]
action_probability: [[0.16702445 0.15699807 0.15873893 0.17173563 0.17754164 0.16796136]]
action_probability: [[0.166601   0.15727851 0.15887488 0.17178291 0.17769116 0.16777149]]
action_probability: [[0.16634877 0.1575684  0.15880607 0.171558   0.1778418  0.167877  ]]
action_probability: [[0.16688003 0.15711313 0.15847014 0.17153783 0.1780509  0.16794798]]
action_pro

action_probability: [[0.1655706  0.15545367 0.15752244 0.16870004 0.18223621 0.17051709]]
action_probability: [[0.16567552 0.15555081 0.15784718 0.16855799 0.18223083 0.17013767]]
action_probability: [[0.16550909 0.15559222 0.1577359  0.1689179  0.1819898  0.17025511]]
action_probability: [[0.16541716 0.15512258 0.15783969 0.16935171 0.18227707 0.16999176]]
action_probability: [[0.16579066 0.15494004 0.15810336 0.16930169 0.18255147 0.16931276]]
action_probability: [[0.16554424 0.15493146 0.15833323 0.16950946 0.18256275 0.16911887]]
action_probability: [[0.16612542 0.15436377 0.15860973 0.16920637 0.18285143 0.16884321]]
action_probability: [[0.16572444 0.15496325 0.15794294 0.16870947 0.18284908 0.16981083]]
action_probability: [[0.16552821 0.15496872 0.158052   0.16907546 0.18245767 0.1699179 ]]
action_probability: [[0.16548617 0.15498695 0.15789752 0.1691514  0.18237105 0.1701069 ]]
action_probability: [[0.1652092  0.15520431 0.15788557 0.16914056 0.18239866 0.17016174]]
action_pro

action_probability: [[0.1649137  0.15557122 0.15812908 0.16825897 0.18321683 0.16991025]]
action_probability: [[0.1649137  0.15557122 0.15812908 0.16825897 0.18321683 0.16991025]]
action_probability: [[0.16518623 0.15537012 0.15827323 0.16818525 0.18306318 0.16992201]]
action_probability: [[0.16525175 0.15525603 0.15830246 0.16836381 0.18315825 0.1696678 ]]
action_probability: [[0.16524509 0.15558408 0.15812004 0.16820933 0.18326837 0.16957302]]
action_probability: [[0.1651842  0.15528508 0.15830766 0.1685389  0.18308367 0.1696005 ]]
action_probability: [[0.16511741 0.15514985 0.15839183 0.16870636 0.1833463  0.16928828]]
action_probability: [[0.16506954 0.15530066 0.15835768 0.16874465 0.1832609  0.16926657]]
action_probability: [[0.16495207 0.15553036 0.15780248 0.16863923 0.18322448 0.16985138]]
action_probability: [[0.16488689 0.15550654 0.15788302 0.16865234 0.18324803 0.16982318]]
action_probability: [[0.16496405 0.15590237 0.15765119 0.16848752 0.18310544 0.16988942]]
action_pro

action_probability: [[0.16570067 0.15512857 0.15797818 0.16763632 0.18340874 0.17014755]]
action_probability: [[0.16581509 0.154914   0.15831831 0.16738805 0.18369178 0.16987272]]
action_probability: [[0.16578446 0.15494113 0.15853055 0.1676961  0.18370953 0.16933818]]
action_probability: [[0.16479062 0.15462165 0.15994023 0.17082042 0.18244945 0.16737762]]
action_probability: [[0.16415331 0.15679803 0.15776701 0.1707916  0.18286777 0.16762234]]
action_probability: [[0.16465527 0.15739079 0.15742454 0.17137778 0.1821909  0.16696067]]
action_probability: [[0.16459279 0.15846354 0.15774255 0.17142604 0.18022074 0.16755435]]
action_probability: [[0.16495112 0.15730217 0.15756321 0.17171878 0.18002596 0.16843876]]
action_probability: [[0.16567904 0.1580174  0.1565452  0.16811645 0.18151127 0.17013066]]
action_probability: [[0.16537836 0.15756634 0.15680921 0.1688866  0.18129328 0.17006625]]
action_probability: [[0.1647762  0.1568746  0.15801716 0.16868052 0.18193267 0.16971883]]
action_pro

action_probability: [[0.1650113  0.15582189 0.1572336  0.16942467 0.18334386 0.16916464]]
action_probability: [[0.16517547 0.15564115 0.15744631 0.16948067 0.1834438  0.16881262]]
action_probability: [[0.16503823 0.1558579  0.15730189 0.16921191 0.18347502 0.16911507]]
action_probability: [[0.16503394 0.15592517 0.15709373 0.16906649 0.18338688 0.16949378]]
action_probability: [[0.16502805 0.1556866  0.15733363 0.16916677 0.18343541 0.1693495 ]]
action_probability: [[0.16499466 0.15570433 0.1573502  0.16914414 0.18351398 0.16929264]]
action_probability: [[0.16510855 0.15553369 0.15729472 0.1693515  0.18377008 0.1689415 ]]
action_probability: [[0.16514406 0.15580861 0.15716965 0.16928622 0.18367805 0.1689134 ]]
action_probability: [[0.16488728 0.15617561 0.1570867  0.16921282 0.18360943 0.16902816]]
action_probability: [[0.1650839  0.15595269 0.15720971 0.16913721 0.18360426 0.16901228]]
action_probability: [[0.16516434 0.15594251 0.15708087 0.16917102 0.18379214 0.16884911]]
action_pro

action_probability: [[0.16610533 0.15593137 0.15807    0.1700156  0.1819528  0.16792494]]
action_probability: [[0.16582988 0.15554903 0.15809982 0.16941763 0.18339285 0.1677108 ]]
action_probability: [[0.16609553 0.1560867  0.15746847 0.1692952  0.18296766 0.16808642]]
action_probability: [[0.16544266 0.15544036 0.15822808 0.16929048 0.1832824  0.16831602]]
action_probability: [[0.16593093 0.15572248 0.15777722 0.16892198 0.18316132 0.16848604]]
action_probability: [[0.16565585 0.15595129 0.15752968 0.16890383 0.18289626 0.16906306]]
action_probability: [[0.16549261 0.1559395  0.15761143 0.16889183 0.18345949 0.16860515]]
action_probability: [[0.16549261 0.1559395  0.15761143 0.16889183 0.18345949 0.16860515]]
action_probability: [[0.16535379 0.15570179 0.15745117 0.1692107  0.18349458 0.16878797]]
action_probability: [[0.16530898 0.15524909 0.15765631 0.16968258 0.18362057 0.16848248]]
action_probability: [[0.16528241 0.15497255 0.15772456 0.16973686 0.18386474 0.16841887]]
action_pro

action_probability: [[0.16454391 0.15689844 0.1563257  0.17013122 0.18301477 0.16908588]]
action_probability: [[0.16460851 0.15681724 0.15627363 0.17023198 0.18294516 0.16912352]]
action_probability: [[0.16477415 0.15669163 0.15617956 0.17015377 0.18306968 0.16913122]]
action_probability: [[0.1647355  0.15686686 0.15599664 0.16976981 0.1832286  0.1694026 ]]
action_probability: [[0.16445522 0.15712439 0.15628994 0.16985835 0.18306313 0.16920893]]
action_probability: [[0.16437827 0.1571107  0.15628639 0.1699583  0.18311267 0.16915366]]
action_probability: [[0.16441031 0.15709788 0.15643866 0.16983265 0.18298118 0.16923934]]
action_probability: [[0.16452004 0.15710661 0.1564808  0.1699164  0.18295592 0.1690202 ]]
action_probability: [[0.16471846 0.15669635 0.15636112 0.16987157 0.18304472 0.16930778]]
action_probability: [[0.16460268 0.15680335 0.1563752  0.16980222 0.18308583 0.16933072]]
action_probability: [[0.16452122 0.15691102 0.15642676 0.16969159 0.18312326 0.1693262 ]]
action_pro

action_probability: [[0.1643203  0.15777938 0.15636696 0.1691243  0.18243627 0.16997278]]
action_probability: [[0.16467993 0.1571914  0.15628472 0.16955763 0.18267135 0.16961497]]
action_probability: [[0.16453259 0.15701665 0.15635662 0.16994806 0.18294424 0.16920178]]
action_probability: [[0.16444597 0.15671586 0.1565066  0.17009465 0.18324074 0.16899617]]
action_probability: [[0.16472504 0.15644665 0.15674146 0.16984569 0.18321902 0.16902219]]
action_probability: [[0.16469537 0.15673676 0.15682988 0.16960803 0.18294124 0.16918874]]
action_probability: [[0.16472498 0.1570238  0.15678659 0.16933367 0.18283609 0.16929486]]
action_probability: [[0.16484423 0.15682557 0.15677842 0.16938041 0.18315865 0.16901271]]
action_probability: [[0.16472632 0.15714127 0.15651177 0.16924116 0.18320677 0.1691727 ]]
action_probability: [[0.16468033 0.15705638 0.15668331 0.16931082 0.18299973 0.16926946]]
action_probability: [[0.16478376 0.15672763 0.15674989 0.1693786  0.18276066 0.16959949]]
action_pro

action_probability: [[0.16424973 0.15739788 0.15688665 0.16893014 0.18255405 0.16998155]]
action_probability: [[0.16437215 0.1571164  0.15682778 0.1690788  0.18295088 0.169654  ]]
action_probability: [[0.16437215 0.1571164  0.15682778 0.1690788  0.18295088 0.169654  ]]
action_probability: [[0.16420828 0.15749122 0.15683544 0.1690485  0.18247508 0.1699415 ]]
action_probability: [[0.16414443 0.15751615 0.15681237 0.16915828 0.1822624  0.17010635]]
action_probability: [[0.1642003  0.15734598 0.15704595 0.16923574 0.18237053 0.16980158]]
action_probability: [[0.16429637 0.15735105 0.15692377 0.16913347 0.18246096 0.1698344 ]]
action_probability: [[0.16385292 0.1576051  0.15669903 0.16923782 0.18256737 0.17003772]]
action_probability: [[0.16402446 0.1577293  0.15663481 0.16908127 0.18246043 0.17006981]]
action_probability: [[0.16390146 0.15789449 0.15677038 0.16900337 0.18236485 0.17006549]]
action_probability: [[0.16424395 0.15750307 0.1566964  0.16914812 0.18265599 0.16975243]]
action_pro

action_probability: [[0.16246854 0.15809886 0.1565075  0.16890213 0.18361367 0.17040935]]
action_probability: [[0.1624311  0.15783778 0.15680787 0.16925353 0.183513   0.1701567 ]]
action_probability: [[0.16186881 0.15861952 0.15654042 0.16939873 0.18274985 0.17082265]]
action_probability: [[0.16215251 0.15867285 0.15761884 0.16824463 0.18263565 0.17067549]]
action_probability: [[0.16263774 0.15794723 0.15630122 0.1684926  0.18407331 0.17054786]]
action_probability: [[0.16241288 0.15842636 0.15588437 0.16812317 0.18432105 0.17083213]]
action_probability: [[0.16219306 0.1583679  0.15490034 0.16794075 0.18515079 0.1714472 ]]
action_probability: [[0.1631097  0.15754299 0.15539862 0.16881843 0.18424882 0.1708814 ]]
action_probability: [[0.16259773 0.15762855 0.15523775 0.16857383 0.18486273 0.17109944]]
action_probability: [[0.16320962 0.15752943 0.15543444 0.16811717 0.18460897 0.17110044]]
action_probability: [[0.16318105 0.15744714 0.15500857 0.16843222 0.18514766 0.1707834 ]]
action_pro

KeyboardInterrupt: 

As you can see, the network not only does much better than random actions, but achieves the goal of 200 points per episode, thus solving the task!