# Module Five Assignment: Cartpole Problem
Review the code in this notebook and in the score_logger.py file in the *scores* folder (directory). Once you have reviewed the code, return to this notebook and select **Cell** and then **Run All** from the menu bar to run this code. The code takes several minutes to run.

In [1]:
import random  
import gym  
import numpy as np  
from collections import deque  
from keras.models import Sequential  
from keras.layers import Dense  
from keras.optimizers import Adam  
  
  
from scores.score_logger import ScoreLogger  
  
ENV_NAME = "CartPole-v1"  
  
GAMMA = 0.95  
LEARNING_RATE = 0.001  
  
MEMORY_SIZE = 1000000  
BATCH_SIZE = 20  
  
EXPLORATION_MAX = 1.0  
EXPLORATION_MIN = 0.01  
EXPLORATION_DECAY = 0.995  
  
  
class DQNSolver:  
  
    def __init__(self, observation_space, action_space):  
        self.exploration_rate = EXPLORATION_MAX  
  
        self.action_space = action_space  
        self.memory = deque(maxlen=MEMORY_SIZE)  
  
        self.model = Sequential()  
        self.model.add(Dense(24, input_shape=(observation_space,), activation="relu"))  
        self.model.add(Dense(24, activation="relu"))  
        self.model.add(Dense(self.action_space, activation="linear"))  
        self.model.compile(loss="mse", optimizer=Adam(lr=LEARNING_RATE))  
  
    def remember(self, state, action, reward, next_state, done):  
        self.memory.append((state, action, reward, next_state, done))  
  
    def act(self, state):  
        if np.random.rand() < self.exploration_rate:  
            return random.randrange(self.action_space)  
        q_values = self.model.predict(state)  
        return np.argmax(q_values[0])  
  
    def experience_replay(self):  
        if len(self.memory) < BATCH_SIZE:  
            return  
        batch = random.sample(self.memory, BATCH_SIZE)  
        for state, action, reward, state_next, terminal in batch:  
            q_update = reward  
            if not terminal:  
                q_update = (reward + GAMMA * np.amax(self.model.predict(state_next)[0]))  
            q_values = self.model.predict(state)  
            q_values[0][action] = q_update  
            self.model.fit(state, q_values, verbose=0)  
        self.exploration_rate *= EXPLORATION_DECAY  
        self.exploration_rate = max(EXPLORATION_MIN, self.exploration_rate)  
  
  
def cartpole():  
    env = gym.make(ENV_NAME)  
    score_logger = ScoreLogger(ENV_NAME)  
    observation_space = env.observation_space.shape[0]  
    action_space = env.action_space.n  
    dqn_solver = DQNSolver(observation_space, action_space)  
    run = 0  
    while True:  
        run += 1  
        state = env.reset()  
        state = np.reshape(state, [1, observation_space])  
        step = 0  
        while True:  
            step += 1  
            #env.render()  
            action = dqn_solver.act(state)  
            state_next, reward, terminal, info = env.step(action)  
            reward = reward if not terminal else -reward  
            state_next = np.reshape(state_next, [1, observation_space])  
            dqn_solver.remember(state, action, reward, state_next, terminal)  
            state = state_next  
            if terminal:  
                print ("Run: " + str(run) + ", exploration: " + str(dqn_solver.exploration_rate) + ", score: " + str(step))  
                score_logger.add_score(step, run)  
                break  
            dqn_solver.experience_replay()  



Using TensorFlow backend.


In [2]:
cartpole()

Run: 1, exploration: 1.0, score: 15
Scores: (min: 15, avg: 15, max: 15)

Run: 2, exploration: 0.9558895783575597, score: 14
Scores: (min: 14, avg: 14.5, max: 15)

Run: 3, exploration: 0.8911090557802088, score: 15
Scores: (min: 14, avg: 14.666666666666666, max: 15)

Run: 4, exploration: 0.7744209942832988, score: 29
Scores: (min: 14, avg: 18.25, max: 29)

Run: 5, exploration: 0.7183288830986236, score: 16
Scores: (min: 14, avg: 17.8, max: 29)

Run: 6, exploration: 0.6730128848950395, score: 14
Scores: (min: 14, avg: 17.166666666666668, max: 29)

Run: 7, exploration: 0.6305556603555866, score: 14
Scores: (min: 14, avg: 16.714285714285715, max: 29)

Run: 8, exploration: 0.5937455908197752, score: 13
Scores: (min: 13, avg: 16.25, max: 29)

Run: 9, exploration: 0.531750826943791, score: 23
Scores: (min: 13, avg: 17, max: 29)

Run: 10, exploration: 0.4907693883854626, score: 17
Scores: (min: 13, avg: 17, max: 29)

Run: 11, exploration: 0.457510005540005, score: 15
Scores: (min: 13, avg: 16.

Run: 89, exploration: 0.01, score: 153
Scores: (min: 10, avg: 133.53932584269663, max: 352)

Run: 90, exploration: 0.01, score: 149
Scores: (min: 10, avg: 133.7111111111111, max: 352)

Run: 91, exploration: 0.01, score: 133
Scores: (min: 10, avg: 133.7032967032967, max: 352)

Run: 92, exploration: 0.01, score: 215
Scores: (min: 10, avg: 134.58695652173913, max: 352)

Run: 93, exploration: 0.01, score: 193
Scores: (min: 10, avg: 135.21505376344086, max: 352)

Run: 94, exploration: 0.01, score: 182
Scores: (min: 10, avg: 135.7127659574468, max: 352)

Run: 95, exploration: 0.01, score: 161
Scores: (min: 10, avg: 135.97894736842105, max: 352)

Run: 96, exploration: 0.01, score: 185
Scores: (min: 10, avg: 136.48958333333334, max: 352)

Run: 97, exploration: 0.01, score: 199
Scores: (min: 10, avg: 137.1340206185567, max: 352)

Run: 98, exploration: 0.01, score: 127
Scores: (min: 10, avg: 137.03061224489795, max: 352)

Run: 99, exploration: 0.01, score: 264
Scores: (min: 10, avg: 138.31313131

NameError: name 'exit' is not defined

Note: If the code is running properly, you should begin to see output appearing above this code block. It will take several minutes, so it is recommended that you let this code run in the background while completing other work. When the code has finished, it will print output saying, "Solved in _ runs, _ total runs."

You may see an error about not having an exit command. This error does not affect the program's functionality and results from the steps taken to convert the code from Python 2.x to Python 3. Please disregard this error.

Adjusting Exploration factor

In [7]:
EXPLORATION_DECAY = 0.9

cartpole()

Run: 1, exploration: 1.0, score: 17
Scores: (min: 17, avg: 17, max: 17)

Run: 2, exploration: 0.1500946352969992, score: 21
Scores: (min: 17, avg: 19, max: 21)

Run: 3, exploration: 0.0523347633027361, score: 11
Scores: (min: 11, avg: 16.333333333333332, max: 21)

Run: 4, exploration: 0.020275559590445278, score: 10
Scores: (min: 10, avg: 14.75, max: 21)

Run: 5, exploration: 0.01, score: 10
Scores: (min: 10, avg: 13.8, max: 21)

Run: 6, exploration: 0.01, score: 9
Scores: (min: 9, avg: 13, max: 21)

Run: 7, exploration: 0.01, score: 8
Scores: (min: 8, avg: 12.285714285714286, max: 21)

Run: 8, exploration: 0.01, score: 10
Scores: (min: 8, avg: 12, max: 21)

Run: 9, exploration: 0.01, score: 11
Scores: (min: 8, avg: 11.88888888888889, max: 21)

Run: 10, exploration: 0.01, score: 11
Scores: (min: 8, avg: 11.8, max: 21)

Run: 11, exploration: 0.01, score: 9
Scores: (min: 8, avg: 11.545454545454545, max: 21)

Run: 12, exploration: 0.01, score: 13
Scores: (min: 8, avg: 11.666666666666666, 

Run: 95, exploration: 0.01, score: 212
Scores: (min: 8, avg: 120.70526315789473, max: 384)

Run: 96, exploration: 0.01, score: 212
Scores: (min: 8, avg: 121.65625, max: 384)

Run: 97, exploration: 0.01, score: 239
Scores: (min: 8, avg: 122.8659793814433, max: 384)

Run: 98, exploration: 0.01, score: 294
Scores: (min: 8, avg: 124.61224489795919, max: 384)

Run: 99, exploration: 0.01, score: 265
Scores: (min: 8, avg: 126.03030303030303, max: 384)

Run: 100, exploration: 0.01, score: 186
Scores: (min: 8, avg: 126.63, max: 384)

Run: 101, exploration: 0.01, score: 233
Scores: (min: 8, avg: 128.79, max: 384)

Run: 102, exploration: 0.01, score: 443
Scores: (min: 8, avg: 133.01, max: 443)

Run: 103, exploration: 0.01, score: 175
Scores: (min: 8, avg: 134.65, max: 443)

Run: 104, exploration: 0.01, score: 223
Scores: (min: 8, avg: 136.78, max: 443)

Run: 105, exploration: 0.01, score: 209
Scores: (min: 8, avg: 138.77, max: 443)

Run: 106, exploration: 0.01, score: 341
Scores: (min: 8, avg: 14

NameError: name 'exit' is not defined

In [8]:
EXPLORATION_DECAY = 0.999

cartpole()

Run: 1, exploration: 1.0, score: 11
Scores: (min: 11, avg: 11, max: 11)

Run: 2, exploration: 0.9900448802097482, score: 19
Scores: (min: 11, avg: 15, max: 19)

Run: 3, exploration: 0.9792086759647052, score: 12
Scores: (min: 11, avg: 14, max: 19)

Run: 4, exploration: 0.9240034012385749, score: 59
Scores: (min: 11, avg: 25.25, max: 59)

Run: 5, exploration: 0.8886435861147077, score: 40
Scores: (min: 11, avg: 28.2, max: 59)

Run: 6, exploration: 0.8789172357313328, score: 12
Scores: (min: 11, avg: 25.5, max: 59)

Run: 7, exploration: 0.8615048875706075, score: 21
Scores: (min: 11, avg: 24.857142857142858, max: 59)

Run: 8, exploration: 0.8520755747117399, score: 12
Scores: (min: 11, avg: 23.25, max: 59)

Run: 9, exploration: 0.838544138970058, score: 17
Scores: (min: 11, avg: 22.555555555555557, max: 59)

Run: 10, exploration: 0.8285367691502946, score: 13
Scores: (min: 11, avg: 21.6, max: 59)

Run: 11, exploration: 0.8064546837933355, score: 28
Scores: (min: 11, avg: 22.1818181818181

Run: 87, exploration: 0.01, score: 141
Scores: (min: 11, avg: 141.45977011494253, max: 422)

Run: 88, exploration: 0.01, score: 245
Scores: (min: 11, avg: 142.63636363636363, max: 422)

Run: 89, exploration: 0.01, score: 151
Scores: (min: 11, avg: 142.73033707865167, max: 422)

Run: 90, exploration: 0.01, score: 158
Scores: (min: 11, avg: 142.9, max: 422)

Run: 91, exploration: 0.01, score: 202
Scores: (min: 11, avg: 143.54945054945054, max: 422)

Run: 92, exploration: 0.01, score: 186
Scores: (min: 11, avg: 144.0108695652174, max: 422)

Run: 93, exploration: 0.01, score: 135
Scores: (min: 11, avg: 143.91397849462365, max: 422)

Run: 94, exploration: 0.01, score: 173
Scores: (min: 11, avg: 144.22340425531914, max: 422)

Run: 95, exploration: 0.01, score: 210
Scores: (min: 11, avg: 144.9157894736842, max: 422)

Run: 96, exploration: 0.01, score: 238
Scores: (min: 11, avg: 145.88541666666666, max: 422)

Run: 97, exploration: 0.01, score: 251
Scores: (min: 11, avg: 146.96907216494844, max

NameError: name 'exit' is not defined

In [None]:
GAMMA = 0.99

cartpole()

Run: 1, exploration: 0.9811700348643991, score: 39
Scores: (min: 39, avg: 39, max: 39)

Run: 2, exploration: 0.9540649618417361, score: 29
Scores: (min: 29, avg: 34, max: 39)

Run: 3, exploration: 0.9342286880693633, score: 22
Scores: (min: 22, avg: 30, max: 39)

Run: 4, exploration: 0.9102399514140735, score: 27
Scores: (min: 22, avg: 29.25, max: 39)

Run: 5, exploration: 0.9011784036598737, score: 11
Scores: (min: 11, avg: 25.6, max: 39)

Run: 6, exploration: 0.857205969570888, score: 51
Scores: (min: 11, avg: 29.833333333333332, max: 51)

Run: 7, exploration: 0.8419067177676068, score: 19
Scores: (min: 11, avg: 28.285714285714285, max: 51)

Run: 8, exploration: 0.818648829478636, score: 29
Scores: (min: 11, avg: 28.375, max: 51)

Run: 9, exploration: 0.8064546837933355, score: 16
Scores: (min: 11, avg: 27, max: 51)

Run: 10, exploration: 0.759467619567903, score: 61
Scores: (min: 11, avg: 30.4, max: 61)

Run: 11, exploration: 0.7407070321560997, score: 26
Scores: (min: 11, avg: 30, 

Run: 86, exploration: 0.01, score: 500
Scores: (min: 10, avg: 168.53488372093022, max: 500)

Run: 87, exploration: 0.01, score: 225
Scores: (min: 10, avg: 169.183908045977, max: 500)

Run: 88, exploration: 0.01, score: 264
Scores: (min: 10, avg: 170.26136363636363, max: 500)

Run: 89, exploration: 0.01, score: 149
Scores: (min: 10, avg: 170.02247191011236, max: 500)

Run: 90, exploration: 0.01, score: 9
Scores: (min: 9, avg: 168.23333333333332, max: 500)

Run: 91, exploration: 0.01, score: 10
Scores: (min: 9, avg: 166.4945054945055, max: 500)

Run: 92, exploration: 0.01, score: 9
Scores: (min: 9, avg: 164.7826086956522, max: 500)

Run: 93, exploration: 0.01, score: 10
Scores: (min: 9, avg: 163.11827956989248, max: 500)

Run: 94, exploration: 0.01, score: 9
Scores: (min: 9, avg: 161.4787234042553, max: 500)

Run: 95, exploration: 0.01, score: 10
Scores: (min: 9, avg: 159.8842105263158, max: 500)

Run: 96, exploration: 0.01, score: 10
Scores: (min: 9, avg: 158.32291666666666, max: 500)



Run: 188, exploration: 0.01, score: 41
Scores: (min: 8, avg: 11.62, max: 149)

Run: 189, exploration: 0.01, score: 18
Scores: (min: 8, avg: 10.31, max: 41)

Run: 190, exploration: 0.01, score: 95
Scores: (min: 8, avg: 11.17, max: 95)

Run: 191, exploration: 0.01, score: 118
Scores: (min: 8, avg: 12.25, max: 118)

Run: 192, exploration: 0.01, score: 428
Scores: (min: 8, avg: 16.44, max: 428)

Run: 193, exploration: 0.01, score: 203
Scores: (min: 8, avg: 18.37, max: 428)

Run: 194, exploration: 0.01, score: 355
Scores: (min: 8, avg: 21.83, max: 428)

Run: 195, exploration: 0.01, score: 166
Scores: (min: 8, avg: 23.39, max: 428)

Run: 196, exploration: 0.01, score: 157
Scores: (min: 8, avg: 24.86, max: 428)

Run: 197, exploration: 0.01, score: 210
Scores: (min: 8, avg: 26.86, max: 428)

Run: 198, exploration: 0.01, score: 221
Scores: (min: 8, avg: 28.98, max: 428)

Run: 199, exploration: 0.01, score: 218
Scores: (min: 8, avg: 31.07, max: 428)

Run: 200, exploration: 0.01, score: 500
Score

Run: 292, exploration: 0.01, score: 10
Scores: (min: 8, avg: 110.65, max: 500)

Run: 293, exploration: 0.01, score: 9
Scores: (min: 8, avg: 108.71, max: 500)

Run: 294, exploration: 0.01, score: 10
Scores: (min: 8, avg: 105.26, max: 500)

Run: 295, exploration: 0.01, score: 10
Scores: (min: 8, avg: 103.7, max: 500)

Run: 296, exploration: 0.01, score: 11
Scores: (min: 8, avg: 102.24, max: 500)

Run: 297, exploration: 0.01, score: 10
Scores: (min: 8, avg: 100.24, max: 500)

Run: 298, exploration: 0.01, score: 9
Scores: (min: 8, avg: 98.12, max: 500)

Run: 299, exploration: 0.01, score: 9
Scores: (min: 8, avg: 96.03, max: 500)

Run: 300, exploration: 0.01, score: 10
Scores: (min: 8, avg: 91.13, max: 500)

Run: 301, exploration: 0.01, score: 9
Scores: (min: 8, avg: 89.53, max: 500)

Run: 302, exploration: 0.01, score: 8
Scores: (min: 8, avg: 87.64, max: 500)

Run: 303, exploration: 0.01, score: 10
Scores: (min: 8, avg: 82.74, max: 500)

Run: 304, exploration: 0.01, score: 10
Scores: (min:

Run: 397, exploration: 0.01, score: 103
Scores: (min: 8, avg: 23.46, max: 217)

Run: 398, exploration: 0.01, score: 124
Scores: (min: 8, avg: 24.61, max: 217)

Run: 399, exploration: 0.01, score: 92
Scores: (min: 8, avg: 25.44, max: 217)

Run: 400, exploration: 0.01, score: 122
Scores: (min: 8, avg: 26.56, max: 217)

Run: 401, exploration: 0.01, score: 147
Scores: (min: 8, avg: 27.94, max: 217)

Run: 402, exploration: 0.01, score: 133
Scores: (min: 8, avg: 29.19, max: 217)

Run: 403, exploration: 0.01, score: 128
Scores: (min: 8, avg: 30.37, max: 217)

Run: 404, exploration: 0.01, score: 138
Scores: (min: 8, avg: 31.65, max: 217)

Run: 405, exploration: 0.01, score: 261
Scores: (min: 8, avg: 34.15, max: 261)

Run: 406, exploration: 0.01, score: 225
Scores: (min: 8, avg: 36.31, max: 261)

Run: 407, exploration: 0.01, score: 189
Scores: (min: 8, avg: 38.1, max: 261)

Run: 408, exploration: 0.01, score: 310
Scores: (min: 8, avg: 41.1, max: 310)

Run: 409, exploration: 0.01, score: 424
Sco

Run: 502, exploration: 0.01, score: 9
Scores: (min: 8, avg: 35.37, max: 424)

Run: 503, exploration: 0.01, score: 9
Scores: (min: 8, avg: 34.18, max: 424)

Run: 504, exploration: 0.01, score: 9
Scores: (min: 8, avg: 32.89, max: 424)

Run: 505, exploration: 0.01, score: 10
Scores: (min: 8, avg: 30.38, max: 424)

Run: 506, exploration: 0.01, score: 10
Scores: (min: 8, avg: 28.23, max: 424)

Run: 507, exploration: 0.01, score: 9
Scores: (min: 8, avg: 26.43, max: 424)

Run: 508, exploration: 0.01, score: 10
Scores: (min: 8, avg: 23.43, max: 424)

Run: 509, exploration: 0.01, score: 9
Scores: (min: 8, avg: 19.28, max: 219)

Run: 510, exploration: 0.01, score: 8
Scores: (min: 8, avg: 18.09, max: 219)

Run: 511, exploration: 0.01, score: 8
Scores: (min: 8, avg: 16.51, max: 219)

Run: 512, exploration: 0.01, score: 9
Scores: (min: 8, avg: 15.06, max: 219)

Run: 513, exploration: 0.01, score: 10
Scores: (min: 8, avg: 12.97, max: 123)

Run: 514, exploration: 0.01, score: 12
Scores: (min: 8, avg:

Run: 606, exploration: 0.01, score: 258
Scores: (min: 8, avg: 110.21, max: 356)

Run: 607, exploration: 0.01, score: 96
Scores: (min: 8, avg: 111.08, max: 356)

Run: 608, exploration: 0.01, score: 178
Scores: (min: 8, avg: 112.76, max: 356)

Run: 609, exploration: 0.01, score: 202
Scores: (min: 8, avg: 114.69, max: 356)

Run: 610, exploration: 0.01, score: 292
Scores: (min: 8, avg: 117.53, max: 356)

Run: 611, exploration: 0.01, score: 262
Scores: (min: 9, avg: 120.07, max: 356)

Run: 612, exploration: 0.01, score: 117
Scores: (min: 9, avg: 121.15, max: 356)

Run: 613, exploration: 0.01, score: 232
Scores: (min: 9, avg: 123.37, max: 356)

Run: 614, exploration: 0.01, score: 163
Scores: (min: 9, avg: 124.88, max: 356)

Run: 615, exploration: 0.01, score: 138
Scores: (min: 12, avg: 126.17, max: 356)

Run: 616, exploration: 0.01, score: 161
Scores: (min: 12, avg: 127.66, max: 356)

Run: 617, exploration: 0.01, score: 301
Scores: (min: 14, avg: 130.55, max: 356)

Run: 618, exploration: 0.0

In [2]:
GAMMA = 0.5

cartpole()

Run: 1, exploration: 0.9275689688183278, score: 35
Scores: (min: 35, avg: 35, max: 35)

Run: 2, exploration: 0.8603841919146962, score: 16
Scores: (min: 16, avg: 25.5, max: 35)

Run: 3, exploration: 0.7861544476842928, score: 19
Scores: (min: 16, avg: 23.333333333333332, max: 35)

Run: 4, exploration: 0.7292124703704616, score: 16
Scores: (min: 16, avg: 21.5, max: 35)

Run: 5, exploration: 0.6866430931872001, score: 13
Scores: (min: 13, avg: 19.8, max: 35)

Run: 6, exploration: 0.6433260027715241, score: 14
Scores: (min: 13, avg: 18.833333333333332, max: 35)

Run: 7, exploration: 0.5535075230322891, score: 31
Scores: (min: 13, avg: 20.571428571428573, max: 35)

Run: 8, exploration: 0.510849320360386, score: 17
Scores: (min: 13, avg: 20.125, max: 35)

Run: 9, exploration: 0.4159480862733536, score: 42
Scores: (min: 13, avg: 22.555555555555557, max: 42)

Run: 10, exploration: 0.37627099809304654, score: 21
Scores: (min: 13, avg: 22.4, max: 42)

Run: 11, exploration: 0.3455358541129786, s

Run: 90, exploration: 0.01, score: 34
Scores: (min: 13, avg: 86.02222222222223, max: 486)

Run: 91, exploration: 0.01, score: 53
Scores: (min: 13, avg: 85.65934065934066, max: 486)

Run: 92, exploration: 0.01, score: 122
Scores: (min: 13, avg: 86.05434782608695, max: 486)

Run: 93, exploration: 0.01, score: 189
Scores: (min: 13, avg: 87.16129032258064, max: 486)

Run: 94, exploration: 0.01, score: 22
Scores: (min: 13, avg: 86.46808510638297, max: 486)

Run: 95, exploration: 0.01, score: 19
Scores: (min: 13, avg: 85.7578947368421, max: 486)

Run: 96, exploration: 0.01, score: 41
Scores: (min: 13, avg: 85.29166666666667, max: 486)

Run: 97, exploration: 0.01, score: 61
Scores: (min: 13, avg: 85.04123711340206, max: 486)

Run: 98, exploration: 0.01, score: 15
Scores: (min: 13, avg: 84.3265306122449, max: 486)

Run: 99, exploration: 0.01, score: 42
Scores: (min: 13, avg: 83.8989898989899, max: 486)

Run: 100, exploration: 0.01, score: 38
Scores: (min: 13, avg: 83.44, max: 486)

Run: 101, e

Run: 192, exploration: 0.01, score: 67
Scores: (min: 10, avg: 60.25, max: 422)

Run: 193, exploration: 0.01, score: 116
Scores: (min: 10, avg: 59.52, max: 422)

Run: 194, exploration: 0.01, score: 84
Scores: (min: 10, avg: 60.14, max: 422)

Run: 195, exploration: 0.01, score: 12
Scores: (min: 10, avg: 60.07, max: 422)

Run: 196, exploration: 0.01, score: 122
Scores: (min: 10, avg: 60.88, max: 422)

Run: 197, exploration: 0.01, score: 11
Scores: (min: 10, avg: 60.38, max: 422)

Run: 198, exploration: 0.01, score: 44
Scores: (min: 10, avg: 60.67, max: 422)

Run: 199, exploration: 0.01, score: 40
Scores: (min: 10, avg: 60.65, max: 422)

Run: 200, exploration: 0.01, score: 64
Scores: (min: 10, avg: 60.91, max: 422)

Run: 201, exploration: 0.01, score: 11
Scores: (min: 10, avg: 60.13, max: 422)

Run: 202, exploration: 0.01, score: 11
Scores: (min: 10, avg: 60.03, max: 422)

Run: 203, exploration: 0.01, score: 31
Scores: (min: 10, avg: 59.86, max: 422)

Run: 204, exploration: 0.01, score: 11

Run: 295, exploration: 0.01, score: 16
Scores: (min: 10, avg: 52.02, max: 175)

Run: 296, exploration: 0.01, score: 12
Scores: (min: 10, avg: 50.92, max: 175)

Run: 297, exploration: 0.01, score: 32
Scores: (min: 10, avg: 51.13, max: 175)

Run: 298, exploration: 0.01, score: 90
Scores: (min: 10, avg: 51.59, max: 175)

Run: 299, exploration: 0.01, score: 52
Scores: (min: 10, avg: 51.71, max: 175)

Run: 300, exploration: 0.01, score: 35
Scores: (min: 10, avg: 51.42, max: 175)

Run: 301, exploration: 0.01, score: 57
Scores: (min: 10, avg: 51.88, max: 175)

Run: 302, exploration: 0.01, score: 13
Scores: (min: 10, avg: 51.9, max: 175)

Run: 303, exploration: 0.01, score: 30
Scores: (min: 10, avg: 51.89, max: 175)

Run: 304, exploration: 0.01, score: 76
Scores: (min: 10, avg: 51.55, max: 175)

Run: 305, exploration: 0.01, score: 99
Scores: (min: 10, avg: 52.34, max: 175)

Run: 306, exploration: 0.01, score: 94
Scores: (min: 10, avg: 52.4, max: 175)

Run: 307, exploration: 0.01, score: 52
Sco

Run: 398, exploration: 0.01, score: 87
Scores: (min: 10, avg: 49.37, max: 166)

Run: 399, exploration: 0.01, score: 11
Scores: (min: 10, avg: 48.96, max: 166)

Run: 400, exploration: 0.01, score: 65
Scores: (min: 10, avg: 49.26, max: 166)

Run: 401, exploration: 0.01, score: 73
Scores: (min: 10, avg: 49.42, max: 166)

Run: 402, exploration: 0.01, score: 48
Scores: (min: 10, avg: 49.77, max: 166)

Run: 403, exploration: 0.01, score: 18
Scores: (min: 10, avg: 49.65, max: 166)

Run: 404, exploration: 0.01, score: 31
Scores: (min: 10, avg: 49.2, max: 166)

Run: 405, exploration: 0.01, score: 70
Scores: (min: 10, avg: 48.91, max: 166)

Run: 406, exploration: 0.01, score: 31
Scores: (min: 10, avg: 48.28, max: 166)

Run: 407, exploration: 0.01, score: 28
Scores: (min: 10, avg: 48.04, max: 166)

Run: 408, exploration: 0.01, score: 104
Scores: (min: 10, avg: 48.47, max: 166)

Run: 409, exploration: 0.01, score: 27
Scores: (min: 10, avg: 47.39, max: 166)

Run: 410, exploration: 0.01, score: 47
S

Run: 501, exploration: 0.01, score: 36
Scores: (min: 10, avg: 46.56, max: 136)

Run: 502, exploration: 0.01, score: 124
Scores: (min: 10, avg: 47.32, max: 136)

Run: 503, exploration: 0.01, score: 15
Scores: (min: 10, avg: 47.29, max: 136)

Run: 504, exploration: 0.01, score: 44
Scores: (min: 10, avg: 47.42, max: 136)

Run: 505, exploration: 0.01, score: 17
Scores: (min: 10, avg: 46.89, max: 136)

Run: 506, exploration: 0.01, score: 37
Scores: (min: 10, avg: 46.95, max: 136)

Run: 507, exploration: 0.01, score: 11
Scores: (min: 10, avg: 46.78, max: 136)

Run: 508, exploration: 0.01, score: 107
Scores: (min: 10, avg: 46.81, max: 136)

Run: 509, exploration: 0.01, score: 108
Scores: (min: 10, avg: 47.62, max: 136)

Run: 510, exploration: 0.01, score: 12
Scores: (min: 10, avg: 47.27, max: 136)

Run: 511, exploration: 0.01, score: 130
Scores: (min: 10, avg: 47.86, max: 136)

Run: 512, exploration: 0.01, score: 12
Scores: (min: 10, avg: 47.87, max: 136)

Run: 513, exploration: 0.01, score: 

Run: 604, exploration: 0.01, score: 12
Scores: (min: 10, avg: 51.55, max: 149)

Run: 605, exploration: 0.01, score: 19
Scores: (min: 10, avg: 51.57, max: 149)

Run: 606, exploration: 0.01, score: 49
Scores: (min: 10, avg: 51.69, max: 149)

Run: 607, exploration: 0.01, score: 10
Scores: (min: 10, avg: 51.68, max: 149)

Run: 608, exploration: 0.01, score: 50
Scores: (min: 10, avg: 51.11, max: 149)

Run: 609, exploration: 0.01, score: 79
Scores: (min: 10, avg: 50.82, max: 149)

Run: 610, exploration: 0.01, score: 11
Scores: (min: 10, avg: 50.81, max: 149)

Run: 611, exploration: 0.01, score: 27
Scores: (min: 10, avg: 49.78, max: 149)

Run: 612, exploration: 0.01, score: 11
Scores: (min: 10, avg: 49.77, max: 149)

Run: 613, exploration: 0.01, score: 94
Scores: (min: 10, avg: 50.6, max: 149)

Run: 614, exploration: 0.01, score: 13
Scores: (min: 10, avg: 49.81, max: 149)

Run: 615, exploration: 0.01, score: 27
Scores: (min: 10, avg: 49.98, max: 149)

Run: 616, exploration: 0.01, score: 73
Sc

Run: 708, exploration: 0.01, score: 10
Scores: (min: 9, avg: 43.87, max: 163)

Run: 709, exploration: 0.01, score: 70
Scores: (min: 9, avg: 43.78, max: 163)

Run: 710, exploration: 0.01, score: 11
Scores: (min: 9, avg: 43.78, max: 163)

Run: 711, exploration: 0.01, score: 11
Scores: (min: 9, avg: 43.62, max: 163)

Run: 712, exploration: 0.01, score: 40
Scores: (min: 9, avg: 43.91, max: 163)

Run: 713, exploration: 0.01, score: 10
Scores: (min: 9, avg: 43.07, max: 163)

Run: 714, exploration: 0.01, score: 15
Scores: (min: 9, avg: 43.09, max: 163)

Run: 715, exploration: 0.01, score: 58
Scores: (min: 9, avg: 43.4, max: 163)

Run: 716, exploration: 0.01, score: 10
Scores: (min: 9, avg: 42.77, max: 163)

Run: 717, exploration: 0.01, score: 223
Scores: (min: 9, avg: 44.79, max: 223)

Run: 718, exploration: 0.01, score: 45
Scores: (min: 9, avg: 44.9, max: 223)

Run: 719, exploration: 0.01, score: 11
Scores: (min: 9, avg: 44.64, max: 223)

Run: 720, exploration: 0.01, score: 56
Scores: (min: 

Run: 812, exploration: 0.01, score: 11
Scores: (min: 10, avg: 50.07, max: 223)

Run: 813, exploration: 0.01, score: 75
Scores: (min: 10, avg: 50.72, max: 223)

Run: 814, exploration: 0.01, score: 11
Scores: (min: 10, avg: 50.68, max: 223)

Run: 815, exploration: 0.01, score: 38
Scores: (min: 10, avg: 50.48, max: 223)

Run: 816, exploration: 0.01, score: 109
Scores: (min: 10, avg: 51.47, max: 223)

Run: 817, exploration: 0.01, score: 45
Scores: (min: 10, avg: 49.69, max: 204)

Run: 818, exploration: 0.01, score: 65
Scores: (min: 10, avg: 49.89, max: 204)

Run: 819, exploration: 0.01, score: 73
Scores: (min: 10, avg: 50.51, max: 204)

Run: 820, exploration: 0.01, score: 15
Scores: (min: 10, avg: 50.1, max: 204)

Run: 821, exploration: 0.01, score: 17
Scores: (min: 10, avg: 50.15, max: 204)

Run: 822, exploration: 0.01, score: 32
Scores: (min: 10, avg: 50.36, max: 204)

Run: 823, exploration: 0.01, score: 54
Scores: (min: 10, avg: 50.64, max: 204)

Run: 824, exploration: 0.01, score: 16
S

Run: 916, exploration: 0.01, score: 14
Scores: (min: 9, avg: 40.81, max: 145)

Run: 917, exploration: 0.01, score: 12
Scores: (min: 9, avg: 40.48, max: 145)

Run: 918, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.95, max: 145)

Run: 919, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.34, max: 145)

Run: 920, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.31, max: 145)

Run: 921, exploration: 0.01, score: 47
Scores: (min: 9, avg: 39.61, max: 145)

Run: 922, exploration: 0.01, score: 13
Scores: (min: 9, avg: 39.42, max: 145)

Run: 923, exploration: 0.01, score: 35
Scores: (min: 9, avg: 39.23, max: 145)

Run: 924, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.19, max: 145)

Run: 925, exploration: 0.01, score: 36
Scores: (min: 9, avg: 38.43, max: 145)

Run: 926, exploration: 0.01, score: 16
Scores: (min: 9, avg: 38.08, max: 145)

Run: 927, exploration: 0.01, score: 72
Scores: (min: 9, avg: 38.18, max: 145)

Run: 928, exploration: 0.01, score: 16
Scores: (min:

Run: 1019, exploration: 0.01, score: 20
Scores: (min: 10, avg: 39.08, max: 154)

Run: 1020, exploration: 0.01, score: 21
Scores: (min: 10, avg: 39.17, max: 154)

Run: 1021, exploration: 0.01, score: 11
Scores: (min: 10, avg: 38.81, max: 154)

Run: 1022, exploration: 0.01, score: 11
Scores: (min: 10, avg: 38.79, max: 154)

Run: 1023, exploration: 0.01, score: 47
Scores: (min: 10, avg: 38.91, max: 154)

Run: 1024, exploration: 0.01, score: 31
Scores: (min: 10, avg: 39.1, max: 154)

Run: 1025, exploration: 0.01, score: 110
Scores: (min: 10, avg: 39.84, max: 154)

Run: 1026, exploration: 0.01, score: 17
Scores: (min: 10, avg: 39.85, max: 154)

Run: 1027, exploration: 0.01, score: 22
Scores: (min: 10, avg: 39.35, max: 154)

Run: 1028, exploration: 0.01, score: 20
Scores: (min: 10, avg: 39.39, max: 154)

Run: 1029, exploration: 0.01, score: 47
Scores: (min: 10, avg: 39.42, max: 154)

Run: 1030, exploration: 0.01, score: 12
Scores: (min: 10, avg: 38.76, max: 154)

Run: 1031, exploration: 0.01

Run: 1121, exploration: 0.01, score: 11
Scores: (min: 9, avg: 40.11, max: 137)

Run: 1122, exploration: 0.01, score: 21
Scores: (min: 9, avg: 40.21, max: 137)

Run: 1123, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.86, max: 137)

Run: 1124, exploration: 0.01, score: 13
Scores: (min: 9, avg: 39.68, max: 137)

Run: 1125, exploration: 0.01, score: 11
Scores: (min: 9, avg: 38.69, max: 137)

Run: 1126, exploration: 0.01, score: 10
Scores: (min: 9, avg: 38.62, max: 137)

Run: 1127, exploration: 0.01, score: 14
Scores: (min: 9, avg: 38.54, max: 137)

Run: 1128, exploration: 0.01, score: 11
Scores: (min: 9, avg: 38.45, max: 137)

Run: 1129, exploration: 0.01, score: 42
Scores: (min: 9, avg: 38.4, max: 137)

Run: 1130, exploration: 0.01, score: 16
Scores: (min: 9, avg: 38.44, max: 137)

Run: 1131, exploration: 0.01, score: 84
Scores: (min: 9, avg: 39.17, max: 137)

Run: 1132, exploration: 0.01, score: 12
Scores: (min: 9, avg: 38.98, max: 137)

Run: 1133, exploration: 0.01, score: 161


Run: 1224, exploration: 0.01, score: 9
Scores: (min: 9, avg: 37.8, max: 161)

Run: 1225, exploration: 0.01, score: 10
Scores: (min: 9, avg: 37.79, max: 161)

Run: 1226, exploration: 0.01, score: 11
Scores: (min: 9, avg: 37.8, max: 161)

Run: 1227, exploration: 0.01, score: 19
Scores: (min: 9, avg: 37.85, max: 161)

Run: 1228, exploration: 0.01, score: 12
Scores: (min: 9, avg: 37.86, max: 161)

Run: 1229, exploration: 0.01, score: 39
Scores: (min: 9, avg: 37.83, max: 161)

Run: 1230, exploration: 0.01, score: 16
Scores: (min: 9, avg: 37.83, max: 161)

Run: 1231, exploration: 0.01, score: 26
Scores: (min: 9, avg: 37.25, max: 161)

Run: 1232, exploration: 0.01, score: 91
Scores: (min: 9, avg: 38.04, max: 161)

Run: 1233, exploration: 0.01, score: 101
Scores: (min: 9, avg: 37.44, max: 145)

Run: 1234, exploration: 0.01, score: 104
Scores: (min: 9, avg: 38.38, max: 145)

Run: 1235, exploration: 0.01, score: 56
Scores: (min: 9, avg: 38.19, max: 145)

Run: 1236, exploration: 0.01, score: 48
S

Run: 1327, exploration: 0.01, score: 76
Scores: (min: 9, avg: 44.57, max: 129)

Run: 1328, exploration: 0.01, score: 75
Scores: (min: 9, avg: 45.2, max: 129)

Run: 1329, exploration: 0.01, score: 19
Scores: (min: 9, avg: 45, max: 129)

Run: 1330, exploration: 0.01, score: 53
Scores: (min: 9, avg: 45.37, max: 129)

Run: 1331, exploration: 0.01, score: 18
Scores: (min: 9, avg: 45.29, max: 129)

Run: 1332, exploration: 0.01, score: 15
Scores: (min: 9, avg: 44.53, max: 129)

Run: 1333, exploration: 0.01, score: 101
Scores: (min: 9, avg: 44.53, max: 129)

Run: 1334, exploration: 0.01, score: 27
Scores: (min: 9, avg: 43.76, max: 129)

Run: 1335, exploration: 0.01, score: 28
Scores: (min: 9, avg: 43.48, max: 129)

Run: 1336, exploration: 0.01, score: 12
Scores: (min: 9, avg: 43.12, max: 129)

Run: 1337, exploration: 0.01, score: 93
Scores: (min: 9, avg: 43.84, max: 129)

Run: 1338, exploration: 0.01, score: 49
Scores: (min: 9, avg: 43.49, max: 129)

Run: 1339, exploration: 0.01, score: 34
Sco

Run: 1429, exploration: 0.01, score: 15
Scores: (min: 10, avg: 38.17, max: 133)

Run: 1430, exploration: 0.01, score: 18
Scores: (min: 10, avg: 37.82, max: 133)

Run: 1431, exploration: 0.01, score: 14
Scores: (min: 10, avg: 37.78, max: 133)

Run: 1432, exploration: 0.01, score: 16
Scores: (min: 10, avg: 37.79, max: 133)

Run: 1433, exploration: 0.01, score: 56
Scores: (min: 10, avg: 37.34, max: 133)

Run: 1434, exploration: 0.01, score: 13
Scores: (min: 10, avg: 37.2, max: 133)

Run: 1435, exploration: 0.01, score: 172
Scores: (min: 10, avg: 38.64, max: 172)

Run: 1436, exploration: 0.01, score: 33
Scores: (min: 10, avg: 38.85, max: 172)

Run: 1437, exploration: 0.01, score: 62
Scores: (min: 10, avg: 38.54, max: 172)

Run: 1438, exploration: 0.01, score: 10
Scores: (min: 10, avg: 38.15, max: 172)

Run: 1439, exploration: 0.01, score: 13
Scores: (min: 10, avg: 37.94, max: 172)

Run: 1440, exploration: 0.01, score: 12
Scores: (min: 10, avg: 37.5, max: 172)

Run: 1441, exploration: 0.01,

Run: 1531, exploration: 0.01, score: 11
Scores: (min: 10, avg: 37.87, max: 322)

Run: 1532, exploration: 0.01, score: 10
Scores: (min: 10, avg: 37.81, max: 322)

Run: 1533, exploration: 0.01, score: 12
Scores: (min: 10, avg: 37.37, max: 322)

Run: 1534, exploration: 0.01, score: 10
Scores: (min: 10, avg: 37.34, max: 322)

Run: 1535, exploration: 0.01, score: 12
Scores: (min: 10, avg: 35.74, max: 322)

Run: 1536, exploration: 0.01, score: 19
Scores: (min: 10, avg: 35.6, max: 322)

Run: 1537, exploration: 0.01, score: 80
Scores: (min: 10, avg: 35.78, max: 322)

Run: 1538, exploration: 0.01, score: 9
Scores: (min: 9, avg: 35.77, max: 322)

Run: 1539, exploration: 0.01, score: 16
Scores: (min: 9, avg: 35.8, max: 322)

Run: 1540, exploration: 0.01, score: 11
Scores: (min: 9, avg: 35.79, max: 322)

Run: 1541, exploration: 0.01, score: 42
Scores: (min: 9, avg: 35.6, max: 322)

Run: 1542, exploration: 0.01, score: 38
Scores: (min: 9, avg: 35.62, max: 322)

Run: 1543, exploration: 0.01, score: 

Run: 1634, exploration: 0.01, score: 47
Scores: (min: 9, avg: 34.04, max: 112)

Run: 1635, exploration: 0.01, score: 33
Scores: (min: 9, avg: 34.25, max: 112)

Run: 1636, exploration: 0.01, score: 11
Scores: (min: 9, avg: 34.17, max: 112)

Run: 1637, exploration: 0.01, score: 14
Scores: (min: 9, avg: 33.51, max: 112)

Run: 1638, exploration: 0.01, score: 10
Scores: (min: 9, avg: 33.52, max: 112)

Run: 1639, exploration: 0.01, score: 58
Scores: (min: 9, avg: 33.94, max: 112)

Run: 1640, exploration: 0.01, score: 35
Scores: (min: 9, avg: 34.18, max: 112)

Run: 1641, exploration: 0.01, score: 18
Scores: (min: 9, avg: 33.94, max: 112)

Run: 1642, exploration: 0.01, score: 96
Scores: (min: 9, avg: 34.52, max: 112)

Run: 1643, exploration: 0.01, score: 73
Scores: (min: 9, avg: 35.11, max: 112)

Run: 1644, exploration: 0.01, score: 11
Scores: (min: 9, avg: 35.06, max: 112)

Run: 1645, exploration: 0.01, score: 40
Scores: (min: 9, avg: 35.35, max: 112)

Run: 1646, exploration: 0.01, score: 75


Run: 1737, exploration: 0.01, score: 15
Scores: (min: 9, avg: 42.06, max: 182)

Run: 1738, exploration: 0.01, score: 93
Scores: (min: 9, avg: 42.89, max: 182)

Run: 1739, exploration: 0.01, score: 36
Scores: (min: 9, avg: 42.67, max: 182)

Run: 1740, exploration: 0.01, score: 23
Scores: (min: 9, avg: 42.55, max: 182)

Run: 1741, exploration: 0.01, score: 94
Scores: (min: 9, avg: 43.31, max: 182)

Run: 1742, exploration: 0.01, score: 23
Scores: (min: 9, avg: 42.58, max: 182)

Run: 1743, exploration: 0.01, score: 10
Scores: (min: 9, avg: 41.95, max: 182)

Run: 1744, exploration: 0.01, score: 11
Scores: (min: 9, avg: 41.95, max: 182)

Run: 1745, exploration: 0.01, score: 25
Scores: (min: 9, avg: 41.8, max: 182)

Run: 1746, exploration: 0.01, score: 45
Scores: (min: 9, avg: 41.5, max: 182)

Run: 1747, exploration: 0.01, score: 61
Scores: (min: 9, avg: 41.99, max: 182)

Run: 1748, exploration: 0.01, score: 107
Scores: (min: 9, avg: 42.67, max: 182)

Run: 1749, exploration: 0.01, score: 26
S

Run: 1840, exploration: 0.01, score: 10
Scores: (min: 9, avg: 36.9, max: 125)

Run: 1841, exploration: 0.01, score: 13
Scores: (min: 9, avg: 36.09, max: 125)

Run: 1842, exploration: 0.01, score: 71
Scores: (min: 9, avg: 36.57, max: 125)

Run: 1843, exploration: 0.01, score: 27
Scores: (min: 9, avg: 36.74, max: 125)

Run: 1844, exploration: 0.01, score: 94
Scores: (min: 9, avg: 37.57, max: 125)

Run: 1845, exploration: 0.01, score: 11
Scores: (min: 9, avg: 37.43, max: 125)

Run: 1846, exploration: 0.01, score: 18
Scores: (min: 9, avg: 37.16, max: 125)

Run: 1847, exploration: 0.01, score: 30
Scores: (min: 9, avg: 36.85, max: 125)

Run: 1848, exploration: 0.01, score: 41
Scores: (min: 9, avg: 36.19, max: 125)

Run: 1849, exploration: 0.01, score: 11
Scores: (min: 9, avg: 36.04, max: 125)

Run: 1850, exploration: 0.01, score: 29
Scores: (min: 9, avg: 35.75, max: 125)

Run: 1851, exploration: 0.01, score: 16
Scores: (min: 9, avg: 35.13, max: 125)

Run: 1852, exploration: 0.01, score: 17
S

Run: 1943, exploration: 0.01, score: 49
Scores: (min: 9, avg: 31.21, max: 100)

Run: 1944, exploration: 0.01, score: 23
Scores: (min: 9, avg: 30.5, max: 100)

Run: 1945, exploration: 0.01, score: 90
Scores: (min: 9, avg: 31.29, max: 100)

Run: 1946, exploration: 0.01, score: 45
Scores: (min: 9, avg: 31.56, max: 100)

Run: 1947, exploration: 0.01, score: 59
Scores: (min: 9, avg: 31.85, max: 100)

Run: 1948, exploration: 0.01, score: 30
Scores: (min: 9, avg: 31.74, max: 100)

Run: 1949, exploration: 0.01, score: 21
Scores: (min: 9, avg: 31.84, max: 100)

Run: 1950, exploration: 0.01, score: 18
Scores: (min: 9, avg: 31.73, max: 100)

Run: 1951, exploration: 0.01, score: 17
Scores: (min: 9, avg: 31.74, max: 100)

Run: 1952, exploration: 0.01, score: 159
Scores: (min: 9, avg: 33.16, max: 159)

Run: 1953, exploration: 0.01, score: 65
Scores: (min: 9, avg: 32.81, max: 159)

Run: 1954, exploration: 0.01, score: 82
Scores: (min: 9, avg: 32.73, max: 159)

Run: 1955, exploration: 0.01, score: 22


Run: 2046, exploration: 0.01, score: 106
Scores: (min: 9, avg: 38.63, max: 160)

Run: 2047, exploration: 0.01, score: 32
Scores: (min: 9, avg: 38.36, max: 160)

Run: 2048, exploration: 0.01, score: 39
Scores: (min: 9, avg: 38.45, max: 160)

Run: 2049, exploration: 0.01, score: 80
Scores: (min: 9, avg: 39.04, max: 160)

Run: 2050, exploration: 0.01, score: 12
Scores: (min: 9, avg: 38.98, max: 160)

Run: 2051, exploration: 0.01, score: 34
Scores: (min: 9, avg: 39.15, max: 160)

Run: 2052, exploration: 0.01, score: 42
Scores: (min: 9, avg: 37.98, max: 160)

Run: 2053, exploration: 0.01, score: 63
Scores: (min: 9, avg: 37.96, max: 160)

Run: 2054, exploration: 0.01, score: 15
Scores: (min: 9, avg: 37.29, max: 160)

Run: 2055, exploration: 0.01, score: 35
Scores: (min: 9, avg: 37.42, max: 160)

Run: 2056, exploration: 0.01, score: 89
Scores: (min: 9, avg: 38.13, max: 160)

Run: 2057, exploration: 0.01, score: 19
Scores: (min: 9, avg: 38.19, max: 160)

Run: 2058, exploration: 0.01, score: 12

Run: 2148, exploration: 0.01, score: 12
Scores: (min: 10, avg: 41.53, max: 210)

Run: 2149, exploration: 0.01, score: 12
Scores: (min: 10, avg: 40.85, max: 210)

Run: 2150, exploration: 0.01, score: 25
Scores: (min: 10, avg: 40.98, max: 210)

Run: 2151, exploration: 0.01, score: 24
Scores: (min: 10, avg: 40.88, max: 210)

Run: 2152, exploration: 0.01, score: 11
Scores: (min: 10, avg: 40.57, max: 210)

Run: 2153, exploration: 0.01, score: 10
Scores: (min: 10, avg: 40.04, max: 210)

Run: 2154, exploration: 0.01, score: 10
Scores: (min: 10, avg: 39.99, max: 210)

Run: 2155, exploration: 0.01, score: 22
Scores: (min: 10, avg: 39.86, max: 210)

Run: 2156, exploration: 0.01, score: 39
Scores: (min: 10, avg: 39.36, max: 210)

Run: 2157, exploration: 0.01, score: 11
Scores: (min: 10, avg: 39.28, max: 210)

Run: 2158, exploration: 0.01, score: 12
Scores: (min: 10, avg: 38.13, max: 210)

Run: 2159, exploration: 0.01, score: 10
Scores: (min: 10, avg: 37.3, max: 210)

Run: 2160, exploration: 0.01,

Run: 2251, exploration: 0.01, score: 14
Scores: (min: 9, avg: 38.64, max: 144)

Run: 2252, exploration: 0.01, score: 11
Scores: (min: 9, avg: 38.64, max: 144)

Run: 2253, exploration: 0.01, score: 37
Scores: (min: 9, avg: 38.91, max: 144)

Run: 2254, exploration: 0.01, score: 40
Scores: (min: 9, avg: 39.21, max: 144)

Run: 2255, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.11, max: 144)

Run: 2256, exploration: 0.01, score: 11
Scores: (min: 9, avg: 38.83, max: 144)

Run: 2257, exploration: 0.01, score: 33
Scores: (min: 9, avg: 39.05, max: 144)

Run: 2258, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.05, max: 144)

Run: 2259, exploration: 0.01, score: 10
Scores: (min: 9, avg: 39.05, max: 144)

Run: 2260, exploration: 0.01, score: 36
Scores: (min: 9, avg: 39.22, max: 144)

Run: 2261, exploration: 0.01, score: 92
Scores: (min: 9, avg: 39.87, max: 144)

Run: 2262, exploration: 0.01, score: 27
Scores: (min: 9, avg: 40.01, max: 144)

Run: 2263, exploration: 0.01, score: 17


Run: 2353, exploration: 0.01, score: 110
Scores: (min: 9, avg: 44.58, max: 151)

Run: 2354, exploration: 0.01, score: 79
Scores: (min: 9, avg: 44.97, max: 151)

Run: 2355, exploration: 0.01, score: 159
Scores: (min: 9, avg: 46.44, max: 159)

Run: 2356, exploration: 0.01, score: 31
Scores: (min: 9, avg: 46.64, max: 159)

Run: 2357, exploration: 0.01, score: 25
Scores: (min: 9, avg: 46.56, max: 159)

Run: 2358, exploration: 0.01, score: 16
Scores: (min: 9, avg: 46.6, max: 159)

Run: 2359, exploration: 0.01, score: 12
Scores: (min: 9, avg: 46.62, max: 159)

Run: 2360, exploration: 0.01, score: 27
Scores: (min: 9, avg: 46.53, max: 159)

Run: 2361, exploration: 0.01, score: 49
Scores: (min: 9, avg: 46.1, max: 159)

Run: 2362, exploration: 0.01, score: 10
Scores: (min: 9, avg: 45.93, max: 159)

Run: 2363, exploration: 0.01, score: 38
Scores: (min: 9, avg: 46.14, max: 159)

Run: 2364, exploration: 0.01, score: 23
Scores: (min: 9, avg: 46.12, max: 159)

Run: 2365, exploration: 0.01, score: 13


Run: 2456, exploration: 0.01, score: 54
Scores: (min: 9, avg: 42.23, max: 189)

Run: 2457, exploration: 0.01, score: 13
Scores: (min: 9, avg: 42.11, max: 189)

Run: 2458, exploration: 0.01, score: 28
Scores: (min: 9, avg: 42.23, max: 189)

Run: 2459, exploration: 0.01, score: 79
Scores: (min: 9, avg: 42.9, max: 189)

Run: 2460, exploration: 0.01, score: 41
Scores: (min: 9, avg: 43.04, max: 189)

Run: 2461, exploration: 0.01, score: 14
Scores: (min: 9, avg: 42.69, max: 189)

Run: 2462, exploration: 0.01, score: 16
Scores: (min: 9, avg: 42.75, max: 189)

Run: 2463, exploration: 0.01, score: 80
Scores: (min: 9, avg: 43.17, max: 189)

Run: 2464, exploration: 0.01, score: 47
Scores: (min: 9, avg: 43.41, max: 189)

Run: 2465, exploration: 0.01, score: 12
Scores: (min: 9, avg: 43.4, max: 189)

Run: 2466, exploration: 0.01, score: 58
Scores: (min: 9, avg: 43.84, max: 189)

Run: 2467, exploration: 0.01, score: 13
Scores: (min: 9, avg: 43.14, max: 189)

Run: 2468, exploration: 0.01, score: 17
Sc

Run: 2559, exploration: 0.01, score: 100
Scores: (min: 9, avg: 46.65, max: 180)

Run: 2560, exploration: 0.01, score: 10
Scores: (min: 9, avg: 46.34, max: 180)

Run: 2561, exploration: 0.01, score: 13
Scores: (min: 9, avg: 46.33, max: 180)

Run: 2562, exploration: 0.01, score: 59
Scores: (min: 9, avg: 46.76, max: 180)

Run: 2563, exploration: 0.01, score: 90
Scores: (min: 9, avg: 46.86, max: 180)

Run: 2564, exploration: 0.01, score: 81
Scores: (min: 9, avg: 47.2, max: 180)

Run: 2565, exploration: 0.01, score: 12
Scores: (min: 9, avg: 47.2, max: 180)

Run: 2566, exploration: 0.01, score: 43
Scores: (min: 9, avg: 47.05, max: 180)

Run: 2567, exploration: 0.01, score: 13
Scores: (min: 9, avg: 47.05, max: 180)

Run: 2568, exploration: 0.01, score: 29
Scores: (min: 9, avg: 47.17, max: 180)

Run: 2569, exploration: 0.01, score: 33
Scores: (min: 9, avg: 47.37, max: 180)

Run: 2570, exploration: 0.01, score: 39
Scores: (min: 9, avg: 47.12, max: 180)

Run: 2571, exploration: 0.01, score: 56
S

Run: 2662, exploration: 0.01, score: 12
Scores: (min: 9, avg: 46.79, max: 116)

Run: 2663, exploration: 0.01, score: 11
Scores: (min: 9, avg: 46, max: 116)

Run: 2664, exploration: 0.01, score: 55
Scores: (min: 9, avg: 45.74, max: 116)

Run: 2665, exploration: 0.01, score: 11
Scores: (min: 9, avg: 45.73, max: 116)

Run: 2666, exploration: 0.01, score: 21
Scores: (min: 9, avg: 45.51, max: 116)

Run: 2667, exploration: 0.01, score: 17
Scores: (min: 9, avg: 45.55, max: 116)

Run: 2668, exploration: 0.01, score: 13
Scores: (min: 9, avg: 45.39, max: 116)

Run: 2669, exploration: 0.01, score: 96
Scores: (min: 9, avg: 46.02, max: 116)

Run: 2670, exploration: 0.01, score: 22
Scores: (min: 9, avg: 45.85, max: 116)

Run: 2671, exploration: 0.01, score: 47
Scores: (min: 9, avg: 45.76, max: 116)

Run: 2672, exploration: 0.01, score: 39
Scores: (min: 9, avg: 45.46, max: 116)

Run: 2673, exploration: 0.01, score: 62
Scores: (min: 9, avg: 45.87, max: 116)

Run: 2674, exploration: 0.01, score: 14
Sco

Run: 2765, exploration: 0.01, score: 13
Scores: (min: 9, avg: 39.86, max: 142)

Run: 2766, exploration: 0.01, score: 56
Scores: (min: 9, avg: 40.21, max: 142)

Run: 2767, exploration: 0.01, score: 17
Scores: (min: 9, avg: 40.21, max: 142)

Run: 2768, exploration: 0.01, score: 11
Scores: (min: 9, avg: 40.19, max: 142)

Run: 2769, exploration: 0.01, score: 77
Scores: (min: 9, avg: 40, max: 142)

Run: 2770, exploration: 0.01, score: 19
Scores: (min: 9, avg: 39.97, max: 142)

Run: 2771, exploration: 0.01, score: 94
Scores: (min: 9, avg: 40.44, max: 142)

Run: 2772, exploration: 0.01, score: 21
Scores: (min: 9, avg: 40.26, max: 142)

Run: 2773, exploration: 0.01, score: 12
Scores: (min: 9, avg: 39.76, max: 142)

Run: 2774, exploration: 0.01, score: 10
Scores: (min: 9, avg: 39.72, max: 142)

Run: 2775, exploration: 0.01, score: 119
Scores: (min: 9, avg: 40.42, max: 142)

Run: 2776, exploration: 0.01, score: 29
Scores: (min: 9, avg: 40.38, max: 142)

Run: 2777, exploration: 0.01, score: 11
Sc

Run: 2868, exploration: 0.01, score: 69
Scores: (min: 9, avg: 39.91, max: 158)

Run: 2869, exploration: 0.01, score: 26
Scores: (min: 9, avg: 39.4, max: 158)

Run: 2870, exploration: 0.01, score: 69
Scores: (min: 9, avg: 39.9, max: 158)

Run: 2871, exploration: 0.01, score: 93
Scores: (min: 9, avg: 39.89, max: 158)

Run: 2872, exploration: 0.01, score: 25
Scores: (min: 9, avg: 39.93, max: 158)

Run: 2873, exploration: 0.01, score: 69
Scores: (min: 9, avg: 40.5, max: 158)

Run: 2874, exploration: 0.01, score: 11
Scores: (min: 9, avg: 40.51, max: 158)

Run: 2875, exploration: 0.01, score: 98
Scores: (min: 9, avg: 40.3, max: 158)

Run: 2876, exploration: 0.01, score: 98
Scores: (min: 9, avg: 40.99, max: 158)

Run: 2877, exploration: 0.01, score: 15
Scores: (min: 9, avg: 41.03, max: 158)

Run: 2878, exploration: 0.01, score: 92
Scores: (min: 9, avg: 41.33, max: 158)

Run: 2879, exploration: 0.01, score: 18
Scores: (min: 9, avg: 41.32, max: 158)

Run: 2880, exploration: 0.01, score: 66
Scor

KeyboardInterrupt: 

In [2]:
GAMMA = 0.95
LEARNING_RATE = 0.01

cartpole()

Run: 1, exploration: 1.0, score: 20
Scores: (min: 20, avg: 20, max: 20)

Run: 2, exploration: 0.8020760579717637, score: 45
Scores: (min: 20, avg: 32.5, max: 45)

Run: 3, exploration: 0.7111635524897149, score: 25
Scores: (min: 20, avg: 30, max: 45)

Run: 4, exploration: 0.6629680834613705, score: 15
Scores: (min: 15, avg: 26.25, max: 45)

Run: 5, exploration: 0.6180388156137953, score: 15
Scores: (min: 15, avg: 24, max: 45)

Run: 6, exploration: 0.5907768628656763, score: 10
Scores: (min: 10, avg: 21.666666666666668, max: 45)

Run: 7, exploration: 0.5562889678716474, score: 13
Scores: (min: 10, avg: 20.428571428571427, max: 45)

Run: 8, exploration: 0.4907693883854626, score: 26
Scores: (min: 10, avg: 21.125, max: 45)

Run: 9, exploration: 0.47147873742168567, score: 9
Scores: (min: 9, avg: 19.77777777777778, max: 45)

Run: 10, exploration: 0.4506816115185697, score: 10
Scores: (min: 9, avg: 18.8, max: 45)

Run: 11, exploration: 0.4036245882390106, score: 23
Scores: (min: 9, avg: 19.1

Run: 87, exploration: 0.01, score: 20
Scores: (min: 9, avg: 30.448275862068964, max: 161)

Run: 88, exploration: 0.01, score: 59
Scores: (min: 9, avg: 30.772727272727273, max: 161)

Run: 89, exploration: 0.01, score: 20
Scores: (min: 9, avg: 30.651685393258425, max: 161)

Run: 90, exploration: 0.01, score: 65
Scores: (min: 9, avg: 31.033333333333335, max: 161)

Run: 91, exploration: 0.01, score: 58
Scores: (min: 9, avg: 31.32967032967033, max: 161)

Run: 92, exploration: 0.01, score: 74
Scores: (min: 9, avg: 31.793478260869566, max: 161)

Run: 93, exploration: 0.01, score: 74
Scores: (min: 9, avg: 32.24731182795699, max: 161)

Run: 94, exploration: 0.01, score: 19
Scores: (min: 9, avg: 32.1063829787234, max: 161)

Run: 95, exploration: 0.01, score: 99
Scores: (min: 9, avg: 32.810526315789474, max: 161)

Run: 96, exploration: 0.01, score: 59
Scores: (min: 9, avg: 33.083333333333336, max: 161)

Run: 97, exploration: 0.01, score: 10
Scores: (min: 9, avg: 32.845360824742265, max: 161)

Run

Run: 189, exploration: 0.01, score: 29
Scores: (min: 8, avg: 38.73, max: 148)

Run: 190, exploration: 0.01, score: 17
Scores: (min: 8, avg: 38.25, max: 148)

Run: 191, exploration: 0.01, score: 29
Scores: (min: 8, avg: 37.96, max: 148)

Run: 192, exploration: 0.01, score: 26
Scores: (min: 8, avg: 37.48, max: 148)

Run: 193, exploration: 0.01, score: 25
Scores: (min: 8, avg: 36.99, max: 148)

Run: 194, exploration: 0.01, score: 10
Scores: (min: 8, avg: 36.9, max: 148)

Run: 195, exploration: 0.01, score: 58
Scores: (min: 8, avg: 36.49, max: 148)

Run: 196, exploration: 0.01, score: 41
Scores: (min: 8, avg: 36.31, max: 148)

Run: 197, exploration: 0.01, score: 57
Scores: (min: 8, avg: 36.78, max: 148)

Run: 198, exploration: 0.01, score: 20
Scores: (min: 8, avg: 36.18, max: 148)

Run: 199, exploration: 0.01, score: 54
Scores: (min: 8, avg: 36.61, max: 148)

Run: 200, exploration: 0.01, score: 10
Scores: (min: 8, avg: 35.93, max: 148)

Run: 201, exploration: 0.01, score: 70
Scores: (min: 

Run: 293, exploration: 0.01, score: 9
Scores: (min: 8, avg: 38.23, max: 193)

Run: 294, exploration: 0.01, score: 8
Scores: (min: 8, avg: 38.21, max: 193)

Run: 295, exploration: 0.01, score: 10
Scores: (min: 8, avg: 37.73, max: 193)

Run: 296, exploration: 0.01, score: 10
Scores: (min: 8, avg: 37.42, max: 193)

Run: 297, exploration: 0.01, score: 10
Scores: (min: 8, avg: 36.95, max: 193)

Run: 298, exploration: 0.01, score: 9
Scores: (min: 8, avg: 36.84, max: 193)

Run: 299, exploration: 0.01, score: 39
Scores: (min: 8, avg: 36.69, max: 193)

Run: 300, exploration: 0.01, score: 50
Scores: (min: 8, avg: 37.09, max: 193)

Run: 301, exploration: 0.01, score: 10
Scores: (min: 8, avg: 36.49, max: 193)

Run: 302, exploration: 0.01, score: 10
Scores: (min: 8, avg: 35.38, max: 193)

Run: 303, exploration: 0.01, score: 9
Scores: (min: 8, avg: 33.98, max: 193)

Run: 304, exploration: 0.01, score: 115
Scores: (min: 8, avg: 34.89, max: 193)

Run: 305, exploration: 0.01, score: 36
Scores: (min: 8,

Run: 397, exploration: 0.01, score: 15
Scores: (min: 8, avg: 24.85, max: 138)

Run: 398, exploration: 0.01, score: 33
Scores: (min: 8, avg: 25.09, max: 138)

Run: 399, exploration: 0.01, score: 20
Scores: (min: 8, avg: 24.9, max: 138)

Run: 400, exploration: 0.01, score: 16
Scores: (min: 8, avg: 24.56, max: 138)

Run: 401, exploration: 0.01, score: 15
Scores: (min: 8, avg: 24.61, max: 138)

Run: 402, exploration: 0.01, score: 10
Scores: (min: 8, avg: 24.61, max: 138)

Run: 403, exploration: 0.01, score: 21
Scores: (min: 8, avg: 24.73, max: 138)

Run: 404, exploration: 0.01, score: 23
Scores: (min: 8, avg: 23.81, max: 138)

Run: 405, exploration: 0.01, score: 14
Scores: (min: 8, avg: 23.59, max: 138)

Run: 406, exploration: 0.01, score: 11
Scores: (min: 8, avg: 23.6, max: 138)

Run: 407, exploration: 0.01, score: 29
Scores: (min: 8, avg: 23.79, max: 138)

Run: 408, exploration: 0.01, score: 11
Scores: (min: 8, avg: 23.51, max: 138)

Run: 409, exploration: 0.01, score: 16
Scores: (min: 8

Run: 502, exploration: 0.01, score: 15
Scores: (min: 8, avg: 18.9, max: 65)

Run: 503, exploration: 0.01, score: 14
Scores: (min: 8, avg: 18.83, max: 65)

Run: 504, exploration: 0.01, score: 11
Scores: (min: 8, avg: 18.71, max: 65)

Run: 505, exploration: 0.01, score: 8
Scores: (min: 8, avg: 18.65, max: 65)

Run: 506, exploration: 0.01, score: 22
Scores: (min: 8, avg: 18.76, max: 65)

Run: 507, exploration: 0.01, score: 25
Scores: (min: 8, avg: 18.72, max: 65)

Run: 508, exploration: 0.01, score: 42
Scores: (min: 8, avg: 19.03, max: 65)

Run: 509, exploration: 0.01, score: 12
Scores: (min: 8, avg: 18.99, max: 65)

Run: 510, exploration: 0.01, score: 13
Scores: (min: 8, avg: 19.01, max: 65)

Run: 511, exploration: 0.01, score: 21
Scores: (min: 8, avg: 18.96, max: 65)

Run: 512, exploration: 0.01, score: 22
Scores: (min: 8, avg: 19.06, max: 65)

Run: 513, exploration: 0.01, score: 17
Scores: (min: 8, avg: 19.06, max: 65)

Run: 514, exploration: 0.01, score: 8
Scores: (min: 8, avg: 19.03,

Run: 608, exploration: 0.01, score: 10
Scores: (min: 8, avg: 18.25, max: 68)

Run: 609, exploration: 0.01, score: 20
Scores: (min: 8, avg: 18.33, max: 68)

Run: 610, exploration: 0.01, score: 25
Scores: (min: 8, avg: 18.45, max: 68)

Run: 611, exploration: 0.01, score: 10
Scores: (min: 8, avg: 18.34, max: 68)

Run: 612, exploration: 0.01, score: 9
Scores: (min: 8, avg: 18.21, max: 68)

Run: 613, exploration: 0.01, score: 9
Scores: (min: 8, avg: 18.13, max: 68)

Run: 614, exploration: 0.01, score: 11
Scores: (min: 8, avg: 18.16, max: 68)

Run: 615, exploration: 0.01, score: 9
Scores: (min: 8, avg: 18.13, max: 68)

Run: 616, exploration: 0.01, score: 9
Scores: (min: 8, avg: 18.12, max: 68)

Run: 617, exploration: 0.01, score: 34
Scores: (min: 8, avg: 18.31, max: 68)

Run: 618, exploration: 0.01, score: 12
Scores: (min: 8, avg: 18.18, max: 68)

Run: 619, exploration: 0.01, score: 11
Scores: (min: 8, avg: 18.14, max: 68)

Run: 620, exploration: 0.01, score: 10
Scores: (min: 8, avg: 18.09, 

Run: 714, exploration: 0.01, score: 15
Scores: (min: 8, avg: 16.16, max: 57)

Run: 715, exploration: 0.01, score: 10
Scores: (min: 8, avg: 16.17, max: 57)

Run: 716, exploration: 0.01, score: 17
Scores: (min: 8, avg: 16.25, max: 57)

Run: 717, exploration: 0.01, score: 14
Scores: (min: 8, avg: 16.05, max: 57)

Run: 718, exploration: 0.01, score: 17
Scores: (min: 8, avg: 16.1, max: 57)

Run: 719, exploration: 0.01, score: 15
Scores: (min: 8, avg: 16.14, max: 57)

Run: 720, exploration: 0.01, score: 8
Scores: (min: 8, avg: 16.12, max: 57)

Run: 721, exploration: 0.01, score: 14
Scores: (min: 8, avg: 16.13, max: 57)

Run: 722, exploration: 0.01, score: 16
Scores: (min: 8, avg: 16.18, max: 57)

Run: 723, exploration: 0.01, score: 47
Scores: (min: 8, avg: 16.34, max: 57)

Run: 724, exploration: 0.01, score: 19
Scores: (min: 8, avg: 16.39, max: 57)

Run: 725, exploration: 0.01, score: 13
Scores: (min: 8, avg: 16.42, max: 57)

Run: 726, exploration: 0.01, score: 9
Scores: (min: 8, avg: 16.33,

Run: 820, exploration: 0.01, score: 18
Scores: (min: 8, avg: 16.02, max: 47)

Run: 821, exploration: 0.01, score: 9
Scores: (min: 8, avg: 15.97, max: 47)

Run: 822, exploration: 0.01, score: 12
Scores: (min: 8, avg: 15.93, max: 47)

Run: 823, exploration: 0.01, score: 19
Scores: (min: 8, avg: 15.65, max: 35)

Run: 824, exploration: 0.01, score: 9
Scores: (min: 8, avg: 15.55, max: 35)

Run: 825, exploration: 0.01, score: 8
Scores: (min: 8, avg: 15.5, max: 35)

Run: 826, exploration: 0.01, score: 9
Scores: (min: 8, avg: 15.5, max: 35)

Run: 827, exploration: 0.01, score: 20
Scores: (min: 8, avg: 15.41, max: 35)

Run: 828, exploration: 0.01, score: 17
Scores: (min: 8, avg: 15.38, max: 35)

Run: 829, exploration: 0.01, score: 11
Scores: (min: 8, avg: 15.21, max: 35)

Run: 830, exploration: 0.01, score: 33
Scores: (min: 8, avg: 15.4, max: 35)

Run: 831, exploration: 0.01, score: 8
Scores: (min: 8, avg: 15.34, max: 35)

Run: 832, exploration: 0.01, score: 12
Scores: (min: 8, avg: 15.33, max:

Scores: (min: 8, avg: 15.39, max: 41)

Run: 926, exploration: 0.01, score: 43
Scores: (min: 8, avg: 15.73, max: 43)

Run: 927, exploration: 0.01, score: 10
Scores: (min: 8, avg: 15.63, max: 43)

Run: 928, exploration: 0.01, score: 17
Scores: (min: 8, avg: 15.63, max: 43)

Run: 929, exploration: 0.01, score: 10
Scores: (min: 8, avg: 15.62, max: 43)

Run: 930, exploration: 0.01, score: 30
Scores: (min: 8, avg: 15.59, max: 43)

Run: 931, exploration: 0.01, score: 10
Scores: (min: 8, avg: 15.61, max: 43)

Run: 932, exploration: 0.01, score: 10
Scores: (min: 8, avg: 15.59, max: 43)

Run: 933, exploration: 0.01, score: 15
Scores: (min: 8, avg: 15.6, max: 43)

Run: 934, exploration: 0.01, score: 24
Scores: (min: 8, avg: 15.61, max: 43)

Run: 935, exploration: 0.01, score: 19
Scores: (min: 8, avg: 15.61, max: 43)

Run: 936, exploration: 0.01, score: 17
Scores: (min: 8, avg: 15.49, max: 43)

Run: 937, exploration: 0.01, score: 49
Scores: (min: 8, avg: 15.79, max: 49)

Run: 938, exploration: 0.0

Run: 1031, exploration: 0.01, score: 10
Scores: (min: 8, avg: 13.79, max: 55)

Run: 1032, exploration: 0.01, score: 10
Scores: (min: 8, avg: 13.79, max: 55)

Run: 1033, exploration: 0.01, score: 20
Scores: (min: 8, avg: 13.84, max: 55)

Run: 1034, exploration: 0.01, score: 15
Scores: (min: 8, avg: 13.75, max: 55)

Run: 1035, exploration: 0.01, score: 32
Scores: (min: 8, avg: 13.88, max: 55)

Run: 1036, exploration: 0.01, score: 19
Scores: (min: 8, avg: 13.9, max: 55)

Run: 1037, exploration: 0.01, score: 10
Scores: (min: 8, avg: 13.51, max: 55)

Run: 1038, exploration: 0.01, score: 14
Scores: (min: 8, avg: 13.44, max: 55)

Run: 1039, exploration: 0.01, score: 11
Scores: (min: 8, avg: 13.4, max: 55)

Run: 1040, exploration: 0.01, score: 12
Scores: (min: 8, avg: 13.26, max: 55)

Run: 1041, exploration: 0.01, score: 17
Scores: (min: 8, avg: 13.32, max: 55)

Run: 1042, exploration: 0.01, score: 9
Scores: (min: 8, avg: 13.14, max: 55)

Run: 1043, exploration: 0.01, score: 10
Scores: (min: 8

Run: 1136, exploration: 0.01, score: 9
Scores: (min: 8, avg: 9.76, max: 17)

Run: 1137, exploration: 0.01, score: 10
Scores: (min: 8, avg: 9.76, max: 17)

Run: 1138, exploration: 0.01, score: 8
Scores: (min: 8, avg: 9.7, max: 17)

Run: 1139, exploration: 0.01, score: 10
Scores: (min: 8, avg: 9.69, max: 17)

Run: 1140, exploration: 0.01, score: 8
Scores: (min: 8, avg: 9.65, max: 17)

Run: 1141, exploration: 0.01, score: 9
Scores: (min: 8, avg: 9.57, max: 13)

Run: 1142, exploration: 0.01, score: 10
Scores: (min: 8, avg: 9.58, max: 13)

Run: 1143, exploration: 0.01, score: 10
Scores: (min: 8, avg: 9.58, max: 13)

Run: 1144, exploration: 0.01, score: 9
Scores: (min: 8, avg: 9.58, max: 13)

Run: 1145, exploration: 0.01, score: 9
Scores: (min: 8, avg: 9.57, max: 13)

Run: 1146, exploration: 0.01, score: 10
Scores: (min: 8, avg: 9.57, max: 13)

Run: 1147, exploration: 0.01, score: 10
Scores: (min: 8, avg: 9.58, max: 13)

Run: 1148, exploration: 0.01, score: 9
Scores: (min: 8, avg: 9.57, max:

Run: 1240, exploration: 0.01, score: 193
Scores: (min: 9, avg: 84.02, max: 465)

Run: 1241, exploration: 0.01, score: 149
Scores: (min: 9, avg: 85.42, max: 465)

Run: 1242, exploration: 0.01, score: 358
Scores: (min: 9, avg: 88.9, max: 465)

Run: 1243, exploration: 0.01, score: 199
Scores: (min: 9, avg: 90.79, max: 465)

Run: 1244, exploration: 0.01, score: 168
Scores: (min: 9, avg: 92.38, max: 465)

Run: 1245, exploration: 0.01, score: 180
Scores: (min: 9, avg: 94.09, max: 465)

Run: 1246, exploration: 0.01, score: 273
Scores: (min: 9, avg: 96.72, max: 465)

Run: 1247, exploration: 0.01, score: 195
Scores: (min: 9, avg: 98.57, max: 465)

Run: 1248, exploration: 0.01, score: 236
Scores: (min: 9, avg: 100.84, max: 465)

Run: 1249, exploration: 0.01, score: 211
Scores: (min: 9, avg: 102.85, max: 465)

Run: 1250, exploration: 0.01, score: 241
Scores: (min: 9, avg: 105.17, max: 465)

Run: 1251, exploration: 0.01, score: 244
Scores: (min: 9, avg: 107.52, max: 465)

Run: 1252, exploration: 0

NameError: name 'exit' is not defined

### Reinforcement learning
The cartpole problem is one in wich a pole is balanced on a cart via a pivot. The goal is to keep the pole upright by moving the cart back and forth. The goal of the agent in this case it to keep the pole upright. The state values are the carts position and velocity, as well as the poles andle and angular velocity.(Gymnasium) The only actions possible are to move left or right, and the model is trained to chose between these using a Deep Q-Network. 

### Experence Replay
In this algorithm past batches of experence are remembered and used for future training. This helps the model with stability making sure it's not overtrained on the last action, or overfit the trainig data. We use a discout factor to calculate future rewards. We need to strike a ballance between training for the current situation and the future states and the discount facter allowes us to reduce the reward for future states. 

### Neural Networks
This network has two hidden layers and output layer with the move left and right options. Q-Learning works well for instances with a set number of states but can struggle with continuous problems like cartpole. This is where neural networds come in to provide a more generalivable prediction table that isn't dependent on the number of states from the begining. (Taweh, 2019) The learning rate informs how fast the neural network changes it's weights. A learing rate that is too high the model will skip all over the place and posibly not converge. With a learing rate too low it will take forever for the model to improve and may get caught in a local maximum. 


## References

Gymnasium Documentation. (n.d.). Gymnasium.farama.org. https://gymnasium.farama.org/

Lamba, A. (2018, September 3). An introduction to Q-Learning: reinforcement learning. We’ve Moved to FreeCodeCamp.org/News. https://medium.com/free-code-camp/an-introduction-to-q-learning-reinforcement-learning-14ac0b4493cc

Patel, R. (2025, June 29). Master CartPole RL: Unlock Fun Learning with Setup, Q-Learning & More! Aigreeks.com. https://aigreeks.com/cartpole-magic-master-reinforcement-learning/

Taweh Beysolow Ii, & Springerlink (Online Service. (2019). Applied Reinforcement Learning with Python : With OpenAI Gym, Tensorflow, and Keras. Apress.
