# Module Five Assignment: Cartpole Problem
Review the code in this notebook and in the score_logger.py file in the *scores* folder (directory). Once you have reviewed the code, return to this notebook and select **Cell** and then **Run All** from the menu bar to run this code. The code takes several minutes to run.

In [3]:
import random  
import gym  
import numpy as np  
from collections import deque  
from keras.models import Sequential  
from keras.layers import Dense  
from keras.optimizers import Adam  
  
  
from scores.score_logger import ScoreLogger  
  
ENV_NAME = "CartPole-v1"  
  
GAMMA = 0.95  
LEARNING_RATE = 0.001  
  
MEMORY_SIZE = 1000000  
BATCH_SIZE = 20  
  
EXPLORATION_MAX = 1.0  
EXPLORATION_MIN = 0.01  
EXPLORATION_DECAY = 0.995  
  
  
class DQNSolver:  
  
    def __init__(self, observation_space, action_space):  
        self.exploration_rate = EXPLORATION_MAX  
  
        self.action_space = action_space  
        self.memory = deque(maxlen=MEMORY_SIZE)  
  
        self.model = Sequential()  
        self.model.add(Dense(24, input_shape=(observation_space,), activation="relu"))  
        self.model.add(Dense(24, activation="relu"))  
        self.model.add(Dense(self.action_space, activation="linear"))  
        self.model.compile(loss="mse", optimizer=Adam(lr=LEARNING_RATE))  
  
    def remember(self, state, action, reward, next_state, done):  
        self.memory.append((state, action, reward, next_state, done))  
  
    def act(self, state):  
        if np.random.rand() < self.exploration_rate:  
            return random.randrange(self.action_space)  
        q_values = self.model.predict(state)  
        return np.argmax(q_values[0])  
  
    def experience_replay(self):  
        if len(self.memory) < BATCH_SIZE:  
            return  
        batch = random.sample(self.memory, BATCH_SIZE)  
        for state, action, reward, state_next, terminal in batch:  
            q_update = reward  
            if not terminal:  
                q_update = (reward + GAMMA * np.amax(self.model.predict(state_next)[0]))  
            q_values = self.model.predict(state)  
            q_values[0][action] = q_update  
            self.model.fit(state, q_values, verbose=0)  
        self.exploration_rate *= EXPLORATION_DECAY  
        self.exploration_rate = max(EXPLORATION_MIN, self.exploration_rate)  
  
  
def cartpole():  
    env = gym.make(ENV_NAME)  
    score_logger = ScoreLogger(ENV_NAME)  
    observation_space = env.observation_space.shape[0]  
    action_space = env.action_space.n  
    dqn_solver = DQNSolver(observation_space, action_space)  
    run = 0  
    while True:  
        run += 1  
        state = env.reset()  
        state = np.reshape(state, [1, observation_space])  
        step = 0  
        while True:  
            step += 1  
            #env.render()  
            action = dqn_solver.act(state)  
            state_next, reward, terminal, info = env.step(action)  
            reward = reward if not terminal else -reward  
            state_next = np.reshape(state_next, [1, observation_space])  
            dqn_solver.remember(state, action, reward, state_next, terminal)  
            state = state_next  
            if terminal:  
                print ("Run: " + str(run) + ", exploration: " + str(dqn_solver.exploration_rate) + ", score: " + str(step))  
                score_logger.add_score(step, run)  
                break  
            dqn_solver.experience_replay()  



In [4]:
cartpole()

Run: 1, exploration: 1.0, score: 16
Scores: (min: 16, avg: 16, max: 16)

Run: 2, exploration: 0.9511101304657719, score: 14
Scores: (min: 14, avg: 15, max: 16)

Run: 3, exploration: 0.9000874278732445, score: 12
Scores: (min: 12, avg: 14, max: 16)

Run: 4, exploration: 0.8265651079747222, score: 18
Scores: (min: 12, avg: 15, max: 18)

Run: 5, exploration: 0.7822236754458713, score: 12
Scores: (min: 12, avg: 14.4, max: 18)

Run: 6, exploration: 0.7219385759785162, score: 17
Scores: (min: 12, avg: 14.833333333333334, max: 18)

Run: 7, exploration: 0.6900935609921609, score: 10
Scores: (min: 10, avg: 14.142857142857142, max: 18)

Run: 8, exploration: 0.6596532430440636, score: 10
Scores: (min: 10, avg: 13.625, max: 18)

Run: 9, exploration: 0.6242658676435396, score: 12
Scores: (min: 10, avg: 13.444444444444445, max: 18)

Run: 10, exploration: 0.5907768628656763, score: 12
Scores: (min: 10, avg: 13.3, max: 18)

Run: 11, exploration: 0.5562889678716474, score: 13
Scores: (min: 10, avg: 13.

Run: 89, exploration: 0.01, score: 368
Scores: (min: 9, avg: 185.1573033707865, max: 500)

Run: 90, exploration: 0.01, score: 177
Scores: (min: 9, avg: 185.06666666666666, max: 500)

Run: 91, exploration: 0.01, score: 313
Scores: (min: 9, avg: 186.47252747252747, max: 500)

Run: 92, exploration: 0.01, score: 500
Scores: (min: 9, avg: 189.8804347826087, max: 500)

Run: 93, exploration: 0.01, score: 270
Scores: (min: 9, avg: 190.74193548387098, max: 500)

Run: 94, exploration: 0.01, score: 500
Scores: (min: 9, avg: 194.03191489361703, max: 500)

Run: 95, exploration: 0.01, score: 500
Scores: (min: 9, avg: 197.25263157894736, max: 500)

Run: 96, exploration: 0.01, score: 283
Scores: (min: 9, avg: 198.14583333333334, max: 500)

Run: 97, exploration: 0.01, score: 500
Scores: (min: 9, avg: 201.2577319587629, max: 500)

Run: 98, exploration: 0.01, score: 500
Scores: (min: 9, avg: 204.30612244897958, max: 500)

Run: 99, exploration: 0.01, score: 311
Scores: (min: 9, avg: 205.3838383838384, max

NameError: name 'exit' is not defined

Note: If the code is running properly, you should begin to see output appearing above this code block. It will take several minutes, so it is recommended that you let this code run in the background while completing other work. When the code has finished, it will print output saying, "Solved in _ runs, _ total runs."

You may see an error about not having an exit command. This error does not affect the program's functionality and results from the steps taken to convert the code from Python 2.x to Python 3. Please disregard this error.

In [5]:
GAMMA = 0.90  
LEARNING_RATE = 0.01 

EXPLORATION_MAX = 2.0  
EXPLORATION_MIN = 0.05  
EXPLORATION_DECAY = 0.8

cartpole()

Run: 1, exploration: 2.0, score: 18
Scores: (min: 18, avg: 18, max: 18)

Run: 2, exploration: 0.056294995342131254, score: 18
Scores: (min: 18, avg: 18, max: 18)

Run: 3, exploration: 0.05, score: 10
Scores: (min: 10, avg: 15.333333333333334, max: 18)

Run: 4, exploration: 0.05, score: 28
Scores: (min: 10, avg: 18.5, max: 28)

Run: 5, exploration: 0.05, score: 24
Scores: (min: 10, avg: 19.6, max: 28)

Run: 6, exploration: 0.05, score: 84
Scores: (min: 10, avg: 30.333333333333332, max: 84)

Run: 7, exploration: 0.05, score: 43
Scores: (min: 10, avg: 32.142857142857146, max: 84)

Run: 8, exploration: 0.05, score: 58
Scores: (min: 10, avg: 35.375, max: 84)

Run: 9, exploration: 0.05, score: 31
Scores: (min: 10, avg: 34.888888888888886, max: 84)

Run: 10, exploration: 0.05, score: 96
Scores: (min: 10, avg: 41, max: 96)

Run: 11, exploration: 0.05, score: 240
Scores: (min: 10, avg: 59.09090909090909, max: 240)

Run: 12, exploration: 0.05, score: 103
Scores: (min: 10, avg: 62.75, max: 240)



Run: 94, exploration: 0.05, score: 56
Scores: (min: 9, avg: 74.6063829787234, max: 500)

Run: 95, exploration: 0.05, score: 43
Scores: (min: 9, avg: 74.27368421052631, max: 500)

Run: 96, exploration: 0.05, score: 15
Scores: (min: 9, avg: 73.65625, max: 500)

Run: 97, exploration: 0.05, score: 258
Scores: (min: 9, avg: 75.55670103092784, max: 500)

Run: 98, exploration: 0.05, score: 56
Scores: (min: 9, avg: 75.35714285714286, max: 500)

Run: 99, exploration: 0.05, score: 86
Scores: (min: 9, avg: 75.46464646464646, max: 500)

Run: 100, exploration: 0.05, score: 70
Scores: (min: 9, avg: 75.41, max: 500)

Run: 101, exploration: 0.05, score: 33
Scores: (min: 9, avg: 75.56, max: 500)

Run: 102, exploration: 0.05, score: 17
Scores: (min: 9, avg: 75.55, max: 500)

Run: 103, exploration: 0.05, score: 18
Scores: (min: 9, avg: 75.63, max: 500)

Run: 104, exploration: 0.05, score: 44
Scores: (min: 9, avg: 75.79, max: 500)

Run: 105, exploration: 0.05, score: 14
Scores: (min: 9, avg: 75.69, max: 5

Run: 197, exploration: 0.05, score: 71
Scores: (min: 11, avg: 80.19, max: 325)

Run: 198, exploration: 0.05, score: 95
Scores: (min: 11, avg: 80.58, max: 325)

Run: 199, exploration: 0.05, score: 148
Scores: (min: 11, avg: 81.2, max: 325)

Run: 200, exploration: 0.05, score: 27
Scores: (min: 11, avg: 80.77, max: 325)

Run: 201, exploration: 0.05, score: 109
Scores: (min: 11, avg: 81.53, max: 325)

Run: 202, exploration: 0.05, score: 69
Scores: (min: 11, avg: 82.05, max: 325)

Run: 203, exploration: 0.05, score: 78
Scores: (min: 11, avg: 82.65, max: 325)

Run: 204, exploration: 0.05, score: 18
Scores: (min: 11, avg: 82.39, max: 325)

Run: 205, exploration: 0.05, score: 100
Scores: (min: 11, avg: 83.25, max: 325)

Run: 206, exploration: 0.05, score: 34
Scores: (min: 11, avg: 83.09, max: 325)

Run: 207, exploration: 0.05, score: 45
Scores: (min: 11, avg: 83.18, max: 325)

Run: 208, exploration: 0.05, score: 110
Scores: (min: 11, avg: 83.85, max: 325)

Run: 209, exploration: 0.05, score: 1

Run: 300, exploration: 0.05, score: 56
Scores: (min: 12, avg: 77.61, max: 325)

Run: 301, exploration: 0.05, score: 89
Scores: (min: 12, avg: 77.41, max: 325)

Run: 302, exploration: 0.05, score: 49
Scores: (min: 12, avg: 77.21, max: 325)

Run: 303, exploration: 0.05, score: 301
Scores: (min: 12, avg: 79.44, max: 325)

Run: 304, exploration: 0.05, score: 15
Scores: (min: 12, avg: 79.41, max: 325)

Run: 305, exploration: 0.05, score: 101
Scores: (min: 12, avg: 79.42, max: 325)

Run: 306, exploration: 0.05, score: 90
Scores: (min: 12, avg: 79.98, max: 325)

Run: 307, exploration: 0.05, score: 14
Scores: (min: 12, avg: 79.67, max: 325)

Run: 308, exploration: 0.05, score: 45
Scores: (min: 12, avg: 79.02, max: 325)

Run: 309, exploration: 0.05, score: 71
Scores: (min: 12, avg: 78.09, max: 325)

Run: 310, exploration: 0.05, score: 20
Scores: (min: 12, avg: 77.1, max: 325)

Run: 311, exploration: 0.05, score: 96
Scores: (min: 12, avg: 76.85, max: 325)

Run: 312, exploration: 0.05, score: 75


Run: 403, exploration: 0.05, score: 55
Scores: (min: 11, avg: 84.34, max: 307)

Run: 404, exploration: 0.05, score: 36
Scores: (min: 11, avg: 84.55, max: 307)

Run: 405, exploration: 0.05, score: 25
Scores: (min: 11, avg: 83.79, max: 307)

Run: 406, exploration: 0.05, score: 91
Scores: (min: 11, avg: 83.8, max: 307)

Run: 407, exploration: 0.05, score: 19
Scores: (min: 11, avg: 83.85, max: 307)

Run: 408, exploration: 0.05, score: 104
Scores: (min: 11, avg: 84.44, max: 307)

Run: 409, exploration: 0.05, score: 70
Scores: (min: 11, avg: 84.43, max: 307)

Run: 410, exploration: 0.05, score: 17
Scores: (min: 11, avg: 84.4, max: 307)

Run: 411, exploration: 0.05, score: 15
Scores: (min: 11, avg: 83.59, max: 307)

Run: 412, exploration: 0.05, score: 81
Scores: (min: 11, avg: 83.65, max: 307)

Run: 413, exploration: 0.05, score: 10
Scores: (min: 10, avg: 82.78, max: 307)

Run: 414, exploration: 0.05, score: 35
Scores: (min: 10, avg: 82.2, max: 307)

Run: 415, exploration: 0.05, score: 20
Sco

Run: 508, exploration: 0.05, score: 9
Scores: (min: 8, avg: 19.39, max: 188)

Run: 509, exploration: 0.05, score: 10
Scores: (min: 8, avg: 18.79, max: 188)

Run: 510, exploration: 0.05, score: 10
Scores: (min: 8, avg: 18.72, max: 188)

Run: 511, exploration: 0.05, score: 9
Scores: (min: 8, avg: 18.66, max: 188)

Run: 512, exploration: 0.05, score: 9
Scores: (min: 8, avg: 17.94, max: 188)

Run: 513, exploration: 0.05, score: 8
Scores: (min: 8, avg: 17.92, max: 188)

Run: 514, exploration: 0.05, score: 10
Scores: (min: 8, avg: 17.67, max: 188)

Run: 515, exploration: 0.05, score: 11
Scores: (min: 8, avg: 17.58, max: 188)

Run: 516, exploration: 0.05, score: 9
Scores: (min: 8, avg: 16.49, max: 188)

Run: 517, exploration: 0.05, score: 11
Scores: (min: 8, avg: 15.72, max: 188)

Run: 518, exploration: 0.05, score: 8
Scores: (min: 8, avg: 15.5, max: 188)

Run: 519, exploration: 0.05, score: 10
Scores: (min: 8, avg: 15.43, max: 188)

Run: 520, exploration: 0.05, score: 8
Scores: (min: 8, avg:

Run: 613, exploration: 0.05, score: 14
Scores: (min: 8, avg: 19.09, max: 125)

Run: 614, exploration: 0.05, score: 14
Scores: (min: 8, avg: 19.13, max: 125)

Run: 615, exploration: 0.05, score: 12
Scores: (min: 8, avg: 19.14, max: 125)

Run: 616, exploration: 0.05, score: 21
Scores: (min: 8, avg: 19.26, max: 125)

Run: 617, exploration: 0.05, score: 18
Scores: (min: 8, avg: 19.33, max: 125)

Run: 618, exploration: 0.05, score: 10
Scores: (min: 8, avg: 19.35, max: 125)

Run: 619, exploration: 0.05, score: 12
Scores: (min: 8, avg: 19.37, max: 125)

Run: 620, exploration: 0.05, score: 11
Scores: (min: 8, avg: 19.4, max: 125)

Run: 621, exploration: 0.05, score: 13
Scores: (min: 8, avg: 19.43, max: 125)

Run: 622, exploration: 0.05, score: 11
Scores: (min: 8, avg: 19.44, max: 125)

Run: 623, exploration: 0.05, score: 15
Scores: (min: 9, avg: 19.51, max: 125)

Run: 624, exploration: 0.05, score: 15
Scores: (min: 9, avg: 19.57, max: 125)

Run: 625, exploration: 0.05, score: 31
Scores: (min: 

Run: 718, exploration: 0.05, score: 26
Scores: (min: 8, avg: 16.32, max: 80)

Run: 719, exploration: 0.05, score: 11
Scores: (min: 8, avg: 16.31, max: 80)

Run: 720, exploration: 0.05, score: 11
Scores: (min: 8, avg: 16.31, max: 80)

Run: 721, exploration: 0.05, score: 28
Scores: (min: 8, avg: 16.46, max: 80)

Run: 722, exploration: 0.05, score: 37
Scores: (min: 8, avg: 16.72, max: 80)

Run: 723, exploration: 0.05, score: 9
Scores: (min: 8, avg: 16.66, max: 80)

Run: 724, exploration: 0.05, score: 8
Scores: (min: 8, avg: 16.59, max: 80)

Run: 725, exploration: 0.05, score: 15
Scores: (min: 8, avg: 16.43, max: 80)

Run: 726, exploration: 0.05, score: 8
Scores: (min: 8, avg: 16.31, max: 80)

Run: 727, exploration: 0.05, score: 47
Scores: (min: 8, avg: 16.63, max: 80)

Run: 728, exploration: 0.05, score: 10
Scores: (min: 8, avg: 16.57, max: 80)

Run: 729, exploration: 0.05, score: 30
Scores: (min: 8, avg: 16.64, max: 80)

Run: 730, exploration: 0.05, score: 31
Scores: (min: 8, avg: 16.86,

Run: 824, exploration: 0.05, score: 15
Scores: (min: 8, avg: 21.42, max: 73)

Run: 825, exploration: 0.05, score: 16
Scores: (min: 8, avg: 21.43, max: 73)

Run: 826, exploration: 0.05, score: 37
Scores: (min: 9, avg: 21.72, max: 73)

Run: 827, exploration: 0.05, score: 12
Scores: (min: 9, avg: 21.37, max: 73)

Run: 828, exploration: 0.05, score: 10
Scores: (min: 9, avg: 21.37, max: 73)

Run: 829, exploration: 0.05, score: 12
Scores: (min: 9, avg: 21.19, max: 73)

Run: 830, exploration: 0.05, score: 22
Scores: (min: 9, avg: 21.1, max: 73)

Run: 831, exploration: 0.05, score: 13
Scores: (min: 9, avg: 21.14, max: 73)

Run: 832, exploration: 0.05, score: 10
Scores: (min: 9, avg: 20.94, max: 73)

Run: 833, exploration: 0.05, score: 9
Scores: (min: 9, avg: 20.92, max: 73)

Run: 834, exploration: 0.05, score: 9
Scores: (min: 9, avg: 20.84, max: 73)

Run: 835, exploration: 0.05, score: 70
Scores: (min: 9, avg: 20.81, max: 70)

Run: 836, exploration: 0.05, score: 34
Scores: (min: 9, avg: 21.04,

Run: 930, exploration: 0.05, score: 13
Scores: (min: 8, avg: 19.36, max: 70)

Run: 931, exploration: 0.05, score: 18
Scores: (min: 8, avg: 19.41, max: 70)

Run: 932, exploration: 0.05, score: 16
Scores: (min: 8, avg: 19.47, max: 70)

Run: 933, exploration: 0.05, score: 22
Scores: (min: 8, avg: 19.6, max: 70)

Run: 934, exploration: 0.05, score: 45
Scores: (min: 8, avg: 19.96, max: 70)

Run: 935, exploration: 0.05, score: 17
Scores: (min: 8, avg: 19.43, max: 49)

Run: 936, exploration: 0.05, score: 15
Scores: (min: 8, avg: 19.24, max: 49)

Run: 937, exploration: 0.05, score: 69
Scores: (min: 8, avg: 19.65, max: 69)

Run: 938, exploration: 0.05, score: 24
Scores: (min: 8, avg: 19.72, max: 69)

Run: 939, exploration: 0.05, score: 16
Scores: (min: 8, avg: 19.7, max: 69)

Run: 940, exploration: 0.05, score: 26
Scores: (min: 8, avg: 19.79, max: 69)

Run: 941, exploration: 0.05, score: 14
Scores: (min: 8, avg: 19.79, max: 69)

Run: 942, exploration: 0.05, score: 22
Scores: (min: 8, avg: 19.89

Run: 1036, exploration: 0.05, score: 9
Scores: (min: 8, avg: 16.12, max: 69)

Run: 1037, exploration: 0.05, score: 9
Scores: (min: 8, avg: 15.52, max: 65)

Run: 1038, exploration: 0.05, score: 9
Scores: (min: 8, avg: 15.37, max: 65)

Run: 1039, exploration: 0.05, score: 9
Scores: (min: 8, avg: 15.3, max: 65)

Run: 1040, exploration: 0.05, score: 9
Scores: (min: 8, avg: 15.13, max: 65)

Run: 1041, exploration: 0.05, score: 8
Scores: (min: 8, avg: 15.07, max: 65)

Run: 1042, exploration: 0.05, score: 9
Scores: (min: 8, avg: 14.94, max: 65)

Run: 1043, exploration: 0.05, score: 10
Scores: (min: 8, avg: 14.85, max: 65)

Run: 1044, exploration: 0.05, score: 10
Scores: (min: 8, avg: 14.7, max: 65)

Run: 1045, exploration: 0.05, score: 9
Scores: (min: 8, avg: 14.65, max: 65)

Run: 1046, exploration: 0.05, score: 9
Scores: (min: 8, avg: 14.63, max: 65)

Run: 1047, exploration: 0.05, score: 8
Scores: (min: 8, avg: 14.51, max: 65)

Run: 1048, exploration: 0.05, score: 9
Scores: (min: 8, avg: 14.

Run: 1140, exploration: 0.05, score: 15
Scores: (min: 8, avg: 27.41, max: 106)

Run: 1141, exploration: 0.05, score: 20
Scores: (min: 8, avg: 27.53, max: 106)

Run: 1142, exploration: 0.05, score: 43
Scores: (min: 8, avg: 27.87, max: 106)

Run: 1143, exploration: 0.05, score: 15
Scores: (min: 8, avg: 27.92, max: 106)

Run: 1144, exploration: 0.05, score: 17
Scores: (min: 8, avg: 27.99, max: 106)

Run: 1145, exploration: 0.05, score: 25
Scores: (min: 8, avg: 28.15, max: 106)

Run: 1146, exploration: 0.05, score: 13
Scores: (min: 8, avg: 28.19, max: 106)

Run: 1147, exploration: 0.05, score: 22
Scores: (min: 8, avg: 28.33, max: 106)

Run: 1148, exploration: 0.05, score: 26
Scores: (min: 8, avg: 28.5, max: 106)

Run: 1149, exploration: 0.05, score: 38
Scores: (min: 8, avg: 28.78, max: 106)

Run: 1150, exploration: 0.05, score: 16
Scores: (min: 8, avg: 28.85, max: 106)

Run: 1151, exploration: 0.05, score: 16
Scores: (min: 8, avg: 28.92, max: 106)

Run: 1152, exploration: 0.05, score: 18
S

Run: 1242, exploration: 0.05, score: 9
Scores: (min: 8, avg: 48.6, max: 173)

Run: 1243, exploration: 0.05, score: 10
Scores: (min: 8, avg: 48.55, max: 173)

Run: 1244, exploration: 0.05, score: 11
Scores: (min: 8, avg: 48.49, max: 173)

Run: 1245, exploration: 0.05, score: 10
Scores: (min: 8, avg: 48.34, max: 173)

Run: 1246, exploration: 0.05, score: 10
Scores: (min: 8, avg: 48.31, max: 173)

Run: 1247, exploration: 0.05, score: 10
Scores: (min: 8, avg: 48.19, max: 173)

Run: 1248, exploration: 0.05, score: 10
Scores: (min: 8, avg: 48.03, max: 173)

Run: 1249, exploration: 0.05, score: 9
Scores: (min: 8, avg: 47.74, max: 173)

Run: 1250, exploration: 0.05, score: 8
Scores: (min: 8, avg: 47.66, max: 173)

Run: 1251, exploration: 0.05, score: 10
Scores: (min: 8, avg: 47.6, max: 173)

Run: 1252, exploration: 0.05, score: 10
Scores: (min: 8, avg: 47.52, max: 173)

Run: 1253, exploration: 0.05, score: 8
Scores: (min: 8, avg: 47.46, max: 173)

Run: 1254, exploration: 0.05, score: 11
Scores

Run: 1345, exploration: 0.05, score: 46
Scores: (min: 8, avg: 36.42, max: 177)

Run: 1346, exploration: 0.05, score: 18
Scores: (min: 8, avg: 36.5, max: 177)

Run: 1347, exploration: 0.05, score: 17
Scores: (min: 8, avg: 36.57, max: 177)

Run: 1348, exploration: 0.05, score: 21
Scores: (min: 8, avg: 36.68, max: 177)

Run: 1349, exploration: 0.05, score: 24
Scores: (min: 8, avg: 36.83, max: 177)

Run: 1350, exploration: 0.05, score: 16
Scores: (min: 8, avg: 36.91, max: 177)

Run: 1351, exploration: 0.05, score: 88
Scores: (min: 8, avg: 37.69, max: 177)

Run: 1352, exploration: 0.05, score: 59
Scores: (min: 8, avg: 38.18, max: 177)

Run: 1353, exploration: 0.05, score: 53
Scores: (min: 8, avg: 38.63, max: 177)

Run: 1354, exploration: 0.05, score: 79
Scores: (min: 8, avg: 39.31, max: 177)

Run: 1355, exploration: 0.05, score: 14
Scores: (min: 8, avg: 39.35, max: 177)

Run: 1356, exploration: 0.05, score: 30
Scores: (min: 8, avg: 39.57, max: 177)

Run: 1357, exploration: 0.05, score: 78
S

Run: 1447, exploration: 0.05, score: 16
Scores: (min: 9, avg: 41.81, max: 134)

Run: 1448, exploration: 0.05, score: 27
Scores: (min: 9, avg: 41.87, max: 134)

Run: 1449, exploration: 0.05, score: 42
Scores: (min: 9, avg: 42.05, max: 134)

Run: 1450, exploration: 0.05, score: 46
Scores: (min: 9, avg: 42.35, max: 134)

Run: 1451, exploration: 0.05, score: 12
Scores: (min: 9, avg: 41.59, max: 134)

Run: 1452, exploration: 0.05, score: 10
Scores: (min: 9, avg: 41.1, max: 134)

Run: 1453, exploration: 0.05, score: 14
Scores: (min: 9, avg: 40.71, max: 134)

Run: 1454, exploration: 0.05, score: 16
Scores: (min: 9, avg: 40.08, max: 134)

Run: 1455, exploration: 0.05, score: 107
Scores: (min: 9, avg: 41.01, max: 134)

Run: 1456, exploration: 0.05, score: 85
Scores: (min: 9, avg: 41.56, max: 134)

Run: 1457, exploration: 0.05, score: 17
Scores: (min: 9, avg: 40.95, max: 134)

Run: 1458, exploration: 0.05, score: 38
Scores: (min: 9, avg: 40.71, max: 134)

Run: 1459, exploration: 0.05, score: 14


Run: 1549, exploration: 0.05, score: 39
Scores: (min: 9, avg: 54.79, max: 252)

Run: 1550, exploration: 0.05, score: 122
Scores: (min: 9, avg: 55.55, max: 252)

Run: 1551, exploration: 0.05, score: 78
Scores: (min: 9, avg: 56.21, max: 252)

Run: 1552, exploration: 0.05, score: 58
Scores: (min: 9, avg: 56.69, max: 252)

Run: 1553, exploration: 0.05, score: 100
Scores: (min: 9, avg: 57.55, max: 252)

Run: 1554, exploration: 0.05, score: 90
Scores: (min: 9, avg: 58.29, max: 252)

Run: 1555, exploration: 0.05, score: 226
Scores: (min: 9, avg: 59.48, max: 252)

Run: 1556, exploration: 0.05, score: 109
Scores: (min: 9, avg: 59.72, max: 252)

Run: 1557, exploration: 0.05, score: 19
Scores: (min: 9, avg: 59.74, max: 252)

Run: 1558, exploration: 0.05, score: 98
Scores: (min: 9, avg: 60.34, max: 252)

Run: 1559, exploration: 0.05, score: 152
Scores: (min: 9, avg: 61.72, max: 252)

Run: 1560, exploration: 0.05, score: 53
Scores: (min: 9, avg: 61.9, max: 252)

Run: 1561, exploration: 0.05, score:

Run: 1652, exploration: 0.05, score: 96
Scores: (min: 9, avg: 86.95, max: 343)

Run: 1653, exploration: 0.05, score: 109
Scores: (min: 9, avg: 87.04, max: 343)

Run: 1654, exploration: 0.05, score: 68
Scores: (min: 9, avg: 86.82, max: 343)

Run: 1655, exploration: 0.05, score: 100
Scores: (min: 9, avg: 85.56, max: 343)

Run: 1656, exploration: 0.05, score: 67
Scores: (min: 9, avg: 85.14, max: 343)

Run: 1657, exploration: 0.05, score: 81
Scores: (min: 9, avg: 85.76, max: 343)

Run: 1658, exploration: 0.05, score: 127
Scores: (min: 9, avg: 86.05, max: 343)

Run: 1659, exploration: 0.05, score: 93
Scores: (min: 9, avg: 85.46, max: 343)

Run: 1660, exploration: 0.05, score: 31
Scores: (min: 9, avg: 85.24, max: 343)

Run: 1661, exploration: 0.05, score: 41
Scores: (min: 9, avg: 85.54, max: 343)

Run: 1662, exploration: 0.05, score: 10
Scores: (min: 9, avg: 85.52, max: 343)

Run: 1663, exploration: 0.05, score: 16
Scores: (min: 9, avg: 84.76, max: 343)

Run: 1664, exploration: 0.05, score: 

Run: 1754, exploration: 0.05, score: 103
Scores: (min: 10, avg: 73.33, max: 258)

Run: 1755, exploration: 0.05, score: 30
Scores: (min: 10, avg: 72.63, max: 258)

Run: 1756, exploration: 0.05, score: 114
Scores: (min: 10, avg: 73.1, max: 258)

Run: 1757, exploration: 0.05, score: 14
Scores: (min: 10, avg: 72.43, max: 258)

Run: 1758, exploration: 0.05, score: 80
Scores: (min: 10, avg: 71.96, max: 258)

Run: 1759, exploration: 0.05, score: 141
Scores: (min: 10, avg: 72.44, max: 258)

Run: 1760, exploration: 0.05, score: 87
Scores: (min: 10, avg: 73, max: 258)

Run: 1761, exploration: 0.05, score: 38
Scores: (min: 10, avg: 72.97, max: 258)

Run: 1762, exploration: 0.05, score: 49
Scores: (min: 11, avg: 73.36, max: 258)

Run: 1763, exploration: 0.05, score: 14
Scores: (min: 11, avg: 73.34, max: 258)

Run: 1764, exploration: 0.05, score: 221
Scores: (min: 11, avg: 75.03, max: 258)

Run: 1765, exploration: 0.05, score: 28
Scores: (min: 11, avg: 74.36, max: 258)

Run: 1766, exploration: 0.05

Run: 1855, exploration: 0.05, score: 95
Scores: (min: 10, avg: 76.61, max: 221)

Run: 1856, exploration: 0.05, score: 60
Scores: (min: 10, avg: 76.07, max: 221)

Run: 1857, exploration: 0.05, score: 62
Scores: (min: 10, avg: 76.55, max: 221)

Run: 1858, exploration: 0.05, score: 27
Scores: (min: 10, avg: 76.02, max: 221)

Run: 1859, exploration: 0.05, score: 138
Scores: (min: 10, avg: 75.99, max: 221)

Run: 1860, exploration: 0.05, score: 88
Scores: (min: 10, avg: 76, max: 221)

Run: 1861, exploration: 0.05, score: 53
Scores: (min: 10, avg: 76.15, max: 221)

Run: 1862, exploration: 0.05, score: 27
Scores: (min: 10, avg: 75.93, max: 221)

Run: 1863, exploration: 0.05, score: 112
Scores: (min: 10, avg: 76.91, max: 221)

Run: 1864, exploration: 0.05, score: 97
Scores: (min: 10, avg: 75.67, max: 179)

Run: 1865, exploration: 0.05, score: 79
Scores: (min: 10, avg: 76.18, max: 179)

Run: 1866, exploration: 0.05, score: 51
Scores: (min: 10, avg: 75.68, max: 179)

Run: 1867, exploration: 0.05,

Run: 1957, exploration: 0.05, score: 67
Scores: (min: 13, avg: 78.49, max: 245)

Run: 1958, exploration: 0.05, score: 98
Scores: (min: 13, avg: 79.2, max: 245)

Run: 1959, exploration: 0.05, score: 95
Scores: (min: 13, avg: 78.77, max: 245)

Run: 1960, exploration: 0.05, score: 52
Scores: (min: 13, avg: 78.41, max: 245)

Run: 1961, exploration: 0.05, score: 78
Scores: (min: 13, avg: 78.66, max: 245)

Run: 1962, exploration: 0.05, score: 67
Scores: (min: 13, avg: 79.06, max: 245)

Run: 1963, exploration: 0.05, score: 92
Scores: (min: 13, avg: 78.86, max: 245)

Run: 1964, exploration: 0.05, score: 100
Scores: (min: 13, avg: 78.89, max: 245)

Run: 1965, exploration: 0.05, score: 71
Scores: (min: 13, avg: 78.81, max: 245)

Run: 1966, exploration: 0.05, score: 107
Scores: (min: 13, avg: 79.37, max: 245)

Run: 1967, exploration: 0.05, score: 36
Scores: (min: 13, avg: 78.98, max: 245)

Run: 1968, exploration: 0.05, score: 80
Scores: (min: 13, avg: 78.81, max: 245)

Run: 1969, exploration: 0.0

Run: 2058, exploration: 0.05, score: 94
Scores: (min: 12, avg: 83.74, max: 199)

Run: 2059, exploration: 0.05, score: 112
Scores: (min: 12, avg: 83.91, max: 199)

Run: 2060, exploration: 0.05, score: 94
Scores: (min: 12, avg: 84.33, max: 199)

Run: 2061, exploration: 0.05, score: 97
Scores: (min: 12, avg: 84.52, max: 199)

Run: 2062, exploration: 0.05, score: 151
Scores: (min: 12, avg: 85.36, max: 199)

Run: 2063, exploration: 0.05, score: 93
Scores: (min: 12, avg: 85.37, max: 199)

Run: 2064, exploration: 0.05, score: 96
Scores: (min: 12, avg: 85.33, max: 199)

Run: 2065, exploration: 0.05, score: 84
Scores: (min: 12, avg: 85.46, max: 199)

Run: 2066, exploration: 0.05, score: 167
Scores: (min: 12, avg: 86.06, max: 199)

Run: 2067, exploration: 0.05, score: 94
Scores: (min: 12, avg: 86.64, max: 199)

Run: 2068, exploration: 0.05, score: 99
Scores: (min: 12, avg: 86.83, max: 199)

Run: 2069, exploration: 0.05, score: 14
Scores: (min: 12, avg: 86.46, max: 199)

Run: 2070, exploration: 0

Run: 2159, exploration: 0.05, score: 120
Scores: (min: 10, avg: 88.86, max: 263)

Run: 2160, exploration: 0.05, score: 24
Scores: (min: 10, avg: 88.16, max: 263)

Run: 2161, exploration: 0.05, score: 139
Scores: (min: 10, avg: 88.58, max: 263)

Run: 2162, exploration: 0.05, score: 93
Scores: (min: 10, avg: 88, max: 263)

Run: 2163, exploration: 0.05, score: 70
Scores: (min: 10, avg: 87.77, max: 263)

Run: 2164, exploration: 0.05, score: 97
Scores: (min: 10, avg: 87.78, max: 263)

Run: 2165, exploration: 0.05, score: 69
Scores: (min: 10, avg: 87.63, max: 263)

Run: 2166, exploration: 0.05, score: 35
Scores: (min: 10, avg: 86.31, max: 263)

Run: 2167, exploration: 0.05, score: 11
Scores: (min: 10, avg: 85.48, max: 263)

Run: 2168, exploration: 0.05, score: 12
Scores: (min: 10, avg: 84.61, max: 263)

Run: 2169, exploration: 0.05, score: 71
Scores: (min: 10, avg: 85.18, max: 263)

Run: 2170, exploration: 0.05, score: 96
Scores: (min: 10, avg: 85.04, max: 263)

Run: 2171, exploration: 0.05,

Run: 2260, exploration: 0.05, score: 90
Scores: (min: 11, avg: 77.51, max: 200)

Run: 2261, exploration: 0.05, score: 25
Scores: (min: 11, avg: 76.37, max: 200)

Run: 2262, exploration: 0.05, score: 98
Scores: (min: 11, avg: 76.42, max: 200)

Run: 2263, exploration: 0.05, score: 71
Scores: (min: 11, avg: 76.43, max: 200)

Run: 2264, exploration: 0.05, score: 14
Scores: (min: 11, avg: 75.6, max: 200)

Run: 2265, exploration: 0.05, score: 50
Scores: (min: 11, avg: 75.41, max: 200)

Run: 2266, exploration: 0.05, score: 90
Scores: (min: 11, avg: 75.96, max: 200)

Run: 2267, exploration: 0.05, score: 76
Scores: (min: 12, avg: 76.61, max: 200)

Run: 2268, exploration: 0.05, score: 93
Scores: (min: 12, avg: 77.42, max: 200)

Run: 2269, exploration: 0.05, score: 103
Scores: (min: 12, avg: 77.74, max: 200)

Run: 2270, exploration: 0.05, score: 65
Scores: (min: 12, avg: 77.43, max: 200)

Run: 2271, exploration: 0.05, score: 94
Scores: (min: 12, avg: 77.47, max: 200)

Run: 2272, exploration: 0.05

Run: 2361, exploration: 0.05, score: 112
Scores: (min: 10, avg: 74.39, max: 216)

Run: 2362, exploration: 0.05, score: 40
Scores: (min: 10, avg: 73.81, max: 216)

Run: 2363, exploration: 0.05, score: 91
Scores: (min: 10, avg: 74.01, max: 216)

Run: 2364, exploration: 0.05, score: 29
Scores: (min: 10, avg: 74.16, max: 216)

Run: 2365, exploration: 0.05, score: 13
Scores: (min: 10, avg: 73.79, max: 216)

Run: 2366, exploration: 0.05, score: 18
Scores: (min: 10, avg: 73.07, max: 216)

Run: 2367, exploration: 0.05, score: 19
Scores: (min: 10, avg: 72.5, max: 216)

Run: 2368, exploration: 0.05, score: 71
Scores: (min: 10, avg: 72.28, max: 216)

Run: 2369, exploration: 0.05, score: 58
Scores: (min: 10, avg: 71.83, max: 216)

Run: 2370, exploration: 0.05, score: 100
Scores: (min: 10, avg: 72.18, max: 216)

Run: 2371, exploration: 0.05, score: 92
Scores: (min: 10, avg: 72.16, max: 216)

Run: 2372, exploration: 0.05, score: 88
Scores: (min: 10, avg: 72.6, max: 216)

Run: 2373, exploration: 0.05

Run: 2462, exploration: 0.05, score: 131
Scores: (min: 11, avg: 77.43, max: 223)

Run: 2463, exploration: 0.05, score: 96
Scores: (min: 11, avg: 77.48, max: 223)

Run: 2464, exploration: 0.05, score: 41
Scores: (min: 11, avg: 77.6, max: 223)

Run: 2465, exploration: 0.05, score: 44
Scores: (min: 11, avg: 77.91, max: 223)

Run: 2466, exploration: 0.05, score: 111
Scores: (min: 11, avg: 78.84, max: 223)

Run: 2467, exploration: 0.05, score: 13
Scores: (min: 11, avg: 78.78, max: 223)

Run: 2468, exploration: 0.05, score: 67
Scores: (min: 11, avg: 78.74, max: 223)

Run: 2469, exploration: 0.05, score: 93
Scores: (min: 11, avg: 79.09, max: 223)

Run: 2470, exploration: 0.05, score: 105
Scores: (min: 11, avg: 79.14, max: 223)

Run: 2471, exploration: 0.05, score: 89
Scores: (min: 11, avg: 79.11, max: 223)

Run: 2472, exploration: 0.05, score: 92
Scores: (min: 11, avg: 79.15, max: 223)

Run: 2473, exploration: 0.05, score: 79
Scores: (min: 11, avg: 79.7, max: 223)

Run: 2474, exploration: 0.0

Run: 2563, exploration: 0.05, score: 32
Scores: (min: 11, avg: 75.18, max: 190)

Run: 2564, exploration: 0.05, score: 141
Scores: (min: 11, avg: 76.18, max: 190)

Run: 2565, exploration: 0.05, score: 107
Scores: (min: 11, avg: 76.81, max: 190)

Run: 2566, exploration: 0.05, score: 86
Scores: (min: 11, avg: 76.56, max: 190)

Run: 2567, exploration: 0.05, score: 58
Scores: (min: 11, avg: 77.01, max: 190)

Run: 2568, exploration: 0.05, score: 99
Scores: (min: 11, avg: 77.33, max: 190)

Run: 2569, exploration: 0.05, score: 18
Scores: (min: 11, avg: 76.58, max: 190)

Run: 2570, exploration: 0.05, score: 58
Scores: (min: 11, avg: 76.11, max: 190)

Run: 2571, exploration: 0.05, score: 37
Scores: (min: 11, avg: 75.59, max: 190)

Run: 2572, exploration: 0.05, score: 133
Scores: (min: 11, avg: 76, max: 190)

Run: 2573, exploration: 0.05, score: 97
Scores: (min: 11, avg: 76.18, max: 190)

Run: 2574, exploration: 0.05, score: 47
Scores: (min: 11, avg: 75.67, max: 190)

Run: 2575, exploration: 0.05

Run: 2665, exploration: 0.05, score: 129
Scores: (min: 11, avg: 74.4, max: 256)

Run: 2666, exploration: 0.05, score: 53
Scores: (min: 11, avg: 74.07, max: 256)

Run: 2667, exploration: 0.05, score: 124
Scores: (min: 11, avg: 74.73, max: 256)

Run: 2668, exploration: 0.05, score: 113
Scores: (min: 11, avg: 74.87, max: 256)

Run: 2669, exploration: 0.05, score: 18
Scores: (min: 11, avg: 74.87, max: 256)

Run: 2670, exploration: 0.05, score: 61
Scores: (min: 11, avg: 74.9, max: 256)

Run: 2671, exploration: 0.05, score: 78
Scores: (min: 11, avg: 75.31, max: 256)

Run: 2672, exploration: 0.05, score: 94
Scores: (min: 11, avg: 74.92, max: 256)

Run: 2673, exploration: 0.05, score: 17
Scores: (min: 11, avg: 74.12, max: 256)

Run: 2674, exploration: 0.05, score: 108
Scores: (min: 11, avg: 74.73, max: 256)

Run: 2675, exploration: 0.05, score: 90
Scores: (min: 11, avg: 74.6, max: 256)

Run: 2676, exploration: 0.05, score: 11
Scores: (min: 11, avg: 74.36, max: 256)

Run: 2677, exploration: 0.0

KeyboardInterrupt: 

- Explain how reinforcement learning concepts apply to the cartpole problem

The goal of the agent in the cartpole problem is to keep the pole pointed upwards even though it is unstable.  The states in this case are the results of actions by the cart on the pole.  The cart can move left or right to keep it balanced which may make it centered or contribute to it falling over.  The learning algorithm used to train the cart to keep the pole upright is the Deep Q-Learning technique (DQN).

- Analyze how experience replay is applied to the cartpole problem

Experience replay is a process used that samples actions and updates its related Q-value.  Q-values are measures of quality for a given move.  So by trying different moves at different states, the Q-values can be updated accordingly to find on average the best actions for a given problem.  The discount factor is woven into this as a way to simulate uncertainty.  Actions that have more risk should have a discounted reward to the agent since there is a chance that they may not receive the reward.  An example of this is why most people will not spend all of their money on a power ball ticket.  The perceived risk of spending all of your money is much greater than the likelihood of actually winning, making that action less likely to be acted on.

- Analyze how neural networks are used in deep Q-learning.

Deep Q-learning is different than Q-learning in that the first approximates the values in the Q table while the other calculates them manually.  The benefits of the neural network are the lesser memory requirements of continuous or large sets.  The neural network is setup in that the inputs are states with the output being pairs of actions and q-values.  The output with the highest q-values represent the best known actions for that state.  The learning rate is used to update the weights in the neural network after each run.  Instead of updating the weight by the full amount, the difference is multiplied by the learning weight and added to the current amount.  Increasing or lowering this hyper parameter than can effect the rate of change of the weights in the model.

Sources:

1. Beysolow, I. T. (2019). Applied reinforcement learning with python : With openai gym, tensorflow, and keras. Apress L. P..

2. Surma, G. (2019, November 10). Cartpole - introduction to reinforcement learning (DQN - deep Q-learning). Medium. Retrieved November 21, 2021, from https://gsurma.medium.com/cartpole-introduction-to-reinforcement-learning-ed0eb5b58288.

3. Wang, M. (2021, October 3). Deep Q-learning tutorial: Mindqn. Medium. Retrieved November 21, 2021, from https://towardsdatascience.com/deep-q-learning-tutorial-mindqn-2a4c855abffc. 
