## Q LEARNING FOR INTELLIGENT GAME PLAYING

<h3> 1. Introduction </h3>

Reinforcement learning is an area of Machine Learning Reinforcement. It is about taking suitable action to maximize reward in a situation. Reinforcement learning differs from the supervised learning in a way that in supervised learning the training data has the answer key with it so the model is trained with the correct answer itself whereas in reinforcement learning, there is no answer but the reinforcement agent decides what to do to perform the given task. In the absence of training dataset, it is bound to learn from its experience.

In [1]:
import numpy as np
import pickle

State class was defined to record the board state, available board positions, winner, rewards and game state. First, we initialized two agents, an empty board and player symbol (1 for player1 and -1 for player2). The state continuously checked if the game had ended and if it ended, checked the winner and gave reward to the agent that won.

In [6]:
class state:
    def __init__(self,a1,a2):
        self.state = '         '
        self.a1 = a1
        self.a2 = a2
        self.board = np.zeros((rows, cols)).astype(int)
        self.gameEnd = False
        self.playerSymbol = 1
        
    def getBoardState(self):
        self.BoardState = str(self.board.reshape(cols * rows))
        return self.BoardState
        
    def printBoard(self,pos):
        s = pos
        s= s.replace('0', '*')
        s= s.replace('-1', 'O')
        s= s.replace('1', 'X')
        s= s.replace('[', '')
        s= s.replace(']', '')
        s = s.replace(" ", "")
        #print(s)

        print('    -------------')
        print('    | {} | {} | {} |'.format(s[0],s[1],s[2]))
        print('    -------------')
        print('    | {} | {} | {} | '.format(s[3],s[4],s[5]))
        print('    -------------')
        print('    | {} | {} | {} | '.format(s[6],s[7],s[8]))
        print('    -------------')
        
    def availablePositions(self):
        positions = []
        for i in range(rows):
            for j in range(cols):
                if self.board[i, j] == 0:
                    positions.append((i, j))  # need to be tuple
        return positions
     
    def winner(self):
         # row
        for i in range(rows):
            if sum(self.board[i, :]) == 3:
                self.gameEnd = True
                return 1
            if sum(self.board[i, :]) == -3:
                self.gameEnd = True
                return -1
        # col
        for i in range(cols):
            if sum(self.board[:, i]) == 3:
                self.gameEnd = True
                return 1
            if sum(self.board[:, i]) == -3:
                self.gameEnd = True
                return -1
        # diagonal
        diag_sum1 = sum([self.board[i, i] for i in range(cols)])
        diag_sum2 = sum([self.board[i, cols - i - 1] for i in range(cols)])
        diag_sum = max(abs(diag_sum1), abs(diag_sum2))
        if diag_sum == 3:
            self.gameEnd = True
            if diag_sum1 == 3 or diag_sum2 == 3:
                return 1
            else:
                return -1

        # tie
        # no available positions
        if len(self.availablePositions()) == 0:
            self.gameEnd = True
            return 0
        # not end
        self.gameEnd = False
        return None

    def updateState(self, position):
        self.board[position] = self.playerSymbol
        # switch to another player
        self.playerSymbol = -1 if self.playerSymbol == 1 else 1    
    
    # only when game ends
    def Reward(self):
        result = self.winner()
        # backpropagate reward
        if result == 1:
            self.a1.giveReward(1)
            self.a2.giveReward(0)
        elif result == -1:
            self.a1.giveReward(0)
            self.a2.giveReward(1)
        else:
            self.a1.giveReward(0.1)
            self.a2.giveReward(0.5)

    # board reset
    def reset(self):
        self.board = np.zeros((rows, cols)).astype(int)
        self.BoardState = None
        self.gameEnd = False
        self.playerSymbol = 1
    
    def print_winner(self, results):
        if results == 1:
            print('Winner is Agent1 (X)' )
        elif results == -1:
            print('Winner is Agent2 (O)' )
        else:
            print('Tie')
    
    def game_state(self,result):
        alist.append(result)
        from collections import Counter 
        c= Counter(alist)
        percent= [(i, c[i] / len(alist) * 100.0) for i in c]
        #print(c)
        print("Percent of game won by each agent:",percent)
    

    def play(self, rounds=100):
        
        for i in range(rounds):
            print("Episode {}".format(i))
            if i % 1000 == 0:
                print("Rounds {}".format(i))
            while not self.gameEnd:
                # Player 1
                positions = self.availablePositions()
                a1_action = self.a1.chooseAction(positions, self.board, self.playerSymbol)
                # take action and upate board state
                self.updateState(a1_action)
                board_state = self.getBoardState()
                #self.printBoard(board_state)
                self.a1.addState(board_state)
                # check board status if it is end

                win = self.winner()
                if win is not None:
                    # self.showBoard()
                    # ended with p1 either win or draw
                    self.Reward()
                    self.a1.reset()
                    self.a2.reset()
                    self.reset()
                    break

                else:
                    # Player 2
                    positions = self.availablePositions()
                    a2_action = self.a2.chooseAction(positions, self.board, self.playerSymbol)
                    self.updateState(a2_action)
                    board_state = self.getBoardState()
                    #self.printBoard(board_state)
                    self.a2.addState(board_state)

                    win = self.winner()
                    if win is not None:
                        # self.showBoard()
                        # ended with p2 either win or draw
                        self.Reward()
                        self.a1.reset()
                        self.a2.reset()
                        self.reset()
                        break
            self.print_winner(win)
            self.game_state(win)
        
        

Class Agent is defined and basic steps in this class are described as follows:

1.	Agent starts in a state (s1) takes an action (a1) and receives a reward (r1)
2.	Agent selects action by referencing Q-table with highest value (max) OR by random (epsilon, ε)
3.	Update q-values

Q-Learning steps:

1. Create a Q- table: 
When Q-learning is performed we create what’s called a Q-table or matrix that follows the shape of [state, action] and we initialize our values to zero. We then update and store our Q-values after an episode. This q-table becomes a reference table for our agent to select the best action based on the Q-value.

2. Q-learning and making updates:
The next step is simply for the agent to interact with the environment and make updates to the state action pairs in our q-table Q[state, action], Which can be done in 2 ways. The First method is exploit, where the q-table is used as a reference to view all possible actions for a given state. The agent then selects the action based on the max value of those actions. We then use the information to make a decision.The second method is explore, which is to take action Randomly. Instead of selecting actions based on the max future reward we select an action at random. Acting randomly is important because it allows the agent to explore and discover new states that otherwise may not be selected during the exploitation process.

3. Updating the Q-Table:
The updates occur after each step or action and ends when an episode is done. Done in this case means reaching some terminal point by the agent. A terminal state for example can be anything like landing on a checkout page, reaching the end of some game, completing some desired objective, etc. The agent will not learn much after a single episode, but eventually with enough exploring (steps and episodes) it will converge and learn the optimal q-values or q-star (Q∗).

In [3]:
class Agent:
    def __init__(self, name, exp_rate=0.3):
        self.name = name
        self.states = []  # record all positions taken
        self.learn_rate = 0.5
        self.exp_rate = exp_rate
        self.gamma = 0.9
        self.states_value = {}  # state -> value
        
    def getBoardState(self, board):
        boardState = str(board.reshape(cols * rows))
        return boardState
        
    def chooseAction(self, positions, current_board, symbol):
        if np.random.uniform(0, 1) <= self.exp_rate:
            # take random action
            idx = np.random.choice(len(positions))
            action = positions[idx]
        else:
            value_max = -999
            for p in positions:
                next_board = current_board.copy()
                next_board[p] = symbol
                next_boardState = self.getBoardState(next_board)
                
                value = 0 if self.states_value.get(next_boardState) is None else self.states_value.get(next_boardState)

                # print("value", value)
                if value >= value_max:
                    value_max = value
                    action = p
        print("{} takes action {}".format(self.name, action))
        return action

    # append a hash state
    def addState(self, state):
        self.states.append(state)
        
    def Reward(self):
        result = self.winner()
        if result == 1:
            self.a1.giveReward(1)
            self.a2.giveReward(0)
        elif result == -1:
            self.a1.giveReward(0)
            self.a2.giveReward(1)
        else:
            self.a1.giveReward(-0.1)
            self.a2.giveReward(-0.1)
            

    def giveReward(self, reward):
        for st in reversed(self.states):
            if self.states_value.get(st) is None:
                self.states_value[st] = 0
            self.states_value[st] += self.learn_rate * (self.gamma * reward - self.states_value[st])
            #print(self.states_value)
            reward = self.states_value[st]

    def reset(self):
        self.states = []

    def savePolicy(self):
        fw = open('policy_' + str(self.name), 'wb')
        pickle.dump(self.states_value, fw)
        fw.close()

    def loadPolicy(self, file):
        fr = open(file,'rb')
        self.states_value = pickle.load(fr)
        fr.close()

<h3> Result and Analysis </h3>

Using the discount factor of 0.9 and Learning rate of 0.5, we obtained the draw percentage to be 95%, meaning it is almost unbeatable. It can be clearly seen from the results below that the total percentage of ‘player1’ wins is 100%, ‘player2’ is 0% and tie is 0% before learning. After 20000 learning games, the percentage win of ‘player1’ becomes 0.6%, percentage wins of ‘player2’ becomes 4% and percentage wins of ‘ties’ becomes 95%. This indicates that our agent has learnt to play completely. 

### Training with 1000 rounds

In [4]:
rows=3
cols=3
alist = []

if __name__ == "__main__":
    # training
    a1 = Agent("agent1")
    a2 = Agent("agent2")

    st = state(a1, a2)
    #print("training...")

    print(st.availablePositions())
    st.play(1000)
    a1.savePolicy()
    a2.savePolicy()

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
Episode 0
Rounds 0
agent1 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
agent1 takes action (2, 0)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | X | O | X | 
    -------------
agent2 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | O | 
    -------------
    | X | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | O | 
    -------------
    | X | O | X | 
    -------------
agent2 takes action (0, 0)
    -------------
    | O | * | * |
    -------------
    | * | X | O | 
    -------------
    | X | O | X

agent2 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | * | * | O | 
    -------------
agent1 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | * | X | O | 
    -------------
agent2 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | X | O | * | 
    -------------
    | * | X | O | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | * | * |
    -------------
    | X | O | * | 
    -------------
    | * | X | O | 
    -------------
agent2 takes action (2, 0)
    -------------
    | X | * | * |
    -------------
    | X | O | * | 
    -------------
    | O | X | O | 
    -------------
agent1 takes action (1, 2)
    -------------
    | X | * | * |
    -------------
    | X | O | X | 
    -------------
    | O | X | O | 
    -------------
agent2 takes action (0, 2)
    -------------
    | X | * | O |
    ---

    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | * | * | O | 
    -------------
agent1 takes action (2, 0)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | X | * | O | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | X | O | O | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | * | * |
    -------------
    | X | * | * | 
    -------------
    | X | O | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 16.43835616438356), (-1, 34.24657534246575), (1, 49.31506849315068)]
Episode 73
agent1 takes action (1, 0)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------

agent1 takes action (2, 1)
    -------------
    | * | * | O |
    -------------
    | X | X | * | 
    -------------
    | * | X | O | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | * | O |
    -------------
    | X | X | * | 
    -------------
    | O | X | O | 
    -------------
agent1 takes action (1, 2)
    -------------
    | * | * | O |
    -------------
    | X | X | X | 
    -------------
    | O | X | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 13.88888888888889), (-1, 31.48148148148148), (1, 54.629629629629626)]
Episode 108
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 1)
    -------------
    | * | O | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | * | O | * |
    -------------
 

agent1 takes action (1, 0)
    -------------
    | * | O | * |
    -------------
    | X | X | X | 
    -------------
    | O | X | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 14.788732394366196), (-1, 29.577464788732392), (1, 55.633802816901415)]
Episode 142
agent1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (2, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | X | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | X | O | 
    -------------
agent1 takes action (2, 0)
    -------------
    | * | O | X |
    -------------

agent2 takes action (2, 1)
    -------------
    | * | O | * |
    -------------
    | * | X | X | 
    -------------
    | X | O | O | 
    -------------
agent1 takes action (1, 0)
    -------------
    | * | O | * |
    -------------
    | X | X | X | 
    -------------
    | X | O | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 14.52513966480447), (-1, 30.726256983240223), (1, 54.7486033519553)]
Episode 179
agent1 takes action (1, 0)
    -------------
    | * | * | * |
    -------------
    | X | * | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 2)
    -------------
    | * | * | O |
    -------------
    | X | * | * | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (1, 2)
    -------------
    | * | * | O |
    -------------
    | X | * | X | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | O |
    -------------
  

    -------------
    | * | * | O | 
    -------------
    | X | * | X | 
    -------------
agent1 takes action (1, 0)
    -------------
    | O | * | * |
    -------------
    | X | * | O | 
    -------------
    | X | * | X | 
    -------------
agent2 takes action (2, 1)
    -------------
    | O | * | * |
    -------------
    | X | * | O | 
    -------------
    | X | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | O | * | * |
    -------------
    | X | X | O | 
    -------------
    | X | O | X | 
    -------------
agent2 takes action (0, 2)
    -------------
    | O | * | O |
    -------------
    | X | X | O | 
    -------------
    | X | O | X | 
    -------------
agent1 takes action (0, 1)
    -------------
    | O | X | O |
    -------------
    | X | X | O | 
    -------------
    | X | O | X | 
    -------------
Tie
Percent of game won by each agent: [(0, 13.82488479262673), (-1, 28.57142857142857), (1, 57.6036866359447)]
Episode 217
agent1 ta

    -------------
agent2 takes action (2, 1)
    -------------
    | X | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | O | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | X | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
agent2 takes action (2, 0)
    -------------
    | X | * | * |
    -------------
    | * | * | * | 
    -------------
    | O | O | X | 
    -------------
agent1 takes action (1, 2)
    -------------
    | X | * | * |
    -------------
    | * | * | X | 
    -------------
    | O | O | X | 
    -------------
agent2 takes action (0, 2)
    -------------
    | X | * | O |
    -------------
    | * | * | X | 
    -------------
    | O | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | X | * | O |
    -------------
    | * | X | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each age

agent2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | X | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | O | O | X | 
    -------------
agent1 takes action (1, 2)
    -------------
    | * | O | X |
    -------------
    | * | X | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 12.88135593220339), (-1, 26.101694915254235), (1, 61.016949152542374)]
Episode 295
agent1 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | * |
    -------------


agent2 takes action (0, 2)
    -------------
    | * | * | O |
    -------------
    | * | X | O | 
    -------------
    | X | X | O | 
    -------------
Winner is Agent2 (O)
Percent of game won by each agent: [(0, 13.939393939393941), (-1, 24.848484848484848), (1, 61.212121212121204)]
Episode 330
agent1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (1, 1)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | * | * | O | 
    -------------
agent1 takes action (1, 2)
    -------------
    | * | O | X |
    -------------

    -------------
    | O | O | O | 
    -------------
Winner is Agent2 (O)
Percent of game won by each agent: [(0, 12.398921832884097), (-1, 24.528301886792452), (1, 63.07277628032345)]
Episode 371
agent1 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | X | * | 
    -------------
agent2 takes action (0, 2)
    -------------
    | * | * | O |
    -------------
    | * | * | * | 
    -------------
    | * | X | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | * | * | O |
    -------------
    | * | * | * | 
    -------------
    | * | X | X | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | * | O |
    -------------
    | * | * | * | 
    -------------
    | O | X | X | 
    -------------
agent1 takes action (0, 1)
    -------------
    | * | X | O |
    -------------
    | * | * | * | 
    -------------
    | O | X | X | 
    -------------
agent2 takes action (1, 0)

    -------------
    | X | * | * | 
    -------------
agent2 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | O | 
    -------------
    | X | * | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | O | 
    -------------
    | X | * | X | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | * | O | 
    -------------
    | X | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | O | 
    -------------
    | X | O | X | 
    -------------
agent2 takes action (0, 0)
    -------------
    | O | * | * |
    -------------
    | * | X | O | 
    -------------
    | X | O | X | 
    -------------
agent1 takes action (0, 2)
    -------------
    | O | * | X |
    -------------
    | * | X | O | 
    -------------
    | X | O | X | 
    -------------
Winner is Agent

    | O | * | * | 
    -------------
    | * | X | X | 
    -------------
agent2 takes action (2, 0)
    -------------
    | O | X | * |
    -------------
    | O | * | * | 
    -------------
    | O | X | X | 
    -------------
Winner is Agent2 (O)
Percent of game won by each agent: [(0, 12.582781456953644), (-1, 24.282560706401764), (1, 63.13465783664459)]
Episode 453
agent1 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent2 takes action (1, 0)
    -------------
    | * | * | * |
    -------------
    | O | * | * | 
    -------------
    | * | * | X | 
    -------------
agent1 takes action (2, 0)
    -------------
    | * | * | * |
    -------------
    | O | * | * | 
    -------------
    | X | * | X | 
    -------------
agent2 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | O | O | * | 
    -------------
    | X | * | X | 
    -------------
agent1 

agent2 takes action (2, 2)
    -------------
    | * | O | * |
    -------------
    | X | X | * | 
    -------------
    | * | * | O | 
    -------------
agent1 takes action (0, 2)
    -------------
    | * | O | X |
    -------------
    | X | X | * | 
    -------------
    | * | * | O | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | O | X |
    -------------
    | X | X | * | 
    -------------
    | O | * | O | 
    -------------
agent1 takes action (2, 1)
    -------------
    | * | O | X |
    -------------
    | X | X | * | 
    -------------
    | O | X | O | 
    -------------
agent2 takes action (1, 2)
    -------------
    | * | O | X |
    -------------
    | X | X | O | 
    -------------
    | O | X | O | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | O | X |
    -------------
    | X | X | O | 
    -------------
    | O | X | O | 
    -------------
Tie
Percent of game won by each agent: [(0, 12.525252525252526), (-1, 

    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | O | * | * | 
    -------------
agent1 takes action (1, 0)
    -------------
    | * | * | * |
    -------------
    | X | X | * | 
    -------------
    | O | * | * | 
    -------------
agent2 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | X | X | O | 
    -------------
    | O | * | * | 
    -------------
agent1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | X | X | O | 
    -------------
    | O | * | * | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | X |
    -------------
    | X | X | O | 
    -------------
    | O | O | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | X | X | O | 
    -------------
    | O | O | X

    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | O | * | * | 
    -------------
agent1 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | * | X | X | 
    -------------
    | O | * | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | * | X | X | 
    -------------
    | O | * | O | 
    -------------
agent1 takes action (1, 0)
    -------------
    | * | * | * |
    -------------
    | X | X | X | 
    -------------
    | O | * | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 13.297872340425531), (-1, 23.581560283687946), (1, 63.12056737588653)]
Episode 564
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    ----

    | O | O | X | 
    -------------
agent1 takes action (1, 2)
    -------------
    | * | X | X |
    -------------
    | * | O | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 13.388429752066116), (-1, 22.975206611570247), (1, 63.63636363636363)]
Episode 605
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | * | O | * | 
    -------------
agent1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | O | 
    -------------
agent1 takes action (2, 0)
    -------------

    | * | X | * | 
    -------------
agent1 takes action (1, 0)
    -------------
    | O | O | X |
    -------------
    | X | X | O | 
    -------------
    | * | X | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | O | O | X |
    -------------
    | X | X | O | 
    -------------
    | * | X | O | 
    -------------
agent1 takes action (2, 0)
    -------------
    | O | O | X |
    -------------
    | X | X | O | 
    -------------
    | X | X | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 13.622291021671826), (-1, 22.60061919504644), (1, 63.77708978328174)]
Episode 646
agent1 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | X | * | 
    -------------
agent2 takes action (0, 2)
    -------------
    | * | * | O |
    -------------
    | * | * | * | 
    -------------
    | * | X | * | 
    -------------
agent1 takes action (1, 2)
    -------------


    -------------
    | * | X | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | O | * | * | 
    -------------
    | * | X | X | 
    -------------
agent2 takes action (2, 0)
    -------------
    | * | * | * |
    -------------
    | O | * | * | 
    -------------
    | O | X | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | O | X | * | 
    -------------
    | O | X | X | 
    -------------
agent2 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | O | X | O | 
    -------------
    | O | X | X | 
    -------------
agent1 takes action (0, 1)
    -------------
    | * | X | * |
    -------------
    | O | X | O | 
    -------------
    | O | X | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 13.828238719068414), (-1, 21.97962154294032), (1, 64.19213973799127)]
Episode 687
agent1 takes action (2, 1)


Percent of game won by each agent: [(0, 14.758620689655173), (-1, 21.793103448275865), (1, 63.44827586206897)]
Episode 725
agent1 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | X | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 1)
    -------------
    | * | O | * |
    -------------
    | * | * | X | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (1, 0)
    -------------
    | * | O | * |
    -------------
    | X | * | X | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | * | O | * |
    -------------
    | X | * | X | 
    -------------
    | * | * | O | 
    -------------
agent1 takes action (0, 2)
    -------------
    | * | O | X |
    -------------
    | X | * | X | 
    -------------
    | * | * | O | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | O | X |
    -------------
    | X | * | X | 
  

    -------------
    | * | * | X |
    -------------
    | * | X | O | 
    -------------
    | O | X | O | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | * | X |
    -------------
    | * | X | O | 
    -------------
    | O | X | O | 
    -------------
agent2 takes action (0, 1)
    -------------
    | X | O | X |
    -------------
    | * | X | O | 
    -------------
    | O | X | O | 
    -------------
agent1 takes action (1, 0)
    -------------
    | X | O | X |
    -------------
    | X | X | O | 
    -------------
    | O | X | O | 
    -------------
Tie
Percent of game won by each agent: [(0, 15.111695137976348), (-1, 21.28777923784494), (1, 63.600525624178715)]
Episode 761
agent1 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | X | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | * | X | 
    -------------
    | *

agent1 takes action (2, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent2 takes action (0, 0)
    -------------
    | O | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent1 takes action (1, 0)
    -------------
    | O | * | * |
    -------------
    | X | * | * | 
    -------------
    | * | * | X | 
    -------------
agent2 takes action (1, 1)
    -------------
    | O | * | * |
    -------------
    | X | O | * | 
    -------------
    | * | * | X | 
    -------------
agent1 takes action (2, 1)
    -------------
    | O | * | * |
    -------------
    | X | O | * | 
    -------------
    | * | X | X | 
    -------------
agent2 takes action (2, 0)
    -------------
    | O | * | * |
    -------------
    | X | O | * | 
    -------------
    | O | X | X | 
    -------------
agent1 takes action (1, 2)
    -------------
    | O | * | * |
    ---

agent1 takes action (2, 2)
    -------------
    | * | * | O |
    -------------
    | * | X | * | 
    -------------
    | X | O | X | 
    -------------
agent2 takes action (0, 1)
    -------------
    | * | O | O |
    -------------
    | * | X | * | 
    -------------
    | X | O | X | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | O | O |
    -------------
    | * | X | * | 
    -------------
    | X | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 15.393794749403341), (-1, 21.121718377088307), (1, 63.48448687350835)]
Episode 838
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 2)
    -------------
    | * | * | O |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (2, 0)
    -------------
    | * | * | O |
    -------------


    | * | * | X | 
    -------------
agent2 takes action (0, 2)
    -------------
    | * | * | O |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | * | O |
    -------------
    | * | * | * | 
    -------------
    | * | * | X | 
    -------------
agent2 takes action (2, 1)
    -------------
    | X | * | O |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | X | * | O |
    -------------
    | * | X | * | 
    -------------
    | * | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 15.358361774744028), (-1, 20.819112627986346), (1, 63.82252559726962)]
Episode 879
agent1 takes action (2, 1)
    -------------
    | * | * | * |
    -------------
    | * | * | * | 
    -------------
    | * | X | * | 
    -------------
agent2 takes action (1, 1)
    -------------

    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 15.376226826608505), (-1, 20.93784078516903), (1, 63.68593238822247)]
Episode 917
agent1 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (0, 1)
    -------------
    | * | O | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent1 takes action (0, 0)
    -------------
    | X | O | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 2)
    -------------
    | X | O | * |
    -------------
    | * | X | * | 
    -------------
    | * | * | O | 
    -------------
agent1 takes action (2, 0)
    -------------
    | X | O | * |
    -------------
    | * | X | * | 
    -------------
    | X | * | O | 
    -------------
agent2 takes action (2, 1)
    -------------
    | X | O | * |
 

agent2 takes action (2, 2)
    -------------
    | X | * | * |
    -------------
    | * | O | * | 
    -------------
    | X | * | O | 
    -------------
agent1 takes action (0, 1)
    -------------
    | X | X | * |
    -------------
    | * | O | * | 
    -------------
    | X | * | O | 
    -------------
agent2 takes action (2, 1)
    -------------
    | X | X | * |
    -------------
    | * | O | * | 
    -------------
    | X | O | O | 
    -------------
agent1 takes action (0, 2)
    -------------
    | X | X | X |
    -------------
    | * | O | * | 
    -------------
    | X | O | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 15.328467153284672), (-1, 20.5422314911366), (1, 64.12930135557873)]
Episode 959
agent1 takes action (1, 2)
    -------------
    | * | * | * |
    -------------
    | * | * | X | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (1, 1)
    -------------
    | * | * | * |
    -------------
  

    -------------
    | O | X | O | 
    -------------
Tie
Percent of game won by each agent: [(0, 15.531062124248496), (-1, 20.440881763527056), (1, 64.02805611222445)]
Episode 998
agent1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
agent2 takes action (2, 1)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | * | 
    -------------
agent1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
agent2 takes action (0, 0)
    -------------
    | O | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
agent1 takes action (1, 1)
    -------------
    | O | * | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | X | 
    -------------
agent2 takes action (2, 0)
    ------------

### Testing

In [5]:
alist = []

p1 = Agent("Player1", exp_rate=0)
p1.loadPolicy("policy_agent1")

p2 = Agent("Player2", exp_rate=0)
p2.loadPolicy("policy_agent2")

st = state(p1, p2)
st.play(500)

Episode 0
Rounds 0
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 1)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
Player1 takes action (1, 1)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | X | 
    -------------
Player2 takes action (2, 0)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | O | O | X | 
    -------------
Player1 takes action (1, 2)
    -------------

    | * | O | X | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
Player1 takes action (1, 1)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | X | 
    -------------
Player2 takes action (2, 0)
    -------------
    | * | O | X |
    -------------
    | * | X | * | 
    -------------
    | O | O | X | 
    -------------
Player1 takes action (1, 2)
    -------------
    | * | O | X |
    -------------
    | * | X | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 43
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
  

    -------------
    | * | X | X |
    -------------
    | * | O | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 78
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (1, 0)
    -------------
    | * | * | X |
    -------------
    | O | * | * | 
    -------------
    | * | * | * | 
    -------------
Player1 takes action (1, 1)
    -------------
    | * | * | X |
    -------------
    | O | X | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | O | X | * | 
    -------------
    | * | * | O | 
    -------------
Player1 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | O | X | * | 
    -------------
    | X | * | O | 
    -------------
Winner is Ag

    -------------
    | * | * | X |
    -------------
    | * | X | O | 
    -------------
    | X | * | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 114
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (1, 1)
    -------------
    | * | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player1 takes action (0, 0)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player2 tak

Player1 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | O | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | O | 
    -------------
Player1 takes action (1, 0)
    -------------
    | * | O | X |
    -------------
    | X | * | * | 
    -------------
    | X | * | O | 
    -------------
Player2 takes action (2, 1)
    -------------
    | * | O | X |
    -------------
    | X | * | * | 
    -------------
    | X | O | O | 
    -------------
Player1 takes action (1, 1)
    -------------
    | * | O | X |
    -------------
    | X | X | * | 
    -------------
    | X | O | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 153
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
 

    | X | O | X | 
    -------------
Player1 takes action (1, 1)
    -------------
    | O | O | X |
    -------------
    | * | X | * | 
    -------------
    | X | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 192
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | O | 
    -------------
Player1 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | O | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | O | 
    -------------
Player1 takes action (1, 0)
    -------------
    | * | O | X |
    -------------
    | X | * | * | 
 

    -------------
    | * | * | * | 
    -------------
    | X | * | X | 
    -------------
Player2 takes action (2, 1)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | O | X | 
    -------------
Player1 takes action (1, 1)
    -------------
    | O | O | X |
    -------------
    | * | X | * | 
    -------------
    | X | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 230
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (1, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | O | 
    -------------
    | * | * | * | 
    -------------
Player1 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | O | 
    -------------
    | X | * | * | 
    -------------
Player2 takes action (2, 2)
    -------------
 

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 267
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player1 takes action (2, 0)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | * | 
    -------------
Player2 takes action (0, 0)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | X | 
    -------------
Player2 takes action (2, 1)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | O | X | 
 

    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (1, 1)
    -------------
    | * | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player1 takes action (0, 0)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (2, 1)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | O | X | 
    -------------
Player1 takes action (1, 2)
    -------------
    | X | * | X |
    -------------
    | * | O | X | 
    -----

Player1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (1, 1)
    -------------
    | * | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player1 takes action (0, 0)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (2, 1)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | O | X | 
    -------------
Player1 takes action (1, 2)
    -------------
    | X | * | X |
    -------------
    | * | O | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 342
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
 

Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player1 takes action (2, 0)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | * | 
    -------------
Player2 takes action (0, 0)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | X | 
    -------------
Player2 takes action (2, 1)
    -------------
    | O | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | O | X | 
    -------------
Player1 takes action (1, 1)
    -------------
    | O | O | X |
    -------------
    | * | X | * | 
    -------------
    | X | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100

    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | O | 
    -------------
Player1 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | O | 
    -------------
Player2 takes action (0, 1)
    -------------
    | * | O | X |
    -------------
    | * | * | * | 
    -------------
    | X | * | O | 
    -------------
Player1 takes action (1, 0)
    -------------
    | * | O | X |
    -------------
    | X | * | * | 
    -------------
    | X | * | O | 
    -------------
Player2 takes action (2, 1)
    -------------
    | * | O | X |
    -------------
    | X | * | * | 
    -------------
    | X | O | O | 
    -------------
Player1 takes action (1, 1)
    -------------
    | * | O | X |
    -------------
    | X | X | * | 
    -------------
    | X | O | O | 
    -------------
Winner is

Player2 takes action (2, 0)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (1, 1)
    -------------
    | * | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player1 takes action (0, 0)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | * | X | 
    -------------
Player2 takes action (2, 1)
    -------------
    | X | * | X |
    -------------
    | * | O | * | 
    -------------
    | O | O | X | 
    -------------
Player1 takes action (1, 2)
    -------------
    | X | * | X |
    -------------
    | * | O | X | 
    -------------
    | O | O | X | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100

Player1 takes action (1, 1)
    -------------
    | * | * | X |
    -------------
    | * | X | O | 
    -------------
    | X | * | O | 
    -------------
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 492
Player1 takes action (0, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | * | * | 
    -------------
Player2 takes action (2, 1)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | * | 
    -------------
Player1 takes action (2, 2)
    -------------
    | * | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
Player2 takes action (0, 0)
    -------------
    | O | * | X |
    -------------
    | * | * | * | 
    -------------
    | * | O | X | 
    -------------
Player1 takes action (1, 1)
    -------------
    | O | * | X |
    -------------
    | * | X | * | 
    -------------
    | * | O | X | 
 

### Training with 10000 rounds

In [7]:
rows=3
cols=3
alist = []

if __name__ == "__main__":
    # training
    a1 = Agent("agent1")
    a2 = Agent("agent2")

    st = state(a1, a2)
    #print("training...")
    st.play(10000)
    a1.savePolicy()
    a2.savePolicy()

Episode 0
Rounds 0
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 1
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 2
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 100.0)]
Episode 3
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes ac

agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.17977528089888), (-1, 33.70786516853933), (0, 10.112359550561797)]
Episode 89
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 55.55555555555556), (-1, 33.33333333333333), (0, 11.11111111111111)]
Episode 90
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.94505494505495), (-1, 

Percent of game won by each agent: [(1, 55.69620253164557), (-1, 30.37974683544304), (0, 13.924050632911392)]
Episode 158
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.34591194968554), (-1, 30.81761006289308), (0, 13.836477987421384)]
Episode 159
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.00000000000001), (-1, 31.25), (0, 13.750000000000002)]
Episode 160
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 tak

agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.84647302904564), (-1, 31.950207468879665), (0, 11.20331950207469)]
Episode 241
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.611570247933884), (-1, 32.231404958677686), (0, 11.15702479338843)]
Episode 242
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.79012345679012), (-1, 32.098765432098766), (0, 11.11111111111111)]
Episode 243
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
age

agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.3208722741433), (-1, 31.464174454828658), (0, 11.214953271028037)]
Episode 321
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.453416149068325), (-1, 31.366459627329192), (0, 11.180124223602485)]
Episode 322
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.585139318885446), (-1, 31.269349845201237), (0, 11.145510835913312)]
Episode 323
agent1 takes action (2, 2)
agent2 takes action (0, 1)


agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.5), (-1, 28.999999999999996), (0, 12.5)]
Episode 400
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 58.35411471321695), (-1, 28.92768079800499), (0, 12.718204488778055)]
Episode 401
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.457711442786064), (-1, 28.855721393034827), (0, 12.686567164179104)]
Episode 402
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes actio

agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.36820083682008), (-1, 28.661087866108787), (0, 12.97071129707113)]
Episode 478
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.455114822546975), (-1, 28.60125260960334), (0, 12.943632567849686)]
Episode 479
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.54166666666667), (-1, 28.541666666666664), (0, 12.916666666666668)]
Episode 480
agent1 takes action (0, 2)
ag

agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.93226381461676), (-1, 29.590017825311943), (0, 12.4777183600713)]
Episode 561
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.00711743772242), (-1, 29.537366548042705), (0, 12.455516014234876)]
Episode 562
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.90

agent2 takes action (2, 1)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.41029641185648), (-1, 29.95319812792512), (0, 12.636505460218409)]
Episode 641
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.3208722741433), (-1, 30.062305295950154), (0, 12.616822429906541)]
Episode 642
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.38724727838258), (-1, 30.015552099533437), (0, 12.597200622083982)]
Episode 643
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
age

agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.84647302904564), (-1, 30.843706777316736), (0, 12.309820193637622)]
Episode 723
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 56.767955801104975), (-1, 30.801104972375693), (0, 12.430939226519337)]
Episode 724
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.827586206896555), (-1, 30.758620689655174), (0, 12.413793103448276)]
E

agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 55.819774718397994), (-1, 31.914893617021278), (0, 12.265331664580724)]
Episode 799
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.75), (-1, 32.0), (0, 12.25)]
Episode 800
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 55.68039950062422), (-1, 31.960049937578027), (0, 12.359550561797752)]
Episode 801
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action

agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.710556186152104), (-1, 32.2360953461975), (0, 13.053348467650396)]
Episode 881
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.761904761904766), (-1, 32.19954648526077), (0, 13.038548752834467)]
Episode 882
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.699886749716875), (-1, 32.16308040770102), (0, 13.137032842582105)]
Episode 883
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action

Percent of game won by each agent: [(1, 55.91286307053942), (-1, 31.12033195020747), (0, 12.966804979253114)]
Episode 964
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.95854922279793), (-1, 31.088082901554404), (0, 12.953367875647666)]
Episode 965
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.90062111801242), (-1, 31.15942028985507), (0, 12.939958592132506)]
Episode 966
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 

agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.16045845272206), (-1, 31.23209169054441), (0, 12.607449856733524)]
Episode 1047
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.10687022900763), (-1, 31.297709923664126), (0, 12.595419847328243)]
Episode 1048
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.05338417540515), (-1, 31.3

agent2 takes action (1, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.08888888888889), (-1, 31.377777777777776), (0, 12.533333333333333)]
Episode 1125
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.03907637655418), (-1, 31.438721136767317), (0, 12.522202486678507)]
Episode 1126
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 55.98935226264419), (-1, 31.410825199645075), (0, 12.599822537710736)]
Episode 1127
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes ac

agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.74043261231281), (-1, 31.61397670549085), (0, 12.645590682196339)]
Episode 1202
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.69409808811305), (-1, 31.67082294264339), (0, 12.635078969243557)]
Episode 1203
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.730897009966775), (-1, 31.64451827242525), (0, 12.624584717607974)]
Episode 1204
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percen

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.25), (-1, 31.171875), (0, 12.578125000000002)]
Episode 1280
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.284153005464475), (-1, 31.147540983606557), (0, 12.568306010928962)]
Episode 1281
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.240249609984396), (-1, 31.201248049921997), (0, 12.558502340093602)]
Episode 1282
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by 

agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.28214548126378), (-1, 31.00661278471712), (0, 12.711241734019104)]
Episode 1361
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.314243759177685), (-1, 30.983847283406757), (0, 12.701908957415565)]
Episode 1362
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.346294937637566), (-1, 30.96111518708731), (0, 12.692589875275129)]
Episode 1363
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2

agent2 takes action (0, 1)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.16343490304709), (-1, 31.024930747922436), (0, 12.81163434903047)]
Episode 1444
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.193771626297575), (-1, 31.003460207612456), (0, 12.802768166089965)]
Episode 1445
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 56.15491009681881), (-1, 30.982019363762102), (0, 12.863070539419086)]
Episode 1446
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes ac

agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.484529295589205), (-1, 30.74391046741277), (0, 12.771560236998026)]
Episode 1519
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.51315789473684), (-1, 30.723684210526315), (0, 12.763157894736842)]
Episode 1520
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.541748849441156), (-1, 30.703484549638393), (0, 12.754766600920448)]
Episode 1521
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 

agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.214241099312936), (-1, 29.981261711430356), (0, 12.804497189256715)]
Episode 1601
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.24094881398252), (-1, 29.962546816479403), (0, 12.796504369538079)]
Episode 1602
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.26762320648784), (-1, 29.94385527136619), (0, 12.788521522145976)]
Episode 1603
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 2

agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.60998810939358), (-1, 29.72651605231867), (0, 12.663495838287753)]
Episode 1682
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.63517528223411), (-1, 29.708853238265004), (0, 12.655971479500892)]
Episode 1683
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 57.60095011876485), (-1, 29.69121140142518), (

agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.15096481271282), (-1, 29.96594778660613), (0, 12.883087400681045)]
Episode 1762
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.11854792966534), (-1, 30.005672149744754), (0, 12.875779920589903)]
Episode 1763
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.08616780045352), (-1, 30.045351473922903), (0, 12.868480725623582)]
Episode 1764

agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.305812058663776), (-1, 29.766431287343835), (0, 12.927756653992395)]
Episode 1841
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.274701411509234), (-1, 29.80456026058632), (0, 12.920738327904452)]
Episode 1842
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.297883884970155), (-1, 2

agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.26941115164148), (-1, 29.91141219385096), (0, 12.819176654507555)]
Episode 1919
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 57.239583333333336), (-1, 29.895833333333332), (0, 12.864583333333332)]
Episode 1920
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.261842790213436), (-1, 29.88027069234774), (0, 12.857886517438835)]
Episode 1921
agent1 takes action (2, 0)
agent2 takes ac

agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.607607607607605), (-1, 29.57957957957958), (0, 12.812812812812812)]
Episode 1998
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.57878939469735), (-1, 29.61480740370185), (0, 12.8064032016008)]
Episode 1999
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent:

agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.46628131021194), (-1, 29.43159922928709), (0, 13.102119460500964)]
Episode 2076
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.48675974963891), (-1, 29.417428984111698), (0, 13.095811266249399)]
Episode 2077
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.50721847930702), (-1, 29.403272377285848), (0, 13.089509143407122)]
Episode 2078
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Perce

agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.209647495361786), (-1, 28.803339517625233), (0, 12.987012987012985)]
Episode 2156
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 58.182661103384326), (-1, 28.789986091794155), (0, 13.027352804821513)]
Episode 2157
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.15569972196478), (-1, 28.82298424467099), (0, 13.021316033364227)]
Episode 2158
agent1 takes action (2, 0)
agent2 takes a

agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.04289544235925), (-1, 29.088471849865954), (0, 12.868632707774799)]
Episode 2238
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.016971862438595), (-1, 29.12014292094685), (0, 12.862885216614561)]
Episode 2239
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.99107142857143), (-1, 29.151785714285715), (0, 12.857142857142856)]
Episode 2240
agent1 takes action (0, 1)
agent2 takes action (2, 2

agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.96536796536797), (-1, 29.09090909090909), (0, 12.943722943722943)]
Episode 2310
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.940285590653396), (-1, 29.121592384249244), (0, 12.938122025097359)]
Episode 2311
agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.91522491349481), (-1, 29.1522491349481), (0, 12.9325

agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.315877670716375), (-1, 28.739002932551323), (0, 12.945119396732299)]
Episode 2387
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.291457286432156), (-1, 28.768844221105528), (0, 12.939698492462313)]
Episode 2388
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.3089158643784), (-1, 28.756802009208876), (0, 12.9

agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.585445625511035), (-1, 28.536385936222402), (0, 12.878168438266558)]
Episode 2446
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.60237024928484), (-1, 28.524724152022884), (0, 12.872905598692277)]
Episode 2447
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.619281045751634), (-1, 28.513071895424837), (0, 12.867647058823529)]
Episode 2448
agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (1,

Percent of game won by each agent: [(1, 58.43898573692552), (-1, 28.526148969889064), (0, 13.034865293185419)]
Episode 2524
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.45544554455445), (-1, 28.51485148514851), (0, 13.02970297029703)]
Episode 2525
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.432304038004744), (-1, 28.54315122723674), (0, 13.02454473475851)]
Episode 2526
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
Winner is Agent2 (O)
P

agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.240491740299646), (-1, 28.73607376104495), (0, 13.023434498655398)]
Episode 2603
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.25652841781874), (-1, 28.72503840245776), (0, 13.018433179723502)]
Episode 2604
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.27255278310941), (-1, 28.71401151631478), (0, 13.013435700575815)]
Episode 2605
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)


agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.22454308093995), (-1, 28.646027601641176), (0, 13.129429317418872)]
Episode 2681
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.24011931394482), (-1, 28.635346756152124), (0, 13.124533929903057)]
Episode 2682
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.21841222512113), (-1, 28.661945583302273), (0, 13.119642191576594)]
Episode 268

Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.73982558139535), (-1, 28.815406976744185), (0, 13.444767441860463)]
Episode 2752
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 57.7188521612786), (-1, 28.804940065383217), (0, 13.476207773338174)]
Episode 2753
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.73420479302832), (-1, 28.794480755265067), (0, 13.47131445170661)]
Episode 2754
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes acti

agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.90590732224974), (-1, 28.58153519632119), (0, 13.512557481429077)]
Episode 2827
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.88543140028288), (-1, 28.60678925035361), (0, 13.507779349363508)]
Episode 2828
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 57.86496995404736), (-1, 28.596677271120534), (0, 13.538352774832097)]
Episode 2829
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes acti

agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 57.931034482758626), (-1, 28.6551724137931), (0, 13.413793103448276)]
Episode 2900
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.91106514994829), (-1, 28.67976559806963), (0, 13.409169251982075)]
Episode 2901
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.925568573397655), (-1, 28.669882839421092), (0, 13.404548587181253)]
Episode 2902
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes act

agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 57.926624032312354), (-1, 28.6772130595759), (0, 13.396162908111748)]
Episode 2971
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.907133243607), (-1, 28.70121130551817), (0, 13.391655450874831)]
Episode 2972
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 57.887655566

agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.79092702169625), (-1, 28.66535174227482), (0, 13.543721236028928)]
Episode 3042
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.80479789681235), (-1, 28.655931646401577), (0, 13.539270456786065)]
Episode 3043
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.81865965834429), (-1, 28.64651773981603), (0, 13.534822601839686)]
Episode 3044
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 0)


agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.10073788899584), (-1, 28.360603144048763), (0, 13.538658966955404)]
Episode 3117
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.11417575368826), (-1, 28.35150737652341), (0, 13.534316869788327)]
Episode 3118
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.12760500160308), (-1, 28.342417441487655), (0, 13.529977556909264)]
Episode 3119
agent1 takes action (1, 1)

agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.223889931207005), (-1, 28.298936835522206), (0, 13.477173233270795)]
Episode 3198
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.23694904657706), (-1, 28.290090653329163), (0, 13.472960300093778)]
Episode 3199
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.25), (-1, 28.281250000000004), (0, 13.468749999999998)]
Episode 3200
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 ta

agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.38167938931298), (-1, 28.18320610687023), (0, 13.435114503816795)]
Episode 3275
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.36385836385837), (-1, 28.205128205128204), (0, 13.431013431013431)]
Episode 3276
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 58.34604821483064), (-1, 28.196521208422336), (0, 13.457430576747026)]
Episode 3277
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes act

agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.3134684147795), (-1, 28.277711561382596), (0, 13.408820023837903)]
Episode 3356
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.32588620792374), (-1, 28.269288054810843), (0, 13.404825737265416)]
Episode 3357
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.338296605122096), (-1, 28.26086956521739), (0, 13.400833829660513)]
Episode 3358
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 0)

agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.26771653543307), (-1, 28.258967629046367), (0, 13.47331583552056)]
Episode 3429
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.2798833819242), (-1, 28.250728862973762), (0, 13.46938775510204)]
Episode 3430
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.26289711454387), (-1, 28.271640921014278), (0, 13.465461964441856)]
Episode 3431
agent1 takes action (1, 0)
agent2 takes action (0, 1)
a

agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.08383233532935), (-1, 28.371827773025377), (0, 13.54433989164528)]
Episode 3507
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 58.09578107183581), (-1, 28.36374002280502), (0, 13.54047890535918)]
Episode 3508
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 58.079224850384726), (-1, 28.38

agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.937395192845166), (-1, 28.200111794298486), (0, 13.862493012856344)]
Episode 3578
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 57.92120704107293), (-1, 28.192232467169596), (0, 13.886560491757475)]
Episode 3579
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.9050279329609), (-1, 28.212290502793298), (0, 13.88268156424581)]
Episode 3580
agent1 takes action (2, 2)
agent2 takes act

Percent of game won by each agent: [(1, 57.43631881676253), (-1, 28.512736236647495), (0, 14.050944946589974)]
Episode 3651
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.44797371303395), (-1, 28.504928806133627), (0, 14.047097480832422)]
Episode 3652
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.459622228305506), (-1, 28.49712565015056), (0, 14.043252121543937)]
Episode 3653
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.471264367

agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 57.28467936678293), (-1, 28.52159914140059), (0, 14.193721491816474)]
Episode 3727
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.2961373390558), (-1, 28.513948497854074), (0, 14.189914163090128)]
Episode 3728
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.28077232502011), (-1, 28.53311879860552), (0, 14.186108876374362)]
Episode 3729
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes actio

agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.28947368421052), (-1, 28.447368421052634), (0, 14.26315789473684)]
Episode 3800
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.30071033938437), (-1, 28.439884240989215), (0, 14.259405419626415)]
Episode 3801
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.31194108364019), (-1, 28.432403997895843), (0, 14.255654918463968)]
Episode 3802
agent1 takes action (0, 2)

Episode 3873
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.17604543107899), (-1, 28.368611254517294), (0, 14.455343314403718)]
Episode 3874
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.18709677419355), (-1, 28.361290322580647), (0, 14.451612903225808)]
Episode 3875
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.19814241486068), (-1, 28.35397316821465), (0, 14.447884416924664)]

agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 57.42849911414831), (-1, 28.170083523158695), (0, 14.401417362692989)]
Episode 3951
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.43927125506073), (-1, 28.162955465587043), (0, 14.397773279352228)]
Episode 3952
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.4500379458639), (-1, 28.15583101441943), (0, 14.39413103971667)]
Episode 3953
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes actio

Percent of game won by each agent: [(1, 57.288557213930346), (-1, 28.2089552238806), (0, 14.502487562189053)]
Episode 4020
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.274309873165876), (-1, 28.22680925142999), (0, 14.498880875404128)]
Episode 4021
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.2849328692193), (-1, 28.219791148682248), (0, 14.495275982098457)]
Episode 4022
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.29555058414119), (-1, 28.2127765349241

agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.08353688324377), (-1, 28.285295554469958), (0, 14.631167562286274)]
Episode 4094
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 57.069597069597066), (-1, 28.278388278388277), (0, 14.652014652014653)]
Episode 4095
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.0556640625), (-1, 28.2958984375), (0, 14.6484375)]
Episode 4096
agent1 takes action (1, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agen

agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.01248799231509), (-1, 28.31412103746398), (0, 14.673390970220943)]
Episode 4164
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.022809123649466), (-1, 28.30732292917167), (0, 14.669867947178872)]
Episode 4165
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 57.00912145943351), (-1, 28.30052808449352), (0, 14.69035045607297)]
Epis

Percent of game won by each agent: [(1, 57.02498821310702), (-1, 28.28854314002829), (0, 14.686468646864686)]
Episode 4242
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.03511666273863), (-1, 28.281876031110066), (0, 14.683007306151307)]
Episode 4243
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.04524033930255), (-1, 28.275212064090482), (0, 14.679547596606973)]
Episode 4244
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 57.031802120141336), (-1, 28.29210836277

agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.984016678248786), (-1, 28.28353022932592), (0, 14.732453092425295)]
Episode 4317
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.99397869383974), (-1, 28.27698008337193), (0, 14.729041222788327)]
Episode 4318
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 57.0039360963186), (-1, 28.270432970595046), (0, 14.72563093308636)]
Episode 4319
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
W

agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.79490097882996), (-1, 28.431595720464376), (0, 14.773503300705668)]
Episode 4393
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.80473372781065), (-1, 28.4251251706873), (0, 14.770141101502048)]
Episode 4394
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.814562002275316), (-1, 28.418657565415245), (0, 14.766780432309442)]
Episode 4395

agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.79426908439669), (-1, 28.38594134766062), (0, 14.81978956794269)]
Episode 4467
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.80393912264996), (-1, 28.379588182632048), (0, 14.816472694717994)]
Episode 4468
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.81360483329604), (-1, 28.37323786081897), (0, 14.813157305884985)]
Episode 4469
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent:

agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.61748513543272), (-1, 28.451882845188287), (0, 14.930632019378992)]
Episode 4541
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.627036547776314), (-1, 28.445618670189344), (0, 14.927344782034346)]
Episode 4542
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.61457186880916), (-1, 28.46136913933524), (0, 14.9240589918556)]
Episode 4543
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percen

agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.56412478336221), (-1, 28.444540727902947), (0, 14.991334488734836)]
Episode 4616
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.573532596924416), (-1, 28.438379900368204), (0, 14.988087502707387)]
Episode 4617
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.58293633607623), (-1, 28.432221741013425), (0, 14.98484192291035)]
Episode 4618
agent1 takes action (2, 0)
agent2 takes action (2, 2

agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 56.624706635374444), (-1, 28.376360145082142), (0, 14.998933219543417)]
Episode 4687
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.633959044368595), (-1, 28.37030716723549), (0, 14.995733788395905)]
Episode 4688
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.62188099808061), (-1, 28.385583280017062), (0, 14.992535721902325)]
Episode 4689
agent1 takes a

agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.6470340765671), (-1, 28.24989482541018), (0, 15.103071098022719)]
Episode 4754
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.635120925341745), (-1, 28.264984227129336), (0, 15.099894847528915)]
Episode 4755
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.64423885618166), (-1, 28.259041211101767), (0, 15.096719932716567)]
Episode 4756
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)

agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.64871582435791), (-1, 28.272576636288317), (0, 15.07870753935377)]
Episode 4828
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.636984882998554), (-1, 28.28743010975357), (0, 15.075585007247877)]
Episode 4829
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.64596273291925), (-1, 28.2815734989648), (0, 15.072463768115943)]
Episode 4830
a

Percent of game won by each agent: [(1, 56.58914728682171), (-1, 28.396572827417383), (0, 15.014279885760914)]
Episode 4902
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 56.5776055476239), (-1, 28.390781154395267), (0, 15.031613297980828)]
Episode 4903
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.566068515497555), (-1, 28.405383360522023), (0, 15.028548123980423)]
Episode 4904
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56

agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 56.433453960595095), (-1, 28.327301970245276), (0, 15.23924406915963)]
Episode 4974
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.42211055276382), (-1, 28.34170854271357), (0, 15.236180904522614)]
Episode 4975
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.430868167202576), (-1, 28.336012861736336), (0, 15.233118971061094)]
Episode 4976
agent1 takes action (2, 1)
agent2 takes ac

agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 56.33551457465794), (-1, 28.316478286734085), (0, 15.348007138607972)]
Episode 5043
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.32434575733545), (-1, 28.330689928628072), (0, 15.34496431403648)]
Episode 5044
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56

agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 56.01643513989435), (-1, 28.487575816865586), (0, 15.49598904324007)]
Episode 5111
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 56.0054773082942), (-1, 28.501564945226914), (0, 15.492957746478872)]
Episode 5112
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 55.99452376295717), (-1, 28.495990612165066), (0, 15.509485624877762)]
Episode 5113
agent1 takes action (2, 1)
agent2 

agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.91356357322014), (-1, 28.5355971445109), (0, 15.550839282268955)]
Episode 5183
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.90277777777778), (-1, 28.549382716049383), (0, 15.547839506172838)]
Episode 5184
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 55.89199614271938), (-1, 28.54387656702025), (0, 15.564127290260366)]
Episode 5185
agent1 takes action (1, 1)
agent2 takes actio

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.86613424605438), (-1, 28.52253280091272), (0, 15.611332953032896)]
Episode 5259
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.8745247148289), (-1, 28.517110266159694), (0, 15.608365019011408)]
Episode 5260
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 55.86390420072229), (-1, 28.511689792815055), (0, 15.62440600646265)]
Episode 5261
agent1 takes action (2, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
Winner is Agent1 (

agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.77861163227017), (-1, 28.517823639774857), (0, 15.703564727954971)]
Episode 5330
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.786906771712616), (-1, 28.512474207465765), (0, 15.70061902082161)]
Episode 5331
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.79519879969993), (-1, 28.507126781695426), (0, 15.69767441860465)]
Episode 5332
agent1 takes action (1, 1)
agent2 takes action (1, 0)

agent2 takes action (2, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.82560296846011), (-1, 28.330241187384043), (0, 15.844155844155845)]
Episode 5390
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.833797069189394), (-1, 28.32498608792432), (0, 15.841216842886292)]
Episode 5391
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 55.823442136498514), (-1, 28.31973293768546), (0, 15.856824925816024)]
Episode 5392
agent1 takes action (0, 0)
agent2 takes ac

agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.82631000366435), (-1, 28.21546353975815), (0, 15.958226456577501)]
Episode 5458
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.81608353178238), (-1, 28.228613299139038), (0, 15.955303169078586)]
Episode 5459
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.824175824175825), (-1, 28.223443223443223), (0, 15.95238095238095)]
Episode 5460
agent1 takes action (0, 0)

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.557570262919306), (-1, 28.43155031731641), (0, 16.01087941976428)]
Episode 5515
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.565627266134875), (-1, 28.426395939086298), (0, 16.007976794778823)]
Episode 5516
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.55555555555556), (-1, 28.43936922240348), (0, 16.005075222040965)]
Episode 5517
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (0, 0)

agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 55.53366762177651), (-1, 28.456303724928368), (0, 16.01002865329513)]
Episode 5584
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.5237242614145), (-1, 28.46911369740376), (0, 16.007162041181736)]
Episode 5585
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.53168635875403), (-1, 28.4640171858217), (0, 16.004296455424274)]
Episode 5586
agent1 takes action (1, 0)
agent2 takes action 

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.46322489391796), (-1, 28.394625176803395), (0, 16.14214992927864)]
Episode 5656
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.45342054092275), (-1, 28.407283012197276), (0, 16.13929644687997)]
Episode 5657
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.46129374337222), (-1, 28.4022622834924), (0, 16.136443973135385)]
Episode 5658
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
Wi

agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.49153134276236), (-1, 28.409289331237996), (0, 16.099179325999653)]
Episode 5727
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 55.481843575419), (-1, 28.40432960893855), (0, 16.113826815642458)]
Episode 5728
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 55.472159190085534), (-1, 28.399371618083435), (0, 16.128469191831034)]
Episode 5729
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 

agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.46160483175151), (-1, 28.403796376186367), (0, 16.13459879206212)]
Episode 5795
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.46928916494134), (-1, 28.39889579020014), (0, 16.13181504485852)]
Episode 5796
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 55.45972054510953), (-1, 28.393996894945662), (0, 16.1462825599448)]
Episod

agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.45888775162061), (-1, 28.38621630842716), (0, 16.154895939952237)]
Episode 5862
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.466484734777424), (-1, 28.38137472283814), (0, 16.152140542384444)]
Episode 5863
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.47407912687585), (-1, 28.376534788540248), (0, 16.149386084583902)]
Episode 5864
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 2)

agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.47125273984152), (-1, 28.410048895633118), (0, 16.118698364525375)]
Episode 5931
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.46190155091032), (-1, 28.422117329737016), (0, 16.11598111935266)]
Episode 5932
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.46940839372998), (-1, 28.417326816113263), (0, 16.11326479015675)]
Episode 5933
agent1 takes action (1, 1)


agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.40923487247874), (-1, 28.371395232538752), (0, 16.219369894982496)]
Episode 5999
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.400000000000006), (-1, 28.383333333333333), (0, 16.216666666666665)]
Episode 6000
Rounds 6000
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.407432094650886), (-1, 28.378603566072318), (0, 16.21396433927679)]
Episode 6001
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes

agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.26445872466633), (-1, 28.50551985500082), (0, 16.23002142033284)]
Episode 6069
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.25535420098847), (-1, 28.517298187808898), (0, 16.227347611202635)]
Episode 6070
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.262724427606656), (-1, 28.512600889474548), (0, 16.224674682918796)]
Episode 6071
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)

agent2 takes action (2, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.22631064799739), (-1, 28.573754477368933), (0, 16.19993487463367)]
Episode 6142
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.21732052742959), (-1, 28.58538173530848), (0, 16.197297737261923)]
Episode 6143
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.208333333333336), (-1, 28.597005208333332), (0, 16.194661458333336)]
Episode 6144
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Perce

agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.249597423510465), (-1, 28.550724637681157), (0, 16.199677938808374)]
Episode 6210
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.25680244727097), (-1, 28.546127837707292), (0, 16.197069715021737)]
Episode 6211
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 55.24790727623954), (-1, 28.54153251770766), (0, 16.2105602060528)]
Episode 6212
agent1 takes action (2, 0)
agent2 takes acti

agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.284941101560015), (-1, 28.462273161413563), (0, 16.252785737026425)]
Episode 6282
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.27614197039631), (-1, 28.473659080057296), (0, 16.250198949546395)]
Episode 6283
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.26734563971992), (-1, 28.485041374920435), (0, 16.2

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.17404315640258), (-1, 28.5399275476453), (0, 16.28602929595212)]
Episode 6349
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.16535433070866), (-1, 28.551181102362204), (0, 16.283464566929133)]
Episode 6350
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 55.15666824122185), (-1, 28.54668556132893), (0, 16.29664619744922)]
Episode 6351
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action 

agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.04830165160486), (-1, 28.52913680274229), (0, 16.42256154565285)]
Episode 6418
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 55.03972581398971), (-1, 28.52469231967596), (0, 16.43558186633432)]
Episode 6419
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
Tie
Percent of game won by each agent: [(1, 55.03115264797508), (-1, 28.520249221183803), (0, 16.448598130841123)]
Episode 6420
agent1 takes action (2, 0)
agent2 ta

agent2 takes action (2, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.99383477188656), (-1, 28.606658446362516), (0, 16.399506781750926)]
Episode 6488
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.98535983972877), (-1, 28.61766065649561), (0, 16.396979503775622)]
Episode 6489
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.9768875192604), (-1, 28.6286594761171), (0, 16.3944530046225)]
Episode 6490
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agen

agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 54.96113397347965), (-1, 28.57796067672611), (0, 16.46090534979424)]
Episode 6561
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.9527583053947), (-1, 28.588844864370618), (0, 16.458396830234683)]
Episode 6562
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.95962212402865), (-1, 28.58448880085327), (0, 16.455889075118087)]
Episo

agent1 takes action (0, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.92681454655198), (-1, 28.580051305266334), (0, 16.49313414818168)]
Episode 6627
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.918527459263736), (-1, 28.575739287869645), (0, 16.505733252866627)]
Episode 6628
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.9253281037864), (-1, 28.57142857142857), (0, 16.503243324785036)]
Episode 6629
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes acti

agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.8714883442917), (-1, 28.556485355648537), (0, 16.572026300059772)]
Episode 6692
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.87823098759898), (-1, 28.55221873599283), (0, 16.569550276408187)]
Episode 6693
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.87003286525246), (-1, 28.547953391096502), (0, 16.582013743651032)]
Episode 6694
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 

agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.89355410999409), (-1, 28.577764636309876), (0, 16.528681253696035)]
Episode 6764
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.88543976348854), (-1, 28.588322246858834), (0, 16.52623798965262)]
Episode 6765
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.89210759680757), (-1, 28.584096955365062), (0, 16.52379544782737)]
Episode 6766
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)


agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.88875878220141), (-1, 28.52751756440281), (0, 16.583723653395786)]
Episode 6832
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.88072588906776), (-1, 28.537977462315233), (0, 16.581296648617005)]
Episode 6833
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.87269534679543), (-1, 28.548434299092772), (0, 16.578870354111793)]
Episode 6834
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
agent2 takes act

agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.918151528321026), (-1, 28.480370853252207), (0, 16.60147761842677)]
Episode 6903
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 54.91019698725377), (-1, 28.47624565469293), (0, 16.613557358053303)]
Episode 6904
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.902244

Percent of game won by each agent: [(1, 54.83500717360115), (-1, 28.450502152080343), (0, 16.714490674318508)]
Episode 6970
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.84148615693588), (-1, 28.44642088652991), (0, 16.712092956534214)]
Episode 6971
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.84796328169822), (-1, 28.44234079173838), (0, 16.7096959265634)]
Episode 6972
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1,

agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.9680170575693), (-1, 28.30135039090263), (0, 16.730632551528075)]
Episode 7035
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.96020466173962), (-1, 28.311540648095505), (0, 16.728254690164867)]
Episode 7036
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.966605087395195), (-1, 28.30751740798636), (0, 16.725877504618445)]
Episode 7037
agent1 takes action (2, 0)
agent2 takes action (0, 0)


agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 55.011261261261254), (-1, 28.322072072072075), (0, 16.666666666666664)]
Episode 7104
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 55.00351864883884), (-1, 28.33216045038705), (0, 16.6643209007741)]
Episode 7105
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.99577821559246), (-1, 28.328173374613), (0, 16.67604840979454)]
Episode 7106
agent1 takes action (

Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.962364092556456), (-1, 28.352383607471427), (0, 16.685252299972124)]
Episode 7174
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.968641114982574), (-1, 28.34843205574913), (0, 16.682926829268293)]
Episode 7175
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.97491638795987), (-1, 28.34448160535117), (0, 16.680602006688964)]
Episode 7176
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each age

agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.937163375224415), (-1, 28.28338627261428), (0, 16.779450352161305)]
Episode 7241
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 54.929577464788736), (-1, 28.27948080640707), (0, 16.7909417288042)]
Episode 7242
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.92199364904046), (-1, 28.289382852409222), (0, 16.788623498550326)]
Episode 7243
agent1 takes action (2, 2)
agent2 

Percent of game won by each agent: [(1, 54.81795784286887), (-1, 28.278127566383795), (0, 16.90391459074733)]
Episode 7306
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
Tie
Percent of game won by each agent: [(1, 54.8104557273847), (-1, 28.274257561242642), (0, 16.91528671137266)]
Episode 7307
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.80295566502463), (-1, 28.284072249589492), (0, 16.912972085385878)]
Episode 7308
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes

agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.71928397070789), (-1, 28.247898020070515), (0, 17.03281800922159)]
Episode 7374
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.72542372881356), (-1, 28.244067796610167), (0, 17.030508474576273)]
Episode 7375
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.7180043383948), (-1, 28.253796095444685), (0, 17.02819956616052)]
Episode 7376
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent:

agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 54.687080311576686), (-1, 28.270212194466826), (0, 17.042707493956485)]
Episode 7446
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.69316503289915), (-1, 28.26641600644555), (0, 17.0404189606553)]
Episode 7447
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.68582169709989), (-1, 28.27604726100967), (0, 17.038131041890438)]
Episode 7448
agent1 takes actio

agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.73908413205538), (-1, 28.23482428115016), (0, 17.02609158679446)]
Episode 7512
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.74510847863703), (-1, 28.231066152003194), (0, 17.023825369359777)]
Episode 7513
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.75113122171946), (-1, 28.22730902315677), (0, 17.021559755123768)]
Episode 7514
agent1 takes action (1, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
a

agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.86422356973372), (-1, 28.17031373582916), (0, 16.96546269443712)]
Episode 7586
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 54.85699222354027), (-1, 28.16660076446553), (0, 16.9764070119942)]
Episode 7587
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.8629414865

Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.81045751633987), (-1, 28.13071895424837), (0, 17.058823529411764)]
Episode 7650
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.81636387400339), (-1, 28.1270422167037), (0, 17.056593909292904)]
Episode 7651
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.822268687924726), (-1, 28.12336644014637), (0, 17.054364871928907)]
Episode 7652
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent:

agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.818652849740936), (-1, 28.0699481865285), (0, 17.11139896373057)]
Episode 7720
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.82450459785002), (-1, 28.066312653801322), (0, 17.10918274834866)]
Episode 7721
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.830354830354835), (-1, 28.062678062678064), (0, 17.106967106967108)]
Episode 7722
agent1 takes action (2, 1)
agent2 takes acti

agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.728861475199174), (-1, 28.10331534309946), (0, 17.167823181701362)]
Episode 7782
agent1 takes action (0, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 54.72182962867789), (-1, 28.099704484132083), (0, 17.17846588719003)]
Episode 7783
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.727646454265155), (-1, 28.096094552929085), (0, 17.176258992805753)]
Episode 7784
agent1 takes action (0, 0)
agent2 takes ac

agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.73429336051995), (-1, 28.07442334650185), (0, 17.191283292978206)]
Episode 7847
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.727319062181444), (-1, 28.083588175331293), (0, 17.189092762487256)]
Episode 7848
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.72034654096063), (-1, 28.092750668875016), (0, 17.186902790164353)]
Episode 7849
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 2

agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.77513895907023), (-1, 28.08236483072259), (0, 17.142496210207174)]
Episode 7916
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 54.768220285461666), (-1, 28.078817733990146), (0, 17.152961980548188)]
Episode 7917
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Tie
Percent of 

agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.666332162568985), (-1, 28.073256397390868), (0, 17.26041144004014)]
Episode 7972
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 54.65947573059074), (-1, 28.0697353568293), (0, 17.270788912579956)]
Episode 7973
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.66516177577125), (-1, 28.06621519939804), (0, 17.268623024830703)]
Episode 7974
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes actio

agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.62352209085252), (-1, 28.02738021157436), (0, 17.349097697573118)]
Episode 8035
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.629168740667), (-1, 28.0238924838228), (0, 17.346938775510203)]
Episode 8036
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
Tie
Percent of game won by each agent: [(1, 54.62237153166605), (-1, 28.020405623989053), (0, 17.357222844344903)]
Episode 8037
agent1 takes action 

agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.53422288114653), (-1, 28.082530269335308), (0, 17.383246849518162)]
Episode 8094
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.53983940704138), (-1, 28.07906114885732), (0, 17.3810994441013)]
Episode 8095
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.54545454545454), (-1, 28.075592885375496), (0, 17.37895256916996)]
Episode 8096
agent1 takes action (1, 2)
ag

agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
Tie
Percent of game won by each agent: [(1, 54.47084762371387), (-1, 28.123468887800097), (0, 17.405683488486037)]
Episode 8164
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.46417636252296), (-1, 28.132271892222903), (0, 17.403551745254134)]
Episode 8165
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 54.45750673524369), (-1, 28.128826843007595), (0, 17.413666421748715)]
Episode 8166
agent

agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.42945679912504), (-1, 28.132215336006805), (0, 17.43832786486815)]
Episode 8229
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.4228432563791), (-1, 28.140947752126365), (0, 17.43620899149453)]
Episode 8230
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 54.41623132061718), (-1, 28.13752885433119), (0, 17.446239825051634)]
Episode 8231
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 2)
agent2 takes action

agent2 takes action (2, 1)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.369049053874896), (-1, 28.178859828853803), (0, 17.452091117271305)]
Episode 8297
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.37454808387563), (-1, 28.175463967221013), (0, 17.44998794890335)]
Episode 8298
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.367996144113754), (-1, 28.17206892396674), (0, 17.459934931919506)]
Episode 8299
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes ac

agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.2822966507177), (-1, 28.193779904306222), (0, 17.523923444976077)]
Episode 8360
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
Tie
Percent of game won by each agent: [(1, 54.27580432962564), (-1, 28.190407845951444), (0, 17.533787824422916)]
Episode 8361
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.28127242286534), (-1, 28.18703659411624), (0, 17.531690983018418)]
Episode 8362
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes acti

agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.156769596199524), (-1, 28.313539192399052), (0, 17.529691211401424)]
Episode 8420
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.16221351383446), (-1, 28.31017693860587), (0, 17.52760954755967)]
Episode 8421
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.16765613868439), (-1, 28.306815483258134), (0, 17.525528378057466)]
Episode 8422
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agen

agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.13622436954984), (-1, 28.340796606174877), (0, 17.522979024275276)]
Episode 8486
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.14162837280547), (-1, 28.337457287616353), (0, 17.520914339578177)]
Episode 8487
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.13524976437323), (-1, 28.34590009425071), (0, 17.51885014137606)]
Episode 8488
agent1 takes action (1, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 2)


agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.13498654813429), (-1, 28.31910164931571), (0, 17.54591180255001)]
Episode 8549
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 54.12865497076024), (-1, 28.315789473684212), (0, 17.555555555555554)]
Episode 8550
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 54.12232487428371), (-1, 28.312478072

agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.067540907508416), (-1, 28.339329232911687), (0, 17.5931298595799)]
Episode 8617
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.06126711533999), (-1, 28.347644465073103), (0, 17.59108841958691)]
Episode 8618
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.054994778976685), (-1, 28.355957767722472), (0, 17.589047453300847)]
Episode 8619
agent1 takes action (0, 2)
agent2 takes action (0, 1)

agent2 takes action (2, 2)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 54.02073732718894), (-1, 28.352534562211986), (0, 17.62672811059908)]
Episode 8680
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.01451445685981), (-1, 28.360787927658105), (0, 17.624697615482088)]
Episode 8681
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.008293020041464), (-1, 28.369039391845195), (0, 17.622667588113337)]
Episode 8682
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes ac

agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 54.00091449474166), (-1, 28.337905807041608), (0, 17.661179698216735)]
Episode 8748
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.00617213395817), (-1, 28.334666819065035), (0, 17.6591610469768)]
Episode 8749
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 54.011428571428574), (-1, 28.331428571428575), (0, 17.65714285714286)]
Episode 8750
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)


agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.9596097118221), (-1, 28.28454731109598), (0, 17.755842977081915)]
Episode 8814
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 53.95348837209303), (-1, 28.281338627339764), (0, 17.765173000567213)]
Episode 8815
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.958711433756804), (-1, 28.278130671506354), 

agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.86867890528213), (-1, 28.313999324248222), (0, 17.817321770469647)]
Episode 8879
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 53.862612612612615), (-1, 28.31081081081081), (0, 17.826576576576578)]
Episode 8880
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (2, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 53.85654768607139), (-1, 28.3076230

agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.819483279275246), (-1, 28.307795548596353), (0, 17.872721172128397)]
Episode 8941
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.81346454931782), (-1, 28.315813017222098), (0, 17.870722433460077)]
Episode 8942
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.818629095381866), (-1, 28.312646762831267), (0, 17.868724141786874)]
Episode 8943
agent1 takes action (1, 1)
agent2 takes action (0,

agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 53.79149550349728), (-1, 28.311313422893303), (0, 17.897191073609413)]
Episode 9007
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.796625222024865), (-1, 28.308170515097693), (0, 17.895204262877442)]
Episode 9008
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 53.7906

agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.81649961449499), (-1, 28.274038991078314), (0, 17.9094613944267)]
Episode 9079
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.81057268722466), (-1, 28.28193832599119), (0, 17.90748898678414)]
Episode 9080
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.815659068384534), (-1, 28.278823918070696), (0, 17.905517013544763)]
Episode 9081
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
a

agent1 takes action (0, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 2)
Tie
Percent of game won by each agent: [(1, 53.751914241960186), (-1, 28.297965434259464), (0, 17.950120323780354)]
Episode 9142
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.75697254730395), (-1, 28.29487039265011), (0, 17.948157060045936)]
Episode 9143
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 53.75109361329834), (-1, 28.2917760279965), (0, 17.95713035870

agent2 takes action (0, 0)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.78680864935347), (-1, 28.219059002499186), (0, 17.994132348147343)]
Episode 9203
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 53.78096479791395), (-1, 28.21599304650152), (0, 18.003042155584527)]
Episode 9204
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.78598587724063), (-1, 28.212927756653993), (0, 18.00108636610538)]
Episode 9205
agent1 takes acti

agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (1, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.838678328474245), (-1, 28.161105712126123), (0, 18.000215959399632)]
Episode 9261
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.83286547182035), (-1, 28.168862016843015), (0, 17.998272511336644)]
Episode 9262
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.83784950879844), (-1, 28.16582100831264), (0, 17.99632948288891)]
Episode 9263
agent1 takes action (1, 1)

agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 0)
Tie
Percent of game won by each agent: [(1, 53.86595174262735), (-1, 28.128686327077745), (0, 18.005361930294907)]
Episode 9325
agent1 takes action (1, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.870898563156764), (-1, 28.125670169418832), (0, 18.003431267424403)]
Episode 9326
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.865122761874126), (-1, 28.13337621957757)

agent2 takes action (1, 2)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 53.862546616941934), (-1, 28.034096963239215), (0, 18.103356419818862)]
Episode 9385
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (1, 0)
agent1 takes action (2, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.86746217771149), (-1, 28.031110164074153), (0, 18.101427658214362)]
Episode 9386
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.87237669116863), (-1, 28.028124001278364), (0, 18.099499307553)]
Episode 9387
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 1)
agent1 takes action (1, 2)
agent2 takes acti

agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.90393567498942), (-1, 28.004655099449856), (0, 18.091409225560728)]
Episode 9452
agent1 takes action (0, 2)
agent2 takes action (1, 1)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 1)
agent1 takes action (2, 0)
Tie
Percent of game won by each agent: [(1, 53.898233365069295), (-1, 28.00169258436475), (0, 18.100074050565958)]
Episode 9453
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (2, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 53.892532261476624), (-1, 27.998730696001694), (0, 18.108737042521682)]
Episode 9454
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
agen

agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 53.785488958990534), (-1, 28.054679284963196), (0, 18.159831756046266)]
Episode 9510
agent1 takes action (2, 1)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.77983387656398), (-1, 28.06224371780044), (0, 18.157922405635578)]
Episode 9511
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action (0, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.77417998317914), (-1, 28.06980656013457), 

agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (1, 2)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.72962808190556), (-1, 28.050564145424158), (0, 18.219807772670286)]
Episode 9572
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.72401546014834), (-1, 28.05808001671367), (0, 18.217904523137992)]
Episode 9573
agent1 takes action (2, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (1, 0)
agent1 takes action (1, 2)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
agent2 takes action (0, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.718404010862756), (-1, 28.06559431794443), (0, 18.216001671192814)]
Episode 9574
agent1 takes action (1, 1)

agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (2, 1)
agent1 takes action (0, 1)
Tie
Percent of game won by each agent: [(1, 53.690439115540336), (-1, 27.987127582269284), (0, 18.32243330219039)]
Episode 9633
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.68486609923189), (-1, 27.994602449657464), (0, 18.32053145111065)]
Episode 9634
agent1 takes action (2, 2)
agent2 takes action (2, 1)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.689673066943435), (-1, 27.99169693824598), (0, 18.318629994810586)]
Episode 9635
agent1 takes action (2, 2)
agent2 takes act

Percent of game won by each agent: [(1, 53.65174334639984), (-1, 27.98638332989478), (0, 18.361873323705385)]
Episode 9694
agent1 takes action (1, 1)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 0)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.65652398143372), (-1, 27.983496647756578), (0, 18.359979370809697)]
Episode 9695
agent1 takes action (2, 1)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
agent1 takes action (0, 0)
agent2 takes action (1, 2)
agent1 takes action (1, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.66130363036304), (-1, 27.980610561056107), (0, 18.358085808580856)]
Episode 9696
agent1 takes action (2, 1)
agent2 takes action (0, 2)
agent1 takes action (2, 2)
agent2 takes action 

agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (2, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.5670356703567), (-1, 27.9930299302993), (0, 18.439934399343993)]
Episode 9756
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 2)
agent2 takes action (1, 1)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.56154555703597), (-1, 28.00040996207851), (0, 18.43804448088552)]
Episode 9757
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.566304570608736), (-1, 27.997540479606474), (0, 18.436154949784793)]
Episode 9758
agent1 takes action (0, 0)
agent2 takes action (0, 2)
agent1 takes action (2, 0)
agent2 takes action (0, 1)
agent1 takes action (2, 2)
ag

agent2 takes action (2, 1)
agent1 takes action (0, 1)
agent2 takes action (1, 1)
agent1 takes action (1, 0)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 53.56415478615071), (-1, 28.0142566191446), (0, 18.421588594704687)]
Episode 9820
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 0)
agent1 takes action (1, 0)
agent2 takes action (0, 1)
agent1 takes action (1, 1)
agent2 takes action (0, 0)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 53.55870074330517), (-1, 28.011404133998575), (0, 18.429895122696262)]
Episode 9821
agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 2)
agent2 takes action (1, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.55324781103646), (-1, 28.018733455508045), (0, 18.42801873345551)]
Episode 9822
agent1 takes action (2, 2)
agent2 t

agent1 takes action (2, 2)
agent2 takes action (1, 2)
agent1 takes action (0, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 2)
Tie
Percent of game won by each agent: [(1, 53.49919093851133), (-1, 27.993527508090615), (0, 18.50728155339806)]
Episode 9888
agent1 takes action (1, 2)
agent2 takes action (0, 2)
agent1 takes action (2, 1)
agent2 takes action (2, 2)
agent1 takes action (1, 0)
agent2 takes action (1, 1)
agent1 takes action (0, 0)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.49378096875316), (-1, 28.000808979674385), (0, 18.505410051572454)]
Episode 9889
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 1)
agent1 takes action (0, 0)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.49848331648129), (-1, 27.99797775530839), (0, 18.503538928210315)]
Episode 9890
agent1 takes action (2, 2)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes acti

agent1 takes action (0, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 1)
agent2 takes action (0, 0)
agent1 takes action (1, 1)
agent2 takes action (2, 0)
Winner is Agent2 (O)
Percent of game won by each agent: [(1, 53.46663987138264), (-1, 28.044614147909968), (0, 18.488745980707396)]
Episode 9952
agent1 takes action (2, 1)
agent2 takes action (0, 0)
agent1 takes action (2, 0)
agent2 takes action (2, 2)
agent1 takes action (1, 1)
agent2 takes action (0, 2)
agent1 takes action (0, 1)
Winner is Agent1 (X)
Percent of game won by each agent: [(1, 53.47131518135235), (-1, 28.041796443283435), (0, 18.48688837536421)]
Episode 9953
agent1 takes action (2, 1)
agent2 takes action (1, 1)
agent1 takes action (1, 2)
agent2 takes action (2, 2)
agent1 takes action (0, 0)
agent2 takes action (2, 0)
agent1 takes action (0, 2)
agent2 takes action (0, 1)
agent1 takes action (1, 0)
Tie
Percent of game won by each agent: [(1, 53.46594333936107), (-1, 28.03897930480209), (0, 18.49507735583685)]
Epis

### Testing

In [8]:
alist = []

p1 = Agent("Player1", exp_rate=0)
p1.loadPolicy("policy_agent1")

p2 = Agent("Player2", exp_rate=0)
p2.loadPolicy("policy_agent2")

st = state(p1, p2)
st.play(500)

Episode 0
Rounds 0
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 0)
Player2 takes action (0, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 100.0)]
Episode 1
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 0)
Player2 takes action (0, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 100.0)]
Episode 2
Player1 takes action (2, 2)
Player2 takes action (1, 1)
Player1 takes action (2, 1)
Player2 takes action (2, 0)
Player1 takes action (0, 2)
Player2 takes action (1, 0)
Player1 takes action (1, 2)
Winner is Agent1 (X)
Percent of game won by each agent: [(0, 66.66666666666666), (1, 33.33333333333333)]
Episode 3
Player1 takes action (2, 2)
Player2 takes 

Player2 takes action (1, 1)
Player1 takes action (1, 0)
Player2 takes action (0, 0)
Player1 takes action (2, 2)
Player2 takes action (2, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(0, 79.36507936507937), (1, 1.5873015873015872), (-1, 19.047619047619047)]
Episode 63
Player1 takes action (2, 0)
Player2 takes action (1, 1)
Player1 takes action (1, 0)
Player2 takes action (0, 0)
Player1 takes action (2, 2)
Player2 takes action (2, 1)
Player1 takes action (0, 1)
Player2 takes action (1, 2)
Player1 takes action (0, 2)
Tie
Percent of game won by each agent: [(0, 79.6875), (1, 1.5625), (-1, 18.75)]
Episode 64
Player1 takes action (1, 0)
Player2 takes action (1, 1)
Player1 takes action (2, 2)
Player2 takes action (2, 1)
Player1 takes action (0, 0)
Player2 takes action (2, 0)
Player1 takes action (0, 1)
Player2 takes action (0, 2)
Winner is Agent2 (O)
Percent of game won by each agent: [(0, 78.46153846153847), (1, 1.5384615

Player2 takes action (2, 0)
Player1 takes action (0, 2)
Player2 takes action (1, 2)
Player1 takes action (1, 0)
Player2 takes action (0, 1)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 84.29752066115702), (1, 2.479338842975207), (-1, 13.223140495867769)]
Episode 121
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 0)
Player2 takes action (0, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 84.42622950819673), (1, 2.459016393442623), (-1, 13.114754098360656)]
Episode 122
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (0, 2)
Player2 takes action (2, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 84.5528455284553), (1, 2.4390243

Player1 takes action (0, 0)
Player2 takes action (0, 2)
Player1 takes action (2, 0)
Player2 takes action (1, 0)
Player1 takes action (1, 2)
Tie
Percent of game won by each agent: [(0, 87.35632183908046), (1, 1.7241379310344827), (-1, 10.919540229885058)]
Episode 174
Player1 takes action (2, 2)
Player2 takes action (1, 1)
Player1 takes action (1, 2)
Player2 takes action (0, 2)
Player1 takes action (2, 0)
Player2 takes action (2, 1)
Player1 takes action (0, 1)
Player2 takes action (0, 0)
Player1 takes action (1, 0)
Tie
Percent of game won by each agent: [(0, 87.42857142857143), (1, 1.7142857142857144), (-1, 10.857142857142858)]
Episode 175
Player1 takes action (2, 0)
Player2 takes action (1, 1)
Player1 takes action (1, 2)
Player2 takes action (0, 2)
Player1 takes action (0, 1)
Player2 takes action (2, 1)
Player1 takes action (0, 0)
Player2 takes action (1, 0)
Player1 takes action (2, 2)
Tie
Percent of game won by each agent: [(0, 87.5), (1, 1.7045454545454544), (-1, 10.795454545454545)]


Player1 takes action (0, 1)
Player2 takes action (2, 1)
Player1 takes action (2, 0)
Player2 takes action (0, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 90.67796610169492), (1, 1.2711864406779663), (-1, 8.050847457627118)]
Episode 236
Player1 takes action (1, 2)
Player2 takes action (1, 1)
Player1 takes action (2, 1)
Player2 takes action (2, 0)
Player1 takes action (0, 0)
Player2 takes action (2, 2)
Player1 takes action (0, 2)
Player2 takes action (0, 1)
Player1 takes action (1, 0)
Tie
Percent of game won by each agent: [(0, 90.71729957805907), (1, 1.2658227848101267), (-1, 8.016877637130802)]
Episode 237
Player1 takes action (1, 2)
Player2 takes action (1, 1)
Player1 takes action (2, 1)
Player2 takes action (2, 2)
Player1 takes action (0, 0)
Player2 takes action (2, 0)
Player1 takes action (0, 2)
Player2 takes action (0, 1)
Player1 takes action (1, 0)
Tie
Percent of game won by each agent: [(0, 90.7

Player1 takes action (0, 0)
Player2 takes action (2, 2)
Player1 takes action (0, 2)
Player2 takes action (0, 1)
Player1 takes action (1, 0)
Tie
Percent of game won by each agent: [(0, 92.46575342465754), (1, 1.0273972602739725), (-1, 6.506849315068493)]
Episode 292
Player1 takes action (1, 2)
Player2 takes action (1, 1)
Player1 takes action (2, 1)
Player2 takes action (2, 2)
Player1 takes action (0, 0)
Player2 takes action (2, 0)
Player1 takes action (0, 2)
Player2 takes action (0, 1)
Player1 takes action (1, 0)
Tie
Percent of game won by each agent: [(0, 92.4914675767918), (1, 1.023890784982935), (-1, 6.484641638225256)]
Episode 293
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 0)
Player2 takes action (0, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 92.51700680272108), (1, 1.0204081632653061), (-1, 6.4625850340

Player2 takes action (0, 2)
Player1 takes action (2, 0)
Player2 takes action (1, 0)
Player1 takes action (1, 2)
Tie
Percent of game won by each agent: [(0, 93.73219373219374), (1, 0.8547008547008548), (-1, 5.413105413105413)]
Episode 351
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (0, 2)
Player2 takes action (2, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 93.75), (1, 0.8522727272727272), (-1, 5.3977272727272725)]
Episode 352
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (0, 2)
Player2 takes action (2, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 93.76770538243626), (1, 0.84985835694051), (-1, 5.382436260623229)]
Episode 353
Player1 takes actio

Player2 takes action (0, 0)
Player1 takes action (2, 2)
Player2 takes action (1, 2)
Player1 takes action (0, 1)
Tie
Percent of game won by each agent: [(0, 94.58128078817734), (1, 0.7389162561576355), (-1, 4.679802955665025)]
Episode 406
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (0, 2)
Player2 takes action (2, 0)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 94.5945945945946), (1, 0.7371007371007371), (-1, 4.668304668304668)]
Episode 407
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (1, 0)
Player2 takes action (1, 2)
Player1 takes action (0, 2)
Player2 takes action (2, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 94.6078431372549), (1, 0.7352941176470588), (-1, 4.6568627450980395)]
Episode 408
Player1

Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 95.26881720430107), (1, 0.6451612903225806), (-1, 4.086021505376344)]
Episode 465
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (0, 2)
Player2 takes action (2, 0)
Player1 takes action (0, 0)
Tie
Percent of game won by each agent: [(0, 95.27896995708154), (1, 0.6437768240343348), (-1, 4.07725321888412)]
Episode 466
Player1 takes action (1, 1)
Player2 takes action (2, 2)
Player1 takes action (0, 0)
Player2 takes action (0, 2)
Player1 takes action (1, 2)
Player2 takes action (1, 0)
Player1 takes action (2, 1)
Player2 takes action (0, 1)
Player1 takes action (2, 0)
Tie
Percent of game won by each agent: [(0, 95.28907922912205), (1, 0.6423982869379015), (-1, 4.068522483940043)]
Episode 467
Player1 takes action (0, 1)
Player2 takes action (1, 1)
Player1 takes action (2, 2)
Player2

<h3> Conclusion </h3>

After 10,000 training games against itself, the Q Learned Player learns to play optimally resulting in draws in all self-play games. It would be instructive, however, to check this by pitting it against a player following the minimax algorithm for many games. As mentioned above, the agent was trained using Q learning and made it play with itself until they tie. As it played again and again, it imitates and learns the moves and thus learns to play well.  

<h3> References </h3>

1. https://github.com/MJeremy2017/Reinforcement-Learning-Implementation/blob/master/TicTacToe/ticTacToe.py
2. https://deeplizard.com/learn/video/qhRNvCVVJaA