## Monte Carlo

蒙特卡洛方法是用一些随机数来解决复杂问题的一种方法。

### 蒙特卡洛方法求抛硬币概率

In [None]:
import math
import random
import numpy as np
import matplotlib.pyplot as plt

In [None]:
def coin_flip():
    """生成硬币正反两面"""
    return random.randint(0, 1)

In [None]:
list1 = []
def monte_carlo(n):
    results = 0
    for i in range(n):
        flip_result = coin_flip()
        results += flip_result
        
        prob_value = results / (i + 1)  # 计算概率
        list1.append(prob_value)
        
    plt.axhline(y=0.5, color='r', linestyle='-')
    plt.xlabel("Iterations")
    plt.ylabel("probability")
    plt.plot(list1)
    
    return results / n

In [None]:
answer = monte_carlo(5000)
print("final value is {}".format(answer))

### 估计Pi值

在一个正方形内的内切圆，其正方形面积与圆形面积比例为:

$$
\frac{Area \ of \ square}{Area \ of \ circlr} = \frac{(2r)^{2}}{\pi r^{2}} = \frac{4}{\pi}
$$

In [None]:
in_circle = 0 # 记录在圆内的点
out_circle = 0 # 记录在圆外的点
pi_values = []  # 记录pi值

In [None]:
for i in range(5):
    for j in range(1000):
        x = random.randrange(-100, 100)
        y = random.randrange(-100, 100)
        
        if (x**2 + y**2 > 100**2):
            out_circle += 1
        else:
            in_circle += 1
        
        pi = 4 * in_circle / (in_circle + out_circle)
        pi_values.append(pi)
        
print(pi_values[-1])

plt.axhline(y=math.pi, color='g', linestyle='-')
plt.plot(pi_values)
plt.xlabel('Iterations')
plt.xlabel("Value of PI")
plt.show()

## Monte Carlo Tree Search

传统强化学习没有用MCTS遇到一些困难，加上MCTS之后就可以解决强化学习中的一一些困难。

如果是用Q-Value来做围棋，其状态空间太大，而MCTS是Tree Strategy加上rollout strategy，随机采样一些样本来作为其值。

蒙特卡洛树有四步:

1. Tree Traversal

2. Node expansion

3. Rollout (Random Simulation)

4. Backpropagation

<img src="../images/MCTS.png" width="100%">

可以看到孩子节点的分子和等于父亲节点的分子和，对于分母也一样。

## Tictactoe

In [1]:
from copy import deepcopy

In [18]:
#
# AI that learns to play Tic Tac Toe using
#        reinforcement learning
#                (MCTS)
#

# packages
from copy import deepcopy

# Tic Tac Toe board class
class Board():
    # create constructor (init board class instance)
    def __init__(self, board=None):
        # define players
        self.player_1 = 'x'
        self.player_2 = 'o'
        self.empty_square = '.'
        
        # define board position
        self.position = {}
        
        # init (reset) board
        self.init_board()
        
        # create a copy of a previous board state if available
        if board is not None:
            self.__dict__ = deepcopy(board.__dict__)
    
    # init (reset) board
    def init_board(self):
        # loop over board rows
        for row in range(3):
            # loop over board columns
            for col in range(3):
                # set every board square to empty square
                self.position[row, col] = self.empty_square
    
    # make move
    def make_move(self, row, col):
        # create new board instance
        # board = Board(self)
        
        # make move
        board.position[row, col] = self.player_1
        
        # swap players
        (board.player_1, board.player_2) = (board.player_2, board.player_1)
    
        # return new board state
        return board
        
    # print board state
    def __str__(self):
        # define board string representation
        board_string = ''
        
        # loop over board rows
        for row in range(3):
            # loop over board columns
            for col in range(3):
                board_string += ' %s' % self.position[row, col]
            
            # print new line every row
            board_string += '\n'
        
        # prepend side to move
        if self.player_1 == 'x':
            board_string = '\n--------------\n "x" to move:\n--------------\n\n' + board_string
        
        elif self.player_1 == 'o':
            board_string = '\n--------------\n "o" to move:\n--------------\n\n' + board_string
                        
        # return board string
        return board_string

In [19]:
board = Board()
print(board)

board.make_move(0, 0)
print(board)

board.position[2,2] = 'x'
print(board)

board_1 = Board(board)
print(board_1)


--------------
 "x" to move:
--------------

 . . .
 . . .
 . . .


--------------
 "o" to move:
--------------

 x . .
 . . .
 . . .


--------------
 "o" to move:
--------------

 x . .
 . . .
 . . x


--------------
 "o" to move:
--------------

 x . .
 . . .
 . . x

