## Minmax + Pruning

* ( + 状态评估，限制最大搜索深度
* ( + alpha-beta，限制每轮搜索宽度


note: 
* 初始状态搜索空间大，速度较慢。
* 交互几轮后，迭代速度就很快了
* 状态评估函数太简单，但是有点智能的味道

todo: 
* 优化capture_diff_v2

In [1]:
# 环境配置
%cd ../../
import sys
sys.path.append('./python')

d:\sgd-代码库\torch2.0-paly\sgd_deep_learning\sgd_rl\gobang


In [2]:
from dlgobang.game import GameState, Player, Move, Point
from dlgobang.agent import AlphaBetaAgent
from dlgobang.utils import print_board, print_move, point_from_coords

In [3]:
# 极简棋局状态评估，，简化胜负判定
def capture_diff_v1(game_state:GameState):
    black_score = 0
    white_score = 0
    
    # 只比较1-2-3-4连子数目， 威胁分分别为（1，3，10，20）
    smap = {1:1, 2:3, 3:10, 4:20}

    for line in game_state.board.lines.values():
        if game_state.next_player == Player.BLACK:
            black_score += smap[len(line)]
        else:
            white_score += smap[len(line)]
        
    diff = black_score - white_score
    
    if game_state.next_player == Player.BLACK:
        return diff
    return -1 * diff

In [4]:
# 在v1基础上，细化连子威胁分数
def capture_diff_v2(game_state):
    black_score = 0
    white_score = 0
    
    
    '''
    空2:    没堵上
    空1:    堵了一半
    死棋：  两边都堵上
    
    1子，
        空2：
        空1：
        死：
    2子，
        空2：
        空1：
        死：
        # 接近的双子（具有较大威胁）
    3子，
        空2： 可以判赢
        空1： 有较大威胁
        死：  有一定威胁
    4子，
        空2： 可以判赢
        空1： 可以判赢 (需要分敌我情况分析)
        死：  有一定威胁
    '''
    
    diff = black_score - white_score
    
    if game_state.next_player == Player.BLACK:
        return diff
    return -1 * diff

### 人机博弈
搜索深度为3，提前停止较差的搜索，ai-bot的落子速度有提升。

In [5]:
BOARD_SIZE = 5
game = GameState.new_game(BOARD_SIZE)
bot = AlphaBetaAgent(3, capture_diff_v1)

while not game.is_over():
    print_board(game.board)
    if game.next_player == Player.BLACK:
        human_move = input('输入落子[A1-]:')
        point = point_from_coords(human_move.strip())
        move = Move.play(point)
    else:
        move = bot.select_move(game)
    print_move(game.next_player, move)
    game = game.apply_move(move)

 5  .  .  .  .  . 
 4  .  .  .  .  . 
 3  .  .  .  .  . 
 2  .  .  .  .  . 
 1  .  .  .  .  . 
    A  B  C  D  E
Player.BLACK C2
 5  .  .  .  .  . 
 4  .  .  .  .  . 
 3  .  .  .  .  . 
 2  .  .  x  .  . 
 1  .  .  .  .  . 
    A  B  C  D  E
AlphaBetaAgent is search Move (r 1, c 1)
AlphaBetaAgent is search Move (r 1, c 2)
AlphaBetaAgent is search Move (r 1, c 3)
AlphaBetaAgent is search Move (r 1, c 4)
AlphaBetaAgent is search Move (r 1, c 5)
AlphaBetaAgent is search Move (r 2, c 1)
AlphaBetaAgent is search Move (r 2, c 2)
AlphaBetaAgent is search Move (r 2, c 4)
AlphaBetaAgent is search Move (r 2, c 5)
AlphaBetaAgent is search Move (r 3, c 1)
AlphaBetaAgent is search Move (r 3, c 2)
AlphaBetaAgent is search Move (r 3, c 3)
AlphaBetaAgent is search Move (r 3, c 4)
AlphaBetaAgent is search Move (r 3, c 5)
AlphaBetaAgent is search Move (r 4, c 1)
AlphaBetaAgent is search Move (r 4, c 2)
AlphaBetaAgent is search Move (r 4, c 3)
AlphaBetaAgent is search Move (r 4, c 4)
AlphaBetaAgent is se

IndexError: string index out of range