# PathChess (Educational AlphaZero Clone) Chess Engine Demo

## Introduction
This notebook demonstrates the capabilities of the AlphaZero Chess Engine. It provides an overview of the engine's architecture, its training process, and allows for interactive gameplay.

## Background and Theory
### Monte Carlo Tree Search (MCTS)
Monte Carlo Tree Search (MCTS) is a heuristic search algorithm used in decision processes, such as game playing. It balances exploration of unexplored moves with exploitation of known rewarding moves.

### Neural Network Architecture
AlphaZero uses a deep neural network with layers designed for evaluating board positions and determining the probabilities of possible moves.

## Setup Environment
To run the examples in this notebook, ensure that the required libraries are installed and import them as follows:


In [1]:
import os

In [2]:
# Import necessary libraries
from src.model import build_alpha_zero_model
from main import load_model

# Load a pre-trained model
#model = load_model('models/alpha-path_model')

model = build_alpha_zero_model()
model.compile(optimizer='adam',
              loss={'policy_output': 'categorical_crossentropy', 'value_output': 'mean_squared_error'},
              metrics={'policy_output': 'accuracy', 'value_output': 'mse'})


['C:\\Users\\Shaur\\Desktop\\PathChess', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\python38.zip', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\DLLs', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero', '', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages\\win32', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages\\win32\\lib', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages\\Pythonwin']


## Model Architecture


In [3]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 4, 8, 8, 14  0           []                               
                                )]                                                                
                                                                                                  
 reshape (Reshape)              (None, 8, 8, 56)     0           ['input_1[0][0]']                
                                                                                                  
 conv2d (Conv2D)                (None, 8, 8, 256)    129280      ['reshape[0][0]']                
                                                                                                  
 batch_normalization (BatchNorm  (None, 8, 8, 256)   1024        ['conv2d[0][0]']             

## Demonstration of Supervised Learning
Here we simulate a training session using pre-loaded data. This section is meant to show how the model can learn from historical game data. There are enough chess games to never have to run over the same dataset multiple times, if you want to you can run through the same dataset multiple times.


In [4]:
from main import begin_supervised_learning
training_history = begin_supervised_learning(model, 'games_database/lichess_elite_2020-05.pgn', 1500)
# save_model(model, 'path_to_my_model.h5')
# model = load_model_from_path('path_to_my_model.h5')
# human_vs_ai(model)



## Demonstration of Self-Play and Unsupervised Learning
The engine improves its play by competing against itself. Below, we run a few iterations of self-play to demonstrate how the model updates its strategy. This is extremely inefficient compared to using masters games and is mainly used to show the skill level of the model, in this example the model has not been trained enough so it will perform questionable moves as it doesn't understand a "winning position" yet


In [5]:
from main import begin_unsupervised_learning
begin_unsupervised_learning(model, cycles = 1, games_per_cycle = 2, iters = 150)
# save_model(model, 'path_to_my_model.h5')
# Note this is extremely slow at the moment, alphazero runs on 800 iterations but if you want to see it run quickly, just do 10 iterations
# The quality of this game will be poor regardless since the model hasn't been trained on enough games

NEW GAME STARTED


  (child.value / child.visits) + c_param * np.sqrt((2 * np.log(self.visits) / child.visits))
  (child.value / child.visits) + c_param * np.sqrt((2 * np.log(self.visits) / child.visits))










































































































  (child.value / child.visits) + c_param * np.sqrt((2 * np.log(self.visits) / child.visits))
  (child.value / child.visits) + c_param * np.sqrt((2 * np.log(self.visits) / child.visits))
































































































































































































































































































































































































































































































































































































































[Event "Self-play training session"]
[Site "Local"]
[Date "2024.04.26"]
[Round "1"]
[White "Model"]
[Black "Model"]
[Result "1/2-1/2"]

1. b3 d5 2. Bb2 e5 3. c4 Nc6 4. e3 Bd6 5. d4 Nge7 6. Nf3 O-O 7. c5 exd4 8. Bc4 Bxc5 9. Nbd2 a5 10. a3 dxc4 11. Nxc4 Bg4 12. Qc1 b5 13. h3 Bh5 14. Rg1 f5 15. Nfd2 Bg6 16. f3 Qd7 17. Kd1 Bb6 18. exd4 Rfe8 19. a4 bxa4 20. d5 Nb4 21. Rb1 Nd3 22. Bxg7 h5 23. Bf8 Nc6 24. Kc2 Red8 25. Na3 Rac8 26. Re1 Nd4+ 27. Kxd3 axb3 28. Rg1 Ra8 29. Ndc4 Qe8 30. Nb5 Kxf8 31. Nxd4 Bf7 32. Nxf5 Bg6 33. Rh1 Ba7 34. Qc2 b2 35. Rhd1 Bf7 36. Nfe3 Kg8 37. Rdc1 Bxd5 38. Nxd5 Rd6 39. f4 c5 40. Qc3 Rb8 41. Qb4 Ra8 42. Qd2 Kh7 43. Qxb2 Qd7 44. Rc3 Rf8 45. Ke2 Qe7+ 46. Kd2 Qd7 47. Ra1 Qb7 48. Qc1 Qa8 49. Kd3 Qb7 50. Rxa5 Rb8 51. Raa3 Qf7 52. Kc2 Qe6 53. Ra1 Rg8 54. f5 Qf6 55. Na5 Kh8 56. Ne7 Qg7 57. Kb2 Qxe7 58. Rxc5 Qf6+ 59. Kb3 Qg7 60. Rc8 Rf8 61. Qc3 h4 62. Qg3 Rxc8 63. Qf2 Qg6 64. Ra2 Rc1 65. Kb4 Qf6 66. Nc4 Be3 67. Na5 Bc5+ 68. Ka4 Rd7 69. Kb3 Rh7 70. Qb2 Qe5 71. Qc2 Qc3+ 72. Qxc

NameError: name 'os' is not defined

It plays questionable chess and has a good understanding of positions and the general moves, but when a model blunders a queen for example, the model won't take the queen since the training data (masters games) extremely rarely has straight up winning moves, a better model may start by training on all games where piece blunders are more common and slowly filter out to masters games.

## Evaluation and Analysis
We evaluate the model's performance by displaying its accuracy and loss metrics, and compare it to a baseline if available.


In [None]:
import matplotlib.pyplot as plt

total_loss = []
policy_output_loss = []
value_output_loss = []
policy_output_accuracy = []
value_output_mse = []


for history in training_history:
    total_loss.extend(history['loss'])
    policy_output_loss.extend(history['policy_output_loss'])
    value_output_loss.extend(history['value_output_loss'])
    policy_output_accuracy.extend(history['policy_output_accuracy'])
    value_output_mse.extend(history['value_output_mse'])


plt.figure(figsize=(10, 5))  # Optional: You can specify the size of the plot

plt.plot(total_loss, label='Total loss')
plt.plot(policy_output_loss, label='Policy loss')
plt.plot(value_output_loss, label='Value loss')

plt.title('Model Loss by Epoch')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

plt.figure(figsize=(10, 5))

plt.plot(policy_output_accuracy, label='Policy output accuracy')
plt.plot(value_output_mse, label='Value output MSE')

plt.title('Model Accuracy/MSE by Epoch')
plt.xlabel('Epoch')
plt.ylabel('Metrics')
plt.legend()
plt.show()


## Conclusion and Further Research
In this notebook, we have explored the functionalities of the AlphaZero chess engine, covering its training and gameplay capabilities. For further research, one could explore enhancements in the neural network architecture or improvement in the MCTS algorithim, especially in its training speed, or perhaps implement an NNUE approach similar to stockfish where the benefits of pure calculation in a chess engine remain. This code will result in an engine that plays logically, but very poorly due to a lack of hardware capabilities, this project is more for educational purposes than practical as there are many optimizations that can be made (such as using C++ directly) in order to make the code more efficient, however this serves as an excellent reference for someone trying to understand how Alpha-Zero works..

## References
- [Link to the original AlphaZero paper](https://arxiv.org/pdf/1712.01815.pdf)

- [Link to code for board encodings](https://www.freecodecamp.org/news/create-a-self-playing-ai-chess-engine-from-scratch/)

- [Assistance for Monte Carlo Tree Search](https://www.youtube.com/watch?v=wuSQpLinRB4&t=7691s)
