# PathChess (Educational AlphaZero Clone) Chess Engine Demo

## Introduction
This notebook demonstrates the capabilities of the AlphaZero Chess Engine. It provides an overview of the engine's architecture, its training process, and allows for interactive gameplay.

## Background and Theory
### Monte Carlo Tree Search (MCTS)
Monte Carlo Tree Search (MCTS) is a heuristic search algorithm used in decision processes, such as game playing. It balances exploration of unexplored moves with exploitation of known rewarding moves.

### Neural Network Architecture
AlphaZero uses a deep neural network with layers designed for evaluating board positions and determining the probabilities of possible moves.

## Setup Environment
To run the examples in this notebook, ensure that the required libraries are installed and import them as follows:


In [1]:
import os

In [2]:
# Import necessary libraries
from src.model import build_alpha_zero_model
from main import load_model

# Load a pre-trained model
model = load_model('models/alpha-path_model')

# model = build_alpha_zero_model()
# model.compile(optimizer='adam',
#               loss={'policy_output': 'categorical_crossentropy', 'value_output': 'mean_squared_error'},
#               metrics={'policy_output': 'accuracy', 'value_output': 'mse'})


['C:\\Users\\Shaur\\Desktop\\PathChess', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\python38.zip', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\DLLs', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero', '', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages\\win32', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages\\win32\\lib', 'C:\\Users\\Shaur\\Anaconda3\\envs\\turozero\\lib\\site-packages\\Pythonwin']


## Model Architecture


In [11]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 4, 8, 8, 14  0           []                               
                                )]                                                                
                                                                                                  
 reshape (Reshape)              (None, 8, 8, 56)     0           ['input_1[0][0]']                
                                                                                                  
 conv2d (Conv2D)                (None, 8, 8, 256)    129280      ['reshape[0][0]']                
                                                                                                  
 batch_normalization (BatchNorm  (None, 8, 8, 256)   1024        ['conv2d[0][0]']             

## Demonstration of Supervised Learning
Here we simulate a training session using pre-loaded data. This section is meant to show how the model can learn from historical game data.


In [12]:
from main import begin_supervised_learning
training_history = begin_supervised_learning(model, 'games_database/lichess_elite_2020-05.pgn', 100)
# save_model(model, 'path_to_my_model.h5')
# model = load_model_from_path('path_to_my_model.h5')
# human_vs_ai(model)

Unexpected exception formatting exception. Falling back to standard exception


Traceback (most recent call last):
  File "C:\Users\Shaur\Anaconda3\envs\turozero\lib\site-packages\IPython\core\interactiveshell.py", line 3508, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "C:\Users\Shaur\AppData\Local\Temp\ipykernel_6912\182678050.py", line 2, in <module>
    training_history = begin_supervised_learning(model, 'games_database/lichess_elite_2020-05.pgn', 100)
  File "C:\Users\Shaur\Desktop\PathChess\main.py", line 21, in begin_supervised_learning
    game_data = process_single_game(game)
  File "C:\Users\Shaur\Desktop\PathChess\src\training.py", line 51, in process_single_game
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Shaur\Anaconda3\envs\turozero\lib\site-packages\IPython\core\interactiveshell.py", line 2105, in showtraceback
    stb = self.InteractiveTB.structured_traceback(
  File "C:\Users\Shaur\Anaconda3\envs\turozero\lib\site-packages\IP

## Demonstration of Self-Play and Unsupervised Learning
The engine improves its play by competing against itself. Below, we run a few iterations of self-play to demonstrate how the model updates its strategy.


In [4]:
from main import begin_unsupervised_learning
begin_unsupervised_learning(model, cycles = 1, games_per_cycle = 2, iters = 1000)
# save_model(model, 'path_to_my_model.h5')
# Note this is extremely slow at the moment, alphazero runs on 800 iterations but if you want to see it run quickly, just do 10 iterations

NEW GAME STARTED






































































































































































































































































KeyboardInterrupt: 

## Evaluation and Analysis
We evaluate the model's performance by displaying its accuracy and loss metrics, and compare it to a baseline if available.


In [None]:
import matplotlib.pyplot as plt

total_loss = []
policy_output_loss = []
value_output_loss = []
policy_output_accuracy = []
value_output_mse = []


for history in training_history:
    total_loss.extend(history['loss'])
    policy_output_loss.extend(history['policy_output_loss'])
    value_output_loss.extend(history['value_output_loss'])
    policy_output_accuracy.extend(history['policy_output_accuracy'])
    value_output_mse.extend(history['value_output_mse'])


plt.figure(figsize=(10, 5))  # Optional: You can specify the size of the plot

plt.plot(total_loss, label='Total loss')
plt.plot(policy_output_loss, label='Policy loss')
plt.plot(value_output_loss, label='Value loss')

plt.title('Model Loss by Epoch')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

plt.figure(figsize=(10, 5))

plt.plot(policy_output_accuracy, label='Policy output accuracy')
plt.plot(value_output_mse, label='Value output MSE')

plt.title('Model Accuracy/MSE by Epoch')
plt.xlabel('Epoch')
plt.ylabel('Metrics')
plt.legend()
plt.show()


## Conclusion and Further Research
In this notebook, we have explored the functionalities of the AlphaZero chess engine, covering its training and gameplay capabilities. For further research, one could explore enhancements in the neural network architecture or improvement in the MCTS algorith, especially in its training speed, or perhaps implement an NNUE approach similar to stockfish where the benefits of pure calculation in a chess engine remain. This code will result in an engine that plays logically, but very poorly due to a lack of hardware capabilities, this project is more for educational purposes than practical as there are many optimizations that can be made (such as using C++ directly) in order to make the code more efficient, however this serves as an excellent reference for someone trying to understand how Alpha-Zero works..

## References
- [Link to the original AlphaZero paper](https://arxiv.org/pdf/1712.01815.pdf)

- [Link to code for board encodings](https://www.freecodecamp.org/news/create-a-self-playing-ai-chess-engine-from-scratch/)
