<a href="https://colab.research.google.com/github/chanind/lc0_colab_notebooks/blob/main/lc0_rescoring.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Leela Chess Zero Rescoring

This Colab runs the rescorer to improve the training data used for model training in leela chess zero. Rescoring makes uses of [syzygy tablebases](https://syzygy-tables.info/), which are lists of perfect endgames for all games with less than 5 (or 6 or 7) pieces remaining on the board. If any training games end up in a position that's known in the tablebase, the rescorer will rewrite the game using the known perfect play from the tablebase so that leela can lean from a perfect endgame rather than the potential mistakes in the traning game.

Rescoring is used in the training of all the best nets for leela, and should always be used before running training for best results. Rescoring is CPU-only.

In [None]:
# Download and untar some training data to rescore

!mkdir -p /content/raw_training_data
%cd /content/raw_training_data

# change this to download whichever training runs you want, or load from google drive
! wget https://storage.lczero.org/files/training_data/test78/training-run3-test78-20220217-0717.tar
! wget https://storage.lczero.org/files/training_data/test78/training-run3-test78-20220216-1817.tar

# untar the files we just downloaded and delete the original tar files
! for f in *.tar; do tar -xf "$f"; rm "$f"; done

Check out the rescorer from Github, which is in https://github.com/Tilps/lc0 in the `rescore_tb` branch, and build from source.

In [None]:
# install build deps
! apt install ninja-build
! pip3 install meson

# checkout the repo from Github
%cd /content
!rm -rf lc0
!git clone --recurse-submodules https://github.com/Tilps/lc0.git
%cd lc0
!git checkout rescore_tb
!git pull

# start building
! ./build.sh

Download szyzygy 3-4-5 tablebases for use by the rescorer

In [None]:
! mkdir -p /content/syzygy-3-4-5
%cd /content/syzygy-3-4-5

from bs4 import BeautifulSoup
import requests
import urllib.parse

# we'll scrape https://tablebase.lichess.ovh/tables/standard/3-4-5/ 
# and download each file listed there
TABLEBASE_HOME_URL = 'https://tablebase.lichess.ovh/tables/standard/3-4-5/'

soup = BeautifulSoup(requests.get(TABLEBASE_HOME_URL).content, "html.parser")
tablebase_links = soup.select('a[href*=.rt]')
for tablebase_link in tablebase_links:
  filename = tablebase_link.attrs['href']
  url = urllib.parse.urljoin(TABLEBASE_HOME_URL, filename)
  tablebase_file = requests.get(url)
  with open(filename, "wb") as file:
      file.write(tablebase_file.content)

Next, run rescoring and save the output to `/content/rescored_training_data`.

NOTE: The rescorer will delete the original data files as it runs.

In [None]:
! mkdir -p /content/rescored_training_data/
%cd /content/raw_training_data/

# The rescorer script can't handle file globs like /content/raw_training_data/*
# So we need to do a loop in base and run it individually on each folder
! for data_dir in *; do \
  mkdir "/content/rescored_training_data/$data_dir"; \
  /content/lc0/build/release/rescorer rescore \
    --threads=2 \
    --syzygy-paths=/content/syzygy-3-4-5/ \
    --input="/content/raw_training_data/$data_dir/" \
    --output="/content/rescored_training_data/$data_dir"; \
done

And we're done! The rescored training data is now in the `rescored_training_data` folder. You can move this into Google drive to save it for later, or use it in training runs immediately.

For even better results, you can use larger tablebases. There are 6 and 7 piece tablebases available, but they take up dramatically more disk space than the 3-4-5 tablebase used here.