**NOTE:** You might need to run this notebook locally for an optimal experience, since it contains javascript code that cannot be run on colab/github/etc.

# Query a pre-built database of chess positions

This demo shows you how to search a pre-bulit database of chess positions for positions that are similar to your own query position.

### Enter your query

First you need to specify the position for which you want to find similar positions for. Use the widget in the following cell to set up a board and then press the *Export Position* button.

In [1]:
'''RUN THIS CELL ONLY ONCE to create the widget and a query queue'''
from IPython.display import HTML
query = []
HTML('input-widget.html')

0,1
Start Position  Clear Board  Export Position,Side to play  White Black Castling Rights  White Queenside White Kingside Black Queenside Black Kingside


Run the next cell to add your position to the query queue.

In [3]:
'''ALWAYS RUN THIS CELL AFTER YOU ENTER A POSITION IN THE PREVIOUS WIDGET'''
print(f"The fen string of your position is: {fen}.")
query.append(fen)
print(f"The fen string was added to the query, which now has {len(query)} positions stored.")

The fen string of your position is: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1.
The fen string was added to the query, which now has 1 positions stored.


After executing the last two cells you can **view the current query queue with the command below**. You can also **add more positions to the query queue by using the widget again and then executing the previous cell again**.

In [4]:
print(query)

['rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1']


**Alternatively** if the widget above does not work you can visit [lichess.org](https://lichess.org/editor) retrieve the fen string of your position there and then paste it in the array below.

In [None]:
'''
EXECUTE THIS CELL ONLY IF YOU DON'T WANT TO USE THE WIDGET ABOVE.
Replace the example fen string with your own.
'''
query = [
    "rnbq1rk1/pp2bppp/2p1pn2/4N1B1/2pP4/2N3P1/PP2PPBP/R2Q1RK1 b Qq - 0 1", #list of fen strings seperated by comma
    "rnbq1rk1/pp2bppp/2p1pn2/4N1B1/2pP4/2N3P1/PP2PPBP/R2Q1RK1 b Qq - 0 1"
]

### Load the database of chess positions

In [5]:
from chesspos.binary_index import index_load, index_query_positions

Let's download a small precompiled database from google drive and uncompress it.

In [None]:
!curl -L -o '../data/index_2013.faiss.bz2' 'https://docs.google.com/uc?export=download&id=1MQKJ6KSmYRyPbIP1ldsNBo-0dGhi-CpQ'
!bzip2 -d ../data/index_2013.faiss.bz2

Now load the database into memory. *You need at least 256MB of RAM.*

In [6]:
filepath = "../data/index_2013.faiss"
index = index_load(filepath, is_binary=True)
print(f"The database you loaded contains {round(index.ntotal/1.e6,3)} million positions")

The database you loaded contains 1.766 million positions


Specify the **expected number of results per query**, then search the database and retrieve most similar positions.

In [7]:
search_results = 10
dist, reconstructed = index_query_positions(query, index, input_format='fen', output_format='fen',
                                            num_results=search_results)

### Inspect the retrieved results

Execute the cell below. This will generate a widget which lets you inspect the retireved queries.

In [8]:
from IPython.display import HTML
html = '''<link rel="stylesheet" href="https://unpkg.com/@chrisoakman/chessboardjs@1.0.0/dist/chessboard-1.0.0.min.css" integrity="sha384-q94+BZtLrkL1/ohfjR8c6L+A6qzNH9R2hBLwyoAfu3i/WCvQjzL2RQJ3uNHDISdU" crossorigin="anonymous"><table>'''
for i in range(len(reconstructed)):
    html += f'''<tr><td>Your Query Position {i}</td><td><span>The (Hamming) distance between query and </span><select id="mySelect{i}" onchange="myFunction{i}()">'''
    for j in range(search_results):
        html += f'''<option value='{reconstructed[i][j]}|{dist[i][j]}'>Similar Position {j}</option>'''
    html += f'''</select><span> is </span><span id="dist{i}">0</span><span>.</span></td></tr><tr><td><div id="query{i}" style="width: 400px"></div></td><td><div id="myBoard{i}" style="width: 400px"></div></td></tr>'''
html += '''</table><script src="https://unpkg.com/@chrisoakman/chessboardjs@1.0.0/dist/chessboard-1.0.0.min.js" integrity="sha384-8Vi8VHwn3vjQ9eUHUxex3JSN/NFqUg3QbPyX8kWyb93+8AC/pPWTzj+nHtbC5bxD" crossorigin="anonymous"></script><script>'''
for i in range(len(reconstructed)):
    html += f'''var pos{i} = document.getElementById("mySelect{i}").value;var board{i} = Chessboard('myBoard{i}',{{showNotation: false}});var query{i} = Chessboard('query{i}',{{position: '{query[i]}',showNotation: false}});function myFunction{i}() {{var infos = document.getElementById("mySelect{i}").value;var position = infos.split("|")[0];var distance = infos.split("|")[1];board{i}.position(position);document.getElementById("dist{i}").innerHTML = distance;}}'''
html += '''</script>'''
HTML(html)

0,1
Your Query Position 0,The (Hamming) distance between query and Similar Position 0Similar Position 1Similar Position 2Similar Position 3Similar Position 4Similar Position 5Similar Position 6Similar Position 7Similar Position 8Similar Position 9 is 0.
,


## How are similar positions calculated?

Positions are internally represented as bitboards, where each combination of (sqare,piece) is assigned the value 'true' if a piece of this type occupies the square and 'false' otherwise.

Therefore the distance between positions is the [Hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) between boolean vectors.

This measure of position distances is however not too useful as you can find out by experimenting with the above demo. For other ways of measuring position similarity check out the metric learning part of this repo.

## Now use an embedding for search

First we need to download an embedding model and the index files. Then we prepare everything for the saerch and retrieval.

In [None]:
# download an embedding model
!curl -L -o '../data/deep64.tar.bz2' 'https://docs.google.com/uc?export=download&id=1MHBTMx7yCJTL_l-BD72Nr3EEcwLa1myq'
# download an index for search
!curl -L -o '../data/PCA32,SQ6.faiss' 'https://docs.google.com/uc?export=download&id=1C70LuT3NGHdwqmPSrz4ZplwskakV769W'
!curl -L -o '../data/PCA32,SQ6.json' 'https://docs.google.com/uc?export=download&id=1f94oEH6aMEFASQs1jowdiklCktZt3gy1'
# download the bitboards that were used to create the index
!curl -L -o '../data/2013_bitboards.tar.bz2' 'https://docs.google.com/uc?export=download&id=1i00hmYvjPn4LmNHj71fm_Kx9ETBv_cwa'
# uncompress and clean up
!tar -xjf ../data/deep64.tar.bz2 -C ../data/
!mv ../data/deep64 ../data/model_deep64
!rm ./data/deep64.tar.bz2
!tar -xjf ../data/2013_bitboards.tar.bz2 -C ../data/
!mv ../data/content/2013_bitboards ../data/2013_bitboards
!rm -r ../data/content

In [9]:
import json
import faiss
import numpy as np
import tensorflow as tf

from chesspos.convert import bitboard_to_board
from chesspos.binary_index import board_to_bitboard
from chesspos.utils import files_from_directory
import chesspos.embedding_index as iemb

In [10]:
# prepare everything for search
encoder_path = "../data/model_deep64/model_encoder.h5"
decoder_path ="../data/model_deep64/model_decoder.h5"
embedding_path = f"../data/2013_bitboards"
index_file_without_ending = "../data/PCA32,SQ6"
# load the index
table_dict = json.load( open( f"{index_file_without_ending}.json" ) )
index = faiss.read_index(f"{index_file_without_ending}.faiss")
# query same as above
#query = np.asarray(query)
search_results = 10

### Now search the index and retrieve position of nearest neighbors

In [11]:
D, I, E = iemb.index_query_positions(query, index, encoder_path,
                                     input_format='fen', num_results=search_results)



**And retrieve the belonging bitboards**

In [12]:
# retrieve the belonging bitboards
file, table, offset = iemb.location_from_index(I, table_dict)
bb_table = iemb.manipulate_prefix(table, "position")
bb_file = iemb.manipulate_prefix(file, f"{embedding_path}/lichess_db_standard_rated")

embedding_bitboards = iemb.retrieve_elements_from_file(bb_file, bb_table, offset)
bb_shape = embedding_bitboards.shape
print(embedding_bitboards.shape, embedding_bitboards.dtype)

[  49366  209910  237040  637847  699750  877884 1163538 1266212 1449998
 1634451 1693104 1765739]
(1, 10, 773) bool


In [13]:
# convert bitboards to fen
def fen_converter(bb):
    bb = bb.astype(bool)
    board = bitboard_to_board(bb) 
    return board.fen()

bitboards_fen = [[None for _ in range(bb_shape[1])] for _ in range(bb_shape[0])]
for i in range(bb_shape[0]):
    for j in range(bb_shape[1]):
        bitboards_fen[i][j] = fen_converter(embedding_bitboards[i][j])
print(len(bitboards_fen),len(bitboards_fen[0]), bitboards_fen)

1 10 [['rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1', 'rnbqkbnr/pppppppp/8/8/1P6/8/P1PPPPPP/RNBQKBNR b KQkq - 0 1']]


### Inspect the retireved results

In [16]:
from IPython.display import HTML
html = '''<link rel="stylesheet" href="https://unpkg.com/@chrisoakman/chessboardjs@1.0.0/dist/chessboard-1.0.0.min.css" integrity="sha384-q94+BZtLrkL1/ohfjR8c6L+A6qzNH9R2hBLwyoAfu3i/WCvQjzL2RQJ3uNHDISdU" crossorigin="anonymous"><table>'''
for i in range(len(bitboards_fen)):
    html += f'''<tr><td>Your Query Position {i}</td><td><span>Distance between query and </span><select id="mySelectEmb{i}" onchange="myFunctionEmb{i}()">'''
    for j in range(search_results):
        html += f'''<option value='{bitboards_fen[i][j]}|{D[i][j]}'>Similar Position {j}</option>'''
    html += f'''</select><span> is </span><span id="distEmb{i}">0</span><span>.</span></td></tr><tr><td><div id="queryEmb{i}" style="width: 400px"></div></td><td><div id="myBoardEmb{i}" style="width: 400px"></div></td></tr>'''
html += '''</table><script src="https://unpkg.com/@chrisoakman/chessboardjs@1.0.0/dist/chessboard-1.0.0.min.js" integrity="sha384-8Vi8VHwn3vjQ9eUHUxex3JSN/NFqUg3QbPyX8kWyb93+8AC/pPWTzj+nHtbC5bxD" crossorigin="anonymous"></script><script>'''
for i in range(len(bitboards_fen)):
    html += f'''var pos{i} = document.getElementById("mySelectEmb{i}").value;var board{i} = Chessboard('myBoardEmb{i}',{{showNotation: false}});var query{i} = Chessboard('queryEmb{i}',{{position: '{query[i]}',showNotation: false}});function myFunctionEmb{i}() {{var infos = document.getElementById("mySelectEmb{i}").value;var position = infos.split("|")[0];var distance = parseFloat(infos.split("|")[1]).toFixed(2);board{i}.position(position);document.getElementById("distEmb{i}").innerHTML = distance;}}'''
html += '''</script>'''
HTML(html)

0,1
Your Query Position 0,Distance between query and Similar Position 0Similar Position 1Similar Position 2Similar Position 3Similar Position 4Similar Position 5Similar Position 6Similar Position 7Similar Position 8Similar Position 9 is 0.
,


## Why is the result worse than I expected?

If you execute the code above you might be disappointed by the retrieved results. This is most likely due to the fact that you are only searching in a database of 1.7 million positions. After all a similar positin can only be retrieved if it is in the database. You might want to download one of the bigger databases I provide (see section 4.3 of the readme).

Another reason for poor performance could be that you selected a position from the opening or endgame phase. Since the embedding models that I provide are trained on middle game positions the embeddings might not have the highest possible quality in that case.