My journey to achieving a 2,500 puzzle rating on Lichess.org and some general exploration of the Lichess puzzle database.
This repository contains some exploration and investigation into the Lichess database of chess puzzles as well as my own chess puzzle progress.
This project was undertaken partly out of curiosity and partly to practice some Python
, R
, and Tableau
. As such, I did most of the data manipulation in Python
using pandas
with initial visualizations in R
using ggplot2
. I worked primarily in Google Colab
, since I wanted to familiarize myself with that platform, as well. I used rpy2
to run Python
and R
in the same Colab
notebook via R magic
. Occasionally, the Colab
runtime would run out of RAM and crash when using R magic
—in those cases, I ran the R
code locally. Advanced visualizations and dashboards were done in Tableau
.
This is certainly not the most efficient way to do things but it worked out well enough overall and I learned a lot along the way.
Lichess.org is a free/libre, open-source chess server powered by volunteers and donations.
With humble beginnings, Lichess.org (pronounced "lee-chess") is now one of the most popular chess websites in the world, with over 3 million games played daily.
Lichess.org also has an extensive database of chess tactics puzzles, all of which were generated from user games on the website. This database contains millions of puzzles available to play for free!
We collected data from the full Lichess puzzle database as well as data on my personal rating history and puzzle activity vie the Lichess API. Details can be found in puzzle_journey_data_collection_processing.ipynb
.
At the onset of the COVID-19 Pandemic in early 2020, I—like many others—inexplicably found myself interested in learning to play chess. I created an account on Lichess.org (the clearly superior, open-source alternative to Chess.com) in May of 2020, but didn't do much with it aside from the occasional correspondence game with friends and family.
About 2 years later, I found myself practicing more frequently with the tactics puzzles available on Lichess. Eventually, I noticed that my puzzle performance was being evaluated with a puzzle rating. From https://database.lichess.org/#puzzles,
To determine the [puzzle] rating, each attempt to solve [a puzzle] is considered as a Glicko2 rated game between the player and the puzzle.
Basically, the higher-rated the puzzle, the more difficult it is to solve, and the higher a player's puzzle rating, the better they are at solving puzzles.
Now that a metric of performance was involved, what started as a way to pass time between classes and meetings turned into a personal challenge to see how high I could push my puzzle rating.
After about a year of taking puzzles far too seriously, on February 24, 2023, I reached my all-time high puzzle rating of 2,510
(putting me in the 96th
percentile for puzzle ratings).
You'll find details behind the data collection and analysis in puzzle_journey_data_collection_processing.ipynb
and lichess_puzzle_activity_history_eda.ipynb
respectively.
We perform some basic processing and analysis of the Lichess.org puzzle database. The database itself was obtained from https://database.lichess.org/#puzzles on March 22, 2023. The database contains information on Lichess puzzles including
- Puzzle ID
- FEN (board position 1 move prior to the beginning of the puzzle)
- Moves (i.e. solution to the puzzle)
- Rating
- Rating Deviation
- Popularity
- Number of Plays
- Themes
- Game URL
- Opening Tags.
From the documentation:
Generating these chess puzzles took more than 50 years of CPU time. We went through 300,000,000 analysed games from the Lichess database, and re-analyzed interesting positions with Stockfish 12/13/14/15 NNUE at 40 meganodes. The resulting puzzles were then automatically tagged. To determine the rating, each attempt to solve is considered as a Glicko2 rated game between the player and the puzzle. Finally, player votes refine the tags and define popularity.
You'll find the details behind the data collection and analysis in puzzle_journey_data_collection_processing.ipynb
and the Exploratory Analysis
folder respectively.
- This project would obviously not be possible if it weren't for the amazing people working on Lichess.org. You can view all of the source code for lichess here, you can explore their open database here, and you can make a donation here.
- The chess piece image(s) used here were sourced from https://github.com/lichess-org/lila/pull/10842/commits/410d54efdd43b5c90ca8d8d8af4c588b9268f38f, created by https://github.com/caderek, distributed under license CC BY-NC-SA 4.0.