Skip to content

Data collection of rating and ranking of chess players over time

License

Notifications You must be signed in to change notification settings

JGravier/chessplayers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chess players rating and ranking

License: CC-BY

Objectives

This repo aims to compile different datasets related to chess players’ ratings and rankings over time. The data are extracted from several sources:

  1. since 1851 to September 2001 (annual, biannual, quarterly, monthly and hebdo snapshots): scraping chessmetrics old website created by Jeff Sonas. Rate calculation is chessmetrics. Output is stored as .csv in csv file.

  2. since September 2001 to December 2004 (monthly snapshots): scraping chessmetrics new website created by Jeff Sonas. Rate calculation is chessmetrics. Output is stored as .csv in csv file.

  3. since January 2001 to December 2019 (quarterly and monthly snapshots): fork from FIDE Data Pull created by Anuj Dahiya in 2022 and based on International Chess Federation rates (FIDE). Rate calculation is Elo rating system. File output compilations of chess players’s standard ratings in .csv is compiled as .parquet format compilationcsv.R file (output .parquet is bigger than 500Mo and not stored on git).

Scraping infos for 1851-2001

Selection of second dataframe in page list, adding date of list and ranking r as 1, 2, 3, …, n from rating of each specific date. Example: in December 31, 1851, scraping dataframe from CSS selector:

body > font:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(2) > div:nth-child(5) > center:nth-child(1) > table:nth-child(4) > tbody:nth-child(1)

Output is like:

## # A tibble: 241,118 × 5
##    Player             Rating   Age dateranking    ranking
##    <chr>               <int> <dbl> <chr>            <int>
##  1 Kasparov, Garry K    2884  37.0 April 10, 2000       1
##  2 Anand, Viswanathan   2796  30.3 April 10, 2000       2
##  3 Kramnik, Vladimir    2793  24.8 April 10, 2000       3
##  4 Shirov, Alexei       2778  27.8 April 10, 2000       4
##  5 Leko, Peter          2765  20.6 April 10, 2000       5
##  6 Topalov, Veselin     2746  25.1 April 10, 2000       6
##  7 Ivanchuk, Vassily    2738  31.1 April 10, 2000       7
##  8 Adams, Michael       2736  28.4 April 10, 2000       8
##  9 Gelfand, Boris       2731  31.8 April 10, 2000       9
## 10 Kamsky, Gata         2716  25.9 April 10, 2000      10
## # … with 241,108 more rows

Scraping infos for 2001-2004

Selection of dataframe in page list, adding date of list and ranking r as 1, 2, 3, …, n from rating of each specific date. Example: in January 2001, scraping dataframe from CSS selector:

body > form:nth-child(1) > table:nth-child(4)

Output is like:

## # A tibble: 4,800 × 5
##    Player               Rating Age    dateranking ranking
##    <chr>                 <int> <chr>        <int>   <int>
##  1 Garry Kasparov         2850 37y9m       200101       1
##  2 Viswanathan Anand      2820 31y1m       200101       2
##  3 Vladimir Kramnik       2815 25y7m       200101       3
##  4 Peter Leko             2768 21y4m       200101       4
##  5 Alexander Morozevich   2757 23y6m       200101       5
##  6 Alexei Shirov          2750 28y6m       200101       6
##  7 Vassily Ivanchuk       2749 31y10m      200101       7
##  8 Michael Adams          2743 29y2m       200101       8
##  9 Evgeny Bareev          2739 34y2m       200101       9
## 10 Boris Gelfand          2738 32y7m       200101      10
## # … with 4,790 more rows

About

Data collection of rating and ranking of chess players over time

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages