# The chessable dataset
One of the most interesting platforms to interactively learn chess is [chessable](chessable.com). The platform offers courses where learners get direct feedback on what they are doing. This can be textual feedback or even video. 

The platform has recently been bought by chess world champion Magnus Carlsen to support the ongoing quest to digitalize and professionalize chess. At 30-07-2020 there are a total of 376 courses available on chessable. I have scraped the details of all these courses using Selenium. 

In [1]:
import numpy as np 
import pandas as pd

pd.set_option('display.max_columns', 500)
chessable = pd.read_csv("chessable_raw_30072020.csv")
print(chessable.shape)
chessable.head(10)

(380, 24)


Unnamed: 0.1,Unnamed: 0,course_link,course_title,course_type,author,price,price_with_video,course_rating,course_rating_count,rubies,target_color,beginning,casual,intermediate,advanced,expert,language,instruction_word_count,free_video,trainable_variations,avg_line_depth,released_on,support_level,section
0,0,https://www.chessable.com/short-sweet-legendar...,Short & Sweet: Legendary Tactics,MoveTrainer™ Tactics/Strategy course,GM GM Pascal Charbonneau,0,,4.7,30.0,14,both pieces,1.0,1.0,1.0,1.0,1.0,English,1212,1 hour and 59 minutes,24.0,9.29,"Jul 24, 2020",High,🎓 Titled player
1,1,https://www.chessable.com/chessable-masters-to...,Chessable Masters Tournament,MoveTrainer™ Tactics/Strategy course,Chessable,0,,4.5,88.0,59,both pieces,0.0,1.0,1.0,1.0,0.0,English,126,4 hours and 31 minutes,65.0,16.36,"Jun 21, 2020",High,🎓 Titled player
2,2,https://www.chessable.com/short-sweet-the-blac...,Short & Sweet: The Black Lion,MoveTrainer™ Opening course,GM GingerGM,0,,4.7,198.0,119,black pieces,1.0,1.0,1.0,0.0,0.0,English,6210,1 hour and 5 minutes,24.0,13.63,"Mar 11, 2020",High,🎓 Titled player
3,3,https://www.chessable.com/short-and-sweet-jan-...,Short and Sweet: Jan Gustafsson's 1. e4 e5,MoveTrainer™ Opening course,Chessable,0,,4.6,242.0,123,black pieces,1.0,1.0,1.0,0.0,0.0,English,9820,,37.0,17.0,"Dec 13, 2019",High,🌎 Community
4,4,https://www.chessable.com/short-sweet-queens-g...,Short & Sweet: Queen's Gambit Declined,MoveTrainer™ Opening course,GM Alex Colovic,0,,4.4,73.0,44,black pieces,1.0,1.0,1.0,0.0,0.0,English,3932,38 minutes,20.0,15.83,"Jun 19, 2020",High,🎓 Titled player
5,5,https://www.chessable.com/short-sweet-1d4/cour...,Short & Sweet: 1.d4,MoveTrainer™ Opening course,FM Daniel Barrish,0,,4.4,128.0,56,white pieces,1.0,1.0,1.0,0.0,0.0,English,20566,1 hour and 28 minutes,31.0,16.1,"May 06, 2020",Low,🎓 Titled player
6,6,https://www.chessable.com/short-sweet-accelera...,Short & Sweet: Accelerated Queen's Indian Defense,MoveTrainer™ Opening course,"FM Yuriy Krykun (2506 USCF, 2382 FIDE)",0,,4.5,82.0,45,black pieces,1.0,1.0,1.0,0.0,0.0,English,8420,1 hour and 7 minutes,20.0,9.92,"May 13, 2020",High,🎓 Titled player
7,7,https://www.chessable.com/endgame-bootcamp-wit...,Endgame Bootcamp with John Bartholomew,MoveTrainer™ Endgame course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0,,4.6,389.0,236,both pieces,1.0,1.0,1.0,0.0,0.0,English,2860,39 minutes,19.0,5.05,"Jan 21, 2020",High,🎓 Titled player
8,8,https://www.chessable.com/basic-checkmate-patt...,Basic Checkmate Patterns,MoveTrainer™ Tactics course,"CraftyRaf & IM John Bartholomew (2534 USCF, 24...",0,,4.5,120.0,41,both pieces,1.0,1.0,1.0,0.0,0.0,English,658,1 hour and 14 minutes,34.0,2.76,"Jun 30, 2020",High,🎓 Titled player
9,9,https://www.chessable.com/short-sweet-the-caro...,Short & Sweet: The Caro-Kann,MoveTrainer™ Opening course,GM erwinlami (2620 FIDE),0,,4.6,36.0,38,black pieces,1.0,1.0,1.0,0.0,0.0,English,10833,1 hour and 18 minutes,25.0,14.85,"Jul 08, 2020",High,🎓 Titled player


In [2]:
chessable.course_type.value_counts()

MoveTrainer™ Opening course             202
MoveTrainer™ Tactics course              86
MoveTrainer™ Tactics/Strategy course     21
MoveTrainer™ Strategy course             15
MoveTrainer™ Endgame course              15
MoveTrainer™ Strategy/Tactics course     14
MoveTrainer™ Opening/Tactics course      10
MoveTrainer™ Tactics/Endgame course       8
MoveTrainer™ Bundle                       4
MoveTrainer™ Endgame/Tactics course       2
MoveTrainer™ Strategy/Opening course      1
MoveTrainer™ Strategy/Endgame course      1
MoveTrainer™ Tactics/Opening course       1
Name: course_type, dtype: int64

In [3]:
chessable.price.max() #This column need cleansing

'€9.50'

In [4]:
chessable.section.unique()

array(['🎓 Titled player', '🌎 Community', '📚 Publisher', nan], dtype=object)

In [5]:
chessable.language.value_counts(dropna=False)

English    328
NaN         50
Polish       2
Name: language, dtype: int64

In [6]:
chessable.course_link = chessable.course_link.str[26:]

In [7]:
len("https://www.chessable.com")

25

In [8]:
chessable = chessable.drop('Unnamed: 0',axis=1)

In [9]:
chessable.price = chessable.price.str.replace('€','').astype('float')

In [10]:
chessable.price_with_video = chessable.price_with_video.str.replace('€','').astype('float')

In [11]:
chessable.course_rating = chessable.course_rating.astype('float')

In [12]:
chessable[["course_rating_count","trainable_variations","avg_line_depth"]] = chessable[["course_rating_count","trainable_variations","avg_line_depth"]].astype('float')

In [13]:
# execute after fillna chessable[["beginning","casual","intermediate","advanced","expert", "instruction_word_count", "trainable_variations"]] = chessable[["beginning","casual","intermediate","advanced","expert", "instruction_word_count", "trainable_variations"]].astype('int')

In [14]:
chessable.section = chessable.section.str[2:]

In [15]:
chessable[chessable.beginning.isnull()]

Unnamed: 0,course_link,course_title,course_type,author,price,price_with_video,course_rating,course_rating_count,rubies,target_color,beginning,casual,intermediate,advanced,expert,language,instruction_word_count,free_video,trainable_variations,avg_line_depth,released_on,support_level,section
54,evans-gambit/course/6938/,Evans Gambit,MoveTrainer™ Opening course,Frank_S,0.0,,4.2,65.0,54,white pieces,,,,,,English,63.0,,29.0,8.81,"Jun 29, 2018",Unspecified,Community
56,moving-chess-pieces/course/3361/,Moving Chess Pieces,MoveTrainer™ Strategy course,simplydt,0.0,,3.9,29.0,22,white pieces,,,,,,,0.0,,4.0,3.0,,Unspecified,Community
70,sicilian-sacrifices-20-annotated-games/course/...,Sicilian Sacrifices: 20 annotated games,MoveTrainer™ Opening course,"NM logozar (2588 Online Rating, 2226 USCF, 205...",0.0,,4.6,51.0,73,white pieces,,,,,,English,9225.0,,29.0,11.67,"Jan 24, 2018",High,Titled player
101,chessable-tutorial-openings/course/3041/,Chessable Tutorial Openings,MoveTrainer™ Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.5,1486.0,86,white pieces,,,,,,,0.0,,3.0,6.0,,Unspecified,Titled player
375,the-mad-mans-1-d4-bundle/course/38268/,The Mad Man's 1. d4 Bundle,MoveTrainer™ Bundle,"FM Kamil Plichta (3129 Online Rating, 2361 FIDE)",54.98,49.99,,,1,,,,,,,,,,,,,,
376,the-complete-najdorf-bundle/course/38045/,The Complete Najdorf Bundle,MoveTrainer™ Bundle,GM Alex Colovic,55.98,49.99,,,21,,,,,,,,,,,,,,
378,the-kings-indian-attack-defense-bundle/course/...,The King's Indian Attack & Defense Bundle,MoveTrainer™ Bundle,"FM Kamil Plichta (3129 Online Rating, 2361 FIDE)",64.98,59.99,,,1,,,,,,,,,,,,,,
379,the-complete-1-d4-bundle/course/38039/,The Complete 1. d4 Bundle,MoveTrainer™ Bundle,& FM Daniel Barrish IM Chessexplained,87.98,397.96,,,2,,,,,,,,,,,,,,


In [16]:
len("MoveTrainer™")

12

In [17]:
chessable.course_type = chessable.course_type.str[13:]
chessable.course_type.value_counts()

Opening course             202
Tactics course              86
Tactics/Strategy course     21
Strategy course             15
Endgame course              15
Strategy/Tactics course     14
Opening/Tactics course      10
Tactics/Endgame course       8
Bundle                       4
Endgame/Tactics course       2
Strategy/Opening course      1
Tactics/Opening course       1
Strategy/Endgame course      1
Name: course_type, dtype: int64

In [18]:
chessable = chessable[chessable['course_type'] != 'Bundle']

In [19]:
chessable.language = chessable.language.fillna('No language')

In [20]:
chessable.shape

(376, 23)

In [21]:
chessable.released_on.value_counts(dropna=False)

NaN             10
Jun 24, 2020     5
Apr 06, 2020     5
Oct 30, 2019     4
Sep 25, 2019     4
                ..
Feb 09, 2018     1
Aug 14, 2016     1
Sep 19, 2018     1
May 04, 2020     1
Oct 15, 2018     1
Name: released_on, Length: 303, dtype: int64

In [22]:
chessable.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 376 entries, 0 to 377
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   course_link             376 non-null    object 
 1   course_title            376 non-null    object 
 2   course_type             376 non-null    object 
 3   author                  376 non-null    object 
 4   price                   376 non-null    float64
 5   price_with_video        264 non-null    float64
 6   course_rating           365 non-null    float64
 7   course_rating_count     365 non-null    float64
 8   rubies                  376 non-null    object 
 9   target_color            376 non-null    object 
 10  beginning               372 non-null    float64
 11  casual                  372 non-null    float64
 12  intermediate            372 non-null    float64
 13  advanced                372 non-null    float64
 14  expert                  372 non-null    fl

In [23]:
chessable[chessable.beginning.isnull()]

Unnamed: 0,course_link,course_title,course_type,author,price,price_with_video,course_rating,course_rating_count,rubies,target_color,beginning,casual,intermediate,advanced,expert,language,instruction_word_count,free_video,trainable_variations,avg_line_depth,released_on,support_level,section
54,evans-gambit/course/6938/,Evans Gambit,Opening course,Frank_S,0.0,,4.2,65.0,54,white pieces,,,,,,English,63,,29.0,8.81,"Jun 29, 2018",Unspecified,Community
56,moving-chess-pieces/course/3361/,Moving Chess Pieces,Strategy course,simplydt,0.0,,3.9,29.0,22,white pieces,,,,,,No language,0,,4.0,3.0,,Unspecified,Community
70,sicilian-sacrifices-20-annotated-games/course/...,Sicilian Sacrifices: 20 annotated games,Opening course,"NM logozar (2588 Online Rating, 2226 USCF, 205...",0.0,,4.6,51.0,73,white pieces,,,,,,English,9225,,29.0,11.67,"Jan 24, 2018",High,Titled player
101,chessable-tutorial-openings/course/3041/,Chessable Tutorial Openings,Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.5,1486.0,86,white pieces,,,,,,No language,0,,3.0,6.0,,Unspecified,Titled player


In [24]:
chessable[["beginning","casual","intermediate","advanced","expert"]] = chessable[["beginning","casual","intermediate","advanced","expert"]].fillna(0).astype('int')

In [25]:
chessable[chessable.trainable_variations.isnull()]

Unnamed: 0,course_link,course_title,course_type,author,price,price_with_video,course_rating,course_rating_count,rubies,target_color,beginning,casual,intermediate,advanced,expert,language,instruction_word_count,free_video,trainable_variations,avg_line_depth,released_on,support_level,section
194,visualise-1/course/25695/,Visualise 1,Opening course,Benedictine (135 ECF),13.99,13.99,4.7,170.0,219,both pieces,1,1,1,0,0,No language,2809,,,5.24,"Jul 31, 2019",High,Community
195,visualise-3/course/27269/,Visualise 3,Opening course,Benedictine (135 ECF),13.99,13.99,4.8,14.0,10,both pieces,1,1,1,1,0,No language,9568,,,11.08,"Oct 30, 2019",High,Community
196,visualise-4/course/32910/,Visualise 4,Opening course,Benedictine (135 ECF),13.99,13.99,4.5,15.0,36,both pieces,1,1,1,1,0,No language,6209,,,5.3,"Feb 24, 2020",High,Community
197,visualise-2/course/26550/,Visualise 2,Opening course,Benedictine (135 ECF),13.99,13.99,4.5,37.0,69,both pieces,0,1,1,1,0,No language,9955,,,5.76,"Aug 28, 2019",High,Community
198,visualise-5/course/35551/,Visualise 5,Opening course,Benedictine (135 ECF),13.99,13.99,,,0,both pieces,1,1,1,1,0,English,6677,,,5.06,"Jul 13, 2020",High,Community
287,the-100-endgames-you-must-know-workbook-practi...,The 100 Endgames You Must Know Workbook: Pract...,Endgame course,GM Jesús de la Villa & New in Chess,27.99,27.99,4.7,51.0,265,both pieces,0,1,1,1,1,English,57954,,,6.59,"Jul 01, 2019",Community,Publisher
292,endgame-virtuoso-magnus-carlsen/course/25442/,Endgame Virtuoso Magnus Carlsen,Endgame course,IM Tibor Karolyi & New in Chess,26.5,26.5,4.6,12.0,35,both pieces,0,0,1,1,0,English,93748,,,11.2,"Jul 29, 2019",Community,Publisher


In [26]:
chessable.trainable_variations = chessable.trainable_variations.fillna(0)

In [27]:
chessable[chessable.released_on.isnull()] #I will keep this NaNs it's difficult to put any arbitray date there

Unnamed: 0,course_link,course_title,course_type,author,price,price_with_video,course_rating,course_rating_count,rubies,target_color,beginning,casual,intermediate,advanced,expert,language,instruction_word_count,free_video,trainable_variations,avg_line_depth,released_on,support_level,section
48,im-john-bartholomews-1d4-repertoire-for-white/...,IM John Bartholomew's 1.d4 Repertoire for White,Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.8,719.0,948,white pieces,0,0,1,0,0,English,6049,,45.0,10.2,,Unspecified,Titled player
50,essential-rp-vs-r-endings/course/90/,Essential R+P vs. R Endings,Endgame course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.7,525.0,499,white pieces,1,1,1,1,0,English,291,,11.0,6.0,,Unspecified,Titled player
56,moving-chess-pieces/course/3361/,Moving Chess Pieces,Strategy course,simplydt,0.0,,3.9,29.0,22,white pieces,0,0,0,0,0,No language,0,,4.0,3.0,,Unspecified,Community
65,11-opening-traps-with-1e4/course/570/,11 Opening Traps with 1.e4,Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.6,269.0,219,white pieces,0,0,1,0,0,English,2362,,10.0,8.82,,Unspecified,Titled player
67,tutorial-fight-like-magnus/course/30270/,Tutorial: Fight Like Magnus,Opening course,IM Chessexplained,0.0,,4.5,99.0,12,black pieces,1,0,0,0,0,English,1155,,4.0,6.0,,High,Community
76,im-john-bartholomews-scandinavian-free-version...,IM John Bartholomew's Scandinavian (FREE version),Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.8,475.0,525,black pieces,0,0,1,1,0,English,3255,,56.0,10.26,,Unspecified,Titled player
77,centre-game/course/872/,Centre game,Opening course,"Evilgenius0070 (2230 Online Rating, 2050 USCF,...",0.0,,3.8,28.0,20,black pieces,0,0,1,0,0,English,430,,5.0,10.8,,Unspecified,Community
101,chessable-tutorial-openings/course/3041/,Chessable Tutorial Openings,Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",0.0,,4.5,1486.0,86,white pieces,0,0,0,0,0,No language,0,,3.0,6.0,,Unspecified,Titled player
106,chessable-tutorial-openings/course/3317/,Chessable Tutorial Openings,Opening course,simplydt,0.0,,3.8,17.0,8,white pieces,1,0,0,0,0,No language,0,,3.0,2.67,,Unspecified,Community
208,im-john-bartholomews-scandinavian/course/79/,IM John Bartholomew's Scandinavian,Opening course,"IM John Bartholomew (2534 USCF, 2446 FIDE)",15.75,15.75,4.8,191.0,285,black pieces,0,0,1,1,0,English,8208,,142.0,12.33,,High,Titled player


In [28]:
chessable[chessable.course_rating.isnull()]

Unnamed: 0,course_link,course_title,course_type,author,price,price_with_video,course_rating,course_rating_count,rubies,target_color,beginning,casual,intermediate,advanced,expert,language,instruction_word_count,free_video,trainable_variations,avg_line_depth,released_on,support_level,section
116,abrahams-advanced-rauzer/course/3026/,Abraham's (Advanced) Rauzer,Opening course,Mr.Abraham (2054 Online Rating),3.25,3.25,,,0,black pieces,0,0,1,0,0,English,5373,,23.0,19.78,"Sep 06, 2016",Unspecified,Community
192,szybki-kurs-obrony-francuskiej/course/15651/,Szybki kurs obrony francuskiej,Opening course,"FM dczerw (2358 Online Rating, 2286 FIDE)",12.75,12.75,,,0,black pieces,1,1,1,0,0,Polish,3811,,62.0,13.93,"Oct 09, 2018",High,Titled player
198,visualise-5/course/35551/,Visualise 5,Opening course,Benedictine (135 ECF),13.99,13.99,,,0,both pieces,1,1,1,1,0,English,6677,,0.0,5.06,"Jul 13, 2020",High,Community
205,conquer-the-caro-kann-the-shirov-attack/course...,Conquer the Caro-Kann: The Shirov Attack,Opening course,DrErykHargrove (2110 USCF),13.99,13.99,,,0,white pieces,0,1,1,1,1,English,19306,,79.0,15.16,"Jun 24, 2020",High,Community
223,antidotes-to-anti-dutch-systems/course/35481/,Antidotes to anti-Dutch systems,Opening course,"till (2495 Online Rating, 2205 FIDE)",21.99,16.99,,,90,black pieces,0,0,1,1,1,English,50008,,301.0,11.49,"Jun 24, 2020",High,Community
242,chess-tests/course/35171/,Chess Tests,Tactics course,Russell Enterprises,25.95,17.95,,,8,both pieces,0,0,1,1,1,No language,65863,,226.0,6.89,"Jul 27, 2020",Community,Publisher
243,the-complete-open-sicilian-for-white-vol-1/cou...,The Complete Open Sicilian for White - Vol. 1,Opening course,"NM mn79 (2215 USCF, 2115 FIDE)",25.99,17.99,,,0,white pieces,0,0,1,1,1,No language,42023,,252.0,14.67,"Jul 29, 2020",High,Titled player
298,chess-lessons-solving-problems-avoiding-mistak...,Chess Lessons: Solving Problems & Avoiding Mis...,Tactics course,Russell Enterprises,26.99,26.99,,,46,both pieces,0,0,1,1,1,English,189541,,298.0,8.95,"Jun 12, 2020",High,Publisher
312,sabotage-the-slav/course/37224/,Sabotage the Slav,Opening course,GM Boris Avrukh,26.99,149.98,,,7,white pieces,0,0,1,1,1,English,26265,1 hour and 20 minutes,329.0,16.8,"Jul 17, 2020",High,Titled player
332,gambit-killer/course/24718/,Gambit Killer,Opening/Tactics course,GM Ivan Salgado Lopez & Thinkers Publishing,31.5,31.5,,,10,both pieces,0,1,1,1,0,English,52811,,221.0,14.4,"Jul 15, 2019",Community,Publisher


In [29]:
chessable[['course_rating','course_rating_count']] = chessable[['course_rating','course_rating_count']].fillna(0)

In [30]:
chessable[['course_rating_count','trainable_variations']] = chessable[['course_rating_count','trainable_variations']].astype('int')

In [31]:
chessable.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 376 entries, 0 to 377
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   course_link             376 non-null    object 
 1   course_title            376 non-null    object 
 2   course_type             376 non-null    object 
 3   author                  376 non-null    object 
 4   price                   376 non-null    float64
 5   price_with_video        264 non-null    float64
 6   course_rating           376 non-null    float64
 7   course_rating_count     376 non-null    int32  
 8   rubies                  376 non-null    object 
 9   target_color            376 non-null    object 
 10  beginning               376 non-null    int32  
 11  casual                  376 non-null    int32  
 12  intermediate            376 non-null    int32  
 13  advanced                376 non-null    int32  
 14  expert                  376 non-null    in

In [32]:
chessable.to_csv('chessable_clean_30072020.csv')