## Typeracer Data Analyzer:

A simple script that uses Pandas and Plotly to extract and visualize Typeracer.com race data

(Work in progress--more comments/explanations to come)


By Kenneth Burchfiel
Released under the MIT License


In [None]:
import plotly.express as px
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import time
from datetime import timedelta
import kaleido
import os
from IPython.display import Image
# Note regarding kaleido: the most recent version didn't work for me. However,
# specifying an older version by entering conda install python-kaleido=0.1.0
# worked great. See https://github.com/plotly/Kaleido/issues/120 )
from generate_screenshot import generate_screenshot

In [None]:
import plotly.io as pio
pio.kaleido.scope.mathjax = None

In [None]:
# current_time = time.time() # Retrieves the current time in Unix seconds, which is what the API uses also. Adding a +1 helps ensure that races that just completed are also included
# current_time

The following cell reads Typeracer race data (which I downloaded from my Typeracer account) into a Pandas DataFrame, then calculates rolling and cumulative averages.

In [None]:
df_race_data = pd.read_csv('race_data.csv')
df_race_data['Last 10 Avg'] = df_race_data['WPM'].rolling(10).mean()
df_race_data['Last 100 Avg'] = df_race_data['WPM'].rolling(100).mean()
df_race_data['Last 1000 Avg'] = df_race_data['WPM'].rolling(1000).mean()
# The following line uses a list comprehension to generate a cumulative average
# of all WPM scores up until the current race. .iloc searches from 0 to i+1 for
# each row so that that row is included in the calculation.
df_race_data['cumulative_avg'] = [round(np.mean(df_race_data.iloc[0:i+1]['WPM']),3) for i in range(len(df_race_data))]
df_race_data

Top 20 races by WPM:

In [None]:
df_race_data.sort_values('WPM', ascending = False).head(20)

Top 10 'Last 10 Average' values:

In [None]:
df_race_data.sort_values('Last 10 Avg', ascending = False).head(20)

Creating interactive charts using Plotly express: (Note: because these are HTML files, they won't display on GitHub; instead, you'll need to download and run the Jupyter notebook on your computer to view them.)

In [None]:
race_line_plot = px.line(df_race_data, x = 'Race #', y = ['WPM', 'Last 10 Avg', 'Last 100 Avg', 'Last 1000 Avg', 'cumulative_avg'])
race_line_plot.write_html('html_output/race_line_plot.html')
race_line_plot

Generating static version of this .html file using Kaleido:

In [None]:
image_width = 3000 # Interestingly, when I tried setting the image width as 3840 (e.g. UHD resolution), the x axis did not line up properly with the chart.)
image_height = image_width * 9/16
race_line_plot.write_image('png_output/race_line_plot_using_kaleido.png', width = image_width, height = image_height, engine = 'kaleido')
# See https://plotly.com/python/static-image-export/

Here's a copy of the image:

In [None]:
Image(filename = 'png_output/race_line_plot_using_kaleido.png')

Alternate method of generating the screenshot by using Selenium and a web browser: (the output is quite similar)

In [None]:
generate_screenshot(
path_to_html = os.path.join(os.getcwd(), 'html_output'),
html_name = 'race_line_plot.html', 
path_to_image = os.path.join(os.getcwd(), 'png_output'), 
image_name = 'race_line_plot_using_generate_screenshot', 
image_extension = '.png',
window_width = 3000) 
# See https://docs.python.org/3/library/os.path.html for the use of os.path.join().