# Type Through The Bible

By Kenneth Burchfiel

Code is released under the MIT license; Bible verses are from the Web English Bible (Catholic Edition)* and are in the public domain.

\* Genesis was not found within the original WEB Catholic Edition folder, so I copied in files from another Web English Bible translation instead. I imagine, but am not certain, that these files are the same as the actual Catholic Edition Genesis files.

## More documentation to come!

Next steps: (Not necessarily in order of importance)

* See if there's a way to alert the user (perhaps via a sound?) when a character is typed incorrectly.
* Improve chart formatting (e.g. add titles, legend names, etc.)
* Add in more documentation

In [1]:
import pandas as pd
pd.set_option('display.max_columns', 1000)
import time
import plotly.express as px
from getch import getch # Installed this library using pip install py-getch, not
# pip install getch. See https://github.com/joeyespo/py-getch
import numpy as np
from datetime import datetime, date, timezone # Based on 
# https://docs.python.org/3/library/datetime.html

In [2]:
extra_analyses = False

Checking whether the program is currently running on a Jupyter notebook:

(The program normally uses getch() to begin typing tests; however, I wasn't able to enter input after getch() got called within a Jupyter notebook and thus couldn't begin a typing test in that situation. Therefore, the program will use input() instead of getch() to start tests when running within a notebook.)

In [3]:
# The following method of determining whether the code is running
# within a Jupyter notebook is based on Gustavo Bezerra's response
# at https://stackoverflow.com/a/39662359/13097194 . I found that
# just calling get_ipython() was sufficient, at least on Windows and within
# Visual Studio Code; his answer is more complex.

try: 
    get_ipython()
    run_on_notebook = True
except:
    run_on_notebook = False

# print(run_on_notebook)

In [4]:
df_Bible = pd.read_csv('WEB_Catholic_Version_for_game_updated.csv')
df_Bible

Unnamed: 0,Book_Order,Book_Name,Chapter_Name,Book_and_Chapter,Chapter_Order,Verse_#,Verse_Order,Verse,Characters,Typed,Tests,Fastest_WPM,Characters_Typed,Total_Characters_Typed,Count
0,1,GEN,1,GEN 1,1,1,1,"In the beginning, God created the heavens and ...",56,1,8,193.251509,56,448,1
1,1,GEN,1,GEN 1,1,2,2,The earth was formless and empty. Darkness was...,135,1,2,140.461827,135,270,1
2,1,GEN,1,GEN 1,1,3,3,"God said, ""Let there be light,"" and there was ...",52,1,1,84.008493,52,52,1
3,1,GEN,1,GEN 1,1,4,4,"God saw the light, and saw that it was good. G...",85,1,1,123.198378,85,85,1
4,1,GEN,1,GEN 1,1,5,5,"God called the light ""day"", and the darkness h...",119,1,1,135.664272,119,119,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35374,95,REV,22,REV 22,1328,17,35375,"The Spirit and the bride say, ""Come!"" He who h...",160,0,0,,0,0,1
35375,95,REV,22,REV 22,1328,18,35376,I testify to everyone who hears the words of t...,159,0,0,,0,0,1
35376,95,REV,22,REV 22,1328,19,35377,If anyone takes away from the words of the boo...,174,0,0,,0,0,1
35377,95,REV,22,REV 22,1328,20,35378,"He who testifies these things says, ""Yes, I am...",89,0,0,,0,0,1


In [5]:
df_results = pd.read_csv('results.csv', index_col='Test_Number')
df_results

Unnamed: 0_level_0,Unix_Start_Time,Local_Start_Time,UTC_Start_Time,Characters,Seconds,CPS,WPM,Book,Chapter,Verse_Order,Verse,Verse #,Last 10 Avg,Last 100 Avg,Last 1000 Avg,cumulative_avg,Local_Year,Local_Month,Local_Hour,Count,Local_Date
Test_Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1.698121e+09,2023-10-24T00:17:38.017017,2023-10-24T04:17:38.017017+00:00,56,6.571754,8.521317,102.255799,GEN,1,1,"In the beginning, God created the heavens and ...",,,,,102.256,2023,10,0,1,2023-10-24
2,1.698121e+09,2023-10-24T00:18:06.165502,2023-10-24T04:18:06.165502+00:00,135,13.997536,9.644554,115.734653,GEN,1,2,The earth was formless and empty. Darkness was...,,,,,108.995,2023,10,0,1,2023-10-24
3,1.698121e+09,2023-10-24T00:18:24.381290,2023-10-24T04:18:24.381290+00:00,56,4.714291,11.878774,142.545292,GEN,1,1,"In the beginning, God created the heavens and ...",,,,,120.179,2023,10,0,1,2023-10-24
4,1.698121e+09,2023-10-24T00:21:24.331389,2023-10-24T04:21:24.331389+00:00,56,4.878704,11.478458,137.741497,GEN,1,1,"In the beginning, God created the heavens and ...",,,,,124.569,2023,10,0,1,2023-10-24
5,1.698121e+09,2023-10-24T00:22:03.749152,2023-10-24T04:22:03.749152+00:00,135,19.078445,7.076048,84.912580,GEN,1,2,The earth was formless and empty. Darkness was...,,,,,116.638,2023,10,0,1,2023-10-24
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
455,1.699071e+09,2023-11-04T00:05:58.617988,2023-11-04T04:05:58.617988+00:00,118,11.562374,10.205517,122.466203,GEN,12,305,Abram passed through the land to the place of ...,6.0,132.666257,132.210892,,135.337,2023,11,0,1,2023-11-04
456,1.699071e+09,2023-11-04T00:06:11.680379,2023-11-04T04:06:11.680379+00:00,137,11.562335,11.848818,142.185816,GEN,12,306,"Yahweh appeared to Abram and said, ""I will giv...",7.0,136.273657,132.008614,,135.352,2023,11,0,1,2023-11-04
457,1.699071e+09,2023-11-04T00:06:24.081392,2023-11-04T04:06:24.081392+00:00,194,16.993672,11.416014,136.992171,GEN,12,307,He left from there to go to the mountain on th...,8.0,134.820701,131.732027,,135.355,2023,11,0,1,2023-11-04
458,1.699071e+09,2023-11-04T00:06:49.969858,2023-11-04T04:06:49.969858+00:00,48,3.751017,12.796529,153.558351,GEN,12,308,"Abram traveled, still going on toward the South.",9.0,137.840247,131.875672,,135.395,2023,11,0,1,2023-11-04


In [6]:
# If you ever need to drop a particular result,
# you can do so as follows:
# df_results.drop(17, inplace = True)
# df_results.to_csv('results.csv') # We want to preserve the index so as not
# to lose our 'Test_Number' values
# df_results

In [7]:
# Creating an RNG seed:
# In order to make the RNG values a bit more random, the following code will
# derive the RNG seed from the decimal component of the current timestamp.
# This seed will change 1 million times each second.

# Using the decimal component of time.time() to select an RNG seed:
current_time = time.time()
decimal_component = current_time - int(current_time) # This 
# line retrieves the decimal component of current_time. int() is used instead
# of np.round() so that the code won't ever round current_time up prior
# to the subtraction operation, which would return a different value.
# I don't think that converting current_time to an integer (e.g. via
# np.int64(current_time)) is necessary, as int() appears to handle at least 
# some integers larger than 32 bits in size just fine.
decimal_component
random_seed = round(decimal_component * 1000000)
decimal_component, random_seed

(0.7508842945098877, 750884)

In [8]:
rng = np.random.default_rng(random_seed) # Based on
# https://numpy.org/doc/stable/reference/random/index.html?highlight=random#module-numpy.random

In [9]:
df_Bible

Unnamed: 0,Book_Order,Book_Name,Chapter_Name,Book_and_Chapter,Chapter_Order,Verse_#,Verse_Order,Verse,Characters,Typed,Tests,Fastest_WPM,Characters_Typed,Total_Characters_Typed,Count
0,1,GEN,1,GEN 1,1,1,1,"In the beginning, God created the heavens and ...",56,1,8,193.251509,56,448,1
1,1,GEN,1,GEN 1,1,2,2,The earth was formless and empty. Darkness was...,135,1,2,140.461827,135,270,1
2,1,GEN,1,GEN 1,1,3,3,"God said, ""Let there be light,"" and there was ...",52,1,1,84.008493,52,52,1
3,1,GEN,1,GEN 1,1,4,4,"God saw the light, and saw that it was good. G...",85,1,1,123.198378,85,85,1
4,1,GEN,1,GEN 1,1,5,5,"God called the light ""day"", and the darkness h...",119,1,1,135.664272,119,119,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35374,95,REV,22,REV 22,1328,17,35375,"The Spirit and the bride say, ""Come!"" He who h...",160,0,0,,0,0,1
35375,95,REV,22,REV 22,1328,18,35376,I testify to everyone who hears the words of t...,159,0,0,,0,0,1
35376,95,REV,22,REV 22,1328,19,35377,If anyone takes away from the words of the boo...,174,0,0,,0,0,1
35377,95,REV,22,REV 22,1328,20,35378,"He who testifies these things says, ""Yes, I am...",89,0,0,,0,0,1


[This fantastic answer](https://stackoverflow.com/a/23294659/13097194) by Kevin at Stack Overflow proved helpful in implementing user validation code within this program. 

In [10]:
def select_verse():
    print("Select a verse to type! Enter 0 to receive a random verse\n\
or enter a verse number (see 'Verse_Order column of\n\
the WEB_Catholic_Version.csv spreadsheet for a list of numbers to enter\n\
to select a specific verse.\n\
You can also enter -2 to receive a random verse that you haven't yet typed\n\
or -3 to choose the first Bible verse that hasn't yet been typed.")
    while True:
        try:
            response = int(input())
        except:
            print("Please enter an integer corresponding to a particular Bible \
verse or 0 for a randomly selected verse.")
            continue # Allows the user to retry entering a number

        if response == 0:
            return rng.integers(1, 35380) # Selects any verse within the Bible.
            # there are 35,379 verses present, so we'll pass 1 (the first verse)
            # and 35,380 (1 more than the last verse, as rng.integers won't 
            # include the final number within the range) to rng.integers().
        # The next two elif statements will require us to determine which 
        # verses haven't yet been typed. We can do so by filtering df_Bible
        # to include only untyped verses.
        elif response == -2:
            verses_not_yet_typed = list(
                df_Bible.query("Typed == 0")['Verse_Order'].copy())
            if len(verses_not_yet_typed) == 0:
                print("Congratulations! You have typed all verses from \
the Bible, so there are no new verses to type! Try selecting another option \
instead.")
                continue
            print(f"{len(verses_not_yet_typed)} verses have not yet \
been typed.")
            return rng.choice(verses_not_yet_typed) # Chooses one of these
            # untyped verses at random
        elif response == -3:
            verses_not_yet_typed = list(
                df_Bible.query("Typed == 0")['Verse_Order'].copy())
            if len(verses_not_yet_typed) == 0:
                print("Congratulations! You have typed all verses from \
the Bible, so there are no new verses to type! Try selecting another option \
instead.")
                continue
            print(f"{len(verses_not_yet_typed)} verses have not yet \
been typed.")
            verses_not_yet_typed.sort() # Probably not necessary, as df_Bible
            # is already sorted from the first to the last verse.
            return verses_not_yet_typed[0]
        
        else:
            if ((response >= 1) 
            & (response <= 35379)): # Making sure that the response is 
                # an integer between 1 and 35,379 (inclusive) so that it 
                # matches one of the Bible verse numbers present:                    
                return response
            else: # Will be called if a non-integer number was passed
                    # or if the integer didn't correspond to a Bible verse
                    # number. 
                print("Please enter an integer between 1 and 35,379.") # Since
                # we're still within a While loop, the user will be returned
                # to the initial try/except block.


In [11]:
def run_typing_test(verse_number, results_table):
    '''This function calculates how quickly the user types the characters
    passed to the Bible verse represented by verse_number, then saves those 
    results to the DataFrame passed to results_table.'''

    # Retrieving the verse to be typed:
    # The index begins at 0 whereas the list of verse numbers begins at 1,
    # so we'll need to subtract 1 from verse_number in order to obtain
    # the verse's index.
    verse = df_Bible.iloc[verse_number-1]['Verse']
    book = df_Bible.iloc[verse_number-1]['Book_Name']
    chapter = df_Bible.iloc[verse_number-1]['Chapter_Name']
    verse_number_within_chapter = df_Bible.iloc[verse_number-1]['Verse_#']
    verse_number_within_Bible = df_Bible.iloc[
        verse_number-1]['Verse_Order']
    
    # I moved these introductory comments out of the following while loop
    # in order to simplify the dialogue presented to users during retest
    # attempts.
    print("Welcome to the typing test! Note that you can exit a test in \
progress by entering 'exit.'")
    print(f"\nYour verse to type is {book} \
{chapter}:{verse_number_within_chapter} (verse {verse_number_within_Bible} \
within the Bible .csv file).\n")
    if run_on_notebook == False:
        print("Press any key to begin typing!")
    else:
        print("Press Enter to begin the test!")
    
    complete_flag = 0
    while complete_flag == 0:
        print(f"Here is the verse:\n\n{verse}") 

        if run_on_notebook == False: # In this case, we can use getch()
            # to begin the test.
        # time.sleep(3) # I realized that players could actually begin typing
        # during this sleep period, thus allowing them to complete the test
        # faster than intended. Therefore, I'm now having the test start
        # after the player hits a character of his/her choice. getch()
        # accomplishes this task well.
        # A simpler approach would be to add in an additional input block
        # and have the player begin after he/she presses Enter, but that would
        # cause the player's right hand to leave the default home row position,
        # which could end up slowing him/her down. getch() allows any character
        # to be pressed (such as the space bar) and thus avoids this issue.

            start_character = getch() # See https://github.com/joeyespo/py-getch
        
        else: # When running the program within a Jupyter notebook, I wasn't
            # able to enter input after getch() was called, so I created
            # an alternative start method below that simply uses input().
            input()

        print("Start!")
        local_start_time = datetime.now().isoformat()
        utc_start_time = datetime.now(timezone.utc).isoformat()
        typing_start_time = time.time()
        verse_response = input() 
        # The following code will execute once the player finishes typing and
        # hits Enter. (Having the program evaluate the player's entry only after
        # 'Enter' is pressed isn't the best option, as the time required to
        # hit Enter will reduce the player's reported WPM. In the future,
        # I might revise this code so that the text can get evaluated
        # immediately when the player has typed all characters of the text.
        # (Counting the characters as the player types would be one way
        # to implement this revision.)

        typing_end_time = time.time()
        typing_time = typing_end_time - typing_start_time
        if verse_response == verse:
            print(f"Well done! You typed the verse correctly.")
            complete_flag = 1 # Setting this flag to 1 allows the player to exit
            # out of the while statement.
        elif verse_response.lower() == 'exit':
            print("Exiting typing test.")
            return results_table # Exits the function without saving the 
            # current test to results_table or df_Bible
        else:
            print("Sorry, that wasn't the correct input.")   
            # Identifying incorrectly typed words:
            verse_words = verse.split(' ')
            verse_response_words = verse_response.split(' ')[0:len(verse_words)]
            # I added in the [0:len(verse_words)] filter so that the following
            # for loop would not attempt to access more words that were 
            # present in the original verse (which would cause the game
            # to crash with an IndexError).
            for i in range(len(verse_response_words)):
                if verse_response_words[i] != verse_words[i]:
                    print(f"Word number {i} ('{verse_words[i]}') \
was typed '{verse_response_words[i]}'.")
                    # If the response has more or fewer words than the original
                    # verse, some correctly typed words might appear within
                    # this list also.
            print("Try again!")

    # Calculating typing statistics and storing them within a single-row
    # DataFrame:

    cps = len(verse) / typing_time # Calculating characters per second
    wpm = cps * 12 # Multiplying by 60 to convert from characters to minutes, 
    # then dividing by 5 to convert from characters to words.
    wpm

    print(f"Your CPS and WPM were {round(cps, 3)} and {round(wpm, 3)}, \
respectively.")

    # Creating a single-row DataFrame that stores the player's results:
    df_latest_result = pd.DataFrame(index = [
        len(results_table)+1], data = {'Unix_Start_Time':typing_start_time, 
    'Local_Start_Time':local_start_time,
    'UTC_Start_Time':utc_start_time,
    'Characters':len(verse),
    'Seconds':typing_time, 
    'CPS': cps,
    'WPM':wpm,
    'Book': book,
    'Chapter': chapter,
    'Verse #': verse_number_within_chapter,
    'Verse':verse, 
    'Verse_Order':verse_number_within_Bible})
    df_latest_result.index.name = 'Test_Number'
    df_latest_result

    # Adding this new row to results_table:
    results_table = pd.concat([results_table, df_latest_result])\
    
    # Note: I could also have used df.at or df.iloc to add a new row
    # to df_latest_result, but I chose a pd.concat() setup in order to ensure
    # that the latest result would never overwrite an earlier result.
    

    # Updating df_Bible to store the player's results: (This will allow the
    # player to track how much of the Bible he/she has typed so far)
    df_Bible.at[verse_number-1, 'Typed'] = 1 # Denotes that this verse
    # has now ben typed
    df_Bible.at[verse_number-1, 'Tests'] += 1 # Keeps track of how 
    # many times this verse has been typed
    fastest_wpm = df_Bible.at[verse_number-1, 'Fastest_WPM']
    if ((pd.isna(fastest_wpm) == True) | (wpm > fastest_wpm)): 
        # In these cases, we should replace the pre-existing Fastest_WPM value
        # with the WPM the player just achieved.
        # I found that 5 > np.NaN returned False, so if I only checked for
        # wpm > fastest_wpm, blank fastest_wpm values would never get overwritten.
        # Therefore, I chose to also check for NaN values 
        # in the above if statement.
        df_Bible.at[verse_number-1, 'Fastest_WPM'] = wpm

    return results_table


In [12]:
# run_typing_test(1, results_table=df_results)

In [13]:
def select_subsequent_verse(previous_verse_number):
    '''This function allows the player to specify which verse to
    type next, or, alternatively, to exit the game.'''
    print("Press 0 to retry the verse you just typed; \
1 to type the next verse; 2 to type the next verse that hasn't yet been typed; \
3 to select a different verse; \
or -1 to save your results and exit.")
    while True: 
            try:
                response = int(input())
            except: # The user didn't enter a number.
                print("Please enter a number.")      
                continue
            if response == 0:
                return previous_verse_number
            elif response == 1:
                if previous_verse_number == 35379: # The verse order value
                    # corresponding to the final verse of Revelation
                    print("You just typed the last verse in the Bible, so \
there's no next verse to type! Please enter an option other than 1.\n")
                    continue
                else:
                    return previous_verse_number + 1
            elif response == 2:
                # In this case, we'll retrieve a list of verses that haven't
                # yet been typed; filter that list to include only verses
                # greater than previous_verse_number; and then select
                # the first verse within that list (i.e. the next 
                # untyped verse).
                verses_not_yet_typed = list(df_Bible.query(
                    "Typed == 0")['Verse_Order'].copy())
                if len(verses_not_yet_typed) == 0:
                    print("Congratulations! You have typed all verses from \
the Bible, so there are no new verses to type! Try selecting another option \
instead.")
                    continue
                print(f"{len(verses_not_yet_typed)} verses have not yet \
been typed.")
                verses_not_yet_typed.sort() 
                next_untyped_verses = [verse for verse in verses_not_yet_typed 
                if verse > previous_verse_number]
                return next_untyped_verses[0]
            elif response == 3:
                return select_verse()
            elif response == -1:
                return response
            else: # A number other than -1, 0, 1, 2, or 3 was passed.
                print("Please enter either -1, 0, 1, 2, or 3.\n")  

In [14]:
def calculate_current_day_results(df):
    ''' This function reports the number of characters, total verses, and 
    unique verses that the player has typed so far today.'''
    df_current_day_results = df[pd.to_datetime(
        df['Local_Start_Time']).dt.date == datetime.today().date()].copy()
    if len(df_current_day_results) == 0:
        result_string = "You haven't typed any Bible verses yet today."
    else:
        characters_typed_today = df_current_day_results['Characters'].sum()
        total_verses_typed_today = len(df_current_day_results)

        # Allowing for both singular and plural versions of 'verse' to 
        # be displayed:
        if total_verses_typed_today == 1:
            total_verses_string = 'verse'
        else:
            total_verses_string = 'verses'

        unique_verses_typed_today = len(df_current_day_results[
            'Verse_Order'].unique())

        if unique_verses_typed_today == 1:
            unique_verses_string = 'verse'
        else:
            unique_verses_string = 'verses'

        average_wpm_today = round(df_current_day_results['WPM'].mean(), 3)
        median_wpm_today = round(df_current_day_results['WPM'].median(), 3)
        result_string = f"So far today, you have typed \
{characters_typed_today} characters from {total_verses_typed_today} Bible \
{total_verses_string} (including {unique_verses_typed_today} unique \
{unique_verses_string}). Your mean and median WPM today are \
{average_wpm_today} and {median_wpm_today}, respectively."
    return result_string

In [15]:
def run_game(results_table):
    '''This function runs Type Through the Bible by 
    calling various other functions. It allows users to select
    verses to type, then runs typing tests and stores the results in
    the DataFrame passed to results_table.'''
    
    print("Welcome to Type Through the Bible!")
    # The game will now share the player's progress for the current day:
    print(calculate_current_day_results(results_table))
    verse_number = select_verse()
    
    while True: # Allows the game to continue until the user exits
        results_table = run_typing_test(verse_number=verse_number, 
        results_table=results_table)
        # The game will next share an updated progress report:
        print(calculate_current_day_results(results_table))
        
        # The player will now be prompted to select a new verse number 
        # (or to save and quit). This verse_number, provided it is not -1,
        # will then be passed back to run_typing_test().
        verse_number = select_subsequent_verse(
            previous_verse_number=verse_number)
        if verse_number == -1: # In this case, the game will quit and the 
            # user's new test results will be saved to results_table.
            return results_table 

In [16]:
df_results = run_game(results_table = df_results)

Welcome to Type Through the Bible!
So far today, you have typed 2082 characters from 17 Bible verses (including 16 unique verses). Your mean and median WPM today are 133.905 and 136.992, respectively.
Select a verse to type! Enter 0 to receive a random verse
or enter a verse number (see 'Verse_Order column of
the WEB_Catholic_Version.csv spreadsheet for a list of numbers to enter
to select a specific verse.
You can also enter -2 to receive a random verse that you haven't yet typed
or -3 to choose the first Bible verse that hasn't yet been typed.
34936 verses have not yet been typed.
Welcome to the typing test! Note that you can exit a test in progress by entering 'exit.'

Your verse to type is GEN 12:11 (verse 310 within the Bible .csv file).

Press Enter to begin the test!
Here is the verse:

When he had come near to enter Egypt, he said to Sarai his wife, "See now, I know that you are a beautiful woman to look at.
Start!
Sorry, that wasn't the correct input.
Word number 0 ('When') wa

In [17]:
# Updating certain df_Bible columns to reflect new results:

In [18]:
df_Bible['Characters_Typed'] = df_Bible['Characters'] * df_Bible['Typed']
df_Bible['Total_Characters_Typed'] = df_Bible['Characters'] * df_Bible['Tests']
df_Bible

Unnamed: 0,Book_Order,Book_Name,Chapter_Name,Book_and_Chapter,Chapter_Order,Verse_#,Verse_Order,Verse,Characters,Typed,Tests,Fastest_WPM,Characters_Typed,Total_Characters_Typed,Count
0,1,GEN,1,GEN 1,1,1,1,"In the beginning, God created the heavens and ...",56,1,8,193.251509,56,448,1
1,1,GEN,1,GEN 1,1,2,2,The earth was formless and empty. Darkness was...,135,1,2,140.461827,135,270,1
2,1,GEN,1,GEN 1,1,3,3,"God said, ""Let there be light,"" and there was ...",52,1,1,84.008493,52,52,1
3,1,GEN,1,GEN 1,1,4,4,"God saw the light, and saw that it was good. G...",85,1,1,123.198378,85,85,1
4,1,GEN,1,GEN 1,1,5,5,"God called the light ""day"", and the darkness h...",119,1,1,135.664272,119,119,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35374,95,REV,22,REV 22,1328,17,35375,"The Spirit and the bride say, ""Come!"" He who h...",160,0,0,,0,0,1
35375,95,REV,22,REV 22,1328,18,35376,I testify to everyone who hears the words of t...,159,0,0,,0,0,1
35376,95,REV,22,REV 22,1328,19,35377,If anyone takes away from the words of the boo...,174,0,0,,0,0,1
35377,95,REV,22,REV 22,1328,20,35378,"He who testifies these things says, ""Yes, I am...",89,0,0,,0,0,1


In [19]:
characters_typed_sum = df_Bible['Characters_Typed'].sum()
proportion_of_Bible_typed = characters_typed_sum / df_Bible['Characters'].sum()

print(f"You have typed {characters_typed_sum} characters so far, which represents \
{round(100*proportion_of_Bible_typed, 5)}% of the Bible.")



You have typed 50531 characters so far, which represents 1.12088% of the Bible.


# Adding in additional values and statistics to df_results:

(The following cell was derived from [this script](https://github.com/kburchfiel/typeracer_data_analyzer/blob/master/typeracer_data_analyzer_v2.ipynb) that I wrote.)

These statistics will get recreated whenever the script is run; this approach allows for the results to be revised as needed (e.g. if certain rows are removed from the dataset).

In [20]:
df_results['Last 10 Avg'] = df_results['WPM'].rolling(10).mean()
df_results['Last 100 Avg'] = df_results['WPM'].rolling(100).mean()
df_results['Last 1000 Avg'] = df_results['WPM'].rolling(1000).mean()

df_results['Local_Year'] = pd.to_datetime(df_results['Local_Start_Time']).dt.year
df_results['Local_Month'] = pd.to_datetime(df_results['Local_Start_Time']).dt.month
df_results['Local_Date'] = pd.to_datetime(df_results['Local_Start_Time']).dt.date
df_results['Local_Hour'] = pd.to_datetime(df_results['Local_Start_Time']).dt.hour
df_results['Count'] = 1 # Useful for pivot tables that analyze test counts
# by book, month, etc.

# The following line uses a list comprehension to generate a cumulative average
# of all WPM scores up until the current race. .iloc searches from 0 to i+1 for
# each row so that that row is included in the calculation.
df_results['cumulative_avg'] = [round(np.mean(df_results.iloc[0:i+1]['WPM']),
3) for i in range(len(df_results))]
df_results

Unnamed: 0_level_0,Unix_Start_Time,Local_Start_Time,UTC_Start_Time,Characters,Seconds,CPS,WPM,Book,Chapter,Verse_Order,Verse,Verse #,Last 10 Avg,Last 100 Avg,Last 1000 Avg,cumulative_avg,Local_Year,Local_Month,Local_Hour,Count,Local_Date
Test_Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1.698121e+09,2023-10-24T00:17:38.017017,2023-10-24T04:17:38.017017+00:00,56,6.571754,8.521317,102.255799,GEN,1,1,"In the beginning, God created the heavens and ...",,,,,102.256,2023,10,0,1,2023-10-24
2,1.698121e+09,2023-10-24T00:18:06.165502,2023-10-24T04:18:06.165502+00:00,135,13.997536,9.644554,115.734653,GEN,1,2,The earth was formless and empty. Darkness was...,,,,,108.995,2023,10,0,1,2023-10-24
3,1.698121e+09,2023-10-24T00:18:24.381290,2023-10-24T04:18:24.381290+00:00,56,4.714291,11.878774,142.545292,GEN,1,1,"In the beginning, God created the heavens and ...",,,,,120.179,2023,10,0,1,2023-10-24
4,1.698121e+09,2023-10-24T00:21:24.331389,2023-10-24T04:21:24.331389+00:00,56,4.878704,11.478458,137.741497,GEN,1,1,"In the beginning, God created the heavens and ...",,,,,124.569,2023,10,0,1,2023-10-24
5,1.698121e+09,2023-10-24T00:22:03.749152,2023-10-24T04:22:03.749152+00:00,135,19.078445,7.076048,84.912580,GEN,1,2,The earth was formless and empty. Darkness was...,,,,,116.638,2023,10,0,1,2023-10-24
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
457,1.699071e+09,2023-11-04T00:06:24.081392,2023-11-04T04:06:24.081392+00:00,194,16.993672,11.416014,136.992171,GEN,12,307,He left from there to go to the mountain on th...,8.0,134.820701,131.732027,,135.355,2023,11,0,1,2023-11-04
458,1.699071e+09,2023-11-04T00:06:49.969858,2023-11-04T04:06:49.969858+00:00,48,3.751017,12.796529,153.558351,GEN,12,308,"Abram traveled, still going on toward the South.",9.0,137.840247,131.875672,,135.395,2023,11,0,1,2023-11-04
459,1.699071e+09,2023-11-04T00:11:38.511409,2023-11-04T04:11:38.511409+00:00,127,10.637932,11.938411,143.260926,GEN,12,309,There was a famine in the land. Abram went dow...,10.0,136.687821,132.145708,,135.412,2023,11,0,1,2023-11-04
460,1.699076e+09,2023-11-04T01:41:06.191332,2023-11-04T05:41:06.191332+00:00,124,10.944782,11.329600,135.955200,GEN,12,310,"When he had come near to enter Egypt, he said ...",11.0,140.421645,131.915465,,135.413,2023,11,1,1,2023-11-04


In [21]:
print("Saving results:")

Saving results:


In [22]:
def attempt_save(df, filename, index):
    '''This function attempts to save the DataFrame passed to df to the file
    specified by filename. It allows players to retry the save operation
    if it wasn't initially successful (e.g. because the file was open at 
    the time), thus preventing them from losing their latest progress.
    The index parameter determines whether or not the DataFrame's index
    will be included in the .csv export. Set to True for results.csv
    but False for Web_Catholic_Version_for_game_updated.csv.'''
    while True:
        try: 
            df.to_csv(filename, index = index)
            return
        except:
            print("File could not be saved, likely because it is currently open. \
Try closing the file and trying again. Press Enter to retry.")
            input()

In [23]:
attempt_save(df_results, 'results.csv', index = True)

In [24]:
attempt_save(df_Bible, 'WEB_Catholic_Version_for_game_updated.csv', index = False)

In [25]:
print("Successfully saved updated copies of the Results and Bible .csv files.")

Successfully saved updated copies of the Results and Bible .csv files.


# Visualizing the player's progress in typing the entire Bible:

In [26]:
analysis_start_time = time.time() # Allows us to determine how long the
# analyses took
print("Updating analyses:")

Updating analyses:


In [27]:
df_Bible['Count'] = 1

### Creating a tree map within Plotly that visualizes the player's progress in typing the entire Bible:

In [28]:
# This code is based on https://plotly.com/python/treemaps/
# It's pretty amazing that such a complex visualization can be created using
# just one line of code. Thanks Plotly!
fig_tree_map_books_chapters_verses = px.treemap(
    df_Bible, path = ['Book_Name', 'Chapter_Name', 'Verse_#'], 
    values = 'Characters', color = 'Typed')
# fig_verses_typed

In [29]:
fig_tree_map_books_chapters_verses.write_html(
    'Analyses/tree_map_books_chapters_verses.html')

In [30]:
# # A similar chart that doesn't use the Typed column for color coding:
# (This chart, unlike fig_verses_typed above, won't change unless edits are 
# made to the code itself, so it can be 
# commented out after being run once.)
# fig_Bible_verses.write_html('Bible_tree_map.html')
# fig_Bible_verses = px.treemap(df_Bible, path = ['Book_Name', 
# 'Chapter_Name', 'Verse_#'], values = 'Characters')
# fig_Bible_verses

In [31]:
df_Bible

Unnamed: 0,Book_Order,Book_Name,Chapter_Name,Book_and_Chapter,Chapter_Order,Verse_#,Verse_Order,Verse,Characters,Typed,Tests,Fastest_WPM,Characters_Typed,Total_Characters_Typed,Count
0,1,GEN,1,GEN 1,1,1,1,"In the beginning, God created the heavens and ...",56,1,8,193.251509,56,448,1
1,1,GEN,1,GEN 1,1,2,2,The earth was formless and empty. Darkness was...,135,1,2,140.461827,135,270,1
2,1,GEN,1,GEN 1,1,3,3,"God said, ""Let there be light,"" and there was ...",52,1,1,84.008493,52,52,1
3,1,GEN,1,GEN 1,1,4,4,"God saw the light, and saw that it was good. G...",85,1,1,123.198378,85,85,1
4,1,GEN,1,GEN 1,1,5,5,"God called the light ""day"", and the darkness h...",119,1,1,135.664272,119,119,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35374,95,REV,22,REV 22,1328,17,35375,"The Spirit and the bride say, ""Come!"" He who h...",160,0,0,,0,0,1
35375,95,REV,22,REV 22,1328,18,35376,I testify to everyone who hears the words of t...,159,0,0,,0,0,1
35376,95,REV,22,REV 22,1328,19,35377,If anyone takes away from the words of the boo...,174,0,0,,0,0,1
35377,95,REV,22,REV 22,1328,20,35378,"He who testifies these things says, ""Yes, I am...",89,0,0,,0,0,1


In [32]:
# This variant of the treemap shows chapters and verses rather than books,
# chapters, and verses.
if (run_on_notebook == True) & (extra_analyses == True):
    fig_tree_map_chapters_verses = px.treemap(df_Bible, path = [
        'Book_and_Chapter', 'Verse_#'], values = 'Characters', color = 'Typed')
    fig_tree_map_chapters_verses.write_html(
        'Analyses/tree_map_chapters_verses.html')
    fig_tree_map_chapters_verses.write_image(
        'Analyses/tree_map_chapters_verses.png', width = 7680, height = 4320)

In [33]:
# This variant of the treemap shows each verse as its own box, which results in 
# a very busy graph that takes a while to load within a web browser
# (if it even loads at all).

if (run_on_notebook == True) & (extra_analyses == True):
    fig_tree_map_verses = px.treemap(df_Bible, path = ['Verse_Order'], 
                                     values = 'Characters', color = 'Typed')
    fig_tree_map_verses.write_html('Analyses/tree_map_verses.html')
    fig_tree_map_verses.write_image('Analyses/tree_map_verses_8K.png', 
                                    width = 7680, height = 4320) 
    fig_tree_map_verses.write_image('Analyses/tree_map_verses_16K.png', 
                                    width = 15360, height = 8640) 
# fig_tree_map_verses.write_image('Analyses/tree_map_verses.png', width = 30720, 
# height = 17280) # Didn't end up rendering successfully, probably 
# because the dimensions were absurdly large!

### Creating a bar chart that shows the proportion of each book that has been typed so far:

In [34]:
df_characters_typed_by_book = df_Bible.pivot_table(index = ['Book_Order', 
'Book_Name'], values = ['Characters', 'Characters_Typed'], 
aggfunc = 'sum').reset_index()
# Adding 'Book_Order' as the first index value allows for the pivot tables
# and bars to be ordered by that value.
df_characters_typed_by_book['proportion_typed'] = df_characters_typed_by_book[
    'Characters_Typed'] / df_characters_typed_by_book['Characters']
df_characters_typed_by_book.to_csv(
    'Analyses/characters_typed_by_book.csv')
df_characters_typed_by_book

Unnamed: 0,Book_Order,Book_Name,Characters,Characters_Typed,proportion_typed
0,1,GEN,185293,34861,0.188140
1,2,EXO,159492,507,0.003179
2,3,LEV,120367,0,0.000000
3,4,NUM,165525,202,0.001220
4,5,DEU,139344,0,0.000000
...,...,...,...,...,...
68,91,1JN,12363,12363,1.000000
69,92,2JN,1536,453,0.294922
70,93,3JN,1525,1525,1.000000
71,94,JUD,3461,136,0.039295


In [35]:
fig_proportion_of_each_book_typed = px.bar(df_characters_typed_by_book, 
x = 'Book_Name', y = 'proportion_typed')
fig_proportion_of_each_book_typed.update_yaxes(range = [0, 1]) # Setting
# the maximum y value as 1 better demonstrates how much of the Bible
# has been typed so far
fig_proportion_of_each_book_typed.write_html(
    'Analyses/proportion_of_each_book_typed.html')
fig_proportion_of_each_book_typed.write_image(
    'Analyses/proportion_of_each_book_typed.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_proportion_of_each_book_typed

### Creating a chart that compares the number of characters in each book with the number that have been typed:

This provides a clearer view of the player's progress in typing the Bible, as each bar's height is based on the number of characters. (In contrast, bars for fully typed small books will be just as high in fig_proportion_of_each_book_typed as those for fully typed large books.)

In [36]:
fig_characters_typed_in_each_book = px.bar(df_characters_typed_by_book, 
x = 'Book_Name', y = ['Characters', 'Characters_Typed'], barmode = 'overlay')
fig_characters_typed_in_each_book.write_html(
    'Analyses/characters_typed_by_book.html')
fig_characters_typed_in_each_book.write_image(
    'Analyses/characters_typed_by_book.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_characters_typed_in_each_book

## Creating charts that show both book- and chapter-level data:

In [37]:
df_characters_typed_by_book_and_chapter = df_Bible.pivot_table(index = [
'Book_Order', 'Book_Name', 'Book_and_Chapter'], values = [
    'Characters', 'Characters_Typed'], aggfunc = 'sum').reset_index()
df_characters_typed_by_book_and_chapter[
'proportion_typed'] = df_characters_typed_by_book_and_chapter[
'Characters_Typed'] / df_characters_typed_by_book_and_chapter['Characters']
df_characters_typed_by_book_and_chapter.to_csv(
    'Analyses/characters_typed_by_book_and_chapter.csv')
df_characters_typed_by_book_and_chapter

Unnamed: 0,Book_Order,Book_Name,Book_and_Chapter,Characters,Characters_Typed,proportion_typed
0,1,GEN,GEN 1,3831,3831,1.000000
1,1,GEN,GEN 10,2582,2582,1.000000
2,1,GEN,GEN 11,3435,3435,1.000000
3,1,GEN,GEN 12,2473,1601,0.647392
4,1,GEN,GEN 13,2169,0,0.000000
...,...,...,...,...,...,...
1323,95,REV,REV 5,2054,0,0.000000
1324,95,REV,REV 6,2503,0,0.000000
1325,95,REV,REV 7,2452,0,0.000000
1326,95,REV,REV 8,1900,0,0.000000


The following chart shows both books (as bars) and chapters (as sections of these bars). These sections are also color coded by the proportion of each chapter that has been typed.

In [38]:
fig_characters_typed_in_each_book_and_chapter = px.bar(
df_characters_typed_by_book_and_chapter, x = 'Book_Name', y = [
    'Characters'], color = 'proportion_typed')
fig_characters_typed_in_each_book_and_chapter.write_html(
    'Analyses/characters_typed_by_book_and_chapter.html')
fig_characters_typed_in_each_book_and_chapter.write_image(
    'Analyses/characters_typed_by_book_and_chapter.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_characters_typed_in_each_book_and_chapter

## Creating similar charts at the chapter level:

These proved difficult to interpret due to the narrowness of the bars, so I'm commenting this code out for now.

In [39]:
# fig_proportion_of_each_chapter_typed = px.bar(df_characters_typed_by_chapter, 
# x = 'Book_and_Chapter', y = 'proportion_typed')
# fig_proportion_of_each_chapter_typed.update_yaxes(range = [0, 1]) # Setting
# # the maximum y value as 1 better demonstrates how much of the Bible
# # has been typed so far
# fig_proportion_of_each_chapter_typed.write_html(
# 'Analyses/proportion_of_each_chapter_typed.html')
# fig_proportion_of_each_chapter_typed

# fig_characters_typed_in_each_chapter = px.bar(df_characters_typed_by_chapter, 
# x = 'Book_and_Chapter', y = ['Characters', 'Characters_Typed'], 
# barmode = 'overlay')
# fig_characters_typed_in_each_chapter.write_html(
# 'Analyses/characters_typed_by_chapter.html')
# fig_characters_typed_in_each_chapter

## Calculating the dates with the most characters and verses typed:

In [40]:
df_top_dates_by_characters = df_results.pivot_table(
    index = 'Local_Date', values = 'Characters', aggfunc = 'sum').reset_index(
    ).sort_values('Characters', ascending = False).head(50)
df_top_dates_by_characters['Rank'] = df_top_dates_by_characters[
    'Characters'].rank(ascending = False, method = 'min').astype('int')
# Creating a column that shows both the rank and date: (This also prevents
# Plotly from converting the x axis to a date range, which would interfere
# with the order of the chart items)
df_top_dates_by_characters['Rank and Date'] = '#'+df_top_dates_by_characters[
    'Rank'].astype('str') + ': ' + df_top_dates_by_characters[
        'Local_Date'].astype('str')
df_top_dates_by_characters.reset_index(drop=True,inplace=True)

In [41]:
fig_top_dates_by_characters = px.bar(df_top_dates_by_characters, 
x = 'Rank and Date', y = 'Characters')
fig_top_dates_by_characters.update_xaxes(tickangle = 90)
fig_top_dates_by_characters.write_html('Analyses/top_dates_by_characters.html')
fig_top_dates_by_characters.write_image(
    'Analyses/top_dates_by_characters.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_top_dates_by_characters

In [42]:
df_top_dates_by_verses = df_results.pivot_table(
    index = 'Local_Date', values = 'Count', aggfunc = 'sum').reset_index(
    ).rename(columns = {'Count':'Verses'}).sort_values(
        'Verses', ascending = False).head(50)


df_top_dates_by_verses['Rank'] = df_top_dates_by_verses['Verses'].rank(
    ascending = False, method = 'min').astype('int')
df_top_dates_by_verses['Rank and Date'] = '#'+df_top_dates_by_verses[
    'Rank'].astype('str') + ': ' + df_top_dates_by_verses[
        'Local_Date'].astype('str')
df_top_dates_by_verses.reset_index(drop=True,inplace=True)
df_top_dates_by_verses

Unnamed: 0,Local_Date,Verses,Rank,Rank and Date
0,2023-10-29,251,1,#1: 2023-10-29
1,2023-11-03,103,2,#2: 2023-11-03
2,2023-10-31,39,3,#3: 2023-10-31
3,2023-10-24,21,4,#4: 2023-10-24
4,2023-11-04,19,5,#5: 2023-11-04
5,2023-10-25,12,6,#6: 2023-10-25
6,2023-10-28,8,7,#7: 2023-10-28
7,2023-11-02,6,8,#8: 2023-11-02
8,2023-10-27,1,9,#9: 2023-10-27
9,2023-11-01,1,9,#9: 2023-11-01


In [43]:
fig_top_dates_by_verses = px.bar(df_top_dates_by_verses, 
x = 'Rank and Date', y = 'Verses')
fig_top_dates_by_verses.update_xaxes(tickangle = 90)
fig_top_dates_by_verses.write_html('Analyses/top_dates_by_verses.html')
fig_top_dates_by_verses.write_image(
    'Analyses/top_dates_by_verses.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_top_dates_by_verses

## Performing similar analyses by month:

In [44]:
df_top_months_by_characters = df_results.pivot_table(
    index = ['Local_Year', 'Local_Month'], 
    values = 'Characters', aggfunc = 'sum').reset_index(
    ).sort_values('Characters', ascending = False).head(50)

df_top_months_by_characters['Rank'] = df_top_months_by_characters[
'Characters'].rank(ascending = False, method = 'min').astype('int')
df_top_months_by_characters['Rank and Month'] = '#'+df_top_months_by_characters[
    'Rank'].astype('str') + ': ' + df_top_months_by_characters[
        'Local_Year'].astype('str') + '-' + df_top_months_by_characters[
            'Local_Month'].astype('str')
df_top_months_by_characters.reset_index(drop=True,inplace=True)

df_top_months_by_characters

Unnamed: 0,Local_Year,Local_Month,Characters,Rank,Rank and Month
0,2023,10,38076,1,#1: 2023-10
1,2023,11,13580,2,#2: 2023-11


In [45]:
fig_top_months_by_characters = px.bar(df_top_months_by_characters, 
x = 'Rank and Month', y = 'Characters')
fig_top_months_by_characters.update_xaxes(tickangle = 90)
fig_top_months_by_characters.write_html(
    'Analyses/top_months_by_characters.html')
fig_top_months_by_characters.write_image(
    'Analyses/top_months_by_characters.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_top_months_by_characters

In [46]:
df_top_months_by_verses = df_results.pivot_table(
    index = ['Local_Year', 'Local_Month'], 
    values = 'Count', aggfunc = 'sum').reset_index(
    ).rename(columns={'Count':'Verses'}).sort_values(
        'Verses', ascending = False).head(50)

df_top_months_by_verses['Rank'] = df_top_months_by_verses['Verses'].rank(
    ascending = False, method = 'min').astype('int')
df_top_months_by_verses['Rank and Month'] = '#'+df_top_months_by_verses[
    'Rank'].astype('str') + ': ' + df_top_months_by_verses[
        'Local_Year'].astype('str') + '-' + df_top_months_by_verses[
            'Local_Month'].astype('str')
df_top_months_by_verses.reset_index(drop=True,inplace=True)

df_top_months_by_verses

Unnamed: 0,Local_Year,Local_Month,Verses,Rank,Rank and Month
0,2023,10,332,1,#1: 2023-10
1,2023,11,129,2,#2: 2023-11


In [47]:
fig_top_months_by_verses = px.bar(df_top_months_by_verses, 
x = 'Rank and Month', y = 'Verses')
fig_top_months_by_verses.update_xaxes(tickangle = 90)
fig_top_months_by_verses.write_html('Analyses/top_months_by_verses.html')
fig_top_months_by_verses.write_image(
    'Analyses/top_months_by_verses.png', 
    width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_top_months_by_verses

# Analyzing WPM data:

(Some of this section's code derives from my work in [this script](https://github.com/kburchfiel/typeracer_data_analyzer/blob/master/typeracer_data_analyzer_v2.ipynb).)


Top 20 WPM results:

In [48]:
df_top_100_wpm = df_results.sort_values('WPM', ascending = False).head(
    100).copy()
df_top_100_wpm.insert(0, 'Rank', df_top_100_wpm['WPM'].rank(
    ascending = False, method = 'min').astype('int'))
# method = 'min' assigns the lowest rank to any rows that happen to have
# the same WPM. See 
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rank.html
df_top_100_wpm.to_csv('Analyses/top_100_wpm.csv')
df_top_100_wpm

Unnamed: 0_level_0,Rank,Unix_Start_Time,Local_Start_Time,UTC_Start_Time,Characters,Seconds,CPS,WPM,Book,Chapter,Verse_Order,Verse,Verse #,Last 10 Avg,Last 100 Avg,Last 1000 Avg,cumulative_avg,Local_Year,Local_Month,Local_Hour,Count,Local_Date
Test_Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
29,1,1.698284e+09,2023-10-25T21:29:51.817669,2023-10-26T01:29:51.817669+00:00,56,3.477334,16.104292,193.251509,GEN,1,1,"In the beginning, God created the heavens and ...",1.0,148.772233,,,129.881,2023,10,21,1,2023-10-25
26,2,1.698284e+09,2023-10-25T21:28:14.539133,2023-10-26T01:28:14.539133+00:00,56,3.647492,15.353016,184.236198,GEN,1,1,"In the beginning, God created the heavens and ...",1.0,139.936067,,,127.120,2023,10,21,1,2023-10-25
237,3,1.698623e+09,2023-10-29T19:47:07.670908,2023-10-29T23:47:07.670908+00:00,116,7.586424,15.290472,183.485667,GEN,4,91,"Now you are cursed because of the ground, whic...",11.0,138.576366,135.941792,,136.777,2023,10,19,1,2023-10-29
341,4,1.699066e+09,2023-11-03T22:48:34.354652,2023-11-04T02:48:34.354652+00:00,77,5.072950,15.178545,182.142544,GEN,8,194,He waited yet another seven days; and again he...,10.0,149.265970,134.422066,,136.074,2023,11,22,1,2023-11-03
414,5,1.699068e+09,2023-11-03T23:23:14.681466,2023-11-04T03:23:14.681466+00:00,154,10.185597,15.119389,181.432669,GEN,10,267,"These are the families of the sons of Noah, by...",32.0,128.663067,136.017588,,135.546,2023,11,23,1,2023-11-03
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
132,96,1.698621e+09,2023-10-29T19:10:15.549816,2023-10-29T23:10:15.549816+00:00,114,8.990945,12.679424,152.153085,1JN,4,34893,No one has seen God at any time. If we love on...,12.0,148.056368,140.098196,,137.324,2023,10,19,1,2023-10-29
451,97,1.699071e+09,2023-11-04T00:04:46.928265,2023-11-04T04:04:46.928265+00:00,62,4.901897,12.648165,151.777984,SNG,2,17404,"My beloved is mine, and I am his. He browses a...",16.0,129.676966,132.304817,,135.338,2023,11,0,1,2023-11-04
447,98,1.699071e+09,2023-11-04T00:02:25.805547,2023-11-04T04:02:25.805547+00:00,132,10.453946,12.626811,151.521727,GEN,12,300,"Now Yahweh said to Abram, ""Leave your country,...",1.0,122.833221,132.649711,,135.367,2023,11,0,1,2023-11-04
81,99,1.698555e+09,2023-10-29T00:55:38.738309,2023-10-29T04:55:38.738309+00:00,209,16.590364,12.597674,151.172086,1JN,2,34842,"I have written to you, fathers, because you kn...",14.0,144.234653,,,138.224,2023,10,0,1,2023-10-29


In [49]:
fig_top_100_wpm = px.bar(df_top_100_wpm, x = 'Rank', y = 'WPM')
fig_top_100_wpm.write_html('Analyses/top_100_wpm.html')
fig_top_100_wpm.write_image('Analyses/top_100_wpm.png', 
width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_top_100_wpm

Top 20 'Last 10 Average' values:

In [50]:
df_top_20_last_10_avg_results = df_results.sort_values(
    'Last 10 Avg', ascending = False).head(20).copy()
df_top_20_last_10_avg_results

Unnamed: 0_level_0,Unix_Start_Time,Local_Start_Time,UTC_Start_Time,Characters,Seconds,CPS,WPM,Book,Chapter,Verse_Order,Verse,Verse #,Last 10 Avg,Last 100 Avg,Last 1000 Avg,cumulative_avg,Local_Year,Local_Month,Local_Hour,Count,Local_Date
Test_Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
375,1699067000.0,2023-11-03T22:57:11.140166,2023-11-04T02:57:11.140166+00:00,94,6.98114,13.46485,161.578203,GEN,9,228,"Ham, the father of Canaan, saw the nakedness o...",22.0,155.189199,140.005204,,136.999,2023,11,22,1,2023-11-03
346,1699066000.0,2023-11-03T22:50:09.696543,2023-11-04T02:50:09.696543+00:00,26,2.098202,12.391561,148.698727,GEN,8,199,"God spoke to Noah, saying,",15.0,154.501316,135.941614,,136.214,2023,11,22,1,2023-11-03
345,1699066000.0,2023-11-03T22:49:56.583535,2023-11-04T02:49:56.583535+00:00,79,6.649562,11.880482,142.565789,GEN,8,198,"In the second month, on the twenty-seventh day...",14.0,154.17114,135.389865,,136.178,2023,11,22,1,2023-11-03
369,1699067000.0,2023-11-03T22:56:04.797137,2023-11-04T02:56:04.797137+00:00,171,13.39499,12.765967,153.191607,GEN,9,222,The rainbow will be in the cloud. I will look ...,16.0,154.131763,139.205292,,136.805,2023,11,22,1,2023-11-03
376,1699067000.0,2023-11-03T22:57:19.430587,2023-11-04T02:57:19.430587+00:00,205,16.778885,12.217737,146.612842,GEN,9,229,"Shem and Japheth took a garment, and laid it o...",23.0,152.531903,140.209026,,137.025,2023,11,22,1,2023-11-03
377,1699067000.0,2023-11-03T22:57:39.277346,2023-11-04T02:57:39.277346+00:00,73,5.463816,13.360626,160.327516,GEN,9,230,"Noah awoke from his wine, and knew what his yo...",24.0,151.801505,140.348312,,137.087,2023,11,22,1,2023-11-03
374,1699067000.0,2023-11-03T22:57:05.278503,2023-11-04T02:57:05.278503+00:00,69,4.693784,14.700292,176.403507,GEN,9,227,He drank of the wine and got drunk. He was unc...,21.0,151.231641,139.896429,,136.933,2023,11,22,1,2023-11-03
344,1699066000.0,2023-11-03T22:49:38.177286,2023-11-04T02:49:38.177286+00:00,217,15.961097,13.595557,163.146679,GEN,8,197,"In the six hundred first year, in the first mo...",13.0,151.184573,134.886354,,136.159,2023,11,22,1,2023-11-03
368,1699067000.0,2023-11-03T22:55:51.536428,2023-11-04T02:55:51.536428+00:00,161,11.776451,13.671351,164.056215,GEN,9,221,"I will remember my covenant, which is between ...",15.0,150.438336,138.923141,,136.76,2023,11,22,1,2023-11-03
343,1699066000.0,2023-11-03T22:49:24.201749,2023-11-04T02:49:24.201749+00:00,95,7.094975,13.389757,160.677088,GEN,8,196,"He waited yet another seven days, and sent out...",12.0,149.969348,134.253127,,136.08,2023,11,22,1,2023-11-03


In [51]:
fig_df_results_by_test_number = px.line(df_results, x = df_results.index, 
y = ['WPM', 'Last 10 Avg', 'Last 100 Avg', 'Last 1000 Avg', 'cumulative_avg'])
fig_df_results_by_test_number.write_html('Analyses/results_by_test_number.html')
fig_df_results_by_test_number.write_image('Analyses/results_by_test_number.png', 
width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_df_results_by_test_number

## Evaluating average results by month:

In [52]:
df_results_by_month = df_results.pivot_table(
    index = ['Local_Year', 'Local_Month'], values = ['Count', 'WPM'], 
    aggfunc = {'Count':'sum', 'WPM':'mean'}).reset_index()
# Enclosing the year/month in parentheses so that they won't be converted
df_results_by_month['Year/Month'] = df_results_by_month[
    'Local_Year'].astype('str') + '-' + df_results_by_month[
    'Local_Month'].astype('str')
df_results_by_month.to_csv('Analyses/results_by_month.csv')
df_results_by_month

Unnamed: 0,Local_Year,Local_Month,Count,WPM,Year/Month
0,2023,10,332,135.65783,2023-10
1,2023,11,129,134.602562,2023-11


In [53]:
fig_results_by_month = px.bar(df_results_by_month, x = 'Year/Month', 
y = 'WPM', color = 'Count')
fig_results_by_month.update_xaxes(type = 'category') # This line, based on
# Pracheta's response at https://stackoverflow.com/a/64424308/13097194,
# updates the axes to show the date-month pairs as strings rather than 
# as Plotly-formatted date values. This will also prevent missing
# months from appearing in the graph.
fig_results_by_month.write_html('Analyses/results_by_month.html')
fig_results_by_month.write_image('Analyses/results_by_month.png', 
width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_results_by_month

## Evaluating average results by hour of day:

In [54]:
df_results_by_hour = df_results.pivot_table(index = ['Local_Hour'], 
values = ['Count', 'WPM'], aggfunc = {'Count':'sum', 'WPM':'mean'}).reset_index()
df_results_by_hour

Unnamed: 0,Local_Hour,Count,WPM
0,0,79,138.946476
1,1,17,133.438729
2,18,39,138.736296
3,19,192,134.519461
4,20,6,116.627468
5,21,16,135.802421
6,22,46,145.962938
7,23,66,126.235594


In [55]:
fig_results_by_hour = px.bar(df_results_by_hour, x = 'Local_Hour', 
y = 'WPM', color = 'Count')
fig_results_by_hour.write_html('Analyses/results_by_hour.html')
fig_results_by_hour.write_image('Analyses/results_by_hour.png', 
width = 1920, height = 1080, engine = 'kaleido', scale = 2)
fig_results_by_hour

# Comparing mean WPMs by Bible books:

In [56]:
df_wpm_by_book = df_results.pivot_table(index = 'Book', values = 'WPM', 
aggfunc = ['count', 'mean'], margins = True, 
margins_name = 'Total').reset_index()
df_wpm_by_book.columns = 'Book', 'Tests', 'Mean WPM'
df_wpm_by_book.sort_values('Mean WPM', ascending = False, inplace = True)
df_wpm_by_book.reset_index(drop=True,inplace=True)
df_wpm_by_book.to_csv('Analyses/mean_wpm_by_book.csv')

df_wpm_by_book


Unnamed: 0,Book,Tests,Mean WPM
0,REV,1,161.093106
1,2JN,4,151.783862
2,JUD,1,148.672335
3,3JN,15,147.63011
4,SNG,3,144.0843
5,2KI,1,143.531468
6,1JN,105,138.061187
7,Total,461,135.362538
8,GEN,325,134.003192
9,NUM,2,123.507441


In [57]:
# The following chart will display a bar for each book for which at least one 
# test has been taken. It will also show a line that corresponds to the player's
# overall WPM across all books. The bars are colored by test count, making
# it easier to identify which bars might be skewed by a low number of results.
# The 'Total' value in df_wpm_by_book is displayed as a line instead of as
# a color so as not to interfere with the color gradient.

# Retrieving the total mean WPM value in df_wpm_by_book:
total_mean_wpm = df_wpm_by_book.query("Book == 'Total'").iloc[0]['Mean WPM']
total_mean_wpm

fig_mean_wpm_by_book = px.bar(df_wpm_by_book.query("Book != 'Total'"), 
x = 'Book', y = 'Mean WPM', color = 'Tests')
fig_mean_wpm_by_book.add_shape(type = 'line', x0 = -0.5, 
x1 = len(df_wpm_by_book) -1.5, y0 = total_mean_wpm, y1 = total_mean_wpm)
# See https://plotly.com/python/shapes/ for the add_shape() code.
# The use of -0.5 and len() - 1.5 is based on gleasocd's answer at 
# https://stackoverflow.com/a/40408960/13097194 . len(df) - 0.5 would normally
# work, except that I reduced the size of the DataFrame by 1 when excluding
# the 'Total' book.
fig_mean_wpm_by_book.write_html('Analyses/mean_wpm_by_book.html')
fig_mean_wpm_by_book.write_image('Analyses/mean_wpm_by_book.png', width = 1920, 
height = 1080, engine = 'kaleido', scale = 2)
fig_mean_wpm_by_book


In [58]:
analysis_end_time = time.time()
analysis_time = analysis_end_time - analysis_start_time
print(f"Finished updating analyses in {round(analysis_time, 3)} seconds. \
Enter any key to exit.") # Allows the console to stay open when the
# .py version of the program is run

input()

Finished updating analyses in 17.952 seconds. Enter any key to exit.


''