Skip to content

Commit

Permalink
Clean up simulation notebooks and streamlit page
Browse files Browse the repository at this point in the history
  • Loading branch information
jbreffle committed Feb 18, 2024
1 parent 0bd9a2f commit 972628b
Show file tree
Hide file tree
Showing 3 changed files with 64 additions and 46 deletions.
33 changes: 9 additions & 24 deletions notebooks/3a_sim_simple.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Simulation by drawing mistake counts\n"
"# Simply speed typing simulation\n"
]
},
{
Expand All @@ -14,11 +14,6 @@
"## Set up\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 1,
Expand All @@ -42,7 +37,6 @@
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import scipy\n",
"\n",
"from src import util\n",
"from src import plot"
Expand All @@ -54,22 +48,13 @@
"source": [
"## Simulate\n",
"\n",
"Note: Need to fix the dependent parameter calculations\n",
"\n",
"For each trial it draws a random number of mistakes and then simulates the typing\n",
"For each trial, draw a random number of mistakes and then simulate the typing\n",
"speed and accuracy for that trial.\n",
"The number of mistakes is a Poisson distribution with some mean mistakeLambda\n",
"Each mistake is assumed to take a certain amount of time to correct (normal\n",
"distribution)\n",
"The total amount of time to correct all mistakes reduces the final wpm\n",
"\n",
"Simulates a typing text accuracy and speed\n",
"There is an unerlying average wpm and accuracy\n",
"Mistakes are generated randomly\n",
"\n",
"For each test, simulate the number of mistakes and the resulting wpm due to mistake\n",
"delays\n",
"This is a simple simulation: it does not take into account...\n"
"The number of mistakes is a Poisson distribution with some mean $\\lambda$.\n",
"Each mistake is assumed to take a certain amount of time to correct (log normal\n",
"distribution).\n",
"The total amount of time to correct all mistakes reduces the final wpm."
]
},
{
Expand Down Expand Up @@ -123,7 +108,7 @@
}
],
"source": [
"# Plot a histogram of the number of mistakes per trial\n",
"# Histogram of the number of mistakes per trial\n",
"fig = plt.figure(figsize=(6, 2))\n",
"ax = plot.sim_n_mistakes(n_mistakes)\n",
"plt.show()"
Expand All @@ -146,7 +131,7 @@
}
],
"source": [
"# Plot scatter_hist of wpm and acc\n",
"# scatter_hist of wpm and acc\n",
"fig = plt.figure(figsize=(6, 4))\n",
"ax, ax_histx, ax_histy = plot.sim_scatter_hist(wpm, acc, fig=fig)\n",
"ax.axvline(avg_wpm, color=\"k\", linestyle=\"--\", alpha=0.5)\n",
Expand All @@ -159,7 +144,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## For streamlit"
"## Different parameters"
]
},
{
Expand Down
17 changes: 12 additions & 5 deletions notebooks/3b_sim_poisson.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,22 @@
"## Run simulation\n",
"\n",
"Simulates typing as a poisson process in discrete time, where in each time bin there\n",
"is a probability of either typing a correct letter, a wrong letter, or no letter at all.\n",
"is a probability of typing a correct letter, a wrong letter, or no letter at all.\n",
"\n",
"Error of approximating Poisson as Bernoulli is determined by ratio of time step (dt)\n",
"to characters per second (avg_correct_cps)\n",
"Error leads to slightly lower wpm than expected, but is negligible for reasonable\n",
"The error of simulationg a Poisson process in discrete time is determined by ratio of time step (dt)\n",
"to characters per second (avg_correct_cps). \n",
"This error leads to slightly lower wpm than expected, but is negligible for reasonable\n",
"values of dt and avg_correct_cps\n",
"\n",
"Incorporrates an error time cost (each error causes a delay in typing, in order to fix it).\n",
"Set error_cost=0 to assume no cost to correcting errors.\n"
"Set error_cost=0 to assume no cost to correcting errors."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 0 Cost simulation"
]
},
{
Expand Down
60 changes: 43 additions & 17 deletions streamlit/pages/2_Simulated_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,66 +33,92 @@ def main():
Home.configure_page(page_title="Simulated typing")

# Data set up
data_df, _ = Home.load_data()
_, _ = Home.load_data()

# Page introduction
st.title("Simulated typing")
iid_url = "https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables"
st.write(
"""
f"""
One important question when practicing typing is,
"how carefully should one try to avoid mistakes"?
In the raw data we can see that performance (wpm) is strongly correlated with
accuracy. But is this correlation causal?
One hypothesis is that mistakes are i.i.d. random,
One hypothesis is that mistakes are [i.i.d.](<{iid_url}>),
and each mistake requires a fixed time to correct
(time that otherwise would have been spent typing).
This alone might cause the degree of correlation observed in the data.
Is the best approach for practicing typing to balance the probability of making
a mistake with the time it takes to correct it?
We can study the causal relationship between accuracy and wpm through
simulations.
"""
)
st.divider()

st.subheader("Simulated typing: random mistake draws")
st.write(
f"""
One method to simulate typing is to randomly draw mistakes.
"""
A simple method of simulating a typing session is to draw a random
number of mistakes from a Poisson distribution and then assume each of those
mistakes takes some random amount of time to correct.
The WPM and accuracy can then be calculated based on those random values.
Here we see results from 1000 such simulated trials.
If we assume an average performance of 60 WPM and 95\% accuracy
then we can reproduce the $R^2$ that is observed in the actual data
when we assume each mistake takes an average of 0.5 seconds to correct
with a standard deviation of 0.45.
"""
)
# TODO
avg_wpm = 60
avg_acc = 0.95
n_trials = 1000
wpm, acc, n_mistakes = run_simple_sim(
avg_wpm=avg_wpm, avg_acc=avg_acc, n_trials=n_trials
)
wpm, acc, _ = run_simple_sim(avg_wpm=avg_wpm, avg_acc=avg_acc, n_trials=n_trials)
# Plot scatter_hist of wpm and acc
fig = plt.figure(figsize=(6, 4))
ax, ax_histx, ax_histy = plot.sim_scatter_hist(wpm, acc, fig=fig)
ax, _, _ = plot.sim_scatter_hist(wpm, acc, fig=fig)
ax.axvline(avg_wpm, color="grey", linestyle="--", alpha=0.5)
ax.axhline(avg_acc, color="grey", linestyle="--", alpha=0.5)
ax.plot(np.mean(wpm), np.mean(acc), "ro")
st.pyplot(fig, use_container_width=True, transparent=True)
# TODO
st.write(
"""
The dashed grey lines show the target WPM and accuracy.
The red dot is the mean WPM and accuracy over all trial simulations.
The red line is the linear regression.
"""
)
st.divider()

st.subheader("Simulated typing: Poisson process")
st.write(
f"""
An alternative simulation method is to use a Poisson process.
"""
A more complicated but more realistic simulation approach would be to simulate
typing behavior across time within each trial using a Poisson process.
We see that we reproduce similar results to the simple method.
Here we model mistakes as a Poisson process and
assume each mistake takes 0.75 seconds to fix.
"""
)
avg_wpm = 60
avg_acc = 0.95
wpm, acc, n_mistakes = run_poisson_sim(avg_wpm=avg_wpm, avg_acc=avg_acc)
wpm, acc, _ = run_poisson_sim(avg_wpm=avg_wpm, avg_acc=avg_acc)
# Plot scatter_hist of wpm and acc
fig = plt.figure(figsize=(6, 4))
ax, ax_histx, ax_histy = plot.sim_scatter_hist(wpm, acc, fig=fig)
ax, _, _ = plot.sim_scatter_hist(wpm, acc, fig=fig)
ax.axvline(avg_wpm, color="grey", linestyle="--", alpha=0.5)
ax.axhline(avg_acc, color="grey", linestyle="--", alpha=0.5)
ax.plot(np.mean(wpm), np.mean(acc), "ro")
st.pyplot(fig, use_container_width=True, transparent=True)
st.write(
"""
The dashed grey lines show the target WPM and accuracy.
The red dot is the mean WPM and accuracy over all trial simulations.
The red line is the linear regression.
"""
)
st.divider()

nb_url_1 = "https://github.com/jbreffle/monkeytype-analysis/blob/main/notebooks/3a_sim_simple.ipynb"
Expand All @@ -101,7 +127,7 @@ def main():
f"""
Click here
[./notebooks/3a_sim_simple.ipynb]({nb_url_1})
for the simple simulation method notebook.
for the simple simulation notebook.
Click here
[./notebooks/3a_sim_poisson.ipynb]({nb_url_2})
Expand Down

0 comments on commit 972628b

Please sign in to comment.