Data from the online game Axon and code for for analysing it
Latest commit 58c163a Jan 1, 2014 @mikedewar mikedewar Create LICENSE
oops forgot to do this sooner...

Tracing the Trajectory of Skill Learning With a Very Large Sample of Online Game Players

Tom Stafford & Mike Dewar

This code accompanies our paper

Stafford, T. & Dewar, M. "Tracing the Trajectory of Skill Learning With a Very Large Sample of Online Game Players"

which is in press at Psychological Science (expected January 2014). Previously this work was presented at the Cognitive Science Society conference in Berlin in August 2013 under the title "Testing theories of skill learning using a very large sample of online game players".


In the present study, we analyzed data from a very large sample (N= 854,064) of players of an online game involving rapid perception, decision making, and motor responding. Use of game data allowed us to connect, for the first time, rich details of training history with measures of performance from participants engaged for a sustained amount of time in effortful practice. We showed that lawful relations exist between practice amount and subsequent performance, and between practice spacing and subsequent performance. Our methodology allowed an in situ confirmation of results long established in the experimental literature on skill acquisition. Additionally, we showed that greater initial variation in performance is linked to higher subsequent performance, a result we link to the exploration/exploitation trade-off from the computational framework of reinforcement learning. We discuss the benefits and opportunities of behavioral data sets with very large sample sizes and suggest that this approach could be particularly fecund for studies of skill acquisition.

Files in this Repository


  • stafford_and_dewar_revision.pdf - the final submitted version of the paper
  • Psychscience__Response_to_Review.pdf - the accompnanying response to the reviewers.
  • data_by_cookie.json - the raw data upon which all the results are based


The following are all analysis files, written in Python, for generating the results reported in the paper and reported in the response to reviewers.

  • - make Figure 2 from the Psych Science paper. Control analyses follow.
  • - equates players on first one/two scores. Also allows you to calculate maximum score on Nth play rather than on any (unspecified) play
  • - normalises plots to an initial score of zero
  • - compares learning curves for aggregate score vs average score (rather than attempt number vs average score)
  • - make Figure 3 from the Psych Science paper.
  • - extract observsed data (required for
  • - bootstrap confidence intervals (required for
  • - make Figure 4 from the Psych Science paper
  • - supplementary analysis, show attempt number vs average score
  • - supplementary graph, performs analysis of the explore exploit result (reported in the paper, page 5)
  • - the observed data
  • - CIs on the correlation
  • - shows that for players with a high number of attempts the lerning curve regularity doesn't hold
  • - graph for resters vs goers result (reported in the paper, page 4, column 2)

See \Figures for graphs produced by this lot