# Exploration and Skill Acquisition in a Major Online Game

* Tom Stafford; University of Sheffield, t.stafford@shef.ac.uk
* Sam Devlin; Digital Creativity Labs, University of York
* Anders Drachen; Digital Creativity Labs, University of York
* Rafet Sifa; Fraunhofer IAIS, Germany

### Tracking skill acquisition across 21,543 players of the first-person shooter game Destiny shows that variability in some – but not all – dimensions of practice enhances learning

### CogSci17: 39th Annual Meeting of the Cognitive Science Society, London, UK,  July 26th – July 29th, 2017
### Poster 140, Poster Session 1, Monarch Suite, Thursday, July 27, 1:20pm – 2:50pm

Paper: https://mindmodeling.org/cogsci2017/papers/0615/index.html  
Full scripts & sample data: https://osf.io/c59n9/  
Poster: http://tomstafford.staff.shef.ac.uk/docs/CogSci17_TomStafford.jpg  
This notebook: https://github.com/tomstafford/destiny/blob/master/variability.ipynb  

## Initialise

In [40]:
#libraries
# - standard
import os #directory and file functions
import socket #machine id
import pandas as pd #data munging 
import sys #for getting Python version
#Using arrow instead of datetime because @treycausey told me to
#http://crsmithdev.com/arrow/
import arrow

# - bespoke
from destiny_funcs import maketime #


import numpy as np #number functions
import pylab as plt #graphing functions
#from rpy2.robjects.packages import importr #obviously requires R to be installed
#psychometric=importr('psychometric') #first run install.packages("psychometric") in R

print("Python version = " + sys.version)
print("working directory = " + os.getcwd())
print("Machine = " + socket.gethostname())

Python version = 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609]
working directory = /home/tom/Dropbox/university/expts/destiny_public/destiny
Machine = tom-xps


In [41]:
# Parameters
logtransform = False

#all graphs to have same x y range
xmin=0.5 
xmax=1.1

cRmin=60;cRmax=130
if logtransform:
    cRmin=np.log(60);cRmax=np.log(130)

perfvar='combatRating' #'killsDeathsRatio'
if logtransform:
    perfvar='Ln_combatRating' #'killsDeathsRatio'

#display format for numbers
#pd.set_option('display.float_format', lambda x: '%.3f' % x)


## The Data

In [42]:
#----------------- full data not publically available (sorry!), but we'll load summary data if we're not running locally

local=True
if local:
    filepath='../../destiny/data/validation_dataset/' #location of full dataset on Tom's machine
    dfilename='validation-dataset-rldat.csv' 
    gfilename='grimScore_validationPlayers.csv'
    pfilename='validation-dataset-entropyAcrossEventTypesForFirst25Days.csv'

df=pd.read_csv(filepath+dfilename) #main player data
    
gf=pd.read_csv(filepath+gfilename) #final grimoire score of each player
gf.columns= ['userid','grimscore']

pf=pd.read_csv(filepath+pfilename) #entropy across playmodes
pf.columns= ['userid','eventEntropy25']

In [43]:
df.head() #example of what the data look like (nb not all columns shown)

Unnamed: 0,destinyMembershipId,date,PvPEventCount,totalDeathDistance,activitiesWon,totalKillDistance,deaths,averageLifespan,objectivesCompleted,averageKillDistance,...,assists,resurrectionsReceived,activitiesEntered,score,averageScorePerLife,maximumWeaponLevel,allParticipantsCount,maximumPowerLevel,highestCharacterLevel,winLossRatio
0,4611686018428705658,2015-12-01T00:00:00Z,1,0,22,8023,216,75.723502,32,17.868597,...,90,16,31,93182,429.410138,0,245,0,40,2.444444
1,4611686018428705658,2015-12-02T00:00:00Z,2,0,0,275,17,39.555556,3,16.176471,...,6,0,1,3695,205.277778,0,16,0,40,0.0
2,4611686018428705658,2015-12-03T00:00:00Z,3,0,1,254,1,193.5,0,13.368421,...,1,0,1,2505,1252.5,0,6,0,40,-1.0
3,4611686018428705658,2015-12-04T00:00:00Z,4,0,18,5873,106,124.130841,0,30.273196,...,33,42,25,103,0.962617,0,150,0,40,2.571429
4,4611686018428705658,2015-12-08T00:00:00Z,5,0,7,2781,104,62.038095,1,17.490566,...,22,3,10,23625,225.0,0,85,0,40,2.333333


## Data Munging

In [44]:
#our headings
cols=df.columns.values


print("Rows in dataset   = " + str(len(df)))
print("Unique player IDs = " + str(len(df[cols[0]].unique())))

#create a unique timestamp for each play
df['time'] =df.apply(maketime, axis=1)

#use this to create a sequential count of each game, for each player
df.sort_values('time', ascending=True, inplace=True)
df=df.groupby('destinyMembershipId').apply(ranker)
#now find some by player stats
df=df.groupby('destinyMembershipId').apply(tagger)

Rows in dataset   = 703602
Unique player IDs = 12861


NameError: ("name 'arrow' is not defined", 'occurred at index 0')