# Skill estimation using Stan

## Potential Ideas 

    1. Compare the ability of your model to predict the winner of new (unseen) games to simple approaches, such as fraction of games won, number of games played, etc.
        - figure out ways to evaluate model performance: 
            (a) predict the winner of games in validation (Done)
            (b) predict the fraction of games won (Done)
            (c) predict game result: [? - ?] 
    
    2. Try evaluating how many games are required to accurately predict the players' skill levels / win probability by decreasing the amount of training data available and observing the performance. 
        - when processing data...may do pruning
            (a) vary # most recent games for each player --> look at distribution of date in train and valid first
            (b) vary max # opponents for each player to count
            (c) vary max # games with each oppo for each player to count 
    
    3. Try evaluating how quickly you can determine a new players' skill by either random game choices or carefully chosen games (matched based on estimated skill level).  You can leave a player out of the inference process entirely, then slowly add their games in and see how quickly you are able to learn their relative position.
        - may need to build a new model: 
            * input with skill levels for old players, games that new player(s) involved.
            * output the estimated skill level for new player(s). 
    
    4. Experiment with learning a more complex model, for example taking into account game features (player's selected character) or additional latent scores (such as offensive and defensive skill) along with a correspondingly more elaborate probability of win function.
        - Add weights for match date (the more recent one the more importance) -> related to Idea 2. 
        - # plays
        - race
        - addon
        - tournament-type
        * For those new features, may need to do hypothesis test later to verify their significance to the results.  

In [26]:
import numpy as np
np.random.seed(66)
import pystan
import matplotlib.pyplot as plt
%matplotlib inline

import pickle

## Start Point: Use sample model

In [27]:
skill_model = """
data {
  int<lower=1> N;             # Total number of players
  int<lower=1> E;             # number of games
  real<lower=0> scale;        # scale value for probability computation
  int<lower=0,upper=1> win[E]; # PA wins vs PB
  int PA[E];                  # player info between each game
  int PB[E];                  # 
}
parameters {
  vector [N] skill;           # skill values for each player
}

model{
  for (i in 1:N){ skill[i]~normal(0,3); }
  for (i in 1:E){
    win[i] ~ bernoulli_logit( (scale)*(skill[PA[i]]-skill[PB[i]]) );
  }   # win probability is a logit function of skill difference
}
"""

Now, compile the model.  

In [28]:
try:     # load it if already compiled
    sm = pickle.load(open('skill_model.pkl', 'rb'))
except:  # ow, compile and save compiled model
    sm = pystan.StanModel(model_code = skill_model)
    with open('skill_model.pkl', 'wb') as f: pickle.dump(sm, f)

## Processing data

In [47]:
def load_data(dir='data/', pKeep=1.0, nEdge=3, nKeep=5, opt='train'):
    with open(dir+opt+'.csv', encoding='utf-8') as f:
        lines = f.read().split('\n')

    p = 0
    playerid = {}
    for i in range(len(lines)):
        csv = lines[i].split(',')
        if len(csv) != 10: 
            continue   # parse error or blank line
        player0,player1 = csv[1],csv[4]
        if player0 not in playerid:
            playerid[player0]=p
            p+=1
        if player1 not in playerid:
            playerid[player1]=p
            p+=1

    
    # Sparsifying parameters (discard some training examples):
    # pKeep = 1.0   # fraction of edges to consider (immed. throw out 1-p edges)
    # nEdge = 3     # try to keep nEdge opponents per player (may be more; asymmetric)
    # nKeep = 5     # keep at most nKeep games per opponent pairs (play each other multiple times)

    wins = []
    playerA, playerB = [], []
    nplayers = len(playerid)
    nplays = np.zeros( (nplayers,nplayers) )
    
#     for i in np.random.permutation(len(lines)):
    for i in range(len(lines)):
        csv = lines[i].split(',')
        if len(csv) != 10:
            continue   # parse error or blank line
        a,b = playerid[csv[1]],playerid[csv[4]]
        aw,bw = csv[2]=='[winner]',csv[5]=='[winner]'
        
        if (np.random.rand() < pKeep):
            if (nplays[a,b] < nKeep) and ( ((nplays[a,:]>0).sum() < nEdge) or ((nplays[:,b]>0).sum() < nEdge) ):
                nplays[a,b] += 1
                nplays[b,a]+=1
                
                playerA.append(a+1)
                playerB.append(b+1)
                wins.append(1 if aw else 0) 

    return nplayers,playerA,playerB,wins

In [48]:
nplayers,playerA,playerB,wins = load_data()

In [49]:
print('summary: ')
print('# players', nplayers)
print('# games', len(wins))
print('player A', playerA[:10])
print('player B', playerB[:10])
print('wins', wins[:10])

summary: 
# players 999
# games 4677
player A [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
player B [2, 3, 4, 5, 6, 7, 8, 9, 10, 9]
wins [0, 0, 0, 0, 1, 1, 1, 0, 0, 1]


We also need the observed data: number of players and games, which pairs played each game, and who won:

In [50]:
skill_data = {
    'N': nplayers,
    'E': len(wins),
    'scale': 0.3,
    'win':wins,
    'PA': playerA,
    'PB': playerB
}
# Player 1 & 3 played & P1 won; then again; then P2 & P3 (P2 wins), etc.

Now, we can perform MCMC on the model, and extract the samples:

In [51]:
fit = sm.sampling(data=skill_data, iter=1000, chains=4)

In [52]:
samples = fit.extract()

If we just want the mean estimate for each player's skill level, just take the empirical average over the samples:

In [53]:
samples['skill'].shape # 2*100 iterations? 999 players

(2000, 999)

In [54]:
samples['skill'].mean(0)

array([ 4.48795467e+00,  4.60195088e+00,  6.50548668e+00,  7.92398680e+00,
        6.41166045e+00,  1.65989466e+00,  4.75076496e+00,  1.53405384e+00,
        4.93458909e+00,  6.13651338e+00,  2.72881283e-01,  2.24016544e+00,
        2.47912344e+00, -6.05355852e-01,  4.06058978e+00,  2.70026219e+00,
       -1.29230142e+00, -1.69399285e+00, -5.79801658e-01,  4.76320883e+00,
       -4.79595955e-01,  2.49739128e+00,  3.80551489e+00,  4.78848791e+00,
        5.91823840e+00,  2.03005243e+00,  6.59481822e+00,  3.07139844e-01,
        4.81771908e+00,  2.78312656e+00,  4.58836406e+00, -1.23819814e+00,
        8.67036061e-01,  6.32593681e+00,  3.80825951e-01,  6.38655908e+00,
       -8.23134025e-01,  8.40923040e-01,  5.53547400e+00,  3.95674832e+00,
        9.26210774e-01,  4.89395028e+00,  2.53673594e+00,  4.13053448e+00,
       -1.63441915e-01,  3.97474593e+00,  4.97191505e+00,  5.10961947e+00,
        4.26992355e+00,  2.93224059e+00,  5.45718552e-02,  1.04171844e+00,
        2.25452699e+00, -

If we want to predict which player will win, we might use a direct estimator of that quantity based on the sample values:

In [55]:
# Player 0 vs Player 1 prediction:
def logit(z): return 1./(1.+np.exp(-z))

# Use our model's win probability function (logistic of scaled difference)
#  using the predicted skill difference for each sample:
prob = logit( skill_data['scale']*(samples['skill'][:,0]-samples['skill'][:,1]) ).mean()

print(prob)

0.49186551789567473


Remember to save the prediction!

In [56]:
with open('skill_hat.pkl', 'wb') as f: 
    pickle.dump(samples['skill'], f)

## Sample Model Evaluation

In [57]:
skill_hat = pickle.load(open('skill_hat.pkl', 'rb'))

In [59]:
def load_valid_data(dir='data/', pKeep=1.0, nEdge=3, nKeep=5, opt='valid'):
    with open(dir+opt+'.csv', encoding='utf-8') as f:
        lines = f.read().split('\n')

    p = 0
    playerid = {}
    for i in range(len(lines)):
        csv = lines[i].split(',')
        if len(csv) != 10: 
            continue   # parse error or blank line
        player0,player1 = csv[1],csv[4]
        if player0 not in playerid:
            playerid[player0]=p
            p+=1
        if player1 not in playerid:
            playerid[player1]=p
            p+=1

    nplayers = len(playerid)
    playername = ['']*nplayers
    for player in playerid:
        playername[ playerid[player] ]=player  # id to name lookup


    # if validation, use all datapoints
    
    games = []
    nplays, nwins = np.zeros( (nplayers,nplayers) ), np.zeros( (nplayers,nplayers) )
    for i in range(len(lines)):
        csv = lines[i].split(',')
        if len(csv) != 10:
            continue   # parse error or blank line
            
        a,b = playerid[csv[1]],playerid[csv[4]]
        aw,bw = csv[2]=='[winner]',csv[5]=='[winner]' 
        
        if (np.random.rand() < pKeep):
            if (nplays[a,b] < nKeep) and ( ((nplays[a,:]>0).sum() < nEdge) or ((nplays[:,b]>0).sum() < nEdge) ):
            
                nplays[a,b] += 1
                nplays[b,a]+=1
                nwins[a,b] += aw
                nwins[b,a] += bw
    
    return nplayers, nplays, nwins, games


In [60]:
nplayers_val, nplays_val, nwins_val, games_val = load_valid_data()

In [61]:
print('summary: ', nplayers_val)
print(nplays_val.shape, nplays_val.sum())
print(nwins_val.shape, nwins_val.sum())
print('games', len(games_val))

summary:  999
(999, 999) 9536.0
(999, 999) 4785.0
games 0


In [62]:
def logit(z): return 1./(1.+np.exp(-z))

def prediction_loss(skill, nplayers, nplays, nwins, games):
    
    loss = 0.
    binary_loss = 0.
    for i in range(nplayers):
        for j in range(i+1, nplayers):
            if nplays[i, j] == 0:
                continue
            prob = nwins[i,j] / nplays[i,j]
            prob_hat = logit( skill_data['scale']*(skill[:,i]-skill[:,j]) ).mean()
            loss += np.abs(prob_hat - prob)
            binary_loss += np.logical_xor(prob_hat >= 0.5, prob >= 0.5)
    
    loss /= (nplays > 0).sum()/2
    binary_loss /= (nplays > 0).sum()/2
    
    return loss, binary_loss


In [63]:
loss, binary_loss = prediction_loss(skill_hat, nplayers_val, nplays_val, nwins_val, games_val)

In [64]:
loss, binary_loss

(0.42691740432366615, 0.3907351460221551)

In [65]:
np.argsort(skill_hat.mean(axis=0))

array([977, 954, 996, 982, 988, 388, 923, 912, 980, 995, 922, 994, 249,
       789, 958, 965, 942, 951, 959, 992, 566, 851, 933, 974, 973, 814,
       407, 518, 544, 986, 611, 908, 517, 489, 966, 862, 850, 941, 707,
       978, 701, 944, 864, 841, 938, 970, 793, 171,  65, 152, 907, 852,
       899, 112, 963, 389, 163, 553, 768, 807, 585, 673, 987, 881, 630,
       365, 893, 967, 363, 913, 755, 708, 868, 964, 706, 766, 110, 575,
       961, 854, 929, 776, 758, 643, 644, 917, 470, 834, 928, 950, 477,
       837, 687, 879, 552, 587, 505, 946, 955, 930, 129, 745, 651, 395,
       523, 971, 263, 921, 546, 686, 925, 972, 549, 763, 787, 836, 926,
       867, 329, 873, 991, 960, 654, 952, 534, 512, 386, 843, 819, 423,
       981, 919, 538, 883, 535,  17, 760, 715, 361, 195,  77, 653, 555,
       934, 592, 584, 387, 835, 937, 725, 740, 940, 859, 713, 623, 476,
       650, 187, 522, 285, 471, 190, 390, 797, 728, 353, 914, 113, 718,
       894, 430, 916, 260, 666, 831, 800, 375, 610, 632, 634, 91