Now that we have a dataset, we can start to do some actual analysis. I'm going to be attempting to replicate the methodology of this paper:

Sapienza, Anna and Goyal, Palash and Ferrara, Emilio. Deep Neural Networks for Optimal Team Composition. Frontiers in Big Data, vol 2. Jun 2019. https://arxiv.org/abs/1805.03285 

While roller derby and esports games like League of Legends obviously are very different, in many ways, they can be treated similarly- each League match and individual jam of a derby bout consists of a team of 5 players with different defined roles attempting to achieve an objective while slowing the opposing team's attempt to achieve theirs.

A derby bout (game) consists of a series of many individual jams. Each team forwards a defensive line of four "blockers" and an offensive line of one "jammer". The jammer scores points by passing through the "pack" of blockers- one initial non-scoring pass through the pack is required, and then one point is earned for each of the opposing team's blockers that the jammer passes on subsequent laps. Each jam can run for a set amount of time, but the jammer that is the first to complete the non-scoring pass ("lead jammer") can choose to end the jam early. In addition, the jammer can hand off their jammer status to one special blocker on each team called a "pivot" by passing the special helmet cover that the jammer wears. This is the general gist of the sport- in many ways, it's similar to the playground game "Red Rover", but on wheels.

Naturally, when the blockers try to stop the jammer, things can get scrappy! Various penalties are given when a player shoves another in an illegal manner, when a blocker strays too far from the pack, when a player goes out of bounds, when a blocker makes an illegal formation (such as linking arms with another blocker), etc. It's general "derby wisdom" that certain penalties are more common "new-skater" penalties, while the distribution of penalties changes with skill. We can test this!


Let's pick a team. I'll use the Kalamazoo Derby Darlins, the team I've announced for for the past few years. 

In this analysis, I'm going to make some assumptions.
-First, that the fundamental unit of derby is not the bout, but the jam. Each jam is unique, and may have starting conditions determined by the preceding jam, but ultimately, for the purposes of this analysis, the only influence jam 1 may have on a jam like jam 20 is player stamina (N.B.: sometimes players can still be in the penalty box from previous jams, so this is not strictly correct! but it's probably correct enough for what we'd like to test here). This means that I will update a player's "rating" each jam rather than each bout.

-Second, that the "figure of merit" to determine the performance of a jammer is the total number of points they score in a jam, but that the "figure of merit" to determine the performance of a blocker line is the difference between their jammer's score and the opposing jammer's score. A good blocker line is able to slow the opposing jammer substantially while also letting their own through.

-Third: the rules of roller derby change often, as the sport is still relatively new. For instance- at one point, jammers scored an additional point for passing the opposing team's jammer as well as blockers.
    

In [15]:
import requests
import pandas as pd
import numpy as np
import trueskill
from bs4 import BeautifulSoup
from itertools import product
from urllib.request import urlopen
import networkx as nx
from networkx.drawing.nx_agraph import to_agraph 
import matplotlib.pyplot as plt
import pylab

import nbimporter
import Webscraper as wsc


teamID=str(3637)
teamName='Killamazoo'

In [16]:
#First, get the lineups for each jam KDD has stats available for.
AllLineups = wsc.GetAllLineups(teamID, teamName)

# Also, get expanding average of score differentials for each jam. We'll use a player's
# average score differential after a given jam as a proxy for their skill ranking as measured
# after playing that jam.

AllAvgs = wsc.ExpandingAverages(teamID, teamName)
badjams,badblockers = wsc.GetBadJamsAndBlockers(teamID, teamName,12)
print(badjams)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 39, 41, 42, 45, 47, 49, 54, 56, 59, 60, 64, 68, 72, 80, 84, 86, 87, 88, 89, 90, 91, 92, 94, 95, 96, 97, 98, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 113, 114, 115, 116, 117, 119, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 155, 156, 157, 158, 159, 160, 161, 163, 167, 168, 169, 171, 172, 173, 174, 175, 176, 178, 180, 181, 182, 183, 184, 185, 186, 187, 188, 191, 192, 193, 195, 199, 212, 214, 215, 216, 217, 218, 219, 220, 222, 223, 224, 225, 226, 228, 230, 232, 234, 236, 261, 262, 296, 302, 306, 355, 357, 359, 361, 363, 364, 365, 366, 369, 373, 374, 375, 395, 397, 398, 399, 403, 405, 407, 409, 411, 414, 415, 417, 418, 419, 420, 421, 422, 424, 425, 427, 428, 430, 432, 434, 435, 440, 442, 444, 450, 452, 456, 457, 458

Let's only look at blockers for now, since they interact most closely with each other. Matching jammers to blocker lines is a different question than composing the lines themselves, since interplay is different.

In [17]:
print(AllLineups, AllAvgs)

              Jammer     Jstats              B1              B2  \
0         Beaver Jam      Lead   Aly-Kate Co...  Wreck Keene...   
1       Buns N Roses          0  Painbow Con...  Smash Bandi...   
2        Weers Waldo          0  Painbow Con...  Aly-Kate Co...   
3      Hill-De-Beast          0  Smash Bandi...   Beverly Hells   
4         Beaver Jam          0  Smash Bandi...  Wreck Keene...   
...              ...        ...             ...             ...   
1063   Beverly Hells          0  Sparkills (...  Rosie Feroc...   
1064  Delilah Danger      Lead          Javelin  Ivanna O'Bl...   
1065  Rosie Feroc...          0       Lady Hawk   Noam Stompsky   
1066  Sparkills (...  LeadLoss   Ivanna O'Bl...  Ramona D. F...   
1067   Beverly Hells          0         Javelin  Delilah Danger   

                  B3              B4  jamscore  runscore  ScoreDiff  
0            Mustang         Javelin        10        10       10.0  
1     Maggie Walters  Ophelia Plenty         0        1

Next, let's build the short term play network described in the paper. 

In [30]:
blockerlines = AllLineups[['B1', 'B2', 'B3', 'B4']]
#print(blockerlines)

STjams=[]
for jamnum in range(len((blockerlines.index))):
    
    if (jamnum in badjams): continue
    G = nx.complete_graph(4, nx.DiGraph())
    blockers = blockerlines.iloc[jamnum].to_list()
    mapping = dict(zip(G, blockers))
    G = nx.relabel_nodes(G, mapping)
    
    for edge in G.edges():
        weight = AllAvgs.iloc[jamnum][edge[0]]-AllAvgs.iloc[jamnum-1][edge[0]]
        #print(weight)
        G[edge[0]][edge[1]]['weight'] = weight
        STjams.append(G)
        nx.write_weighted_edgelist(G, "Data/"+teamID+str(jamnum)+".edgelist")

In [25]:
#use this function to combine the graphs
def combined_graphs_edges(G, H):
    for u,v,hdata in H.edges(data=True):
        # get data from G or use empty dict if no edge in G
        gdata = G[u].get(v,{})
        # add data from g
        # sum shared items
        shared = set(gdata) & set(hdata)
        attr.update(dict((key, attr[key] + gdata[key]) for key in shared))
        # non shared items
        non_shared = set(gdata) - set(hdata)
        attr.update(dict((key, gdata[key]) for key in non_shared))
        yield u,v,attr
    return

In [31]:
STGraph = nx.null_graph()
for jam in STjams:
    tempgraph = nx.null_graph()
    tempgraph.add_edges_from(combined_graphs_edges(jam, STGraph))
    STGraph = tempgraph

print(STGraph.edges())    


[('Wreck Keene Ball', 'Ophelia Plenty'), ('Wreck Keene Ball', 'Maggie Walters'), ('Wreck Keene Ball', 'Smash Bandicute'), ('Ophelia Plenty', 'Wreck Keene Ball'), ('Ophelia Plenty', 'Maggie Walters'), ('Ophelia Plenty', 'Smash Bandicute'), ('Maggie Walters', 'Wreck Keene Ball'), ('Maggie Walters', 'Ophelia Plenty'), ('Maggie Walters', 'Smash Bandicute'), ('Smash Bandicute', 'Wreck Keene Ball'), ('Smash Bandicute', 'Ophelia Plenty'), ('Smash Bandicute', 'Maggie Walters')]
[('Wreck Keene Ball', 'Ophelia Plenty'), ('Wreck Keene Ball', 'Maggie Walters'), ('Wreck Keene Ball', 'Smash Bandicute'), ('Ophelia Plenty', 'Wreck Keene Ball'), ('Ophelia Plenty', 'Maggie Walters'), ('Ophelia Plenty', 'Smash Bandicute'), ('Maggie Walters', 'Wreck Keene Ball'), ('Maggie Walters', 'Ophelia Plenty'), ('Maggie Walters', 'Smash Bandicute'), ('Smash Bandicute', 'Wreck Keene Ball'), ('Smash Bandicute', 'Ophelia Plenty'), ('Smash Bandicute', 'Maggie Walters')]
[('Wreck Keene Ball', 'Ophelia Plenty'), ('Wreck K

[('Crashive Aggressiv...', 'Lily St. Smear'), ('Crashive Aggressiv...', 'Javelin'), ('Crashive Aggressiv...', 'Neva'), ('Lily St. Smear', 'Crashive Aggressiv...'), ('Lily St. Smear', 'Javelin'), ('Lily St. Smear', 'Neva'), ('Javelin', 'Crashive Aggressiv...'), ('Javelin', 'Lily St. Smear'), ('Javelin', 'Neva'), ('Neva', 'Crashive Aggressiv...'), ('Neva', 'Lily St. Smear'), ('Neva', 'Javelin')]
[('Crashive Aggressiv...', 'Lily St. Smear'), ('Crashive Aggressiv...', 'Javelin'), ('Crashive Aggressiv...', 'Neva'), ('Lily St. Smear', 'Crashive Aggressiv...'), ('Lily St. Smear', 'Javelin'), ('Lily St. Smear', 'Neva'), ('Javelin', 'Crashive Aggressiv...'), ('Javelin', 'Lily St. Smear'), ('Javelin', 'Neva'), ('Neva', 'Crashive Aggressiv...'), ('Neva', 'Lily St. Smear'), ('Neva', 'Javelin')]
[('Crashive Aggressiv...', 'Lily St. Smear'), ('Crashive Aggressiv...', 'Javelin'), ('Crashive Aggressiv...', 'Neva'), ('Lily St. Smear', 'Crashive Aggressiv...'), ('Lily St. Smear', 'Javelin'), ('Lily St. 

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

