## Predicting Final Round Team Compositions in Rainbow 6: Siege

#### Description of the dataset
The data used is an official datadump from Ubisoft, the developers of the game. This data was released after the fifth competitive season of gameplay (In keeping with the games' story as a counter-terrorism first-person shooter, the seasons have operation names. This particular one was named "Operation Velvet Shell"). For the project, the full datadump, featuring a detailed round-by-round breakdown of matches, was used. Each round is detailed from the perspective of each involved player, and includes details on the operator used, the exact loadout of the operator in that round, general measures of the player's skill, and specific match performance statistics, such as the number of kills, and whether they were killed in the round.

First, a subset of approximately 200,000 entries from the data will be obtained. This subset will be sampled in such a way as to preserve round/match groupings. Then, the data will be aggregated and reshaped in order to combine all per-player entries into a single entry per round, detailing the choices and statistics of all involved players. These per-round entries will be further aggregated into per game entries, detailing the choices and statistics of all involved players, in all rounds of that game. This aggregated form will be the final data used for training and predictions.

#### Description of the project
Using the aggregated subset described above, a Generative Adversarial Network (GAN) will be trained to predict the team compositions of the Blue and Orange teams in the final round of a match. Since there are a large number of factors affecting a player's choice of operator in the game, such as personal preference, player skill with specific operators, the choices of other team members, and counterpicks to opponent's favourites, GANs are perfect for this role.

#### Description of GANs
TODO, also cite one of the neat articles on the subject to impress Dr. Miller more when the network inevitably fails to work properly.

In [11]:
import pandas as pd
import dask.dataframe as dd
import numpy as np

import matplotlib.pyplot as plt

In [12]:
data = pd.read_csv("downsampled_datadump.csv")

In [13]:
# Check size of downsampled dataset. ~217,000 is adequately reduced.
data.shape

(217269, 31)

In [26]:
data.iloc[:5,:18]

Unnamed: 0,dateid,platform,gamemode,mapname,matchid,roundnumber,objectivelocation,winrole,endroundreason,roundduration,clearancelevel,skillrank,role,team,haswon,operator,nbkills,isdead
0,20170212,PC,PvP – HOSTAGE,CLUB HOUSE,1522380841,1,STRIP CLUB,Defender,AttackersKilledHostage,124,64,Gold,Defender,1,1,SWAT-CASTLE,0,0
1,20170212,PC,PvP – HOSTAGE,CLUB HOUSE,1522380841,4,CHURCH,Defender,AttackersEliminated,217,81,Gold,Defender,0,1,GSG9-JAGER,0,1
2,20170212,PC,PvP – HOSTAGE,CLUB HOUSE,1522380841,3,CHURCH,Defender,AttackersEliminated,160,150,Gold,Defender,1,1,JTF2-FROST,0,0
3,20170212,PC,PvP – HOSTAGE,CLUB HOUSE,1522380841,4,CHURCH,Defender,AttackersEliminated,217,94,Gold,Defender,0,1,BOPE-CAVEIRA,3,0
4,20170212,PC,PvP – HOSTAGE,CLUB HOUSE,1522380841,6,BEDROOM,Attacker,DefendersEliminated,143,81,Gold,Defender,0,0,GSG9-JAGER,0,1


### Restructuring the Data

Given the limitations of the multi-level index in pandas, the optimal way to restructure the data into per-match rows is to construct a new dataframe.

In [30]:
# The intermediate step of aggregating by round prior to aggregating into matches is no longer being used,
# but setting up the data dictionary for it was very helpful in figuring out how to structure the new
# dataframe of aggregated data.

# per_round = {"gamemode": [], "mapname": [], "matchid": [], "roundnumber": [], "objectivelocation": [],
#              "winrole": [], "endroundreason": [], "roundduration": [], "bluerole": [], "orangerole": [],
#              "teamwin": [], "blue1skill": [], "blue1level": [], "blue1kills": [], "blue1dead": [],
#              "blue1op": [], "blue1primary": [], "blue1secondary": [], "blue2skill": [],
#              "blue2level": [], "blue2kills": [], "blue2dead": [], "blue2op": [], "blue2primary": [],
#              "blue2secondary": [], "blue3skill": [], "blue3level": [], "blue3kills": [],
#              "blue3dead": [], "blue3op": [], "blue3primary": [], "blue3secondary": [],
#              "blue4skill": [], "blue4level": [], "blue4kills": [], "blue4dead": [], "blue4op": [],
#              "blue4primary": [], "blue4secondary": [], "blue5skill": [], "blue5level": [],
#              "blue5kills": [], "blue5dead": [], "blue5op": [], "blue5primary": [], "blue5secondary": [],
#              "orange1skill": [], "orange1level": [], "orange1kills": [], "orange1dead": [],
#              "orange1op": [], "orange1primary": [], "orange1secondary": [], "orange2skill": [],
#              "orange2level": [], "orange2kills": [], "orange2dead": [], "orange2op": [],
#              "orange2primary": [], "orange2secondary": [], "orange3skill": [], "orange3level": [],
#              "orange3kills": [], "orange3dead": [], "orange3op": [], "orange3primary": [],
#              "orange3secondary": [], "orange4skill": [], "orange4level": [], "orange4kills": [],
#              "orange4dead": [], "orange4op": [], "orange4primary": [], "orange4secondary": [],
#              "orange5skill": [], "orange5level": [], "orange5kills": [], "orange5dead": [],
#              "orange5op": [], "orange5primary": [], "orange5secondary": [],
#             }

# Arbitrary decision: Blue is 0, Orange is 1.

In [32]:
per_match = {"gamemode": [], "mapname": [], "matchid": []}
for r_num in range(1, 6):
    for detail in ["objective", "endreason", "duration", "bluerole", "orangerole", "winner", "winrole"]:
        per_match["".join(["round", str(r_num), detail])] = []
    for team in ["blue", "orange"]:
        for player in range(1,6):
            for field in ["level", "skill", "kills", "dead", "op", "primary", "secondary"]:
                per_match["".join(["round", str(r_num), team, str(player), field])] = []

In [33]:
per_match

{'gamemode': [],
 'mapname': [],
 'matchid': [],
 'round1blue1dead': [],
 'round1blue1kills': [],
 'round1blue1level': [],
 'round1blue1op': [],
 'round1blue1primary': [],
 'round1blue1secondary': [],
 'round1blue1skill': [],
 'round1blue2dead': [],
 'round1blue2kills': [],
 'round1blue2level': [],
 'round1blue2op': [],
 'round1blue2primary': [],
 'round1blue2secondary': [],
 'round1blue2skill': [],
 'round1blue3dead': [],
 'round1blue3kills': [],
 'round1blue3level': [],
 'round1blue3op': [],
 'round1blue3primary': [],
 'round1blue3secondary': [],
 'round1blue3skill': [],
 'round1blue4dead': [],
 'round1blue4kills': [],
 'round1blue4level': [],
 'round1blue4op': [],
 'round1blue4primary': [],
 'round1blue4secondary': [],
 'round1blue4skill': [],
 'round1blue5dead': [],
 'round1blue5kills': [],
 'round1blue5level': [],
 'round1blue5op': [],
 'round1blue5primary': [],
 'round1blue5secondary': [],
 'round1blue5skill': [],
 'round1bluerole': [],
 'round1duration': [],
 'round1endreason': 