<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Dependencies" data-toc-modified-id="Dependencies-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Dependencies</a></span></li><li><span><a href="#Reading-the-csv-file" data-toc-modified-id="Reading-the-csv-file-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Reading the csv file</a></span></li><li><span><a href="#Calculating-the-features" data-toc-modified-id="Calculating-the-features-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Calculating the features</a></span><ul class="toc-item"><li><span><a href="#Features" data-toc-modified-id="Features-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Features</a></span></li><li><span><a href="#Graph-structure" data-toc-modified-id="Graph-structure-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Graph structure</a></span></li><li><span><a href="#Construct-features-DataFrame" data-toc-modified-id="Construct-features-DataFrame-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Construct features DataFrame</a></span></li></ul></li><li><span><a href="#Split-Training-and-test-data" data-toc-modified-id="Split-Training-and-test-data-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Split Training and test data</a></span></li><li><span><a href="#Train" data-toc-modified-id="Train-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Train</a></span><ul class="toc-item"><li><span><a href="#Normal-equation-to-get-theta" data-toc-modified-id="Normal-equation-to-get-theta-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Normal equation to get theta</a></span></li></ul></li><li><span><a href="#Implementing-theta-to-calculate-predictions" data-toc-modified-id="Implementing-theta-to-calculate-predictions-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Implementing theta to calculate predictions</a></span></li></ul></div>

# Machine learning prediction algorithm for CS:GO matches

## Dependencies

In [49]:
import pandas as pd
import numpy as np
from datetime import date
from collections import deque


## Reading the csv file
The file results14_10_2018.csv contains all results of professional csgo matches starting from 30-06-2017 to 14-10-2018.

In [3]:
resultsdf = pd.read_csv('../Datafiles/csv/results14_10_2018_csgo.csv', index_col=0, encoding='latin1')
resultsdf.sort_values(by='date', ascending=True, inplace=True)
resultsdf.reset_index(drop=True, inplace=True)

## Calculating the features
The features will be saved in a pandas dataframe. The results will one by one be loaded into a graph, and the features will be calculated before adding the result to the graph.
### Features
 - x<sub>1</sub>: +/- last shared match 
 - x<sub>2</sub>: +/- last 3 shared matches
 - x<sub>3</sub>: Relative momentum = (wins last ten results) - (wins last ten results of opponent)
 - x<sub>4</sub>: Less Simple Algorithm score
 - x<sub>5</sub>: x<sub>1</sub>x<sub>2</sub>
 - x<sub>6</sub>: x<sub>1</sub>x<sub>3</sub>
 - x<sub>7</sub>: x<sub>1</sub>x<sub>4</sub>
 - x<sub>8</sub>: x<sub>2</sub>x<sub>3</sub>
 - x<sub>9</sub>: x<sub>2</sub>x<sub>4</sub>
 - x<sub>10</sub>: x<sub>3</sub>x<sub>4</sub>

### Graph structure

In [94]:
class Graph:
    def __init__(self):
        #dictionary of {teamname: Vertex object}
        self.vertices = dict()
        
    # team = string
    def addTeam(self, team):
        if team not in self.vertices:
            self.vertices[team] = Vertex(team)
        return self.vertices[team]
    
    #team = string
    def getTeam(self, team):
        if team in self.vertices:
            return self.vertices[team]
        
        

class Vertex:
    def __init__(self, name):
        self.name = name
        #dictionary of {teamname: Edge object}
        self.edges = dict()
        self.lastten = []
        
    def getEdges(self):
        return self.edges
    
    #team = string
    def hasPlayed(self, team):
        return team in self.edges
    
    #result = Result object
    def addToLastTen(self, result):
        if len(self.lastten) == 10:
            self.lastten.pop(0)
        self.lastten.append(result)       
    
    # Will only be called on the winner of a result
    def addResult(self, result):
        opponent = result.getLoser() # Vertex object
        if opponent.toString() in self.edges:
            self.edges[opponent.toString()].addResult(result)
        else:
            newEdge = Edge()
            newEdge.addResult(result)
            self.edges[opponent.toString()] = newEdge
            opponent.addEdge(self.toString(), newEdge)
        self.addToLastTen(result)
        opponent.addToLastTen(result)
    
    #team = string, edge = Edge object
    def addEdge(self,team, edge):
        self.edges[team] = edge
    
    #opponent = string
    def getSimpleAlgorithmScore(self, opponent):
        score = 0
        if self.hasPlayed(opponent):
            for result in self.edges[opponent].getResults():
                if result.isWinner(self.toString()):
                    score = score + result.getDif()
                else:
                    score = score - result.getDif()
        return int((score/len(self.edges[opponent].getResults()))*100)
    
    
    #opponent = Vertex object      // Feature
    def getRelativeMomentum(self, opponent):
        return self.getMomentum() - opponent.getMomentum()
    
    #opponent = string      // Feature & Feature
    def getSharedResults(self, opponent):
        lastOne = None
        lastThree = None
        if opponent in self.edges:
            lastOne = self.edges[opponent].getLastResultDif(self.toString())
            lastThree = self.edges[opponent].getLastThreeResultsDif(self.toString())
        return (lastOne, lastThree)
        
    def getMomentum(self):
        momentum = 0
        for result in self.lastten:
            if result.isWinner(self.toString()):
                momentum = momentum + 1
        return momentum
    
    def toString(self):
        return self.name
        
        
class Edge:
    #opponent = Vertex object
    def __init__(self):
        self.results = []
        
    def addResult(self, result):
        self.results.append(result)
        
    def getResults(self):
        return self.results
    
    def getLastResultDif(self, team1):
        if self.results[-1].isWinner(team1):
            return self.results[-1].getDif()
        else:
            return -self.results[-1].getDif()
    
    def getLastThreeResultsDif(self, team1):
        returnable = 1
        if (len(self.results) > 2):
            lastThree = self.results[-3:]
            for result in lastThree:
                if result.isWinner(team1):
                    returnable = returnable + result.getDif()
                else:
                    returnable = returnable - result.getDif()
            return returnable
        else:
            return None
        
        
class Result:
    #date = date object, winner & loser = Vertex object, dif = positive int, playedMap = string
    def __init__(self, winner, loser, dif, dateResult, playedMap):
        self.winner = winner
        self.loser = loser
        self.dif = dif
        self.dateResult = dateResult
        self.playedMap = playedMap
    
    def getDif(self):
        return self.dif
    
    def getDate(self):
        return self.dateResult
    
    #return Vertex object
    def getLoser(self):
        return self.loser
    
    
    #team = string
    def isWinner(self, team):
        return team == self.winner.toString()
    
    def __ge__(self, other):
        return self.dateResult > other.getDate()

### Construct features DataFrame

In [95]:
#team1 = Vertex object, team2 = Vertex object       // Feature
def getLessSimpleAlgorithmScore(team1, team2, graph):
    score = 0
    divider = 1
    for key in team1.getEdges():
        sharedOpponent = graph.getTeam(key)
        if sharedOpponent.hasPlayed(team2.toString()):
            score = score + team1.getSimpleAlgorithmScore(sharedOpponent.toString())
            score = score + sharedOpponent.getSimpleAlgorithmScore(team2.toString())
            divider = divider + 1
    return int((score/divider))

def getDate(string):
    year = int(string[:4])
    month = int(string[5:7])
    day = int(string[-2:])
    dateObject = date(year, month, day)
    return dateObject

In [98]:
# Running time about 20s
graph = Graph()
columns = ['matchcode', 'x0', 'x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'x7', 'x8', 'x9', 'x10', 'y']
featureFrame = pd.DataFrame(columns=columns)
for index, row in resultsdf.iterrows():
    
    #Get Vertex objects
    team1 = graph.addTeam(row['team1'])
    team2 = graph.addTeam(row['team2'])
    
    #Get Result object
    #y = True if team1 is winner, False if team2 is winner
    y = row['score1'] > row['score2']
    winner = team1 if y else team2
    loser = team2 if y else team1
    dif = abs(row['score1'] - row['score2'])
    dateResult = getDate(row['date'])
    result = Result(winner=winner, loser=loser, dateResult=dateResult, dif=dif, playedMap=row['map'])
    
    #Get Features
    (x1, x2) = team1.getSharedResults(team2.toString())
    if(x1 is not None and x2 is not None):
        x3 = team1.getRelativeMomentum(team2)
        x4 = getLessSimpleAlgorithmScore(team1, team2, graph)
        x5 = x1*x2
        x6 = x1*x3
        x7 = x1*x4
        x8 = x2*x3
        x9 = x2*x4
        x10 = x3*x4
        featureFrame = featureFrame.append({'matchcode': row['matchcode'], 'x0':1, 'x1':x1,'x2':x2, 'x3':x3,'x4':x4, 'x5':x5,
                                        'x6':x6, 'x7':x7, 'x8':x8, 'x9':x9, 'x10':x10, 'y':int(y)*100}, ignore_index=True)
    #Add result to edge
    winner.addResult(result)

In [99]:
display(featureFrame)

Unnamed: 0,matchcode,x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,y
0,2312194,1,3,-5,-1,0,-15,-3,0,5,0,0,0
1,2312030,1,-2,-20,-2,-16666,40,4,33332,40,333320,33332,100
2,2312030,1,2,-8,-2,-16666,-16,-4,-33332,16,133328,33332,100
3,2312275,1,14,36,5,0,504,70,0,180,0,0,100
4,2312249,1,-4,-26,-5,-57766,104,20,231064,130,1501916,288830,0
5,2312405,1,4,17,10,0,68,40,0,170,0,0,0
6,2312160,1,-10,-11,-4,63333,110,40,-633330,44,-696663,-253332,100
7,2312451,1,-6,-25,-1,0,150,6,0,25,0,0,0
8,2312512,1,4,-10,-5,0,-40,-20,0,50,0,0,0
9,2312492,1,-12,-10,-2,0,120,24,0,20,0,0,100


## Split Training and test data
Turn the DataFrame with all the features into (train&test) matrixes and result vectors. 

In [106]:
#Decide length of train & test matrixes and result vectors
trainingSplit = 85
n = len(featureFrame['x0'])
nTraining = int((n/100)*trainingSplit)
nTest = n - nTraining

#Split featureFrame in training and test frames
trainingFrame = featureFrame.head(nTraining)
testFrame = featureFrame.tail(nTest)

#Get result vectors
testY = testFrame['y'].values.tolist()
trainY = trainingFrame['y'].values.tolist()

#Drop columns "matchcode" and 'y' so that the resulting dataframe
#can be converted to a matrix with a simple command: values.tolist()
testFrame.drop(['matchcode', 'y'], axis=1, inplace=True)
trainingFrame.drop(['matchcode', 'y'], axis=1, inplace=True)

#Get Training en Test matrices
testX = testFrame.values.tolist()
trainX = trainingFrame.values.tolist()


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


## Train

### Normal equation to get theta

In [107]:
def normalEquation(X, y):
    xT = np.transpose(X)
    xTx = xT.dot(X)
    XtX = np.linalg.inv(xTx)
    XtX_xT = XtX.dot(xT)
    theta = XtX_xT.dot(y)
    return(theta)

In [108]:
theta = normalEquation(trainX, trainY)
print(theta)

[  5.04750724e+01   4.56107058e-01   1.15378138e+00   1.03075957e+01
  -6.27122558e-03   1.48626648e-01  -1.73527224e+00  -6.36985047e-04
  -4.24093132e-01  -5.97138718e-04   6.90101745e-03]


## Implementing theta to calculate predictions

In [109]:
for i in range (0, nTest):
    print (theta.dot(testX[i]), testY[i])

-4.46647025276 100
1.40564148632 0
-182.498783199 0
667.457354996 100
645.743207793 100
188.316322028 0
88.4353435142 0
46.6284269411 100
104.072638862 100
604.428880528 0
81.8219892733 0
13.9640282636 0
74.1865027479 0
47.1772736998 0
14.1532749081 100
-1300.89816093 100
-33.9912691421 100
-15.3379853905 100
-592.268486433 100
408.77458663 100
123.152379685 0
38.730030328 100
121.185510368 0
12.1154151484 0
-18.2313766798 0
-554.605953477 0
-665.757847899 0
-56.6649376381 0
476.29462116 0
433.720142917 100
572.617193659 0
-113.661661946 0
-199.335787457 0
249.903821902 100
46.3390722212 100
80.2451655485 100
1088.68759317 0
1182.14467478 100
1204.9287294 100
595.632008632 100
198.380504262 100
289.350531089 100
555.365406504 100
2627.24220286 0
-1594.55689159 100
-1211.77770246 100
325.005794819 0
-210.167323867 100
-1011.73149707 100
-1044.98248834 0
-304.402093093 0
-160.078581058 0
-352.777828589 100
1237.44751653 0
-1053.19170424 0
-328.245139596 0
213.435260783 100
-497.405064148