In [1]:
import DataFunctions
import numpy as np

Overall idea of simulation: Do pairings taking into consideration current score (win-lose rate) and their geographical location. It will use an adaptation of the Swiss Pairing System, the main difference is that instead of participants having a ELO score or rating and being paired by it; we consider an objective function on the total distance travelled and find proper pairings that minimize it. 

Tentative Objective function: $\sum (d_i)^2 + max(d_i)^2$. This function uses L2 norm and a L2 regularization term. We have to consider a regularization term as it helps us to avoid having multiple low distances and 1 large distance. For simplicity we can remove the regularization to trial the algorithm, and then regularization.

Rules per each round:
1. Split teams by their current score (#wins)
2. Pair in groups from the highest scoring group to lowest scoring group.
3. Per each group, find the group of pairs that minimize the objective function.
4. If there are odd # of teams, demote unpaired team into next scoring group. If it has been demoted before, use the next possible pairing that includes that team.
5. If there exist unpaired teams, demote them into the next scoring group. 
6. After making all possible pairings, set distance of teams that are playing together to inf.
7. For each pairing, if both teams have the same # of home games, pick one to be home and one to be away randomly. If one team has more # home games than the other, pick the team with the least # home games to be home, the other to be the away.

Things to store in db:
1. Each team 2022 stats.
2. Each team # (from 0 to 131).
3. Simulated season pairings, w-l record and h-a count.
4. Simulated games and scores.

Simulating each pairing:
1. Use 2022 stats to predict a spread and totalpts. We use those as means.
2. Using a normal distribution on the spread and totalpts with the standard deviation found on the model. We randomly generate a result with the simulated spread and totalpts.
3. We use obtained values to create an score.

In [2]:
#creating random distance matrix of size = 8
nteams = 8

m_dist = np.around(np.random.uniform(1,10,size=(nteams,nteams)),decimals=3)

for i in range(nteams):
    for j in range(nteams):
        m_dist[j,i]=m_dist[i,j]
    m_dist[i,i]=0

m_dist

array([[0.   , 1.078, 6.913, 5.406, 2.767, 4.544, 1.206, 6.488],
       [1.078, 0.   , 2.796, 2.132, 1.478, 9.621, 7.083, 5.133],
       [6.913, 2.796, 0.   , 7.162, 9.279, 6.092, 7.117, 1.085],
       [5.406, 2.132, 7.162, 0.   , 7.261, 8.916, 3.25 , 9.203],
       [2.767, 1.478, 9.279, 7.261, 0.   , 1.818, 1.688, 2.682],
       [4.544, 9.621, 6.092, 8.916, 1.818, 0.   , 3.154, 4.136],
       [1.206, 7.083, 7.117, 3.25 , 1.688, 3.154, 0.   , 6.024],
       [6.488, 5.133, 1.085, 9.203, 2.682, 4.136, 6.024, 0.   ]])

In [3]:
#storing wins per team.
curr_data = np.zeros(shape=(nteams,3),dtype=int)
curr_data[:,0] = np.arange(nteams)
curr_data

array([[0, 0, 0],
       [1, 0, 0],
       [2, 0, 0],
       [3, 0, 0],
       [4, 0, 0],
       [5, 0, 0],
       [6, 0, 0],
       [7, 0, 0]])

In [4]:
#splitting in groups (need function here)
sim1group0 = np.array([x[0].astype(int) for x in curr_data if x[1]==0])
# sim1group1 = [[x[0].astype(int),0] for x in curr_data if x[1]==1]
sim1group0

array([0, 1, 2, 3, 4, 5, 6, 7])

We cannot find all pairs and run for all of them. That operation has time complexity O(n!) and for 130ish teams is not feasible as it is about $10^{220}$. We need to use another approach.<br>
Fastest approach is done by sorting all possible distances, then picking the suitable one and remove unsuitables from the list. We only need to sort once since at the beginning all teams have to play. We should also count the # of possible matches and use that to do the sorting as well.<br><br>

Algorithm procedure using networkx and Blossom algorithm (also known as Edmonds' algorithm)

In [5]:
import networkx as nx

In [6]:
L = []
G = nx.Graph()

for i in sim1group0:
    for j in sim1group0:
        if i<j: 
            L.append((i,j,m_dist[i,j]))

In [7]:
G.add_weighted_edges_from(L)

In [8]:
G.edges.data("weight")

EdgeDataView([(0, 1, 1.078), (0, 2, 6.913), (0, 3, 5.406), (0, 4, 2.767), (0, 5, 4.544), (0, 6, 1.206), (0, 7, 6.488), (1, 2, 2.796), (1, 3, 2.132), (1, 4, 1.478), (1, 5, 9.621), (1, 6, 7.083), (1, 7, 5.133), (2, 3, 7.162), (2, 4, 9.279), (2, 5, 6.092), (2, 6, 7.117), (2, 7, 1.085), (3, 4, 7.261), (3, 5, 8.916), (3, 6, 3.25), (3, 7, 9.203), (4, 5, 1.818), (4, 6, 1.688), (4, 7, 2.682), (5, 6, 3.154), (5, 7, 4.136), (6, 7, 6.024)])

In [9]:
matchings1g0 = nx.algorithms.matching.min_weight_matching(G)

In [10]:
list(matchings1g0)

[(4, 5), (1, 3), (2, 7), (0, 6)]

In [11]:
#checking 
m_dist[3,7],m_dist[1,2],m_dist[0,4],m_dist[5,6]

(9.203, 2.796, 2.767, 3.154)

In [12]:
LS = sorted(np.array(L)[:,2])
print(LS)

[1.078, 1.085, 1.206, 1.478, 1.688, 1.818, 2.132, 2.682, 2.767, 2.796, 3.154, 3.25, 4.136, 4.544, 5.133, 5.406, 6.024, 6.092, 6.488, 6.913, 7.083, 7.117, 7.162, 7.261, 8.916, 9.203, 9.279, 9.621]


In [13]:
set(sim1group0)-set(np.array(list(matchings1g0)).flatten()) 
# to find which one goes to the next group in odd cases 
# or those unpairable teams (team has played every other team)

set()

In [14]:
#merging all pairings
pairingssim1 = []
pairingssim1 +=matchings1g0
pairingssim1

#simulating matches
for g in pairingssim1:
    #home and away status
    if curr_data[g[0],2] == curr_data[g[1],2]:
        curr_data[g[np.random.randint(2)],2]+=1
    elif curr_data[g[0],2] > curr_data[g[1],2]:
        curr_data[g[1],2]+=1
    else:
        curr_data[g[0],2]+=1
    
    #simulating match
    if(np.random.randint(2)):
        curr_data[g[0],1] += 1
        print(g[0],"won")
        print(g[1],"lose")
    else:
        curr_data[g[1],1] += 1
        print(g[1],"won")
        print(g[0],"lose")
    m_dist[g[0],g[1]]=0
    m_dist[g[1],g[0]]=0

4 won
5 lose
1 won
3 lose
7 won
2 lose
0 won
6 lose


In [15]:
curr_data

array([[0, 1, 1],
       [1, 1, 0],
       [2, 0, 0],
       [3, 0, 1],
       [4, 1, 0],
       [5, 0, 1],
       [6, 0, 0],
       [7, 1, 1]])

In [16]:
#week 2
sim2 = [np.array([x[0].astype(int) for x in curr_data if x[1]==1]),
        np.array([x[0].astype(int) for x in curr_data if x[1]==0])]
sim2

[array([0, 1, 4, 7]), array([2, 3, 5, 6])]

In [17]:
Gs2g1 = G.subgraph(sim2[1])
Gs2g0 = G.subgraph(sim2[0])

In [18]:
matchings2g1 = nx.algorithms.matching.min_weight_matching(Gs2g1)
matchings2g0 = nx.algorithms.matching.min_weight_matching(Gs2g0)

In [19]:
list(matchings2g1),list(matchings2g0)

([(6, 3), (5, 2)], [(1, 0), (4, 7)])

In [20]:
#merging all pairings
pairingssim2 = []
pairingssim2 += matchings2g1
pairingssim2 += matchings2g0

#simulating matches
for g in pairingssim2:
    #home and away status
    if curr_data[g[0],2] == curr_data[g[1],2]:
        curr_data[g[np.random.randint(2)],2]+=1
    elif curr_data[g[0],2] > curr_data[g[1],2]:
        curr_data[g[1],2]+=1
    else:
        curr_data[g[0],2]+=1
    
    #simulating match
    if(np.random.randint(2)):
        curr_data[g[0],1] += 1
        print(g[0],"won")
        print(g[1],"lose")
    else:
        curr_data[g[1],1] += 1
        print(g[1],"won")
        print(g[0],"lose")
    m_dist[g[0],g[1]]=0
    m_dist[g[1],g[0]]=0

3 won
6 lose
2 won
5 lose
1 won
0 lose
7 won
4 lose


In [21]:
curr_data

array([[0, 1, 1],
       [1, 2, 1],
       [2, 1, 1],
       [3, 1, 1],
       [4, 1, 1],
       [5, 0, 1],
       [6, 0, 1],
       [7, 2, 1]])

In [23]:
#week 3
sim3 = [np.array([x[0].astype(int) for x in curr_data if x[1]==2]),
        np.array([x[0].astype(int) for x in curr_data if x[1]==1]),
        np.array([x[0].astype(int) for x in curr_data if x[1]==0])]
sim3

[array([1, 7]), array([0, 2, 3, 4]), array([5, 6])]

In [24]:
Gs3g2 = G.subgraph(sim3[2])
Gs3g1 = G.subgraph(sim3[1])
Gs3g0 = G.subgraph(sim3[0])

In [25]:
matchings3g2 = nx.algorithms.matching.min_weight_matching(Gs3g2)
matchings3g1 = nx.algorithms.matching.min_weight_matching(Gs3g1)
matchings3g0 = nx.algorithms.matching.min_weight_matching(Gs3g0)

In [26]:
#merging all pairings
pairingssim3 = []
pairingssim3 += matchings3g2
pairingssim3 += matchings3g1
pairingssim3 += matchings3g0

#simulating matches
for g in pairingssim2:
    #home and away status
    if curr_data[g[0],2] == curr_data[g[1],2]:
        curr_data[g[np.random.randint(2)],2]+=1
    elif curr_data[g[0],2] > curr_data[g[1],2]:
        curr_data[g[1],2]+=1
    else:
        curr_data[g[0],2]+=1
    
    #simulating match
    if(np.random.randint(2)):
        curr_data[g[0],1] += 1
        print(g[0],"won")
        print(g[1],"lose")
    else:
        curr_data[g[1],1] += 1
        print(g[1],"won")
        print(g[0],"lose")
    m_dist[g[0],g[1]]=0
    m_dist[g[1],g[0]]=0

6 won
3 lose
2 won
5 lose
1 won
0 lose
4 won
7 lose


In [27]:
curr_data

array([[0, 1, 2],
       [1, 3, 1],
       [2, 2, 1],
       [3, 1, 2],
       [4, 2, 2],
       [5, 0, 2],
       [6, 1, 1],
       [7, 2, 1]])

In [None]:
# now here we need the use of functions to be able to add TEAM 1 to group 2 and TEAM 5 to group 1 as they cannot be paired by themselves.
# for the first group, if there are issues then demote them, for the last group they have to ascend as the groups in the middle are larger.

In [22]:
# pending: make everything to functions.
# pending: Think on how to store games and their outcomes
# either database or simulated games on db and team record or game id.