# Use decision optimization to help a sports league schedule its games

## The business problem:  Games Scheduling in the National Football League 


#### Use decision optimization

### Step 1: Import the library

Run the following code to import the Decision Optimization CPLEX Modeling library.  The *DOcplex* library contains the two modeling packages, Mathematical Programming (docplex.mp) and Constraint Programming (docplex.cp).

In [1]:
import sys
#import cplex
import docplex.mp
import pandas as pd

url = 'https://api-oaas.docloud.ibmcloud.com/job_manager/rest/v1/'
key = None

### Step 2: Model the data

In [2]:
# Teams in 1st division
team_div1 = ["Baltimore Ravens","Cincinnati Bengals", "Cleveland Browns","Pittsburgh Steelers","Houston Texans",
                "Indianapolis Colts","Jacksonville Jaguars","Tennessee Titans","Buffalo Bills","Miami Dolphins",
                "New England Patriots","New York Jets","Denver Broncos","Kansas City Chiefs","Oakland Raiders",
                "San Diego Chargers"]

# Teams in 2nd division
team_div2 = ["Chicago Bears","Detroit Lions","Green Bay Packers","Minnesota Vikings","Atlanta Falcons",
                "Carolina Panthers","New Orleans Saints","Tampa Bay Buccaneers","Dallas Cowboys","New York Giants",
                "Philadelphia Eagles","Washington Redskins","Arizona Cardinals","San Francisco 49ers",
                "Seattle Seahawks","St. Louis Rams"]

In [3]:
#number_of_matches_to_play = 2  # Number of match to play between two teams on the league


# Schedule parameters
nb_teams_in_division = 16
#max_teams_in_division = 16
number_of_matches_inside_division = 1
number_of_matches_outside_division = 1

In [4]:


team1 = pd.DataFrame(team_div1)
team2 = pd.DataFrame(team_div2)
team1.columns = ["AFC"]
team2.columns = ["NFC"]

teams = pd.concat([team1,team2], axis=1)
teams.head(16)

Unnamed: 0,AFC,NFC
0,Baltimore Ravens,Chicago Bears
1,Cincinnati Bengals,Detroit Lions
2,Cleveland Browns,Green Bay Packers
3,Pittsburgh Steelers,Minnesota Vikings
4,Houston Texans,Atlanta Falcons
5,Indianapolis Colts,Carolina Panthers
6,Jacksonville Jaguars,New Orleans Saints
7,Tennessee Titans,Tampa Bay Buccaneers
8,Buffalo Bills,Dallas Cowboys
9,Miami Dolphins,New York Giants


### Step 3: Prepare the data


In [5]:
import numpy as np
    
nb_teams = 2 * nb_teams_in_division
teams = range(nb_teams)

# Calculate the number of weeks necessary
nb_inside_div = (nb_teams_in_division - 1) * number_of_matches_inside_division
nb_outside_div = nb_teams_in_division * number_of_matches_outside_division
nb_weeks = nb_inside_div + nb_outside_div


# Weeks to schedule
weeks = range(nb_weeks)

# Season is split into two halves
first_half_weeks = range(int(np.floor(nb_weeks / 2)))
nb_first_half_games = int(np.floor(nb_weeks / 3))


In [6]:
print('Number of Weeks for the tournamnet: ',nb_weeks)
print('No. of Weeks in First Half: ',first_half_weeks[-1] +1)
print('No. of Weeks in Second Half: ',nb_weeks-first_half_weeks[-1] -1)
print('No. of Games in First Half: ',nb_first_half_games)

Number of Weeks for the tournamnet:  31
No. of Weeks in First Half:  15
No. of Weeks in Second Half:  16
No. of Games in First Half:  10


In [7]:
from collections import namedtuple

match = namedtuple("match",["team1","team2","is_divisional"])

matches = {match(t1,t2, 1 if ( t2 <= nb_teams_in_division or t1 > nb_teams_in_division) else 0)  
           for t1 in teams for t2 in teams if t1 < t2}


Number of games to play between pairs depends on whether the pairing is intradivisional or not.

In [8]:
nb_play = { m :  number_of_matches_inside_division if m.is_divisional==1 
                                                   else number_of_matches_outside_division
                   for m in matches}

### Step 4: Set up the prescriptive model

In [9]:
from docplex.mp.environment import Environment
env = Environment()
env.print_information()

* system is: Windows 64bit
* Python version 3.6.9, located at: C:\ProgramData\Anaconda3\python.exe
* docplex is present, version is (2, 10, 155)
* CPLEX library is present, version is 12.9.0.1, located at: C:\ProgramData\Anaconda3\lib\site-packages
* pandas is present, version is 0.25.1


#### Create the DOcplex model

In [10]:
from docplex.mp.model import Model

mdl = Model("sports-scheduling")

#### Defining the decision variables

In [11]:
plays = mdl.binary_var_matrix(matches, weeks, lambda ij: "x_%s_%d" %(str(ij[0]), ij[1]))

#### Constraints

##### Each pair of teams must play the correct number of games.

In [12]:
mdl.add_constraints( mdl.sum(plays[m,w]  for w in weeks) == nb_play[m]
                   for m in matches)
mdl.print_information()

Model: sports-scheduling
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 496
   - linear=496
 - parameters: defaults
 - problem type is: MILP


##### Each team must play exactly once in a week.	 

In [13]:
mdl.add_constraints( mdl.sum(plays[m,w] for m in matches if (m.team1 == t or m.team2 == t) )  == 1
                   for w in weeks for t in teams)
mdl.print_information()

Model: sports-scheduling
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 1488
   - linear=1488
 - parameters: defaults
 - problem type is: MILP


##### Games between the same teams cannot be on successive weeks.

In [14]:
mdl.add_constraints( plays[m,w] + plays[m,w+1] <= 1 
                   for w in weeks
                   for m in matches
                   if w < nb_weeks-1)
mdl.print_information()

Model: sports-scheduling
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 16368
   - linear=16368
 - parameters: defaults
 - problem type is: MILP


##### Some intradivisional games should be in the first half.    

In [15]:
mdl.add_constraints( mdl.sum(plays[m,w]  for w in first_half_weeks for  m in matches 
                            if (((m.team1 == t or m.team2 == t) and m.is_divisional == 1 )))
                    >= nb_first_half_games
                   for t in teams)
mdl.print_information()

Model: sports-scheduling
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 16400
   - linear=16400
 - parameters: defaults
 - problem type is: MILP


#### Express the objective
The objective function for this example is designed to force intradivisional games to occur as late in the season as possible. The incentive for intradivisional games increases by week. There is no incentive for interdivisional games. 

In [16]:
gain = { w : w*w for w in weeks}

# If an intradivisional pair plays in week w, Gain[w] is added to the objective.
mdl.maximize( mdl.sum (m.is_divisional * gain[w] * plays[m,w] for m in matches for w in weeks) )

### Solve with Decision Optimization 

You will get the best solution found after n seconds, due to a time limit parameter.


In [17]:
%%time
mdl.print_information()
assert mdl.solve(url=url,key=key)
mdl.report()

Model: sports-scheduling
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 16400
   - linear=16400
 - parameters: defaults
 - problem type is: MILP
* model sports-scheduling solved with objective = 71665
Wall time: 1min 51s


### Step 5: Investigating the solution

In [18]:
team_league = dict(list({t : team_div1[t] for t in range(nb_teams_in_division) }.items()) +  \
                list({t+nb_teams_in_division : team_div2[t] for t in range(nb_teams_in_division) }.items()))

In [19]:
sol = namedtuple("solution",["week","is_divisional", "team1", "team2"])

solution = [sol(w, m.is_divisional, team_league[m.team1], team_league[m.team2]) for m in matches for w in weeks if plays[m,w].solution_value == 1]

In [20]:
list_solution = []
for sol in solution:
    list_solution.append(list(sol))
df = pd.DataFrame(data=list_solution,columns=['Week','is_divisional','team_div1','team_div2'])
df.sort_values(['Week'],inplace=True)
df['Week'] = df['Week'] +1

In [21]:
df.head(20)

Unnamed: 0,Week,is_divisional,team_div1,team_div2
388,1,0,Oakland Raiders,Minnesota Vikings
19,1,0,New England Patriots,Dallas Cowboys
175,1,0,Tennessee Titans,Seattle Seahawks
27,1,0,Indianapolis Colts,Arizona Cardinals
455,1,0,Miami Dolphins,St. Louis Rams
155,1,0,San Diego Chargers,San Francisco 49ers
40,1,0,New York Jets,New Orleans Saints
330,1,0,Chicago Bears,Carolina Panthers
60,1,0,Pittsburgh Steelers,Philadelphia Eagles
221,1,0,Houston Texans,Tampa Bay Buccaneers


In [22]:
df.tail(20)

Unnamed: 0,Week,is_divisional,team_div1,team_div2
29,30,1,Green Bay Packers,New Orleans Saints
435,30,0,Jacksonville Jaguars,Minnesota Vikings
187,30,1,New York Jets,Kansas City Chiefs
255,30,1,Houston Texans,New England Patriots
206,31,1,Green Bay Packers,Seattle Seahawks
125,31,1,Miami Dolphins,Kansas City Chiefs
245,31,1,Indianapolis Colts,Denver Broncos
108,31,1,Pittsburgh Steelers,New York Jets
458,31,1,Atlanta Falcons,Arizona Cardinals
49,31,1,Tennessee Titans,New England Patriots


In [25]:
df.shape

(496, 4)

#### References
* [Decision Optimization CPLEX Modeling for Python documentation](http://ibmdecisionoptimization.github.io/docplex-doc/)
* [Decision Optimization on Cloud](https://developer.ibm.com/docloud/)

