# Use decision optimization to help a sports league schedule its games

This tutorial includes everything you need to set up decision optimization engines, build mathematical programming models, and arrive at a good working schedule for a sports league's games.


When you finish this tutorial, you'll have a foundational knowledge of _Prescriptive Analytics_.

>This notebook is part of the [Prescriptive Analytics for Python](http://ibmdecisionoptimization.github.io/docplex-doc/)

>Running the sample requires the installation of
    [CPLEX Optimization studio](https://www.ibm.com/products/ilog-cplex-optimization-studio)
    (Commercial or free 
    [CPLEX Community edition](https://www.ibm.com/account/reg/us-en/signup?formid=urx-20028>`)).
    This sample automatically installs *CPLEX CE* if needed.


Table of contents:

-  [The business problem](#The-business-problem:--Games-Scheduling-in-the-National-Football-League)
*  [How decision optimization (prescriptive analytics) can help](#How--decision-optimization-can-help)
*  [Use decision optimization](#Use-decision-optimization)
    *  [Step 1: Import the library](#Step-1:-Import-the-library)
    -  [Step 2: Model the Data](#Step-2:-Model-the-data)
    *  [Step 3: Prepare the data](#Step-3:-Prepare-the-data)
    -  [Step 4: Set up the prescriptive model](#Step-4:-Set-up-the-prescriptive-model)
        * [Define the decision variables](#Define-the-decision-variables)
        * [Express the business constraints](#Express-the-business-constraints)
        * [Express the objective](#Express-the-objective)
        * [Solve with Decision Optimization](#Solve-with-Decision-Optimization)
    *  [Step 5: Investigate the solution and run an example analysis](#Step-5:-Investigate-the-solution-and-then-run-an-example-analysis)
*  [Summary](#Summary)


## The business problem:  Games Scheduling in the National Football League 


* A sports league with two divisions must schedule games so that each team plays every team within its division a given number of times,  and each team plays teams in the other division a given number of times.
* A team plays exactly one game each week. 
* A pair of teams cannot play each other on consecutive weeks.
* While a third of a team's intradivisional games must be played in the first half of the season, the preference is for intradivisional games to be held as late as possible in the season.
    * To model this preference, there is an incentive for intradivisional games that increases each week as a square of the week. 
    * An opponent must be assigned to each team each week to maximize the total of the incentives..
 

 
This is a type of discrete optimization problem that can be solved by using either **Integer Programming** (IP) or **Constraint Programming** (CP). 

>  **Integer Programming** is the class of problems defined as the optimization of a linear function, subject to linear constraints over integer variables. 

>  **Constraint Programming** problems generally have discrete decision variables, but the constraints can be logical, and the arithmetic expressions are not restricted to being linear. 

For the purposes of this tutorial, we will illustrate a solution with mathematical programming (MIP).  


## How  decision optimization can help

* Prescriptive analytics (decision optimization) technology recommends actions that are based on desired outcomes.  It takes into account specific scenarios, resources, and knowledge of past and current events. With this insight, your organization can make better decisions and have greater control of business outcomes.  

* Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.  

* Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.  
<br/>

<u>With prescriptive analytics, you can:</u> 

* Automate the complex decisions and trade-offs to better manage your limited resources.
* Take advantage of a future opportunity or mitigate a future risk.
* Proactively update recommendations based on changing events.
* Meet operational goals, increase customer loyalty, prevent threats and fraud, and optimize business processes.



## Use decision optimization

### Step 1: Import the library

Run the following code to import the Decision Optimization CPLEX Modeling library.  The *DOcplex* library contains the two modeling packages, Mathematical Programming (docplex.mp) and Constraint Programming (docplex.cp).

In [38]:
import sys
try:
    import docplex.mp
except:
    raise Exception('Please install docplex. See https://pypi.org/project/docplex/')

If *CPLEX* is not installed, install CPLEX Community edition.

In [39]:
try:
    import cplex
except:
    raise Exception('Please install CPLEX. See https://pypi.org/project/cplex/')

### Step 2: Model the data
In this scenario, the data is simple. There are eight teams in each division, and the teams must play each team in the division once and each team outside the division once.

Use a Python module, *Collections*, which implements some data structures that will help solve some problems.  *Named tuples* helps to define meaning of each position in a tuple.  This helps the code be more readable and self-documenting. You can use named tuples in any place where you use tuples. 

In this example, you create a *namedtuple* to contain information for points.  You are also defining some of the parameters.

In [40]:
# Teams in 1st division
team_div1 = ["Baltimore Ravens","Cincinnati Bengals", "Cleveland Browns","Pittsburgh Steelers","Houston Texans",
                "Indianapolis Colts","Jacksonville Jaguars","Tennessee Titans","Buffalo Bills","Miami Dolphins",
                "New England Patriots","New York Jets","Denver Broncos","Kansas City Chiefs","Oakland Raiders",
                "San Diego Chargers"]

# Teams in 2nd division
team_div2 = ["Chicago Bears","Detroit Lions","Green Bay Packers","Minnesota Vikings","Atlanta Falcons",
                "Carolina Panthers","New Orleans Saints","Tampa Bay Buccaneers","Dallas Cowboys","New York Giants",
                "Philadelphia Eagles","Washington Redskins","Arizona Cardinals","San Francisco 49ers",
                "Seattle Seahawks","St. Louis Rams"]

In [67]:
#number_of_matches_to_play = 1  # Number of match to play between two teams on the league
# Schedule parameters
nb_teams_in_division = 5
max_teams_in_division = 10
number_of_matches_inside_division = 1
number_of_matches_outside_division = 1

Use basic HTML and a stylesheet to format the data.

In [42]:
CSS = """
body {
    margin: 0;
    font-family: Helvetica;
}
table.dataframe {
    border-collapse: collapse;
    border: none;
}
table.dataframe tr {
    border: none;
}
table.dataframe td, table.dataframe th {
    margin: 0;
    border: 1px solid white;
    padding-left: 0.25em;
    padding-right: 0.25em;
}
table.dataframe th:not(:empty) {
    background-color: #fec;
    text-align: left;
    font-weight: normal;
}
table.dataframe tr:nth-child(2) th:empty {
    border-left: none;
    border-right: 1px dashed #888;
}
table.dataframe td {
    border: 2px solid #ccf;
    background-color: #f4f4ff;
}
    table.dataframe thead th:first-child {
        display: none;
    }
    table.dataframe tbody th {
        display: none;
    }
"""

from IPython.core.display import HTML
HTML('<style>{}</style>'.format(CSS))

Now you will import the *pandas* library. Pandas is an open source Python library for data analysis. It uses two data structures, *Series* and *DataFrame*, which are built on top of *NumPy*.

A **Series** is a one-dimensional object similar to an array, list, or column in a table. It will assign a labeled index to each item in the series. By default, each item receives an index label from 0 to N, where N is the length of the series minus one.

A **DataFrame** is a tabular data structure comprised of rows and columns, similar to a spreadsheet, database table, or R's data.frame object. Think of a DataFrame as a group of Series objects that share an index (the column names).

In the example, each division (the AFC and the NFC) is part of a DataFrame.

In [43]:
import pandas as pd

team1 = pd.DataFrame(team_div1)
team2 = pd.DataFrame(team_div2)
team1.columns = ["AFC"]
team2.columns = ["NFC"]

teams = pd.concat([team1,team2], axis=1)

The following *display* function is a tool to show different representations of objects. When you issue the  *display(teams)* command, you are sending the output to the notebook so that the result is stored in the document.

In [44]:
from IPython.display import display

display(teams)

Unnamed: 0,AFC,NFC
0,Baltimore Ravens,Chicago Bears
1,Cincinnati Bengals,Detroit Lions
2,Cleveland Browns,Green Bay Packers
3,Pittsburgh Steelers,Minnesota Vikings
4,Houston Texans,Atlanta Falcons
5,Indianapolis Colts,Carolina Panthers
6,Jacksonville Jaguars,New Orleans Saints
7,Tennessee Titans,Tampa Bay Buccaneers
8,Buffalo Bills,Dallas Cowboys
9,Miami Dolphins,New York Giants


### Step 3: Prepare the data

Given the number of teams in each division and the number of intradivisional and interdivisional games to be played, you can calculate the total number of teams and the number of weeks in the schedule, assuming every team plays exactly one game per week. 


The season is split into halves, and the number of the intradivisional games that each team must play in the first half of the season is calculated.

In [68]:
import numpy as np
    
nb_teams = 2 * nb_teams_in_division
teams = range(nb_teams)

# Calculate the number of weeks necessary
nb_inside_div = (nb_teams_in_division - 1) * number_of_matches_inside_division
nb_outside_div = nb_teams_in_division * number_of_matches_outside_division
nb_weeks = nb_inside_div + nb_outside_div


# Weeks to schedule
weeks = range(nb_weeks)

# Season is split into two halves
first_half_weeks = range(int(np.floor(nb_weeks / 2)))
nb_first_half_games = int(np.floor(nb_weeks / 3))


In [69]:
nb_inside_div

4

In [70]:
from collections import namedtuple

match = namedtuple("match",["team1","team2","is_divisional"])

matches = {match(t1,t2, 1 if ( t2 <= nb_teams_in_division or t1 > nb_teams_in_division) else 0)  
           for t1 in teams for t2 in teams if t1 < t2}

Number of games to play between pairs depends on whether the pairing is intradivisional or not.

In [71]:
nb_play = { m :  number_of_matches_inside_division if m.is_divisional==1 
                                                   else number_of_matches_outside_division
                   for m in matches}

In [72]:
matches

{match(team1=0, team2=1, is_divisional=1),
 match(team1=0, team2=2, is_divisional=1),
 match(team1=0, team2=3, is_divisional=1),
 match(team1=0, team2=4, is_divisional=1),
 match(team1=0, team2=5, is_divisional=1),
 match(team1=0, team2=6, is_divisional=0),
 match(team1=0, team2=7, is_divisional=0),
 match(team1=0, team2=8, is_divisional=0),
 match(team1=0, team2=9, is_divisional=0),
 match(team1=1, team2=2, is_divisional=1),
 match(team1=1, team2=3, is_divisional=1),
 match(team1=1, team2=4, is_divisional=1),
 match(team1=1, team2=5, is_divisional=1),
 match(team1=1, team2=6, is_divisional=0),
 match(team1=1, team2=7, is_divisional=0),
 match(team1=1, team2=8, is_divisional=0),
 match(team1=1, team2=9, is_divisional=0),
 match(team1=2, team2=3, is_divisional=1),
 match(team1=2, team2=4, is_divisional=1),
 match(team1=2, team2=5, is_divisional=1),
 match(team1=2, team2=6, is_divisional=0),
 match(team1=2, team2=7, is_divisional=0),
 match(team1=2, team2=8, is_divisional=0),
 match(team

### Step 4: Set up the prescriptive model

In [50]:
from docplex.mp.environment import Environment
env = Environment()
env.print_information()

* system is: Windows 64bit
* Python is present, version is 3.6.8
* docplex is present, version is (2, 4, 61)
* CPLEX wrapper is present, version is 12.8.0.0, located at: C:\Miniconda3\lib\site-packages


#### Create the DOcplex model
The model contains all the business constraints and defines the objective.

In [51]:
from docplex.mp.model import Model

mdl = Model("sports")

#### Define the decision variables

In [52]:
plays = mdl.binary_var_matrix(matches, weeks, lambda ij: "x_%s_%d" %(str(ij[0]), ij[1]))

#### Express the business constraints

##### Each pair of teams must play the correct number of games.

In [53]:
mdl.add_constraints( mdl.sum(plays[m,w]  for w in weeks) == nb_play[m]
                   for m in matches)
mdl.print_information()

Model: sports
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 496
   - linear=496
 - parameters: defaults


##### Each team must play exactly once in a week.	 

In [54]:
mdl.add_constraints( mdl.sum(plays[m,w] for m in matches if (m.team1 == t or m.team2 == t) )  == 1
                   for w in weeks for t in teams)
mdl.print_information()

Model: sports
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 1488
   - linear=1488
 - parameters: defaults


##### Games between the same teams cannot be on successive weeks.

In [55]:
mdl.add_constraints( plays[m,w] + plays[m,w+1] <= 1 
                   for w in weeks
                   for m in matches
                   if w < nb_weeks-1)
mdl.print_information()

Model: sports
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 16368
   - linear=16368
 - parameters: defaults


##### Some intradivisional games should be in the first half.    

In [56]:
mdl.add_constraints( mdl.sum(plays[m,w]  for w in first_half_weeks for  m in matches 
                            if (((m.team1 == t or m.team2 == t) and m.is_divisional == 1 )))
                    >= nb_first_half_games
                   for t in teams)
mdl.print_information()

Model: sports
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 16400
   - linear=16400
 - parameters: defaults


#### Express the objective
The objective function for this example is designed to force intradivisional games to occur as late in the season as possible. The incentive for intradivisional games increases by week. There is no incentive for interdivisional games. 

In [57]:
gain = { w : w*w for w in weeks}

# If an intradivisional pair plays in week w, Gain[w] is added to the objective.
mdl.maximize( mdl.sum (m.is_divisional * gain[w] * plays[m,w] for m in matches for w in weeks) )

### Solve with Decision Optimization 

You will get the best solution found after n seconds, due to a time limit parameter.


In [58]:
mdl.print_information()

assert mdl.solve(), "!!! Solve of the model fails"
mdl.report()

Model: sports
 - number of variables: 15376
   - binary=15376, integer=0, continuous=0
 - number of constraints: 16400
   - linear=16400
 - parameters: defaults
* model sports solved with objective = 71665


### Step 5: Investigate the solution and then run an example analysis

Determine which of the scheduled games will be a replay of one of the last 10 Super Bowls.<br>
We start by creating a pandas DataFrame that contains the year and teams who played the last 10 Super Bowls.

In [59]:
try: # Python 2
    team_league = dict({t : team_div1[t] for t in range(nb_teams_in_division) }.items() +  \
                {t+nb_teams_in_division : team_div2[t] for t in range(nb_teams_in_division) }.items()
            )
except: # Python 3
    team_league = dict(list({t : team_div1[t] for t in range(nb_teams_in_division) }.items()) +  \
                list({t+nb_teams_in_division : team_div2[t] for t in range(nb_teams_in_division) }.items()))

In [60]:
sol = namedtuple("solution",["week","is_divisional", "team1", "team2"])

solution = [sol(w, m.is_divisional, team_league[m.team1], team_league[m.team2]) for m in matches for w in weeks if plays[m,w].solution_value == 1]

In [61]:
solution

[solution(week=28, is_divisional=1, team1='Tampa Bay Buccaneers', team2='Philadelphia Eagles'),
 solution(week=21, is_divisional=0, team1='Tennessee Titans', team2='New Orleans Saints'),
 solution(week=18, is_divisional=0, team1='Denver Broncos', team2='Seattle Seahawks'),
 solution(week=24, is_divisional=0, team1='Cincinnati Bengals', team2='Seattle Seahawks'),
 solution(week=30, is_divisional=1, team1='Dallas Cowboys', team2='Arizona Cardinals'),
 solution(week=14, is_divisional=1, team1='Baltimore Ravens', team2='Cleveland Browns'),
 solution(week=0, is_divisional=0, team1='New York Jets', team2='Tampa Bay Buccaneers'),
 solution(week=24, is_divisional=0, team1='Chicago Bears', team2='San Francisco 49ers'),
 solution(week=15, is_divisional=0, team1='Houston Texans', team2='Seattle Seahawks'),
 solution(week=15, is_divisional=0, team1='Miami Dolphins', team2='Green Bay Packers'),
 solution(week=20, is_divisional=0, team1='Oakland Raiders', team2='New Orleans Saints'),
 solution(week=

In [62]:
nfl_finals = [("2016", "Carolina Panthers", "Denver Broncos"),
              ("2015", "New England Patriots", "Seattle Seahawks"),
              ("2014", "Seattle Seahawks", "Denver Broncos"),
              ("2013", "Baltimore Ravens", "San Francisco 49ers"),
              ("2012", "New York Giants", "New England Patriots "),
              ("2011", "Green Bay Packers", "Pittsburgh Steelers"),
              ("2010", "New Orleans Saints", "Indianapolis Colts"),
              ("2009", "Pittsburgh Steelers", "Arizona Cardinals"),
              ("2008", "New York Giants", "New England Patriots"),
              ("2007", "Indianapolis Colts", "Chicago Bears")
             ]
nfl_meetings = {(t[1], t[2]) for t in nfl_finals}
winners_bd = pd.DataFrame(nfl_finals)
winners_bd.columns = ["year", "team1", "team2"]

In [63]:
display(winners_bd)

Unnamed: 0,year,team1,team2
0,2016,Carolina Panthers,Denver Broncos
1,2015,New England Patriots,Seattle Seahawks
2,2014,Seattle Seahawks,Denver Broncos
3,2013,Baltimore Ravens,San Francisco 49ers
4,2012,New York Giants,New England Patriots
5,2011,Green Bay Packers,Pittsburgh Steelers
6,2010,New Orleans Saints,Indianapolis Colts
7,2009,Pittsburgh Steelers,Arizona Cardinals
8,2008,New York Giants,New England Patriots
9,2007,Indianapolis Colts,Chicago Bears


We now look for the games in our solution that are replays of one of the past 10 Super Bowls.

In [64]:
months = ["January", "February", "March", "April", "May", "June", 
          "July", "August", "September", "October", "November", "December"]
report = []
for m in solution:
    if (m.team1, m.team2) in nfl_meetings:
        report.append((m.week, months[m.week//4], m.team1, m.team2))
    if (m.team2, m.team1) in nfl_meetings: 
        report.append((m.week, months[m.week//4], m.team2, m.team1))

print(report)
matches_bd = pd.DataFrame(report)
matches_bd.columns = ["week", "Month", "Team1", "Team2"]

[(18, 'May', 'Seattle Seahawks', 'Denver Broncos'), (18, 'May', 'Baltimore Ravens', 'San Francisco 49ers'), (15, 'April', 'New York Giants', 'New England Patriots'), (25, 'July', 'New Orleans Saints', 'Indianapolis Colts'), (23, 'June', 'Pittsburgh Steelers', 'Arizona Cardinals'), (3, 'January', 'Green Bay Packers', 'Pittsburgh Steelers'), (23, 'June', 'New England Patriots', 'Seattle Seahawks'), (4, 'February', 'Carolina Panthers', 'Denver Broncos'), (7, 'February', 'Indianapolis Colts', 'Chicago Bears')]


In [65]:
try: #pandas >= 0.17
    display(matches_bd.sort_values(by='week'))
except:
    display(matches_bd.sort('week'))

Unnamed: 0,week,Month,Team1,Team2
5,3,January,Green Bay Packers,Pittsburgh Steelers
7,4,February,Carolina Panthers,Denver Broncos
8,7,February,Indianapolis Colts,Chicago Bears
2,15,April,New York Giants,New England Patriots
0,18,May,Seattle Seahawks,Denver Broncos
1,18,May,Baltimore Ravens,San Francisco 49ers
4,23,June,Pittsburgh Steelers,Arizona Cardinals
6,23,June,New England Patriots,Seattle Seahawks
3,25,July,New Orleans Saints,Indianapolis Colts


## Summary


You learned how to set up and use IBM Decision Optimization CPLEX Modeling for Python to formulate a Constraint Programming model and solve it with CPLEX.

#### References
* [Decision Optimization CPLEX Modeling for Python documentation](http://ibmdecisionoptimization.github.io/docplex-doc/)
* [Decision Optimization on Cloud](https://developer.ibm.com/docloud/)
* Need help with DOcplex or to report a bug? Please go [here](https://developer.ibm.com/answers/smartspace/docloud).
* Contact us at dofeedback@wwpdl.vnet.ibm.com.


Copyright © 2017 IBM. IPLA licensed Sample Materials.