### ECE/CS/ISyE 524 &mdash; Introduction to Optimization &mdash; Spring 2022 ###

# Fantasy Football Team Drafting Opt #

#### Patrick Schmidt (pwschmidt@wisc.edu), Margaret Shen (mshen42@wisc.edu), Xingpeng Sun(xsun273@wisc.edu)

*****

### Table of Contents

1. [Introduction](#1.-Introduction)
1. [Mathematical Model](#2.-Mathematical-model)
1. [Data Collected](#3.-Data-Collected)
1. [Solution](#4.-Solution)
1. [Results and Discussion](#4.-Results-and-discussion)
  1. [Optional Subsection](#4.A.-Feel-free-to-add-subsections)
1. [Conclusion](#5.-Conclusion)

## 1. Introduction ##

Fantasy football is a widely played game in America where a set of league “owners” draft actual National Football League players to their fantasy teams and accrue points each week based on the on-field performance of those players. A common format of drafting fantasy teams is the snake draft. This means each owner is randomly given a pick number which is the order they draft in. This order flips for every round of the draft (so the owner that picked last in the previous round picks first in the next round). Our group is trying to use an optimization model to draft the highest scoring fantasy football team possible based on the number of points each player is projected to score at the beginning of the year. This model will be designed to determine the highest value player available for each round of a snake fantasy draft based on the currently available players and the players the team has already drafted.

![fixit flowchart][flow]

[flow]: https://www.washingtonpost.com/wp-apps/imrs.php?src=https://arc-anglerfish-washpost-prod-washpost.s3.amazonaws.com/public/LLVPM5QCO5GFTB645NOBPMHXPM.jpg&w=1440

## 2. Mathematical model ##

We are trying to decide which players to draft in order to obtain a team with the highest value (objective). The decision variables of our problem will be a binary decision variable indicating whether or not to draft a certain player on our team. For our constraints we have: 1 kicker, 2 quarterbacks, 5 running backs, 5 wide receivers, 2 tight ends, and 1 defense.


LP model:

Parameters and variables:

$K:$ the points for all kickers;

$n_k$: the total number of kickers;

$X_k:$ the choice of each kickers, 0 if the player is not selected, 1 if the player is selected;

$Q:$ the points for all quarterbacks;

$n_k$: the total number of quarterbacks;

$X_q:$ the choice of each quarterbacks, 0 if the player is not selected, 1 if the player is selected;

$R:$ the points for all running backs;

$n_r$: the total number of running backs;

$X_r:$ the choice of each running backs, 0 if the player is not selected, 1 if the player is selected;

$W:$ the points for all wide receivers;

$n_w$: the total number of wide receivers;

$X_w:$ the choice of each wide receivers, 0 if the player is not selected, 1 if the player is selected;

$T:$ the points for all tight ends;

$n_t$: the total number of tight ends;

$X_t:$ the choice of each tight ends, 0 if the player is not selected, 1 if the player is selected;

$D:$ the points for all defense;

$n_d$: the total number of defense;

$X_d:$ the choice of each defense, 0 if the player is not selected, 1 if the player is selected;



$$
\begin{aligned}
\underset{X_k,X_q,X_r,X_w,X_t,X_d \in \mathbb{R^n}}{\text{max}}\qquad& \sum_{i=1}^{i=n_k} K(i)*X_k(i)+ \sum_{i=1}^{i=n_q}Q(i)*X_q(i)+\sum_{i=1}^{i=n_r}R(i)*X_r(i)+\sum_{i=1}^{i=n_w}W(i)*X_w(i)+\sum_{i=1}^{i=n_t}T(i)*X_t(i)+\sum_{i=1}^{i=n_d}D(i)*X_d(i)
\\
\text{subject to:}\qquad& \sum_{i=1}^{i=n_i} X_k(i) = 1 \\
& \sum_{i=1}^{i=n_q} X_q(i) = 2 \\
& \sum_{i=1}^{i=n_r} X_r(i) = 5 \\
& \sum_{i=1}^{i=n_w} X_w(i) = 5 \\
& \sum_{i=1}^{i=n_t} X_t(i) = 2 \\
& \sum_{i=1}^{i=n_d} X_d(i) = 1 \\
\end{aligned}
$$


### Variation-1

Pick 5 substitute players, so that we have back-up plan if some players injured. This makes our model more realistic.

$$
\begin{aligned}
\underset{X_k,X_q,X_r,X_w,X_t,X_d \in \mathbb{R^n}}{\text{max}}\qquad& \sum_{i=1}^{i=n_k} K(i)*X_k(i)+ \sum_{i=1}^{i=n_q}Q(i)*X_q(i)+\sum_{i=1}^{i=n_r}R(i)*X_r(i)+\sum_{i=1}^{i=n_w}W(i)*X_w(i)+\sum_{i=1}^{i=n_t}T(i)*X_t(i)+\sum_{i=1}^{i=n_d}D(i)*X_d(i)
\\
\text{subject to:}\qquad& \sum_{i=1}^{i=n_i} X_k(i) \geq 1 \\
& \sum_{i=1}^{i=n_q} X_q(i) \geq 2 \\
& \sum_{i=1}^{i=n_r} X_r(i) \geq 5 \\
& \sum_{i=1}^{i=n_w} X_w(i) \geq 5 \\
& \sum_{i=1}^{i=n_t} X_t(i) \geq 2 \\
& \sum_{i=1}^{i=n_d} X_d(i) \geq 1 \\
& \sum_{i=1}^{i=n_q} X_q(i)+\sum_{i=1}^{i=n_r} X_r(i)+ \sum_{i=1}^{i=n_w} X_w(i)+ \sum_{i=1}^{i=n_t} X_t(i)+\sum_{i=1}^{i=n_d} X_d(i)=16+5 \\
\end{aligned}
$$

## 3. Data Collected ##

The data we used in our model was found on Kaggle. It contains ESPN's 2019 fantasy football statistics and 2020 fantasy football player projections. The data is divided into position groups. We used player name and 2020 point projections for our model.

To more accurately determine each player's value, we used a metric called value over replacement player (VORP). This metric is used to measure how many points a particular player is predicted contribute to the team over a replacement-level player of the same position. For this application, we used the player that is predicted to be picked last in each position group as the replacement player. Since we are assuming that this draft is for a 10 team league, we chose the the 10th ranked kicker and defense, 20th ranked quarterback and tight end, and the 50th ranked running back and wide receiver as our replacement players. To get each player's VORP, we simply take their 2020 points prediction from ESPN and subtract the points prediction from the replacement-level player in their position group.

## 4. Solution ##

### 4.1 Basic Model Solution

In [1]:
using CSV, DataFrames, GLPK, JuMP

In [7]:
files = readdir() # read files currently in directory 
csv_files = [] # create an empty array 

# identify all .csv files 
for i in files
    if endswith(i,".csv")
        push!(csv_files, i)
    end
end

# dictionary 
data = Dict()
names = Dict() 
key = ['D', 'K', 'Q', 'R', 'T', 'W']

index = 1
for i in csv_files
    name = key[index]
    data[name] = CSV.read(i, DataFrame)[:,"2020 VORP"] # stores point data 
    names[name] = CSV.read(i, DataFrame)[:,"PLAYER NAME"] # stores name data 
    index += 1
end

n = Dict() # total number of active players in each position 
for i in key
    n[i] = length(data[i])
end

num_players = Dict('K' => 1, 'Q' => 2, 'R' => 5, 'W' => 5, 'T' => 2, 'D' => 1)
;

In [8]:
m = Model(with_optimizer(GLPK.Optimizer))

# define variables: 
X = Dict() 
points = Dict()
for i in key
    points[i] = @variable(m) # total points for position
    for j in 1:n[i]
        X[i,j] = @variable(m,binary=true) # binary decision varible (1 = pick player, 0 = dont pick player)
    end
end

# define constraints
@constraint(m, [i in key], sum(X[i,j] for j in 1:n[i]) == num_players[i]) # only certian number of players for each pos 
@constraint(m, [i in key], points[i] == sum(X[i,j]*data[i][j] for j = 1:n[i])) # total points for each position

# objective - Maximize total points 
@objective(m, Max, sum(points[i] for i in key))
;

In [9]:
optimize!(m)

In [10]:
getobjectivevalue(m)

1925.1299999999997

In [11]:
println("position, name, points")

for i in key
    for j in 1:n[i]
        if value.(X[i,j]) == 1
            print(i)
            print(",  ")
            print(names[i][j])
            print(",  ")
            print(data[i][j])
            println()
        end
    end
end

println("\ntotal points: " * string(getobjectivevalue(m)))

position, name, points
D,  Steelers D/ST,  29.67
K,  Harrison Butker,  14.3
Q,  Lamar Jackson,  105.41
Q,  Patrick Mahomes,  99.84
R,  Christian McCaffrey,  210.6
R,  Ezekiel Elliott,  177.41
R,  Saquon Barkley,  168.98
R,  Dalvin Cook,  168.37
R,  Alvin Kamara,  167.04
T,  Travis Kelce,  136.71
T,  George Kittle,  117.14
W,  Michael Thomas,  144.99
W,  DeAndre Hopkins,  98.85
W,  Julio Jones,  97.5
W,  Chris Godwin,  95.38
W,  Davante Adams,  92.94

total points: 1925.1299999999997


### 4.2 Variation-1 Solution 

In [12]:
m = Model(with_optimizer(GLPK.Optimizer))
substitute=5
# define variables: 
X = Dict() 
points = Dict()
for i in key
    points[i] = @variable(m) # total points for position
    for j in 1:n[i]
        X[i,j] = @variable(m,binary=true) # binary decision varible (1 = pick player, 0 = dont pick player)
    end
end

# define constraints
@constraint(m, [i in key], sum(X[i,j] for j in 1:n[i]) >= num_players[i]) # only certian number of players for each pos 
@constraint(m, [i in key], sum(sum(X[i,j] for j in 1:n[i]) for i in key) == 16+substitute) # only certian number of players for each pos
@constraint(m, [i in key], points[i] == sum(X[i,j]*data[i][j] for j = 1:n[i])) # total points for each position

# objective - Maximize total points 
@objective(m, Max, sum(points[i] for i in key))
;

In [13]:
optimize!(m)
getobjectivevalue(m)

2537.94

In [15]:
println("------------------------------")
println("position,   name,     points")
println("------------------------------")
for i in key
    for j in 1:n[i]
        if value.(X[i,j]) == 1
            print(i)
            print(",  ")
            print(names[i][j])
            print(",  ")
            print(data[i][j])
            println()
        end
    end
end

println("\ntotal points: " * string(getobjectivevalue(m)))

------------------------------
position,   name,     points
------------------------------
D,  Steelers D/ST,  29.67
K,  Harrison Butker,  14.3
Q,  Lamar Jackson,  105.41
Q,  Patrick Mahomes,  99.84
R,  Christian McCaffrey,  210.6
R,  Ezekiel Elliott,  177.41
R,  Saquon Barkley,  168.98
R,  Dalvin Cook,  168.37
R,  Alvin Kamara,  167.04
R,  Kenyan Drake,  129.88
R,  Derrick Henry,  127.35
R,  Aaron Jones,  125.88
R,  Miles Sanders,  115.43
R,  Austin Ekeler,  114.27
T,  Travis Kelce,  136.71
T,  George Kittle,  117.14
W,  Michael Thomas,  144.99
W,  DeAndre Hopkins,  98.85
W,  Julio Jones,  97.5
W,  Chris Godwin,  95.38
W,  Davante Adams,  92.94

total points: 2537.94


## 5. Results and discussion ##

### Based on our model above, we are able to pick the initial optimal fantasy football team (16 members). The results are: ###


| Kicker         | QBs            | RBs                 | WRs            | TEs          | D             |
| -------------- |:--------------:|:-------------------:|:--------------:| ------------:| -------------:|
| Harrison Butker| Lamar Jackson  | Christian McCaffrey | Michael Thomas | Travis Kelce | Steelers D/ST |
|                | Patrick Mahomes| Ezekiel Elliott     | DeAndre Hopkins| George Kittle|               |
|                |                | Saquon Barkley      | Julio Jones    |              |               |
|                |                | Dalvin Cook         | Chris Godwin   |              |               |
|                |                | Alvin Kamara        | Davante Adams  |              |               |

### From our evaluation, we see that these players are the players with highest points in our dataset. The team contains 16 players, with 1 kicker, 2 QBs, 5 RBs, 5 WRs, 2 TEs, 1 defenser. Therefore, the result matches our objective and hypothesis well. ###

We will add some more contraints and variation to build our new model in next step.