# Logistics Analysis

In this example, we will analyze the accuracy of a public transit system. The example, is analyzing three lines in Seattle, and it is based on the great blog post [How to Solve Optimization Problems with Python](https://towardsdatascience.com/how-to-solve-optimization-problems-with-python-9088bf8d48e5)

In [1]:
import pandas as pd
import numpy as np

## Loading the data

We will get the data from HTML table that are available from [Rotoguru](http://rotoguru.net), that is gather stats data from fantasy league. We will focus on the data of the NBA players.

In [2]:
data_url = 'http://rotoguru1.com/cgi-bin/hyday.pl?game=fd'
dfs = (
    pd
    .read_html(data_url)
)

There are a few HTML table in the page, and after some scorlling we can see that the table with the players data is the 6th one (index=5):
* Start with the dataframe at index 5 of the list of table from the HTML page above
* Rename the column to be: Position, Name,FD Points, Salary, Team, Opp., Score, Min, Stats
* Filter out rows with Position string longer than 2 characters (header lines)
* Remove the dollar sign ($) and commas from the Salary column


In [22]:
all_players = (
    dfs[5]
    .rename(
        columns={
            0:'Position',
            1:'Name',
            2:'FD Points',
            3:'Salary',
            4:'Team',
            5:'Opp.',
            6:'Score',
            7:'Min',
            8:'Stats'
            }
    )
    .query('Position.str.len() <= 2')
    .assign(Points = lambda x : x['FD Points'].astype(float))
    .assign(Salary = lambda x : x.Salary.str.replace('[$,]','', regex=True).astype(int))
)
all_players

Unnamed: 0,Position,Name,FD Points,Salary,Team,Opp.,Score,Min,Stats,Points
2,SG,"VanVleet, Fred^",77.6,8800,tor,@ orl,123-108,37:06,54pt 3rb 2as 3st 3bl 1to 11trey 17-23fg 9-9ft,77.6
3,PG,"Curry, Stephen^",69.2,9400,gsw,v bos,107-111,38:10,38pt 11rb 8as 3st 3to 7trey 12-21fg 7-7ft,69.2
4,SG,"Harden, James^",57.2,10500,bkn,v lac,124-120,43:04,23pt 11rb 14as 1st 3to 1trey 7-15fg 8-8ft,57.2
5,PG,"Irving, Kyrie^",56,9500,bkn,v lac,124-120,36:37,39pt 5rb 2as 1st 2bl 1to 6trey 15-23fg 3-3ft,56.0
6,PG,"Lowry, Kyle^",50,7500,tor,@ orl,123-108,34:25,14pt 10rb 10as 4st 3to 3trey 4-11fg 3-5ft,50.0
...,...,...,...,...,...,...,...,...,...,...
199,C,"Valanciunas, Jonas",0,6800,mem,@ ind,116-134,DNP,,0.0
200,C,"Azubuike, Udoka",0,3500,uta,v det,117-105,,,0.0
201,C,"Okafor, Jahlil",0,3500,det,@ uta,105-117,,,0.0
202,C,"Bryant, Thomas",0,6400,was,v por,121-132,,,0.0


In [4]:
pip install pulp

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [5]:
# Get a list of players
players = list(all_players['Name'])

In [6]:
from pulp import *

# Set Players to Take either 1 or 0 values (owned or not)
player_vars = LpVariable.dicts("Player", players, lowBound=0, upBound=1, cat='Integer')

In [7]:
total_score = LpProblem("Fantasy_Points_Problem", LpMaximize)

In [23]:
points = (
    all_players
    .set_index('Name')
    .to_dict()
    ['Points']
)

In [24]:
positions = (
    all_players
    .set_index('Name')
    .to_dict()
    ['Position']
)

In [25]:
salaries = (
    all_players
    .set_index('Name')
    .to_dict()
    ['Salary']
)

In [26]:
total_score += lpSum([points[i] * player_vars[i] for i in player_vars])

In [27]:
total_score += lpSum([salaries[i] * player_vars[i] for i in player_vars]) <= 60000

In [28]:
# Get indices of players for each position
pg = [p for p in positions.keys() if positions[p] == 'PG']
sg = [p for p in positions.keys() if positions[p] == 'SG']
sf = [p for p in positions.keys() if positions[p] == 'SF']
pf = [p for p in positions.keys() if positions[p] == 'PF']
c = [p for p in positions.keys() if positions[p] == 'C']
# Set Constraints
total_score += lpSum([player_vars[i] for i in pg]) == 2
total_score += lpSum([player_vars[i] for i in sg]) == 2
total_score += lpSum([player_vars[i] for i in sf]) == 2
total_score += lpSum([player_vars[i] for i in pf]) == 2
total_score += lpSum([player_vars[i] for i in c]) == 1

In [29]:
total_score.solve()

1

In [30]:
for v in total_score.variables():
    if v.varValue > 0:
        print(v.name)

Player_Brooks,_Dillon^
Player_Covington,_Robert^
Player_Curry,_Stephen^
Player_Jackson,_Josh
Player_Leonard,_Kawhi^
Player_Lowry,_Kyle^
Player_O'Neale,_Royce^
Player_Plumlee,_Mason^
Player_VanVleet,_Fred^
