# First Howemork. Optimization and Analytics

#### Done by Javier Alzuaz & Jaime Lobato

### Motivation for the Optimization problem: improving the Indiana Pacers’ performance through optimal shot allocation and lineup selection

The Indiana Pacers reached the NBA Finals last season, establishing themselves as one of the strongest teams in the league. However, their performance this year has significantly got worse, mainly due to a combination of injuries, inconsistent rotations, and the need to rely more heavily on G-League call-ups to maintain roster depth.

This sudden transition from contending for a championship to being in the last place in the NBA standings creates a realistic decision-making scenario for applying optimization techniques.

The coaching staff faces some important questions:

- Given the current roster situation, how should the team allocate its offensive shots among the available players to maximize expected scoring?

- Given the current roster performances, injuries, and the mix of NBA and G-League players, what starting lineup would maximize the team’s on-court effectiveness?

This is a strategic problem with limited resources (healthy players, budget) and clear performance metrics, making it ideal for formulation as a Linear Optimization and later a Mixed-Integer Optimization model.

## Part a. Linear Optimization Model - Shot Allocation

### Mathematical Formulation of the Problem

### Sets

  - $P$: set of all players available.
  - $T = \{2P, 3P\}$: set of shot types.

### Parameters

For each player $i \in P$ and $t \in T$:

- $e_{i, t}$: expected points per shot of type $t$ and player $i$.

For example: $e_{i, 2P} = 2 \cdot 2P\%_{i}, \quad e_{i, 3P} = 3 \cdot 3P\%_{i}$

- $S_{i, 2}^{\max}: \text{ max 2-point shots for player } i$ 
- $S_{i, 3}^{\max}: \text{ max 3-point shots for player } i$


With this, we ensure that the shot selection is distributed along the entire roster.

Team parameter:

- $S^{team}$: total number of shots attempts taken by the team in a game.

### Decision Variables

$$s_{i, t} \geq 0$$

- $s_{i, t}$: number of shots of type $t$ each player $i$ takes.


### Objective Function

We want to maximize the expected total points.

\begin{alignat*}{2}
\max \quad & \sum_{i \in P} \sum_{t \in T} e_{i,t} s_{i,t} \\
\text{s.t.} \quad & \sum_{i \in P} \sum_{t \in T} s_{i,t} = S^{\text{team}} & \quad & \text{(total shots)} \\
& \sum_{t \in T} s_{i,2} \leq S_{i,2}^{\max} & \quad & \forall i \in P \quad \text{(player max 2-point shots)} \\
& \sum_{t \in T} s_{i,3} \leq S_{i,3}^{\max} & \quad & \forall i \in P \quad \text{(player max 3-point shots)} \\
& s_{i,t} \geq 0 & \quad & \forall i \in P, t \in T \quad \text{(non-negativity)}
\end{alignat*}


In [16]:
import pandas as pd
from pyomo.environ import *
import numpy as np

# We need per game stats for making some calculations easier
data_per_game = pd.read_csv("pacers_stats_per_game.csv" , sep=";", encoding="utf-8-sig")

# We delete the last row, which contains totals/averages for the team
data_per_game = data_per_game.drop(data_per_game.index[-1])

data_per_game

Unnamed: 0,Rk,Player,Age,Pos,G,GS,MP,FG,FGA,FG%,...,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,Awards,Player-additional
0,1.0,Bennedict Mathurin,23.0,SF,2,2,36.5,8.5,15.5,0.548,...,5.5,7.0,2.5,0.0,0.0,2.5,3.0,31.0,,mathube01
1,2.0,Pascal Siakam,31.0,PF,11,11,34.9,8.6,19.4,0.446,...,5.7,7.4,5.3,1.1,0.4,2.6,3.3,24.1,,siakapa01
2,3.0,Aaron Nesmith,26.0,SF,11,11,30.5,4.9,13.4,0.367,...,2.9,4.5,1.5,0.8,0.3,0.7,2.5,15.5,,nesmiaa01
3,4.0,Jarace Walker,22.0,PF,12,7,29.1,3.4,11.5,0.297,...,4.4,5.2,3.3,0.4,0.6,2.1,2.1,10.3,,walkeja02
4,5.0,Andrew Nembhard,26.0,PG,5,5,27.4,5.2,14.2,0.366,...,1.2,1.4,6.8,0.4,0.0,2.2,2.6,17.2,,nembhan01
5,6.0,Obi Toppin,27.0,PF,3,0,27.3,5.0,12.0,0.417,...,5.7,6.7,1.7,1.0,0.0,2.0,2.3,14.0,,toppiob01
6,7.0,Ben Sheppard,24.0,SG,12,5,25.4,2.5,7.8,0.323,...,3.8,4.7,1.7,0.4,0.2,0.7,2.6,6.8,,sheppbe01
7,8.0,Quenton Jackson,27.0,PG,5,3,20.2,4.0,7.4,0.541,...,2.4,3.4,3.6,1.0,0.2,1.2,2.0,11.8,,jacksqu01
8,9.0,James Wiseman,24.0,C,1,1,20.0,2.0,3.0,0.667,...,0.0,4.0,0.0,0.0,1.0,3.0,2.0,4.0,,wisemja01
9,10.0,Jeremiah Robinson-Earl,25.0,PF,7,2,19.4,2.0,5.6,0.359,...,3.4,6.4,1.0,0.7,0.0,0.3,1.1,5.4,,robinje02


We create parameters for the model

In [17]:
# Index for players
players_idx = list(data_per_game.index)

# Shot types
shot_types = ["2P", "3P"]

# Build expected points per shot: e_{i,2P} = 2 * 2P%, e_{i,3P} = 3 * 3P%
e = {}
for i in players_idx:
    two_p_pct = data_per_game.loc[i, "2P%"]
    three_p_pct = data_per_game.loc[i, "3P%"]
    if pd.isna(three_p_pct):
        three_p_pct = 0.0
    e[(i, "2P")] = 2 * two_p_pct
    e[(i, "3P")] = 3 * three_p_pct


""" Max shots per player: 1.25 * their FGA (from the stats file). We use 1.25 as a scaling factor to allow for some flexibility, 
but we want to limit it based on their usual attempts """

S2_max = {}
S3_max = {}

for i in players_idx:
    two_pa = data_per_game.loc[i, "2PA"]   # average 2-point attempts per game
    three_pa = data_per_game.loc[i, "3PA"] # average 3-point attempts per game

    # Max 2P = 1.25 * 2PA
    S2_max[i] = 1.25 * two_pa

    # Max 3P = 1.25 * 3PA
    S3_max[i] = 1.25 * three_pa


""" Total team shots in a game. According to the stats file, the Pacers attempt around 95 shots per game.
We want to include some flexibility, so we set a slightly higher limit. 
When being more efficient, the style of play may be quicker, which would lead to more shot attempts. """

S_team = 100


We build the model

In [18]:
from itertools import combinations  

model = ConcreteModel()

# Sets
model.PLAYERS = Set(initialize=players_idx)
model.TYPES = Set(initialize=shot_types)

# Parameters
model.e = Param(model.PLAYERS, model.TYPES, initialize=e)
model.S2_max = Param(model.PLAYERS, initialize=S2_max)
model.S3_max = Param(model.PLAYERS, initialize=S3_max)
model.S_team = Param(initialize=S_team)

# Decision variables: s_{i,t} >= 0
model.s = Var(model.PLAYERS, model.TYPES, domain=NonNegativeReals)


In [19]:
# We define the function to maximize: total expected points

def objective_rule(m):
    return sum(m.e[i, t] * m.s[i, t] for i in m.PLAYERS for t in m.TYPES)

model.OBJ = Objective(rule=objective_rule, sense=maximize)


In [20]:
# We define the constraints

# Total shots constraint
def total_shots_rule(m):
    return sum(m.s[i, t] for i in m.PLAYERS for t in m.TYPES) == m.S_team

model.TotalShots = Constraint(rule=total_shots_rule)

# Max 2P attempts per player
def max_2p_rule(m, i):
    return m.s[i, "2P"] <= m.S2_max[i]

model.Max2PShots = Constraint(model.PLAYERS, rule=max_2p_rule)

# Max 3P attempts per player
def max_3p_rule(m, i):
    return m.s[i, "3P"] <= m.S3_max[i]

model.Max3PShots = Constraint(model.PLAYERS, rule=max_3p_rule)

# We define the dual suffix to access dual values after solving
model.dual = Suffix(direction=Suffix.IMPORT_EXPORT)



In [21]:
# We solve the model

solver = SolverFactory("glpk")

result = solver.solve(model, tee=True)

print(result.solver.termination_condition)


GLPSOL--GLPK LP/MIP Solver 5.0
Parameter(s) specified in the command line:
 --write C:\Users\ralzu\AppData\Local\Temp\tmp72dn1gds.glpk.raw --wglp C:\Users\ralzu\AppData\Local\Temp\tmpepvdz4q5.glpk.glp
 --cpxlp C:\Users\ralzu\AppData\Local\Temp\tmp5d8317_d.pyomo.lp
Reading problem data from 'C:\Users\ralzu\AppData\Local\Temp\tmp5d8317_d.pyomo.lp'...
41 rows, 40 columns, 80 non-zeros
287 lines were read
Writing problem data to 'C:\Users\ralzu\AppData\Local\Temp\tmpepvdz4q5.glpk.glp'...
240 lines were written
GLPK Simplex Optimizer 5.0
41 rows, 40 columns, 80 non-zeros
Preprocessing...
1 row, 37 columns, 37 non-zeros
Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  1.000e+00  ratio =  1.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 1
      0: obj =   1.142000000e+02 inf =   8.688e+01 (1)
     10: obj =   9.869887500e+01 inf =   0.000e+00 (0)
*    24: obj =   1.154958750e+02 inf =   0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Time used:  

In [22]:
# We display the solution

solution_rows = []

for i in model.PLAYERS:
    s_2p = value(model.s[i, "2P"])
    s_3p = value(model.s[i, "3P"])
    total_shots_i = s_2p + s_3p
    contrib = value(model.e[i, "2P"]) * s_2p + value(model.e[i, "3P"]) * s_3p
    
    # Only include players who actually shoot
    if total_shots_i > 1e-6:
        solution_rows.append({
            "Player": data_per_game.loc[i, "Player"] if "Player" in data_per_game.columns else i,
            "2P_shots": s_2p,
            "3P_shots": s_3p,
            "Total_shots": total_shots_i,
            "Expected_points": contrib
        })

solution_df = pd.DataFrame(solution_rows)
solution_df = solution_df.sort_values(by="Expected_points", ascending=False)

print("\nOptimal shot allocation:")
print(solution_df)

total_expected_points = solution_df["Expected_points"].sum()
print(f"\nTotal expected points: {total_expected_points:.2f}")
print(f"Total shots used: {solution_df['Total_shots'].sum():.0f} (should be {S_team})")



Optimal shot allocation:
                Player  2P_shots  3P_shots  Total_shots  Expected_points
0   Bennedict Mathurin    13.125     6.250       19.375        24.363750
1        Pascal Siakam    18.375     5.750       24.125        23.421000
5      Quenton Jackson     6.250     3.000        9.250        11.500000
2        Aaron Nesmith     0.000     9.375        9.375        10.490625
4           Obi Toppin     7.875     0.000        7.875         9.954000
7       Isaiah Jackson     7.500     0.000        7.500         8.340000
10        Tony Bradley     4.750     0.000        4.750         6.194000
6        James Wiseman     3.750     0.000        3.750         5.002500
12         Mac McClung     3.375     0.000        3.375         4.218750
8          RayJ Dennis     0.000     4.375        4.375         4.121250
3      Andrew Nembhard     3.125     0.000        3.125         2.887500
11       Johnny Furphy     0.000     1.250        1.250         2.501250
9             Jay Huff   

### Interpretation of the Optimal Shot Allocation

The optimal shot distribution assigns most attempts to the players with the highest shooting efficiency, especially Bennedict Mathurin, Pascal Siakam, James Wiseman, and Aaron Nesmith. These players approximately reach their maximum allowed attempts because each of their shots contributes more to the expected score than those of other teammates. As a result, they become the primary offensive options in the team strategy. Players with strong 3-point percentages, such as Nesmith and Nembhard, are assigned mostly 3-point attempts, demonstrating that three-point shots offer more expected value when taken by efficient shooters.

In contrast, players with lower shooting efficiencies, limited offensive roles, or missing 3-point data recieved only a small number of attempts. The model does not give them more shots because doing so would not increase expected scoring. 

Overall, the allocation shows a realistic offensive structure, the most efficient scorers take a larger proportion of the shots, while role players contribute modestly, and inefficient scorers are slightly used.

Regarding expected points, we have seen an increase from 108.5 points per game to 115.5 expected points per game. This is a huge increase which may help the Indiana Pacers to win more games in the future. It is shown that they will benefit from a quicker style play which leads to attempting more shots

## Part c. Sensitivities associated with each constraint

In [23]:
print("\nDual values for all active constraints:\n")

for c in model.component_objects(Constraint, active=True):
    print(f"Constraint block: {c.name}")
    for index in c:
        constr = c[index]
        dual_val = model.dual.get(constr, 0.0)
        print(f"  {constr.name}: dual = {dual_val:.4f}")
    print()


Dual values for all active constraints:

Constraint block: TotalShots
  TotalShots: dual = 0.9240

Constraint block: Max2PShots
  Max2PShots[0]: dual = 0.2180
  Max2PShots[1]: dual = 0.0380
  Max2PShots[2]: dual = 0.0000
  Max2PShots[3]: dual = 0.0000
  Max2PShots[4]: dual = 0.0000
  Max2PShots[5]: dual = 0.3400
  Max2PShots[6]: dual = 0.0000
  Max2PShots[7]: dual = 0.1960
  Max2PShots[8]: dual = 0.4100
  Max2PShots[9]: dual = 0.0000
  Max2PShots[10]: dual = 0.1880
  Max2PShots[11]: dual = 0.0000
  Max2PShots[12]: dual = 0.4100
  Max2PShots[13]: dual = 0.0000
  Max2PShots[14]: dual = 0.3800
  Max2PShots[15]: dual = 0.0000
  Max2PShots[16]: dual = 0.0000
  Max2PShots[17]: dual = 0.0000
  Max2PShots[18]: dual = 0.3260
  Max2PShots[19]: dual = 0.0000

Constraint block: Max3PShots
  Max3PShots[0]: dual = 0.5760
  Max3PShots[1]: dual = 0.0750
  Max3PShots[2]: dual = 0.1950
  Max3PShots[3]: dual = 0.0000
  Max3PShots[4]: dual = 0.0000
  Max3PShots[5]: dual = 0.0000
  Max3PShots[6]: dual = 0

### Interpretation of Sensitivity Results

In a linear optimization model, each constraint has an associated dual value that measures how the objective would change if the constraint was altered i.e. the effect of the constraints in the optimal objective.

- A non-zero dual value indicates that the corresponding variable is not a 'binding' constraint, which indicates that it has a direct effect on the solution and is 'active' in the model. In our context, allowing one additional shot would increase the team’s expected points by approximately the value of the dual

- Conversely, a zero dual value means the constraint is 'non-binding': the player is not using up their full shooting capacity in the optimal solution, so increasing their shot limit would not change the objective. Zero duals often occur for less efficient shooters, role players, or players whose shot type (2P or 3P) is not used by the optimizer.

In summary, non-zero dual values identify the constraints that impact team’s performance, while zero dual values correspond to constraints that are irrelevant to the model.

#### Total Shots Constraint:

This constraint has a dual value of 0.924, which represents the marginal value of one additional team shot within the current optimal solution. In other words, if the Pacers shot one additional shot, the expected points would increase by approximately 0.924 points.

This value is binding constant: the team is using up all available shot attempts and having more would improve performance.


#### 2-point Shots Constraint:

The dual value associated with each player’s 2-point constraint tells us how valuable it would be to allow that player to take one additional 2-point attempt.

We can see that there are several players with a zero dual value, which are Aaron Nesmith, Jarace Walker, Andrew Nembhard, Ben Sheppard, Jeremiah Robinson-Earl, RayJ Dennis, Monte Morris, Cody Martin, T.J. McConnell, Johnny Furphy and Taelon Peter. These are players to which the optimizer does not assign all their allowed shots because doing so would not increase the expected points. This may be due to the players being less efficient scorers or low-usage offensive players.

The rest of players have positive dual value, meaning that the player is already taking as many 2-pointers as allowed, and giving them one more shot would increase the team's expected points by approximately the dual value.


#### 3-point Shots Constraint:

The interpretation is equivalent to the 2-pointers one.

We can see that only six players would increase the team's expected points by taking more threes. These players are Bennedict Mathurin, Pascal Siakam, Aaron Nesmith, Quenton Jackson, RayJ Dennis and Johnny Furphy.


## Part d. Mixed-Integer Optimization Problem - Optimal Lineup 

### Mathematical Formulation of the Problem

### Sets

  - $P$: set of all players available.
  - $G^{GL} \subseteq P$: set of G-League players. The G-League team is the Indiana Pacers' second team.
  - ${i1, i2} \subseteq P$: pair of incompatible players.
  - $s \in P$: star player.
  - $D \subseteq P$: set of defender players.

Position subsets:
  - $P^{PG} \subseteq P$ represents the set of point guards (PGs).
  - $P^{SG} \subseteq P$ represents the set of shooting guards (SGs).
  - $P^{F} \subseteq P$ represents the set of forwards (SFs or PFs).
  - $P^{C} \subseteq P$ represents the set of centers (Cs).

### Parameters

For each player $i \in P$:

- $v_{i}$: valuation (performance score) for each player $i$.

We are going to compute this as follows:

- $v_{i} = \mathbf{FG\%}_{i} + \mathbf{FT\%}_{i} + \mathbf{3P\%}_{i} + \mathbf{PTS}_{i} + \mathbf{STL}_{i} + \mathbf{TRB}_{i} + \mathbf{AST}_{i} + \mathbf{BLK}_{i} - \mathbf{TOV}_{i} - \mathbf{PF}_{i}$

We define a linear performance index that rewards scoring efficiency (shooting percentages), points, rebounds, assists, steals and blocks, and penalizes negative aspects of the game as turnovers and personal fouls.

### Decision Variables

- $x_{i} = 
\begin{cases}
    1, & \text{if player } i \text{ is selected in the starting lineup,} \\
    0, & \text{otherwise.}
\end{cases}$

  $x_{i} \in \{0, 1\} \quad \forall i \in P$.


### Objective Function

We want to maximize the total valuation of the selected lineup.

\begin{aligned}
\max \quad &
  \sum_{i \in P} v_i \, x_i \\[0.4cm]
\text{s.t.} \quad
& \sum_{i \in P} x_i = 5
  && \text{(lineup size)} \\[0.3cm]
& \sum_{i \in P^{PG}} x_i \ge 1
  && \text{(at least 1 PG)} \\[0.2cm]
& \sum_{i \in P^{SG}} x_i \ge 1
  && \text{(at least 1 SG)} \\[0.2cm]
& \sum_{i \in P^{C}}  x_i \ge 1
  && \text{(at least 1 C)} \\[0.2cm]
& \sum_{i \in P^{F}}  x_i \ge 2
  && \text{(at least 2 forwards)} \\[0.4cm]
& \sum_{i \in G^{GL}} x_i \le 2
  && \text{(at most 2 G{-}League players)} \\[0.3cm]
& x_{i_1} + x_{i_2} \le 1
  && \text{(incompatible pair: Jackson / Bradley)} \\[0.3cm]
& x_s \le \sum_{j \in D} x_j
  && \text{(if the star plays, at least one defender too)} \\[0.4cm]
& x_i \in \{0,1\}
  && \forall i \in P \quad \text{(binary decisions)}
\end{aligned}


In [24]:
import pandas as pd
from pyomo.environ import *
import numpy as np

In [26]:
data = pd.read_csv("pacers_stats_per_game.csv" , sep=";", encoding="utf-8-sig")

# These are the columns we need to convert to numeric
num_cols = [
    "Age", "G", "GS", "MP", "FG", "FGA", "FG%", "3P", "3PA", "3P%",
    "2P", "2PA", "2P%", "eFG%", "FT", "FTA", "FT%", "ORB", "DRB",
    "TRB", "AST", "STL", "BLK", "TOV", "PF", "PTS"
]
for c in num_cols:
    data[c] = pd.to_numeric(data[c], errors="coerce")

data["Pos"] = data["Pos"].fillna("").astype(str)


# These are the columns for valuation metrics
cols_valuation = ["FG%", "FT%", "3P%", "PTS", "STL", "TRB", "AST", "BLK", "TOV", "PF"]

for c in cols_valuation:
    data[c] = pd.to_numeric(data[c], errors="coerce")
    data[c] = data[c].fillna(0.0)

In [27]:
# We compute each player's valuation as explained before

data["valuation"] = (
    data["FG%"] + data["FT%"] + data["3P%"] +
    data["PTS"] + data["STL"] + data["TRB"] +
    data["AST"] + data["BLK"] -
    data["TOV"] - data["PF"]
)
data["valuation"] = data["valuation"].fillna(0.0)
data

Unnamed: 0,Rk,Player,Age,Pos,G,GS,MP,FG,FGA,FG%,...,TRB,AST,STL,BLK,TOV,PF,PTS,Awards,Player-additional,valuation
0,1.0,Bennedict Mathurin,23.0,SF,2,2,36.5,8.5,15.5,0.548,...,7.0,2.5,0.0,0.0,2.5,3.0,31.0,,mathube01,36.933
1,2.0,Pascal Siakam,31.0,PF,11,11,34.9,8.6,19.4,0.446,...,7.4,5.3,1.1,0.4,2.6,3.3,24.1,,siakapa01,33.823
2,3.0,Aaron Nesmith,26.0,SF,11,11,30.5,4.9,13.4,0.367,...,4.5,1.5,0.8,0.3,0.7,2.5,15.5,,nesmiaa01,20.935
3,4.0,Jarace Walker,22.0,PF,12,7,29.1,3.4,11.5,0.297,...,5.2,3.3,0.4,0.6,2.1,2.1,10.3,,walkeja02,17.002
4,5.0,Andrew Nembhard,26.0,PG,5,5,27.4,5.2,14.2,0.366,...,1.4,6.8,0.4,0.0,2.2,2.6,17.2,,nembhan01,22.579
5,6.0,Obi Toppin,27.0,PF,3,0,27.3,5.0,12.0,0.417,...,6.7,1.7,1.0,0.0,2.0,2.3,14.0,,toppiob01,20.693
6,7.0,Ben Sheppard,24.0,SG,12,5,25.4,2.5,7.8,0.323,...,4.7,1.7,0.4,0.2,0.7,2.6,6.8,,sheppbe01,11.827
7,8.0,Quenton Jackson,27.0,PG,5,3,20.2,4.0,7.4,0.541,...,3.4,3.6,1.0,0.2,1.2,2.0,11.8,,jacksqu01,18.563
8,9.0,James Wiseman,24.0,C,1,1,20.0,2.0,3.0,0.667,...,4.0,0.0,0.0,1.0,3.0,2.0,4.0,,wisemja01,4.667
9,10.0,Jeremiah Robinson-Earl,25.0,PF,7,2,19.4,2.0,5.6,0.359,...,6.4,1.0,0.7,0.0,0.3,1.1,5.4,,robinje02,13.53


In [28]:
players_idx = list(data.index)

# We define position-based indices
PG_idx = [i for i in players_idx if "PG" in data.loc[i, "Pos"]]
SG_idx = [i for i in players_idx if "SG" in data.loc[i, "Pos"]]
C_idx  = [i for i in players_idx if data.loc[i, "Pos"] == "C" or " C" in data.loc[i,"Pos"]]
F_idx  = [i for i in players_idx if any(p in data.loc[i, "Pos"] for p in ["SF", "PF"])]


valuation_dict = data["valuation"].to_dict()

We start building the model

In [29]:
from itertools import combinations  


model = ConcreteModel()
model.PLAYERS = Set(initialize=players_idx)

model.valuation = Param(
    model.PLAYERS,
    initialize=valuation_dict
)

model.x = Var(model.PLAYERS, domain=Binary)


In [30]:
# We define the objective function: maximize total valuation
def objective_rule(m):
    return sum(m.valuation[i] * m.x[i] for i in m.PLAYERS)

model.OBJ = Objective(rule=objective_rule, sense=maximize)

# We start now defining the constraints

# The starting lineup must have exactly 5 players
def lineup_size_rule(m):
    return sum(m.x[i] for i in m.PLAYERS) == 5

model.LineupSize = Constraint(rule=lineup_size_rule)

# At least one Point Guard (PG)
def pg_rule(m):
    return sum(m.x[i] for i in PG_idx) >= 1

model.MinPG = Constraint(rule=pg_rule)

# At least one Shooting Guard (SG)
def sg_rule(m):
    return sum(m.x[i] for i in SG_idx) >= 1

model.MinSG = Constraint(rule=sg_rule)

# At least one Center (C)
def c_rule(m):
    return sum(m.x[i] for i in C_idx) >= 1

model.MinC = Constraint(rule=c_rule)

# At least two Forwards (SF or PF)
def f_rule(m):
    return sum(m.x[i] for i in F_idx) >= 2

model.MinF = Constraint(rule=f_rule)


# Two G-League players at most. We define the G-League players set by choosing their names in the dataset
gleague_names = [
    "Mac McClung",
    "Johnny Furphy",
    "Taelon Peter",
    "RayJ Dennis",
    "Ben Sheppard",
    "Quenton Jackson"
]

G_idx = [i for i in players_idx if data.loc[i, "Player"] in gleague_names]

if len(G_idx) > 0:
    model.MaxGLeague = Constraint(
        expr=sum(model.x[i] for i in G_idx) <= 2
    )

# Incompatibility constraints. For example, two traditional centers that we do not want to play together (Isaiah Jackson and Tony Bradley)
incompat_pair = ("Isaiah Jackson", "Tony Bradley")
incompat_idx = [i for i in players_idx if data.loc[i, "Player"] in incompat_pair]

if len(incompat_idx) == 2:
    i1, i2 = incompat_idx
    model.Incompatibility = Constraint(expr=model.x[i1] + model.x[i2] <= 1)

# If the star player is selected, at least one of the designated defenders must also be selected
star_name = "Pascal Siakam"
defenders = ["Cody Martin", "Ben Sheppard"]

star_idx = [i for i in players_idx if data.loc[i, "Player"] == star_name]
def_idx  = [i for i in players_idx if data.loc[i, "Player"] in defenders]

if len(star_idx) == 1 and len(def_idx) >= 1:
    i_star = star_idx[0]
    # x_star <= sum x_def
    model.StarNeedsDefender = Constraint(
        expr=model.x[i_star] <= sum(model.x[j] for j in def_idx)
    )

In [31]:
solver = SolverFactory("glpk")  
results = solver.solve(model, tee=True)

print("\nThe optimization was", results.solver.status)
print("The result is", results.solver.termination_condition)

GLPSOL--GLPK LP/MIP Solver 5.0
Parameter(s) specified in the command line:
 --write C:\Users\ralzu\AppData\Local\Temp\tmpm4hjytah.glpk.raw --wglp C:\Users\ralzu\AppData\Local\Temp\tmpn7xdpstw.glpk.glp
 --cpxlp C:\Users\ralzu\AppData\Local\Temp\tmp81yt1aas.pyomo.lp
Reading problem data from 'C:\Users\ralzu\AppData\Local\Temp\tmp81yt1aas.pyomo.lp'...
8 rows, 21 columns, 52 non-zeros
21 integer variables, all of which are binary
149 lines were read
Writing problem data to 'C:\Users\ralzu\AppData\Local\Temp\tmpn7xdpstw.glpk.glp'...
113 lines were written
GLPK Integer Optimizer 5.0
8 rows, 21 columns, 52 non-zeros
21 integer variables, all of which are binary
Preprocessing...
1 hidden covering inequaliti(es) were detected
8 rows, 21 columns, 52 non-zeros
21 integer variables, all of which are binary
Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  1.000e+00  ratio =  1.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 8
Solving LP relaxati

We display the resulting starting lineup

In [32]:
selected_idx = [i for i in model.PLAYERS if value(model.x[i]) > 0.5]

best_lineup = data.loc[selected_idx].copy()
total_val = best_lineup["valuation"].sum()

print("\nThis is the best starting lineup found:\n")
print(f"Total valuation = {total_val:.2f}\n")

cols_show = ["Player", "Pos", "PTS", "TRB", "AST", "STL", "BLK", "TOV", "PF", "valuation"]
print(best_lineup[cols_show].to_string(index=False))

print("\nPosition distribution in the lineup:")
print(best_lineup["Pos"].value_counts())


This is the best starting lineup found:

Total valuation = 118.44

            Player Pos  PTS  TRB  AST  STL  BLK  TOV  PF  valuation
Bennedict Mathurin  SF 31.0  7.0  2.5  0.0  0.0  2.5 3.0     36.933
     Pascal Siakam  PF 24.1  7.4  5.3  1.1  0.4  2.6 3.3     33.823
   Andrew Nembhard  PG 17.2  1.4  6.8  0.4  0.0  2.2 2.6     22.579
      Ben Sheppard  SG  6.8  4.7  1.7  0.4  0.2  0.7 2.6     11.827
    Isaiah Jackson   C  8.3  6.1  0.9  0.7  0.3  1.2 3.2     13.282

Position distribution in the lineup:
Pos
SF    1
PF    1
PG    1
SG    1
C     1
Name: count, dtype: int64


### Interpretation of Optimal Lineup:

The mixed-integer model picks the five players who give the highest total valuation while respecting all the positional and logical constraints we added. The resulting lineup: Bennedict Mathurin, Pascal Siakam, Andrew Nembhard, Ben Sheppard and Isaiah Jackson, makes sense when looking at the individual stats. Mathurin and Siakam stand out as the main offensive options, since they contribute a lot in scoring and overall impact. Nembhard is chosen as the point guard because he combines efficient scoring with good playmaking numbers. Ben Sheppard appears as the shooting guard partly because he fits the defensive requirement linked to Siakam, and Isaiah Jackson cpvers the center spot while also satisfying the rule that prevents him from being paired with Tony Bradley.

Taken together, the lineup looks like a realistic NBA starting five: two strong scorers (Mathurin and Siakam), a solid ball-handler (Nembhard), a 3-and-D wing (Sheppard), and an athletic big man (Jackson). This shows how the optimization model integrates statistical performance with basketball logic to produce a strategically consistent starting lineup.