In [1]:
import pandas as pd
from pulp import *

# POC
To start I will assume I have already ran simulations and translated the results to a set of EVs. I will even boil the problem down to something even more basic with some dummy data to start.
## Dummy Data
Lets say we get to choose between 10 drivers, we only get to pick 3 and they have the following prices and EVs.

In [2]:
df = pd.DataFrame(
    {
        "driver": ["AAAAAA", "BBBBBB", "CCCCCC", "DDDDDD", "EEEEEE", "FFFFFF", "GGGGGG", "HHHHHH", "IIIIII", "JJJJJJ"],
        "EV": [1,2,1,2,3,1,2,3,1.5,1],
        "price": [5, 8, 4, 9, 13, 5, 6, 10, 7, 2]
    }
)

In [3]:
df["value"] = df["EV"] / df["price"]

Then we add the metric `value` representing the EV per unit price. This just helps us vizualize the best value drivers - at the top of the following table

In [4]:
df.sort_values("value", ascending=False)

Unnamed: 0,driver,EV,price,value
9,JJJJJJ,1.0,2,0.5
6,GGGGGG,2.0,6,0.333333
7,HHHHHH,3.0,10,0.3
1,BBBBBB,2.0,8,0.25
2,CCCCCC,1.0,4,0.25
4,EEEEEE,3.0,13,0.230769
3,DDDDDD,2.0,9,0.222222
8,IIIIII,1.5,7,0.214286
0,AAAAAA,1.0,5,0.2
5,FFFFFF,1.0,5,0.2


Lets say our budget is 15 bucks. What team should we choose? For example, **{J, G, C} gives an EV of 4**. At first sight this seems to be the best. Let's see if we can create an optimizer to reproduce this. Let's try and formulate the problem mathematically.

## Formulate the Problem
### Decision Variables
The choice of drivers $ d_i \in \{0,1\} $ for each driver $ i=0,...,9$
### Objective Function
For each boolean driver choice $ d_i $, and driver expected value $  e_i $, maximize the function $$ \sum_{i=0}^{9} d_i e_i $$
### Constraints
For each driver price $ p_i $
$$ \sum_{i=0}^{9} d_i p_i < 15 $$
and 
$$ \sum_{i=0}^{9} d_i = 3 $$
### Data
All required data, $ e_i $ and $ p_i $ for $ i=1,...,9 $ is contained in `df`

## Solution
This is a type of integer program problem given that the decision variables may only take integer values (specifically 0 or 1) rather than real variables. Finding the solution is trivial with pulp. 

In [36]:
drivers = df["driver"].to_list()
EVs = df.set_index("driver").to_dict()["EV"]
prices = df.set_index("driver").to_dict()["price"]

In [37]:
prob = LpProblem("F1-Fantasy", LpMaximize)

In [38]:
driver_choices = LpVariable.dicts(
    "driver_choices",
    drivers,
    lowBound=0,
    upBound=1,
    cat=LpInteger
)

In [39]:
prob += lpSum([EVs[d]*driver_choices[d] for d in drivers]), "Total EV"

In [40]:
prob += lpSum([prices[d]*driver_choices[d] for d in drivers]) <= 15.0, "Total Price"

In [41]:
prob += lpSum([driver_choices[d] for d in drivers]) == 3.0, "Number of Drivers"

In [43]:
prob.solve()

1

In [45]:
print(LpStatus[prob.status])

Optimal


In [44]:
for v in prob.variables():
    print(f"{v.name} = {v.varValue}")

driver_choices_AAAAAA = 0.0
driver_choices_BBBBBB = 0.0
driver_choices_CCCCCC = 0.0
driver_choices_DDDDDD = 0.0
driver_choices_EEEEEE = 0.0
driver_choices_FFFFFF = 0.0
driver_choices_GGGGGG = 1.0
driver_choices_HHHHHH = 0.0
driver_choices_IIIIII = 1.0
driver_choices_JJJJJJ = 1.0


The optimization spotted a better solution!

### Extending
Extending this to the full case is a trivial exercise (up to the availability of data). We just need to get the data for all 20 drivers and modify the constraints. This includes:

 - Budget is now 100
 - We must choose 5 drivers
 - We must choose a single constructor
 
Constructors didn't feature in this example but the principle is exactly the same. The addtional code will be along the lines of:

``` python
constructors = ["Mercedes", "Ferrari", ..., "Williams"]

constructor_choices = LpVariable.dicts(
    "constructor_choices",
    constructors,
    lowBound=0,
    upBound=1,
    cat=LpInteger
)

prob += lpSum([constructor_choices[c] for c in constructors]) == 1.0, "Number of Constructors"

```

The Total EV condition should also be modified to include drivers and constructors... Then we are done!