In [22]:
import pandas as pd
import numpy as np

In [23]:
df = pd.read_csv('../Combine_Data/women/top_women.csv')

In [24]:
df

Unnamed: 0,Name,Average Total,apparatus
0,simone biles,14.65,bb
1,konnor mcclain,14.525,bb
2,sunisa lee,13.925,bb
3,skye blakely,13.543,bb
4,shilese jones,13.3125,bb
5,tiana sumanasekera,13.0875,bb
6,shilese jones,14.5746,ub
7,zoe miller,13.8366,ub
8,skye blakely,13.320222,ub
9,simone biles,15.1,fx


In [37]:
import pulp
import itertools

# Create a Linear Programming problem
prob = pulp.LpProblem("Athlete_Selection", pulp.LpMaximize)

# Define the decision variables (binary variables for each athlete)
athletes = df['Name'].unique()
athlete_vars = pulp.LpVariable.dicts("Select", athletes, 0, 1, pulp.LpInteger)

# Define the objective function (maximize the sum of selected athletes)
objective = pulp.lpSum(athlete_vars[athlete] for athlete in athletes)
prob += objective

# Constraint: Select exactly 5 athletes
prob += pulp.lpSum(athlete_vars[athlete] for athlete in athletes) == 5

# Constraint: Ensure each apparatus is competed on by exactly 3 athletes
apparatus = df['apparatus'].unique()
for event in apparatus:
    athletes_for_event = [athlete_vars[athlete] for athlete in athletes if any(df[(df['Name'] == athlete) & (df['apparatus'] == event)]['Average Total'].notna())]
    prob += pulp.lpSum(athletes_for_event) == 3

# Solve the optimization problem using the CBC solver
prob.solve(pulp.PULP_CBC_CMD())

# Extract the selected athletes
selected_athletes = [athlete for athlete in athletes if pulp.value(athlete_vars[athlete]) == 1]

# Print the selected athletes
for athlete in selected_athletes:
    print(f'{athlete} will compete on the following apparatus: {", ".join(df[(df["Name"] == athlete)]["apparatus"].tolist())}')


Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - /Users/ryantalbot/opt/anaconda3/envs/tf2/lib/python3.9/site-packages/pulp/solverdir/cbc/osx/64/cbc /var/folders/j_/555m2zps099832fjh_m8jjnc0000gn/T/dc8cbcaa52ca481a802a905afe9c23af-pulp.mps max timeMode elapsed branch printingOptions all solution /var/folders/j_/555m2zps099832fjh_m8jjnc0000gn/T/dc8cbcaa52ca481a802a905afe9c23af-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 10 COLUMNS
At line 76 RHS
At line 82 BOUNDS
At line 94 ENDATA
Problem MODEL has 5 rows, 11 columns and 32 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 5 - 0.00 seconds
Cgl0004I processed model has 0 rows, 0 columns (0 integer (0 of which binary)) and 0 elements
Cbc3007W No integer variables - nothing to do
Cuts at root node changed objective from -5 to -1.79769e+308
Probing was tried 0 times and created 0 cuts of whic

In [38]:
import pandas as pd


# Define the athletes and their selected apparatus
selected_athletes = {
    "simone biles": ["bb", "fx", "vt"],
    "skye blakely": ["bb", "ub"],
    "shilese jones": ["bb", "ub", "fx", "vt"],
    "zoe miller": ["ub"],
    "joscelyn roberson": ["fx", "vt"],
}

# Calculate the total score for the selected athletes and apparatus
total_score = 0
for athlete, apparatus_list in selected_athletes.items():
    for app in apparatus_list:
        total_score += df[(df['Name'] == athlete) & (df['apparatus'] == app)]['Average Total'].values[0]

print("Total Score for Selected Athletes and Apparatus:", total_score)


Total Score for Selected Athletes and Apparatus: 169.37484255189258


Certainly! This code is a Python script that uses the PuLP library for linear programming to solve a problem related to athlete selection for a sports competition. I'll explain it line by line:

1. `import pulp`: This line imports the PuLP library, which is used for modeling and solving linear programming problems.

2. `import itertools`: This line imports the itertools library, which provides various functions for working with iterators and iterable data.

3. `prob = pulp.LpProblem("Athlete_Selection", pulp.LpMaximize)`: This line creates a Linear Programming problem named "Athlete_Selection" to be solved with the goal of maximizing an objective function. The `prob` object is an instance of the `LpProblem` class.

4. `athletes = df['Name'].unique()`: This line retrieves a list of unique athlete names from a DataFrame (presumably named `df`) and assigns it to the `athletes` variable.

5. `athlete_vars = pulp.LpVariable.dicts("Select", athletes, 0, 1, pulp.LpInteger)`: This line defines a set of binary decision variables using the athlete names as keys. These variables represent whether an athlete is selected (1) or not selected (0) for the competition. The variables are created with a lower bound of 0 and an upper bound of 1, and they are of integer type.

6. `objective = pulp.lpSum(athlete_vars[athlete] for athlete in athletes)`: This line creates the objective function. The objective is to maximize the sum of the decision variables, which corresponds to maximizing the number of selected athletes.

7. `prob += objective`: This line adds the objective function to the Linear Programming problem (`prob`) that was defined earlier.

8. `prob += pulp.lpSum(athlete_vars[athlete] for athlete in athletes) == 5`: This line adds a constraint to the problem. It specifies that exactly 5 athletes must be selected, and this constraint is added to the `prob` problem.

9. `apparatus = df['apparatus'].unique()`: This line retrieves a list of unique apparatus names from the DataFrame and assigns it to the `apparatus` variable.

10. `for event in apparatus:`: This line starts a loop that iterates over each apparatus in the `apparatus` list.

11. `athletes_for_event = [athlete_vars[athlete] for athlete in athletes if any(df[(df['Name'] == athlete) & (df['apparatus'] == event)]['Average Total'].notna())]`: For each apparatus, this line creates a list of decision variables representing athletes who can compete in the current apparatus. It checks if an athlete has a non-null value in the 'Average Total' column for that apparatus in the DataFrame.

12. `prob += pulp.lpSum(athletes_for_event) == 3`: This line adds a constraint to the problem, ensuring that exactly 3 athletes are selected for the current apparatus. It adds this constraint to the `prob` problem.

13. `prob.solve(pulp.PULP_CBC_CMD())`: This line solves the optimization problem using the CBC solver provided by PuLP. It finds the optimal selection of athletes that maximizes the objective function while satisfying the defined constraints.

14. `selected_athletes = [athlete for athlete in athletes if pulp.value(athlete_vars[athlete]) == 1]`: This line creates a list of athlete names who have been selected for the competition. It checks the value of the decision variables (athlete_vars) to determine which athletes were selected (variables with a value of 1).

15. The following loop iterates over the selected athletes and prints their names along with the apparatus they will compete on based on the data in the DataFrame.

Overall, this code models an athlete selection problem as a linear programming problem and uses the PuLP library to find the optimal solution that maximizes the number of selected athletes while meeting the specified constraints.

Certainly! At a high level, this code is solving an athlete selection problem for a sports competition using linear programming. The objective is to select a set of athletes who will compete while satisfying certain constraints. Here's the big picture idea of what the code is doing:

1. **Problem Setup**:
   - It imports the necessary libraries, PuLP and itertools.
   - Creates a Linear Programming problem with the goal of maximizing an objective function. The problem is named "Athlete_Selection."

2. **Decision Variables**:
   - Defines binary decision variables for each athlete. A binary variable is used for each athlete, indicating whether they are selected (1) or not selected (0) for the competition.

3. **Objective Function**:
   - Defines an objective function that aims to maximize the number of selected athletes. In other words, it maximizes the sum of the binary decision variables.

4. **Constraints**:
   - Ensures that exactly 5 athletes are selected by adding a constraint that limits the sum of the decision variables to equal 5.

   - For each apparatus (event), it ensures that exactly 3 athletes are selected to compete. It identifies which athletes can participate in each apparatus based on certain conditions in the data and adds constraints for each apparatus.

5. **Solving the Problem**:
   - Solves the linear programming problem using the CBC solver, aiming to find the optimal combination of athletes that maximizes the number of selected athletes while satisfying all constraints.

6. **Extracting and Printing the Results**:
   - After solving the problem, it identifies the athletes that were selected (decision variables with a value of 1) and prints their names along with the apparatus they will compete in.

In summary, this code uses linear programming to automate the selection of athletes for a sports competition. It maximizes the number of athletes while ensuring that there are exactly 5 athletes selected overall and 3 athletes selected for each apparatus. The final output is a list of selected athletes and the apparatus they will compete in.