# **Soccer Location Problem**

&copy; Copyright 2025 Fair Isaac Corporation

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.
 
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This example uses FICO&reg; Xpress software. By running it, you agree to the Community License terms of the [Xpress Shrinkwrap License Agreement](https://community.fico.com/s/contentdocument/06980000002h0i5AAA) with respect to the FICO&reg; Xpress software. See the [licensing options](https://www.fico.com/en/fico-xpress-trial-and-licensing-options) overview for additional details and information about obtaining a paid license.

***Context***

Every late spring, the city sets up soccer camps at local parks for students to attend after school. Interested students sign up on each day of the camp and a camp monitor is sent to the schools to pick them up and accompany them to the camp where a limited set of attendees can participate.

Because registration occurs on the same day, interest in attending is highly correlated to the weather.

You're brought on to this project to support the city in their decision of where to locate soccer camps while accounting for costs, including camp setup costs and student pickup cost, as well as ensuring as few interested students as possible are excluded considering the variability of demand.

Surveys performed by experts has provided estimates of each school's  distribution of daily interest in attending soccer camp.
Your goal is to leverage all this information to create an application  to help the city select where to locate soccer camps for the season.

## Potential todos and enhancement

- Should we consider transition matrix of distance to transfer from one place to another? Dependence between each time period?

# Solution Approach



We formulate and solve the problem of choosing in which parks using a combination of a **parametrized deterministic optimization model** that is solved daily and a **simulation** process to learn the best values of the parameter $\Theta^t_{j}$.


## Simulation

The stochasticity of demand and the resulting missed demand/excluded participants are captured via the simulation where we simulate the demand of each school and attempt for all demand to be satisfied with the installed soccer camps.

The **parameter $\Theta^t_{j}$** represents the parameter that affects the capacity constraint that we will learn based on step size during the simulation process and **$\alpha$** is the step size/ learning rate.

After each day simulated $t$, we track the number of missed demand/excluded participants, $\delta^{t}_j$, AKA , as the excess demand unable to be fulfilled by the soccer camp's capacity.   

Leveraging this information we **update the parameter** $\Theta^{t+1}_{j}=\text{min}(0, \alpha * \delta^{t}_{j})$.

**Note**: To more closer resemble reality, the samples of the simulation are drawn from distributions with **different parameters** than those estimated. This reflects the reality of imperfect stochastic estimates.

## Parametrized Optimization Model

The parametrized model seeks to minimize the sum of the following two costs:

* setup costs of all opened camps;
* estimated per unit travelling costs from each school to the selected camp.

The **binary variables $\text{build}_j$** indicate if a certain candidate site $j$ is selected for setting up a camp (=1) or not (=0). 

The **continuous variables $\text{serves}_{i,j}$** indicates the percentage of school $i \in SCHOOLS$'s interested population served by the candidate site $j \in SITES$. 



$$\min _{(serves, build)| \Theta} \sum_{j \in \mathcal{J}} \text{Cost}_{i,j} \cdot \text{build}^t_{j} + \sum_{i \in \mathcal{I}} \sum_{j \in \mathcal{J}} \text{Distance}_{i,j} \cdot (\text{Estimated Demand})^t_i \cdot \text{serves}^t_{i,j}$$

Subject to:

* All demand from each school must be served by a soccer camp:
$$\sum_{j \in SITES} \text{serves}^t_{i,j} = 1, \qquad \forall i \in SCHOOLS$$

* Students can only be assigned to camps that have been installed and must respect the capacity:
$$\sum_{i \in SCHOOLS} (\text{Estimated Demand})^t_{i} \cdot \text{serves}^t_{i,j} \leq (\text{Capacity}_{j} + \Theta^t_{j}) \cdot \text{build}^t_j, \qquad \forall j \in SITES$$

## Metrics 

Because our goal is to evaluate the performance of our model over time, both in terms of costs and excluded participants, we track the daily component costs and number of excluded participants and aggregate over the entire time horizon.

# Insert Code Overview

- Data files, Note that we have two sets of distribution parameters to sample from: Estimated used in the optimization model and Actual used in the simulation.
- Simulator file: 
    - optimizer: Show the code of the facility location and where the trained parameter lies.
    - simulate_demand: Show how we sample demand from Actuals and then deduct from available capacity to estimate final capacities.
    - adjust_opti_penalties: Our rule to update the penalties for the next iteration.

In [1]:
# Install our python code as a package
%pip install SoccerCampLocationApp/python_source/

Processing c:\users\czet_\desktop\xpress-community\xpress-community\soccercamplocation\soccercamplocationapp\python_source
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Building wheels for collected packages: simulator
  Building wheel for simulator (pyproject.toml): started
  Building wheel for simulator (pyproject.toml): finished with status 'done'
  Created wheel for simulator: filename=simulator-0.1.0-py3-none-any.whl size=1173 sha256=edc0d11b1747cbfcdb1a6110d7f0bede2ea69ab85c29dd0772794eb8d1aa214e
  Stored in directory: C:\Users\czet_\AppData\Local\Temp\pip-ephem-wheel-cache-yiqp5s93\wheels\8b\e4\fd\ec78a845f8ac007f9f0edbe7e92decc048f50411c448d1fff1
Successfully built simulator
Installing collected pac


[notice] A new release of pip is available: 23.2.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
from SoccerCampLocationApp.python_source.instance import FixedInstance
from SoccerCampLocationApp.python_source.plotting import draw_sol
from SoccerCampLocationApp.python_source.simulator import optimize, simulate_demand, adjust_opti_penalties
import os
import pandas as pd
import random

# Fix random seed for reproducibility
rndseed = 10
random.seed(rndseed)

# Data folder containing all input files
attachments_root = os.path.join("SoccerCampLocationApp", "attachments")

# Reading data input files
schools_df = pd.read_csv(os.path.join(attachments_root, "schools.csv"), index_col='school_id')
sites_df = pd.read_csv(os.path.join(attachments_root, "sites.csv"), index_col='site_id')
distances_df = pd.read_csv(os.path.join(attachments_root, "distances.csv"))

# Instance
instance = FixedInstance(
    schools_df,
    sites_df,
    os.path.join(attachments_root, "distances.csv")
)

# Folder Structure
results_path = os.path.join("results")
image_path = os.path.join(results_path, "images")

# Create subfolder for diagrams
if os.path.exists(image_path):
    for file in os.listdir(image_path):
        os.remove(os.path.join(image_path, file))
else:
    os.makedirs(image_path)

# Main execution flow

In [3]:
# Don't show the plots inline
%matplotlib svg

def cfa(number_of_simulation_days: int, step_size: float, penalties:dict, train_flag:bool) -> pd.DataFrame:
    results = []
    all_penalties = []
    all_penalties.append(penalties.copy())

    print(f" Number of simulation days {number_of_simulation_days}")

    # Simulate number of days
    for i in range(number_of_simulation_days):
        print(f"******************ITERATION {i+1}")
        # We first will be optimizing the location model
        served, built, E = optimize(instance, penalties)
        if served:
            draw_sol(image_path, instance, built, E, i)
        else:
            print(f"Infeasible: not enough capacities in instance.")
            break
        # Execute a simulation and adjust parameters of optimization model
        capacities = simulate_demand(instance, built, E, train_flag)
        print(f"Penalties before simulation and adjustments: {penalties}")
        adjust_opti_penalties(capacities, penalties, step_size)
        all_penalties.append(penalties.copy())
        print(f"Penalties after simulation and adjustments: {penalties}")

        # Record solution and simulation information
        open_trucks = [truck for truck, value in built.items() if value > 0.5]
        entry = {
                "iteration": i+1,
                "objective_value": round(sum([instance.truck_cost.get(truck) for truck in open_trucks]) +
                sum(instance.dist[i,j]*ratio*instance.school_stats[i][0] for (i,j), ratio in E.items()),2),
                "open_trucks": ''.join(f"{str(x)}" for x in open_trucks),
                "truck_installation_value": sum([instance.truck_cost.get(truck) for truck in open_trucks]),
                "total_serving_cost": round(sum(instance.dist[i,j]*ratio*instance.school_stats[i][0] for (i,j), ratio in E.items()),2),
                "total_missed_demand": sum(-1* min(0, value) for value in capacities.values())
        }
        results.append(entry)

        # What information to use from the training set for the parameter
    return pd.DataFrame(results), all_penalties

In [4]:
# Do the training
penalties = {site: 0 for site in instance.sites}
# High-level Parameters defined by the user
simulation_days = 20
stepsize = 1
# Train the penalties
results_initial, all_penalties = cfa(simulation_days, stepsize, penalties, True)

 Number of simulation days 20
******************ITERATION 1


  xpress.init('C:/Users/czet_/Desktop/xpress-community/xpress-community/SoccerCampLocation/my_env/Lib/site-packages/xpress/license/community-xpauth.xpr')
  prob = xp.problem()


Demand for S0, C3 is 32
Demand for S0, C5 is 10
Demand for S1, C3 is 12
Demand for S1, C6 is 29
Demand for S2, C5 is 49
Demand for S3, C6 is 43
Demand for S4, C5 is 50
Penalties before simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': 0, 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
Penalties after simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': np.float64(-12.0), 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
******************ITERATION 2
Demand for S0, C3 is 39
Demand for S1, C3 is 13
Demand for S1, C6 is 31
Demand for S2, C3 is 2
Demand for S2, C5 is 52
Demand for S3, C6 is 44
Demand for S4, C5 is 47
Penalties before simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': np.float64(-12.0), 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
Penalties after simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': np.float64(-14.0), 'C6': np.float64(-3.0), 'C7': 0, 'C8': 0, 'C9': 0, 

In [5]:
# Test CFA using the last set of penalties and same updates
test_days = 50
stepsize = 0
penalties = all_penalties.pop()
results_new, _ = cfa(test_days, stepsize, penalties, train_flag=False)

 Number of simulation days 50
******************ITERATION 1
Demand for S0, C10 is 55
Demand for S1, C6 is 5
Demand for S1, C10 is 62
Demand for S2, C5 is 38
Demand for S3, C6 is 15
Demand for S4, C5 is 28
Demand for S4, C6 is 19
Penalties before simulation and adjustments: {'C0': 0, 'C1': np.float64(-2.0), 'C2': 0, 'C3': np.float64(-4.0), 'C4': 0, 'C5': np.float64(-27.0), 'C6': np.float64(-9.0), 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
Penalties after simulation and adjustments: {'C0': 0, 'C1': np.float64(-2.0), 'C2': 0, 'C3': np.float64(-4.0), 'C4': 0, 'C5': np.float64(-27.0), 'C6': np.float64(-9.0), 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
******************ITERATION 2
Demand for S0, C10 is 41
Demand for S1, C6 is 5
Demand for S1, C10 is 60
Demand for S2, C5 is 41
Demand for S3, C6 is 51
Demand for S4, C5 is 34
Demand for S4, C6 is 23
Penalties before simulation and adjustments: {'C0': 0, 'C1': np.float64(-2.0), 'C2': 0, 'C3': np.float64(-4.0), 'C4': 0, 'C5': np.float64(-27.0), 'C6': np.floa

In [6]:
# Baseline using static optimization problem no parameters
test_days = 50
stepsize = 0
penalties = {site: 0 for site in instance.sites}
results_baseline, _ = cfa(test_days, stepsize, penalties, train_flag=False)

 Number of simulation days 50
******************ITERATION 1
Demand for S0, C3 is 33
Demand for S0, C5 is 11
Demand for S1, C3 is 25
Demand for S1, C6 is 59
Demand for S2, C5 is 92
Demand for S3, C6 is 23
Demand for S4, C5 is 61
Penalties before simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': 0, 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
Penalties after simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': 0, 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
******************ITERATION 2
Demand for S0, C3 is 30
Demand for S0, C5 is 10
Demand for S1, C3 is 18
Demand for S1, C6 is 43
Demand for S2, C5 is 57
Demand for S3, C6 is 66
Demand for S4, C5 is 67
Penalties before simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': 0, 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
Penalties after simulation and adjustments: {'C0': 0, 'C1': 0, 'C2': 0, 'C3': 0, 'C4': 0, 'C5': 0, 'C6': 0, 'C7': 0, 'C8': 0, 'C9': 0, 'C

## Review Metrics

In [7]:
print(f"KPI Averages of baseline optimization:\n{results_baseline[['objective_value', 'truck_installation_value', 'total_serving_cost', 'total_missed_demand']].mean()}")
print()
print(f"KPI Averages of CFA:\n{results_new[['objective_value', 'truck_installation_value', 'total_serving_cost', 'total_missed_demand']].mean()}")

KPI Averages of baseline optimization:
objective_value             1729.24
truck_installation_value    1200.00
total_serving_cost           529.24
total_missed_demand           53.94
dtype: float64

KPI Averages of CFA:
objective_value             1846.05
truck_installation_value    1200.00
total_serving_cost           646.05
total_missed_demand           25.96
dtype: float64


# Test with a different policy update

In [None]:
# CFA test with a different update policy
test_days = 50
stepsize = 0.5
penalties = all_penalties.pop()
results_new_2, _ = cfa(test_days, stepsize, penalties, train_flag=False)

 Number of simulation days 50
******************ITERATION 1
Demand for S0, C10 is 13
Demand for S1, C6 is 7
Demand for S1, C10 is 82
Demand for S2, C5 is 96
Demand for S3, C6 is 41
Demand for S4, C5 is 16
Demand for S4, C6 is 11
Penalties before simulation and adjustments: {'C0': 0, 'C1': np.float64(-2.0), 'C2': 0, 'C3': np.float64(-4.0), 'C4': 0, 'C5': np.float64(-27.0), 'C6': np.float64(-9.0), 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
Penalties after simulation and adjustments: {'C0': 0, 'C1': np.float64(-2.0), 'C2': 0, 'C3': np.float64(-4.0), 'C4': 0, 'C5': np.float64(-34.5), 'C6': np.float64(-9.0), 'C7': 0, 'C8': 0, 'C9': 0, 'C10': 0}
******************ITERATION 2
Demand for S0, C3 is 34
Demand for S1, C6 is 71
Demand for S2, C1 is 20
Demand for S2, C3 is 86
Demand for S3, C1 is 36
Demand for S3, C6 is 36
Demand for S4, C1 is 32
Penalties before simulation and adjustments: {'C0': 0, 'C1': np.float64(-2.0), 'C2': 0, 'C3': np.float64(-4.0), 'C4': 0, 'C5': np.float64(-34.5), 'C6': np.float

In [None]:
print(f"KPI Averages of CFA:\n{results_new_2[['objective_value', 'truck_installation_value', 'total_serving_cost', 'total_missed_demand']].mean()}")