# Final Project Notebook Template

This template is designed for students to document their final project. Each team should ensure that the following information is included:

- **Student IDs and Full Names**: List the student IDs and full names of all team members.
- **Contribution of each Member**: Describe clearly the contribution of each team member in the project.
- **Studied Problem**: Provide a clear description of the integer programming model being studied. Ensure that the problem is relevant and has been validated by the teacher.
- **Real Data**: Use real data for the project. Describe the source and nature of the data being used.

This notebook is structured to guide you through the process of defining, analyzing, and solving an integer programming problem. Follow the sections provided to document your workflow and findings comprehensively.

In [1]:
import sympy as sp
import numpy as np
import pandas as pd
# add here the necessary imports

## Introduction

This final project is about planning **Community Centers for Digital Innovation (CCDI)** in the Metropolitan Area of Monterrey, Mexico. The main goal is to decide in which municipalities the government should open some centers, and how the population of each municipality will be attended by these centers.

In the course we studied different topics from **linear programming** and **integer programming**, and some solution methods like the simplex method, two‑phase simplex method, and branch‑and‑bound. In this notebook we use these ideas in a small but realistic case study: a *location–allocation* problem where we must choose where to open facilities (the CCDI) and how to assign demand (people) to those facilities.

The data used in the project mix real information (population of municipalities from the 2020 census) with reasonable assumptions (potential users, distances and capacities). The problem is formulated as a **mixed‑integer linear programming** (MILP) model with binary variables, several constraints and an objective function that represents a kind of transportation cost for the users. Then, we solve the model with Python tools and we analyze the obtained solution and its interpretation for public policy planning.


## Description of the case study

The case study is located in the **Monterrey Metropolitan Area (MMA)**, in the state of Nuevo León, México. The state government wants to reduce the digital gap and support students, teachers and general population with access to internet, computers and basic training in digital skills. For that purpose, they plan to create **Community Centers for Digital Innovation (CCDI)** in different municipalities of the MMA.

In this project we consider six municipalities of the MMA:

- Monterrey  
- Guadalupe  
- Apodaca  
- San Nicolás de los Garza  
- Santa Catarina  
- General Escobedo  

Each municipality has a different total population and, as a consequence, a different **potential demand** of users of the CCDI. We will suppose that around 15% of the population could use the centers in a regular way (for study, online procedures, homework, etc).

Because the budget is limited, the government can open **at most three CCDI** in this first stage. Every center has a **maximum capacity** of 200,000 potential users. In addition, the plan should respect some *equity conditions* between different zones of the MMA:

- At least one CCDI must be located in the **east / north‑east** zone (municipalities of Guadalupe or Apodaca).  
- At least one CCDI must be located in the **north / west** zone (San Nicolás de los Garza, General Escobedo or Santa Catarina).  

The main questions of the case study are:

1. In which municipalities should the government open the CCDI?  
2. How should the demand (potential users) from every municipality be assigned to the open centers?  

The decisions must satisfy the capacity and equity restrictions, and also they should minimize a measure of **total distance** (or travel effort) between users and the CCDI that attend them.


## Data Description

In this section we describe the data that we will use for the model.

We work with six municipalities of the Monterrey Metropolitan Area. For each municipality we have:

- **Municipality name**.  
- **Region inside the MMA** (Center, East, North‑east, North, West).  
- **Total population in 2020** (real data from the Mexican census 2020).  
- **Estimated demand of users**, defined as 15% of the total population.  

The total population is based on official values (INEGI 2020 census and public municipal profiles). The 15% factor is an assumption that represents the fraction of inhabitants that may use the CCDI in a frequent way. This demand is the parameter that will appear in the objective function and the capacity constraints.

Additionally, we need an approximation of the **distance between municipalities**, which we model as a symmetric matrix of distances in kilometers between the centers of the municipalities. These distances are not exact values, but they are reasonable numbers to create a realistic optimization model.

All this information is stored in Python structures (a `DataFrame` for the population and demand, and a dictionary for the distances). We will use them later to define the objective function and the constraints of the integer programming model.


In [None]:
import pandas as pd

# Real population in 2020 (approximate values from official statistics)
population = {
    "Monterrey": 1142994,
    "Guadalupe": 643143,
    "Apodaca": 656464,
    "San Nicolás de los Garza": 412199,
    "Santa Catarina": 306322,
    "General Escobedo": 454957
}

# Demand: 15% of the population (potential active users of CCDI)
demand = {m: int(round(p * 0.15)) for m, p in population.items()}

# Region inside the Metropolitan Area (simple classification)
region = {
    "Monterrey": "Center",
    "Guadalupe": "East",
    "Apodaca": "North‑east",
    "San Nicolás de los Garza": "North",
    "Santa Catarina": "West",
    "General Escobedo": "North"
}

df = pd.DataFrame({
    "municipality": list(population.keys()),
    "population_2020": list(population.values()),
    "demand_users": [demand[m] for m in population.keys()],
    "region": [region[m] for m in population.keys()]
})

df

## Problem Definition

We formulate the case study as a **location–allocation** problem with binary decision variables. The idea is to decide where to open the CCDI (location part) and how to assign the demand of each municipality to the open centers (allocation part).

### Decision variables

We define two kinds of binary variables:

1. **Location variables**

For each municipality \( j \) in the set of municipalities \( J \):

$$
y_j = \begin{cases}
1 & \text{if a CCDI is opened in municipality } j \\
0 & \text{otherwise}
\end{cases}
$$

2. **Assignment variables**

For each pair of municipalities \( i \) and \( j \):

$$
x_{ij} = \begin{cases}
1 & \text{if the demand of municipality } i \text{ is assigned to the CCDI located in } j \\
0 & \text{otherwise}
\end{cases}
$$

### Parameters

- \( D_i \): demand of users in municipality \( i \) (15% of its population).  
- \( c_{ij} \): distance in km between municipality \( i \) and municipality \( j \).  
- \( C \): maximum capacity of one CCDI (200,000 users).  
- \( B \): maximum number of centers that can be opened (here, 3).  

### Objective function

We want to **minimize the total travel cost** of the users, which we model as the sum of distance times demand for all assignments:

$$
\min Z = \sum_{i} \sum_{j} c_{ij} \, D_i \, x_{ij}
$$

### Constraints

1. **Unique assignment of municipalities**  

Each municipality must be attended by exactly one CCDI:

$$
\sum_{j} x_{ij} = 1 \quad \forall i
$$

2. **Assignment only to open centers**  

A municipality can be assigned to another municipality \( j \) only if there is a CCDI opened in \( j \):

$$
x_{ij} \le y_j \quad \forall i, j
$$

3. **Capacity of each CCDI**  

The total demand assigned to a center \( j \) can not exceed its capacity:

$$
\sum_{i} D_i \, x_{ij} \le C \, y_j \quad \forall j
$$

4. **Maximum number of centers**  

We can open at most \( B \) CCDI in total:

$$
\sum_{j} y_j \le B
$$

5. **Equity constraint east / north‑east**  

At least one center must be opened in Guadalupe or Apodaca:

$$
y_{\text{Guadalupe}} + y_{\text{Apodaca}} \ge 1
$$

6. **Equity constraint north / west**  

At least one center must be opened in San Nicolás de los Garza, General Escobedo or Santa Catarina:

$$
y_{\text{San~Nicol\acute{a}s}} + y_{\text{General~Escobedo}} + y_{\text{Santa~Catarina}} \ge 1
$$

7. **Integrality conditions**  

All decision variables are binary:

$$
x_{ij} \in \{0,1\}, \quad y_j \in \{0,1\} \quad \forall i,j
$$

This is a mixed‑integer linear programming (MILP) model, where the structure is not trivial to solve by inspection, so we need an optimisation solver (that internally will use techniques like branch‑and‑bound).


In [None]:
# We define the list of municipalities and the distance matrix between them

municipalities = list(population.keys())

# Symmetric distance matrix (in km) between municipality centers (approximate values)
distances = {
    ("Monterrey", "Monterrey"): 0,
    ("Monterrey", "Guadalupe"): 10,
    ("Monterrey", "Apodaca"): 18,
    ("Monterrey", "San Nicolás de los Garza"): 8,
    ("Monterrey", "Santa Catarina"): 14,
    ("Monterrey", "General Escobedo"): 15,

    ("Guadalupe", "Guadalupe"): 0,
    ("Guadalupe", "Apodaca"): 12,
    ("Guadalupe", "San Nicolás de los Garza"): 9,
    ("Guadalupe", "Santa Catarina"): 20,
    ("Guadalupe", "General Escobedo"): 18,

    ("Apodaca", "Apodaca"): 0,
    ("Apodaca", "San Nicolás de los Garza"): 14,
    ("Apodaca", "Santa Catarina"): 26,
    ("Apodaca", "General Escobedo"): 20,

    ("San Nicolás de los Garza", "San Nicolás de los Garza"): 0,
    ("San Nicolás de los Garza", "Santa Catarina"): 16,
    ("San Nicolás de los Garza", "General Escobedo"): 10,

    ("Santa Catarina", "Santa Catarina"): 0,
    ("Santa Catarina", "General Escobedo"): 22,

    ("General Escobedo", "General Escobedo"): 0,
}

# Complete the symmetric distances
for i in municipalities:
    for j in municipalities:
        if (i, j) not in distances and (j, i) in distances:
            distances[(i, j)] = distances[(j, i)]

# Show a small sample to check
list(distances.items())[:10]

## Problem Solution

To solve the integer programming model we will use the `PuLP` library for Python, which allows us to define linear and integer programming problems and send them to an external solver. The solver typically uses the simplex method for the linear relaxations and a **branch‑and‑bound** (or related) strategy to enforce the integrality of the binary variables.

The steps are:

1. Define the problem as a minimization model.  
2. Create the binary variables \( y_j \) (location) and \( x_{ij} \) (assignment).  
3. Add the objective function with the distance and demand parameters.  
4. Add all the constraints: unique assignment, assignment only to open centers, capacity, maximum number of centers and equity constraints.  
5. Call the solver and read the optimal solution.  

Below we implement the full model in Python.


In [None]:
# Install PuLP if it is not available in the environment
!pip install pulp -q

import pulp as lp

# Parameters
D = demand                 # demand of users per municipality
C = 200_000                # maximum capacity per CCDI
B = 3                      # maximum number of centers we can open

# 1) Create the minimization problem
prob = lp.LpProblem("Location_CCDI_Monterrey", lp.LpMinimize)

# 2) Decision variables
# y_j = 1 if we open a CCDI in municipality j
y = lp.LpVariable.dicts("y", municipalities, lowBound=0, upBound=1, cat=lp.LpBinary)

# x_ij = 1 if demand of municipality i is assigned to CCDI in municipality j
x = lp.LpVariable.dicts("x",
                        (municipalities, municipalities),
                        lowBound=0, upBound=1,
                        cat=lp.LpBinary)

# 3) Objective function: minimize total travel cost
prob += lp.lpSum(distances[(i, j)] * D[i] * x[i][j]
                 for i in municipalities for j in municipalities), "Total_travel_cost"

# 4) Each municipality must be assigned to exactly one CCDI
for i in municipalities:
    prob += lp.lpSum(x[i][j] for j in municipalities) == 1, f"Unique_assignment_{i}"

# 5) Assignment is only possible if the CCDI in j is open
for i in municipalities:
    for j in municipalities:
        prob += x[i][j] <= y[j], f"Assign_only_if_open_{i}_{j}"

# 6) Capacity constraint: total demand assigned to j can not exceed its capacity
for j in municipalities:
    prob += lp.lpSum(D[i] * x[i][j] for i in municipalities) <= C * y[j], f"Capacity_{j}"

# 7) Maximum number of centers
prob += lp.lpSum(y[j] for j in municipalities) <= B, "Max_number_of_centers"

# 8) Equity: at least one center in East / North‑east (Guadalupe or Apodaca)
prob += y["Guadalupe"] + y["Apodaca"] >= 1, "Equity_east_northeast"

# 9) Equity: at least one center in North / West (San Nicolás, General Escobedo or Santa Catarina)
prob += (y["San Nicolás de los Garza"] +
         y["General Escobedo"] +
         y["Santa Catarina"]) >= 1, "Equity_north_west"

# 10) Solve the model
prob.solve()

print("Status of the solution:", lp.LpStatus[prob.status])
print("\nOpened centers (y_j = 1):")
for j in municipalities:
    print(f"{j:25s} -> {int(y[j].varValue)}")

print("\nAssignments x_ij = 1 (municipality i is served by CCDI in j):")
for i in municipalities:
    for j in municipalities:
        if x[i][j].varValue > 0.5:
            print(f"{i:25s} is served by {j}")

print("\nOptimal value of the objective function (total travel cost):",
      lp.value(prob.objective))

## Solution Validation and Analysis

In this section we check if the obtained solution is consistent with the constraints and we interpret the results in the context of the case study.

First, we can verify manually that:

- Exactly one **assignment** is made for each municipality (each municipality appears exactly once in the list of assignments).  
- No center receives more total demand than its capacity of 200,000 users.  
- The number of opened centers is less or equal than 3.  
- At least one of the opened centers is located in Guadalupe or Apodaca (east / north‑east zone).  
- At least one of the opened centers is located in San Nicolás de los Garza, General Escobedo or Santa Catarina (north / west zone).  

If all these checks are satisfied, the solution is **feasible**. Because the solver is using an exact branch‑and‑bound procedure on top of the linear programming relaxation, the found solution is also **optimal** for the model that we defined.

From a practical point of view, the pattern of opened centers usually reflects a trade‑off between:

- Placing centers close to the municipalities with larger demand (for exmaple Monterrey or Apodaca).  
- Covering other municipalities in a way that the distances do not increase too much.  
- Respecting the equity restrictions between different zones of the metropolitan area.  

We can also compare different scenarios by changing parameters like the capacity \(C\), the maximum number of centers \(B\), or the distance matrix, to see how sensitive the solution is. This type of analysis is very useful for public decision makers, because it shows how robust is the proposed location plan.


In [None]:
# Simple validation checks for the found solution

# 1) Check that each municipality is assigned exactly once
assign_counts = {i: 0 for i in municipalities}
for i in municipalities:
    for j in municipalities:
        if x[i][j].varValue > 0.5:
            assign_counts[i] += 1

print("Assignments per municipality (should be 1 each):")
print(assign_counts)

# 2) Check capacity usage per center
capacity_use = {j: 0 for j in municipalities}
for j in municipalities:
    for i in municipalities:
        if x[i][j].varValue > 0.5:
            capacity_use[j] += D[i]

print("\nCapacity usage per open center (must be <= 200000 if y_j = 1):")
print(capacity_use)

# 3) Number of opened centers
num_open_centers = sum(int(y[j].varValue) for j in municipalities)
print("\nNumber of opened centers:", num_open_centers)

# 4) Equity checks
east_northeast = int(y["Guadalupe"].varValue) + int(y["Apodaca"].varValue)
north_west = (int(y["San Nicolás de los Garza"].varValue) +
              int(y["General Escobedo"].varValue) +
              int(y["Santa Catarina"].varValue))

print("\nEquity east / north‑east (should be >= 1):", east_northeast)
print("Equity north / west (should be >= 1):", north_west)

## Conclusion

In this final project we modeled and solved a realistic **location–allocation** problem for Community Centers for Digital Innovation in the Monterrey Metropolitan Area. The model combines elements from linear programming and integer programming, using binary variables and several constraints on capacity, equity and assignment of demand.

By using real population data and reasonable assumptions for demand and distances, the optimisation model gives a concrete proposal of where to open the centers and how to assign each municipality to them. The use of a mixed‑integer linear programming solver (with simplex and branch‑and‑bound inside) is necessary, because the structure of the problem is not simple and the best solution can not be found just by inspection or trial and error.

The methodology developed here can be extended to more municipalities, different types of facilities, or other criteria like installation costs, marginalization indexes or priorities for vulnerable groups. In that sense, this project shows how the techniques of linear and integer programming studied in the course can support real decision making in public policies and educational innovation, not only in theory but also in practical applications.
