This notebook performs the optimization task to find the location where a new grocery store should be placed, to maximize access

How this works (implementation):

1. Use the helper_population_allocation.py to allocate a population count to each residential building (not ready yet, some placeholder function in there)
2. Use the helper_distance_calculation.py to calculate existing access and distance between a residential and commercial building
3. Once 1 and 2 are done, all parameters are ready. 

4. Then this notebook does some pre optimization setup and then runs an optimization model in Gurobi. It's pretty anti climatic. 

Next steps:

- Update the population allocation helper file
- Decide how to uniquely identify a building
- Once the above is done, perform aggregation on the access dataframe to get the access variable (we need to do a max aggregation to get the actual access value. For that, we need to know how to uniquely identify a building)
- Make sure ordering of arrays and matrices are correct so that the gurobi solution can be mapped back to the actual building (as far as I know, it's difficult to use Gurobi with dataframes, hence this)
- Once everything seems to work, run the function to calculate distance between commercial and residential buildings on the full dataset to save the output as a csv, because that thing would take 2-3 hours)

In [1]:
# Import libraries
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import haversine as hs
import gurobipy as gp
from gurobipy import GRB

# Helper modules
import helper_population_allocation as pa
import helper_distance_calculation as dc

# Avoid printing set copy warnings
import warnings
warnings.filterwarnings("ignore")



PRE-OPTIMIZATION SETUP

In [2]:
# Get the main buildings dataset 
buildings_df = gpd.read_file('../processed_data/relevant_buildings.shp')
buildings_df.drop_duplicates(inplace=True) # There are some pure duplicates
buildings_df = buildings_df.sample(n=1000)  # Remove later


# Population parameter (Pj)
res_population = pa.get_population()

# Existing access parameter (Aj)
res_groc_access =  dc.calculate_access(
                            geopandas_dataframe=buildings_df,
                            building_type_1='Residential',
                            building_type_2='Grocery',
                            identifier_column='class_reco', 
                            geo_column='geometry', 
                            output_format='dataframe'
)



In [None]:
# Residential- commercial access parameter (Bij)
# Run this once, save dataset (will take forever)
res_comm_access = dc.calculate_access(
                            geopandas_dataframe=buildings_df,
                            building_type_1='Residential',
                            building_type_2='commercial',
                            identifier_column='class_reco', 
                            geo_column='geometry', 
                            output_format='dataframe'
)

OPTIMIZATION

Steps:
i = set of all commercial buildings
j = set of all residential buildings

Decision variable:
- Ci = 1 if the new grocery store is put in commercial building i, 0 otherwise

Parameters:
- Aj = 1 if residential building j already has access to a food store (within 1 mile)
- Bij = 1 if commercial building i is within 1 mile of residential building j
- Pj = Population at building j

Objective function:

Max $$ \sum_i \sum_j (1-A_j)*P_j*B_{ij}*C_i$$

Constraint (only 1 grocery store location being allocated):
$$ \sum_i C_i = 1$$





In [33]:
########################################
# SET UP MODEL
########################################

m = gp.Model("food_access")
res_population_array = np.array(res_population[res_population['class_reco'].str.contains('Residential')]['population']) # ith entry corresponds to population at ith residential building 
res_access_array = np.array(res_groc_access['access']) # ith entry corresponds to existing access at ith residential building 
res_comm_access_matrix = np.array(res_comm_access.pivot(index='commercial_coordinates', columns='Residential_coordinates', values='access')) # entry [i,j] corresponds to Bij


num_commercial_buildings = len(buildings_df[buildings_df['class_reco'].str.contains('commercial')])
num_residential_buildings = len(buildings_df[buildings_df['class_reco'].str.contains('Residential')])


########################################
# ASSIGN DECISION VARIABLES
########################################

c_i = m.addVars(range(num_commercial_buildings), vtype=GRB.BINARY)


#######################################
# OBJECTIVE FUNCTION
########################################

m.setObjective(sum(((1-res_access_array[j]) * res_population_array[j] * res_comm_access_matrix[i,j] * c_i[i] for i in range(num_commercial_buildings) for j in range(num_residential_buildings))))
m.modelSense = GRB.MAXIMIZE


########################################
# CONSTRAINTS
########################################

m.addConstr(sum(c_i[i] for i in range(num_commercial_buildings)) ==  1) 


<gurobi.Constr *Awaiting Model Update*>

In [34]:
# Optimize and see results
m.optimize()

Gurobi Optimizer version 9.5.2 build v9.5.2rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 1 rows, 53 columns and 53 nonzeros
Model fingerprint: 0x81182e8f
Variable types: 0 continuous, 53 integer (53 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [5e+00, 2e+02]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective 109.0000000
Presolve removed 1 rows and 53 columns
Presolve time: 0.01s
Presolve: All rows and columns removed

Explored 0 nodes (0 simplex iterations) in 0.03 seconds (0.00 work units)
Thread count was 1 (of 8 available processors)

Solution count 2: 173 109 

Optimal solution found (tolerance 1.00e-04)
Best objective 1.730000000000e+02, best bound 1.730000000000e+02, gap 0.0000%
