This notebook performs the optimization task to find the location where a new grocery store should be placed, to maximize access

How this works (implementation):

1. Use the helper_population_allocation.py to allocate a population count to each residential building (not ready yet, some placeholder function in there)
2. Use the helper_distance_calculation.py to calculate existing access and distance between a residential and commercial building
3. Once 1 and 2 are done, all parameters are ready. 

4. Then this notebook does some pre optimization setup and then runs an optimization model in Gurobi. It's pretty anti climatic. 

Next steps:

- Update the population allocation helper file
- Figure out the correct ordering of arrays. It's not accurate currently 


In [96]:
# Import libraries
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import haversine as hs
import gurobipy as gp
from gurobipy import GRB

# Helper modules
import helper_population_allocation as pa
import helper_distance_calculation as dc

# Avoid printing set copy warnings
import warnings
warnings.filterwarnings("ignore")



PRE-OPTIMIZATION SETUP

In [97]:
%%time

# Get the main buildings dataset 
buildings_df = gpd.read_file('../processed_data/relevant_buildings.shp')

# Create ID variable
buildings_df.reset_index(drop=True, inplace=True)
buildings_df['building_id'] = buildings_df.index + 1
buildings_df['building_id'] = buildings_df.apply(lambda row: str(row['building_id']) + '-' + str(row['CLASS']) , axis=1)

buildings_df = buildings_df.sample(n=2000, random_state=1)  # Remove later




Wall time: 8.91 s


In [98]:
# Create arrays to track ordering (residential)
res_buildings = buildings_df[buildings_df['class_reco'].str.contains('Residential')]
res_buildings = res_buildings.sort_values('building_id')
res_buildings = dc.get_geocoordinate(res_buildings, 'geometry')

res_buildings_array = np.array(res_buildings['building_id'])
res_buildings_coordinates_array = np.array(res_buildings['coordinates'])

In [99]:
# Create arrays to track ordering (Commercial)
comm_buildings = buildings_df[buildings_df['class_reco'].str.contains('commercial')]
comm_buildings = comm_buildings.sort_values('building_id')
comm_buildings = dc.get_geocoordinate(comm_buildings, 'geometry')

comm_buildings_array = np.array(comm_buildings['building_id'])
comm_buildings_coordinates_array = np.array(comm_buildings['coordinates'])


In [100]:
# Create arrays to track ordering (grocery stores)
grocery_stores = buildings_df[buildings_df['class_reco'].str.contains('Grocery')]
grocery_stores = grocery_stores.sort_values('building_id')
grocery_stores = dc.get_geocoordinate(grocery_stores, 'geometry')

grocery_stores_array = np.array(grocery_stores['building_id'])
grocery_stores_coordinates_array = np.array(grocery_stores['coordinates'])


In [101]:
# Create parameter matrices (Res comm access matrix - Bij)
# [i,j] value indicates whether residential building i is within access distance of commercial building j
res_comm_distance_matrix, res_comm_access_matrix = dc.calculate_access(res_buildings_coordinates_array, comm_buildings_coordinates_array)

In [102]:
# Create parameter matrices (Res groc access array - Aj)
# ith value indicates whether the ith residential building has existing access
res_groc_distance_matrix, res_groc_access_matrix = dc.calculate_access(res_buildings_coordinates_array, grocery_stores_coordinates_array)
res_access_array = np.amax(res_groc_access_matrix, 1)


In [103]:
# Create parameter matrices (Res Population - Pj)
# ith value indicates the population in the ith column
res_population = pa.get_population(geopandas_dataframe=res_buildings) 
res_population_array = np.array(res_population['population'])
res_population_array

array([10, 10, 10, ...,  2,  2,  2], dtype=int64)

OPTIMIZATION

Steps:
i = set of all commercial buildings
j = set of all residential buildings

Decision variable:
- Ci = 1 if the new grocery store is put in commercial building i, 0 otherwise

Parameters:
- Aj = 1 if residential building j already has access to a food store (within 1 mile)
- Bij = 1 if commercial building i is within 1 mile of residential building j
- Pj = Population at building j

Objective function:

Max $$ \sum_i \sum_j (1-A_j)*P_j*B_{ij}*C_i$$

Constraint (only 1 grocery store location being allocated):
$$ \sum_i C_i = 1$$





In [104]:
########################################
# SET UP MODEL
########################################

m = gp.Model("food_access")

num_commercial_buildings = len(comm_buildings_array)
num_residential_buildings = len(res_buildings_array)


########################################
# ASSIGN DECISION VARIABLES
########################################

c_i = m.addVars(range(num_commercial_buildings), vtype=GRB.BINARY)

print('decision vars added')

#######################################
# OBJECTIVE FUNCTION
########################################

m.setObjective(sum(((1-res_access_array[j]) 
                        * res_population_array[j] 
                        * res_comm_access_matrix.T[i,j]
                        * c_i[i] 
                    for j in range(num_residential_buildings) 
                for i in range(num_commercial_buildings))))
m.modelSense = GRB.MAXIMIZE

print('objective function set')

########################################
# CONSTRAINTS
########################################

m.addConstr(sum(c_i[i] for i in range(num_commercial_buildings)) ==  1) 

print('constraints added')


decision vars added
objective function set
constraints added


In [105]:
# Optimize and see results
m.optimize()

Gurobi Optimizer version 9.5.2 build v9.5.2rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 1 rows, 115 columns and 115 nonzeros
Model fingerprint: 0x1b5356fe
Variable types: 0 continuous, 115 integer (115 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 7e+02]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective 495.0000000
Presolve removed 1 rows and 115 columns
Presolve time: 0.00s
Presolve: All rows and columns removed

Explored 0 nodes (0 simplex iterations) in 0.01 seconds (0.00 work units)
Thread count was 1 (of 8 available processors)

Solution count 2: 655 495 

Optimal solution found (tolerance 1.00e-04)
Best objective 6.550000000000e+02, best bound 6.550000000000e+02, gap 0.0000%


ORDERING OF ARRAYS SANITY CHECK

In [106]:
# New population impacted
print(f"Total new population with access = {m.objVal}")

# Where is the store being placed (need a better way than this forloop)
store_location_index = 0
for i in range(num_commercial_buildings):
    if c_i[i].x == 1:
        store_location_index = i
        print(f"store should be placed in commercial building {i}")



Total new population with access = 655.0
store should be placed in commercial building 32


In [107]:
# Now let's check if the ordering makes sense. 

# Which building is this
chosen_building_id = comm_buildings_array[store_location_index]
print(f"The chosen building id is {chosen_building_id}")

The chosen building id is 25316-C


In [108]:
# Let's find all residential buildings that fall within this commercial buildings access range
relevant_res_indices = list(res_comm_access_matrix[:,store_location_index].nonzero()[0])
relevant_res_indices

[53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 81,
 82,
 83,
 84,
 85,
 86,
 88,
 89,
 90,
 91,
 226,
 227,
 228,
 229,
 230,
 231,
 232,
 233,
 235,
 236,
 237,
 270,
 271,
 272,
 273,
 274,
 275,
 276,
 450,
 451,
 452,
 453,
 454,
 455,
 456,
 457,
 459,
 460,
 461,
 462,
 464,
 465,
 466,
 468,
 469,
 470,
 471,
 472,
 474,
 475,
 476,
 477,
 478,
 479,
 480,
 481,
 482,
 483,
 485,
 486,
 488,
 489,
 490,
 491,
 492,
 493,
 494,
 495,
 496,
 507,
 508,
 509,
 511,
 513,
 514,
 515,
 518,
 519,
 521,
 522,
 524,
 525,
 527,
 531,
 532,
 533,
 535,
 536,
 537,
 540,
 542,
 544,
 545,
 546,
 547,
 549,
 550,
 552,
 555,
 974,
 975,
 976,
 977,
 978,
 979,
 980,
 981,
 982,
 983,
 984,
 985,
 986,
 987,
 988,
 989,
 990,
 991,
 1004]

In [109]:
# Which Residential buildings have access to this building and don't have current access
for i in relevant_res_indices:
    if res_access_array[i] == 1:
        relevant_res_indices.remove(i)

relevant_res_indices

[53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 81,
 82,
 83,
 84,
 85,
 86,
 88,
 89,
 90,
 91,
 226,
 227,
 228,
 229,
 230,
 231,
 232,
 233,
 235,
 236,
 237,
 270,
 271,
 272,
 273,
 274,
 275,
 276,
 450,
 451,
 452,
 453,
 454,
 455,
 456,
 457,
 459,
 460,
 461,
 462,
 464,
 465,
 466,
 468,
 469,
 470,
 471,
 472,
 474,
 475,
 476,
 477,
 478,
 479,
 480,
 481,
 482,
 483,
 485,
 486,
 488,
 489,
 490,
 491,
 492,
 493,
 494,
 495,
 496,
 507,
 508,
 509,
 511,
 513,
 514,
 515,
 518,
 519,
 521,
 522,
 524,
 525,
 527,
 531,
 533,
 536,
 537,
 540,
 542,
 544,
 545,
 546,
 547,
 549,
 550,
 552,
 555,
 974,
 975,
 976,
 977,
 978,
 979,
 980,
 981,
 982,
 983,
 984,
 985,
 986,
 987,
 988,
 989,
 990,
 991,
 1004]

In [110]:
# Lets add up the population in all the buildings in res_buildings and see if that value matches the optimum function value
total_population = 0

for i in relevant_res_indices:
    total_population += res_population_array[i]


print(total_population)


655


In [111]:
if total_population == m.objval:
    print('results match and ordering makes sense. What a time to be alive')

else:
    print('manual and gurobi result value does not match, fix ordering of arrays')

results match and ordering makes sense. What a time to be alive
