This is a type of location optimization analysis, specifically finding the optimal location of facilites on a network. These are two types of Set-Coverage analysis implemented in **Python**:

### Set-Coverage Problem
#### Objective: Determine the minimum number of facilities and their locations in order to cover all demands within a pre-specified maximum distance (or time) coverage

### Partial Set-Coverage Problem
#### Objective: Determine the minimum number of facilities and their locations in order to cover a given fraction of the population within a pre-specified maximum distance (or time) coverage

more information on GOSTNets Optimization can be found in the wiki: https://github.com/worldbank/GOST_PublicGoods/wiki/GOSTnets-Optimization

In [1]:
import geopandas as gpd
import pandas as pd
import os, sys
# add to your system path the location of the GOSTnet.py library
sys.path.append(r'../../../GOSTNets/GOSTNets')
import GOSTnet_Optimization as gn
import importlib

peartree version: 0.6.1 
networkx version: 2.2 
matplotlib version: 2.2.2 
osmnx version: 0.9 


#### These Set-Coverage Problems require an OD matrix as an input¶
read in OD matrix saved as csv file as dataframe

In [2]:
pth = r'../../../../lima_optimization_output'
OD_df = pd.read_csv(os.path.join(pth, 'saved_OD.csv'),index_col=0)

In [3]:
import pulp

### Run the Set-Coverage Problem
#### Objective: Determine the minimum number of facilities and their locations in order to cover all demands within a pre-specified maximum distance (or time) coverage

In the example below there is a pre-specified maximum distance coverage of 1200 seconds

In [66]:
set_coverage_result = gn.optimize_set_coverage(OD_df, max_coverage = 1200)

number of origins
678
print totalCoveredFacilities
678
print percent coverage
100.0
print prob obj
X_1556 + X_175 + X_2048 + X_2959 + X_3409 + X_367 + X_3914 + X_4154 + X_4198 + X_4233 + X_4647 + X_474 + X_4919 + X_6048 + X_6107 + X_6691 + X_917


In [57]:
set_coverage_result

[2048, 2959, 4919]

### Partial Set-Coverage Problem
#### Objective: Determine the minimum number of facilities and their locations in order to cover a given fraction of the population within a pre-specified maximum distance (or time) coverage
This problem factors in population coverage, therefore as an additional input we need to produce a series that has each origin and its respective population

In [6]:
origins_gdf = pd.read_csv(os.path.join(pth, 'origins_snapped.csv'))
#origins_gdf
#origins_gdf.Population.values

In [7]:
origins_w_pop_series = pd.Series(origins_gdf.Population.values, index=origins_gdf.NN)
len(origins_w_pop_series)

707

In [16]:
#some origins end up snapping to the same nearest node, therefore the code below groups and sums origin populations
origins_w_pop_series_no_dupl = origins_w_pop_series.groupby('NN').sum()
len(origins_w_pop_series_no_dupl)

678

In [17]:
origins_w_pop_series_no_dupl[:10]

NN
3      1458.0
21     2232.0
32     2041.0
82     1508.0
84     1610.0
99     1295.0
106    1216.0
114     824.0
124     440.0
130    1104.0
dtype: float64

In [18]:
#sum(origins_w_pop_series_no_dupl)
origins_w_pop_series_no_dupl[3]

1458.0

In the example below, inputs include covering 90 percent of the population, a pre-specified maximum distance coverage of 1200 seconds, and a series of origins with their population

In [19]:
OD_df[:5]

Unnamed: 0,6048,2048,6691,4154,4198,4647,4233,3914,2959,175,3409,367,1556,917,4919,474,6107
6147,1968.655016,1020.721567,363.676689,819.749517,1322.002788,1578.076932,1803.247252,806.455001,2049.990821,1517.705535,1879.396717,803.156991,1542.402922,1657.102815,583.090841,1736.154601,1584.512721
2052,525.817013,558.842693,1633.650766,1591.881656,274.594031,839.743363,1166.553316,1578.626774,872.524426,1600.311304,1019.7201,1744.22869,875.417373,694.618499,1785.150976,666.975509,544.361984
3,1330.771618,448.775456,1230.200991,1380.165021,873.074022,1498.688595,1962.518259,1307.025267,1668.335516,2097.397326,1815.601711,1546.804731,1534.362605,1386.775967,1520.879916,1471.197652,1236.519452
6154,153.55021,1138.24779,1658.305553,1700.636369,853.999129,1304.180223,723.313264,1988.980369,429.130522,1197.635321,576.396716,1787.841266,1304.052071,1079.897504,1828.763553,817.834594,822.268208
6162,202.318076,907.367997,1799.378687,1663.935343,623.119336,1074.262074,845.305328,1927.152078,551.122585,1319.627384,698.38878,1751.14024,1074.133922,849.979356,1792.062527,653.254488,592.35006


In [41]:
import importlib
importlib.reload(gn)

peartree version: 0.6.1 
networkx version: 2.2 
matplotlib version: 2.2.2 
osmnx version: 0.9 


<module 'GOSTnet_Optimization' from '../../../GOSTNets/GOSTNets/GOSTnet_Optimization.py'>

In [80]:
# neet to double-check this
partial_set_coverage_result = gn.optimize_partial_set_coverage(OD_df, pop_coverage = .8, max_coverage = 900, origins_pop_series = origins_w_pop_series_no_dupl, existing_facilities = None)

print min_coverage
846785.6000000001


In [81]:
partial_set_coverage_result

[2048, 3409, 3914]