This is a type of location optimization analysis, specifically finding the optimal location of facilites on a network. This analysis is the P-Median Problem implemented in **Python**:

### P-Median Problem
The P-median problem finds the location of (a pre-specified number of) P facilities to minimize the average travel distance (or time) among all demand points and facilities. The P-median problem can take into account the level of demand at each point (e.g. number of people, or the number of visits).

more information on GOSTNets Optimization can be found in the wiki: https://github.com/worldbank/GOST_PublicGoods/wiki/GOSTnets-Optimization

In [34]:
import geopandas as gpd
import pandas as pd
import os, sys
# add to your system path the location of the GOSTnet.py library
sys.path.append(r'../../../GOSTNets/GOSTNets')
import GOSTnet_Optimization as gn
import importlib

## The P-Median Problem requires an OD matrix as an input
### read in OD matrix saved as csv file as dataframe

In [35]:
pth = r'../../../../lima_optimization_output'
OD_df = pd.read_csv(os.path.join(pth, 'saved_OD.csv'),index_col=0)

In [36]:
OD_df[:3]

Unnamed: 0,6048,2048,6691,4154,4198,4647,4233,3914,2959,175,3409,367,1556,917,4919,474,6107
6147,1968.655016,1020.721567,363.676689,819.749517,1322.002788,1578.076932,1803.247252,806.455001,2049.990821,1517.705535,1879.396717,803.156991,1542.402922,1657.102815,583.090841,1736.154601,1584.512721
2052,525.817013,558.842693,1633.650766,1591.881656,274.594031,839.743363,1166.553316,1578.626774,872.524426,1600.311304,1019.7201,1744.22869,875.417373,694.618499,1785.150976,666.975509,544.361984
3,1330.771618,448.775456,1230.200991,1380.165021,873.074022,1498.688595,1962.518259,1307.025267,1668.335516,2097.397326,1815.601711,1546.804731,1534.362605,1386.775967,1520.879916,1471.197652,1236.519452


In [37]:
OD_df.keys()

Index(['6048', '2048', '6691', '4154', '4198', '4647', '4233', '3914', '2959',
       '175', '3409', '367', '1556', '917', '4919', '474', '6107'],
      dtype='object')

In [38]:
facilities = OD_df.columns.values.tolist()

In [39]:
# facilities list
facilities

['6048',
 '2048',
 '6691',
 '4154',
 '4198',
 '4647',
 '4233',
 '3914',
 '2959',
 '175',
 '3409',
 '367',
 '1556',
 '917',
 '4919',
 '474',
 '6107']

In [40]:
# pulp is a python optimization library
import pulp
#import importlib
#importlib.reload(gn)

## Compute the P-Median Problem
The third argument to the optimize_facility_locations function is the number facilities to optimally locate

In [41]:
results = gn.optimize_facility_locations(OD_df, facilities, 4, existing_facilities = None)

### These are the node IDs of the selected optimal facilities

In [29]:
results

[2048, 3409, 4154, 6107]

In [42]:
# save results in a .txt file
file = open(r'../../../../lima_optimization_output/results.txt', 'w')
for r in results:
    file.write('{}\n'.format(r))
file.close()

### Optional, if you would like to access the objective function value

In [43]:
#generate pulp.LpProblem object without solving
return_problem = gn.optimize_facility_locations(OD_df, facilities, 4, existing_facilities = None, execute=False)

In [44]:
#solve problem
return_problem.solve()

1

In [45]:
#print out objective value
return_problem.objective.value()

312049.36805564194