This is a type of location optimization analysis, specifically finding the optimal location of facilites on a network. This analysis is the P-Median Problem implemented in **Julia**:

### P-Median Problem
The P-median problem finds the location of (a pre-specified number of) P facilities to minimize the average travel distance (or time) among all demand points and facilities. The P-median problem can take into account the level of demand at each point (e.g. number of people, or the number of visits).

more information on GOSTNets Optimization can be found in the wiki: https://github.com/worldbank/GOST_PublicGoods/wiki/GOSTnets-Optimization

#### This is a Julia Notebook. If you are new to Julia, these are the [steps](https://datatofish.com/add-julia-to-jupyter/) to add Julia to a Jupyter Notebook

In [46]:
using Pkg
Pkg.add("JuMP")
Pkg.add("Cbc")
Pkg.add("MathOptInterface")
Pkg.add("MathProgBase")
Pkg.add("CSV")
Pkg.add("DelimitedFiles")
Pkg.add("DataFrames")
println("Done installing packages")

[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Manifest.toml`
[90m [no changes][39m
[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Manifest.toml`
[90m [no changes][39m
[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Manifest.toml`
[90m [no changes][39m
[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Project.toml`
[90m [no changes][39m
[32m[1m  Updating[22m[39m `~/.julia/environments/v1.0/Manifest.toml`
[90m [no changes][39m
[32m[1m Resolving[22m[39

In [47]:
#using JuMP, Cbc, GLPK, CPLEX, Test, Random, MathOptInterface, MathOptFormat, CSV, DataFrames, DelimitedFiles, MathProgBase
using JuMP, Cbc, MathOptInterface, CSV, DataFrames, DelimitedFiles, MathProgBase

In [48]:
# MathOptInterface is an abstraction layer for mathematical optimization solvers
const MOI = MathOptInterface

MathOptInterface

## This is the Julia P-Median function

In [10]:
function pMedian(numFacility::Int, CSVfile)

    println("numFacility")
    println(numFacility)

    # materialize a csv file as a DataFrame
    df = CSV.File(CSVfile) |> DataFrame!

    #extract column_headers
    column_headers = []
    #skip Column1
    for i=2:length(names(df))
      push!(column_headers,String(names(df)[i]))
    end
    
    OD_dict = Dict()
    for i in 1:size(df, 1)
        OD_dict[df[i,1]] = df[i,2:end]
    end

    #println("print OD_dict")
    #println(OD_dict)

    #origins as array
    origins = df[:,1]

    #println("origins")
    #println(origins)

    facilities = []
    for i in df[1,2:end]
      push!(facilities,trunc(Int, i))
    end

    #println("facilities")
    #println(facilities)

    #m = Model(with_optimizer(CPLEX.Optimizer))
    #output says threads were changed, but I do not see a difference on the resource monitor
    #m = Model(with_optimizer(Cbc.Optimizer, threads = 14))
    #change the limit to 
    m = Model(with_optimizer(Cbc.Optimizer, threads = 2, seconds = 68400))

    # Facility locations
    #@variable(m, 0 <= s[1:numLocation] <= 1)
    #@variable(m, 0 <= x[1:length(facilities)] <= 1)
    #binary variable
    @variable(m, x[1:length(facilities)], binary=true)

    #println("print Facility location var")
    #println(x)

    # Aux. variable: x_a,i = 1 if the closest facility to a is at i
    #@variable(m, 0 <= x[1:numLocation,1:numCustomer] <= 1)
    #@variable(m, 0 <= y[origins,1:length(facilities)] <= 1)
    #binary variable
    @variable(m, y[origins,1:length(facilities)], binary=true)

    #println("print origin facility var")
    #println(y)

    # Objective: min distance
    #@objective(m, Min, sum(abs(customerLocations[a]-i)*x[i,a] for a = 1:numCustomer, i = 1:numLocation) )

    @objective(m, Min, sum(OD_dict[i][j]*y[i,j] for i in origins, j = 1:length(facilities)) )

    # Constraints


    # Subject to must allocate all facilities
    @constraint(m, sum(x[i] for i=1:length(facilities)) == numFacility )


    for i in origins
        # Subject to linking x with s
        for j in 1:length(facilities)
            @constraint(m, y[i,j] <= x[j])
        end

        # Subject to one of x must be 1
        @constraint(m, sum(y[i,j] for j=1:length(facilities)) == 1 )
    end


    JuMP.optimize!(m)

    println("Objective value is: ", JuMP.objective_value(m))

    #println("Objective bound is: ", JuMP.objective_bound(m))


    println("print array values")
    println(value.(x))
    println("print array length")
    println(length(value.(x)))

    result_array = value.(x)

    selected_facilities = []

    for i=1:length(result_array)
       if result_array[i] == 1
           push!(selected_facilities,column_headers[i])
       end
    end

    println("print selected_facilities")
    println(selected_facilities)

    #save selected_facilities array to file
    #C:\Users\gost_\Desktop\lima\data\OD_distance
    #writedlm("C:\\Users\\gost_\\Desktop\\lima\data\\OD_distance\\selected_facilities_array", selected_facilities)
    #writedlm("H:\\lima_optimality\\examples_testing\\OD2\\selected_facilities_array", selected_facilities)
    #writedlm("C:\\Temp\\lima_OD_distance_output\\selected_facilities_array_lima_distance_weighted_12hr_v2_binary_vars", selected_facilities)

    #println("finished writing selected_facilities_array to file")

    if termination_status(m) == MOI.OPTIMAL
        optimal_solution = value.(x)
        optimal_objective = objective_value(m)
    elseif termination_status(m) == MOI.TIME_LIMIT && has_values(model)
        suboptimal_solution = value.(x)
        suboptimal_objective = objective_value(m)
    else
        error("The model was not solved correctly.")
    end

    return selected_facilities

end

pMedian (generic function with 1 method)

### The pMedian function takes the number of facilities to place as the first input. For the second input it takes in the OD matrix as a csv file.

In [42]:
selected_facilities = pMedian(4,"../../../../lima_optimization_output/saved_OD.csv")

numFacility
4
Objective value is: 312049.36805564194
print array values
[0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]
print array length
17
print selected_facilities
Any["2048", "4154", "3409", "6107"]
Welcome to the CBC MILP Solver 
Version: 2.9.9 
Build Date: Dec 31 2018 

command line - Cbc_C_Interface -threads 2 -seconds 68400 -solve -quit (default strategy 1)
threads was changed from 0 to 2
seconds was changed from 1e+100 to 68400
Continuous objective value is 312049 - 1.22 seconds
Cgl0004I processed model has 12205 rows, 11543 columns (11543 integer (11543 of which binary)) and 34595 elements
Cbc0038I Initial state - 0 integers unsatisfied sum - 0
Cbc0038I Solution found of 312049
Cbc0038I Before mini branch and bound, 11543 integers at bound fixed and 0 continuous
Cbc0038I Mini branch and bound did not improve solution (2.47 seconds)
Cbc0038I After 2.47 seconds - Feasibility pump exiting with objective of 312049 - took 0.06 seconds
Cbc0012

4-element Array{Any,1}:
 "2048"
 "4154"
 "3409"
 "6107"

In [43]:
selected_facilities

4-element Array{Any,1}:
 "2048"
 "4154"
 "3409"
 "6107"

In [45]:
#write-out selected_facilities
writedlm("../../../../lima_optimization_output/selected_facilities_file_from_julia",selected_facilities)