# Facility Location with Regions

Based on Example 14.3 from the SAS Optimization documentation: https://go.documentation.sas.com/doc/en/pgmsascdc/default/casmopt/casmopt_milpsolver_examples03.htm

For a set of customers and sites, choose which sites to build such that:
- We minimize the sum of the distances between customers and their assigned sites and the building costs of sites
- The capacity for each site is not exceeded
- Sites and customers are in same region

#### Mixed Integer Linear Programming Formulation

$ \begin{array}{llllll} 
\min & \displaystyle \sum _{i \in L} \displaystyle \sum _{j \in F} c_{ij} x_{ij} &+& \displaystyle \sum _{j \in F} f_ j y_ j \\ 
\text{s.t.} & \displaystyle \sum _{j \in F} x_{ij} & = & 1 & \forall i \in L & \text{(assign\_def)} \\ 
& \displaystyle \sum _{i \in L} d_ i x_{ij} & \leq & Cy_ j & \forall j \in F & \text{(capacity)} \\ 
& x_{ij} & = & 0 & \forall i,j \text{ if } r_i \ne r_j & \text{(region\_con)} \\
\\
%& \displaystyle \sum _{j \in F} f_ j y_ j &\le& B & \text{(budget\_con)}\\
%\\
& x_{ij} \in \{ 0,1\} & & & \forall i \in L, j \in F \\
& y_{j} \in \{ 0,1\} & & & \forall j \in F 
\end{array} 
$


#### Input Data

For the input data we have a comma-separated value (CSV) file with all cities in Germany with more than 50,000 inhabitants retrieved from the German Federal Office of Statistics (www.destatis.de) that also includes geo locations of the cities for plotting. The following code reads the file and plots the data on a map.

In [10]:
import folium
import pandas as pd

# Read the input data and make sure the numbers are all parsed correctly, then print the top of the DataFrame
indata = pd.read_csv('cities_germany.csv', sep=';', decimal=',')
indata["size"] = pd.to_numeric(indata["size"].str.replace(" ", ""), errors='coerce')
indata["density"] = pd.to_numeric(indata["density"].str.replace(" ", ""), errors='coerce')
print(indata)

# Display the cities on a map of Germany
map_input = folium.Map(location=(52, 9), zoom_start=6)
for index, row in indata.iterrows():
    folium.Marker(location=[row["lat"], row["lon"]], tooltip=f'{row["name"]}<br>Size: {row["size"]}<br>Density: {row["density"]}').add_to(map_input)
display(map_input)

     state                            name     size  density  zipcode  \
0       11                   Berlin, Stadt  3755251     4214    10178   
1        2   Hamburg, Freie und Hansestadt  1892122     2506    20095   
2        9       München, Landeshauptstadt  1512491     4868    80331   
3        5                     Köln, Stadt  1084831     2678    50667   
4        6        Frankfurt am Main, Stadt   773068     3113    60311   
..     ...                             ...      ...      ...      ...   
190      1                 Elmshorn, Stadt    50772     2377    25335   
191      3                    Emden, Stadt    50535      450    26721   
192      3                   Goslar, Stadt    50203      306    38640   
193      5                  Willich, Stadt    50144      740    47877   
194      8  Heidenheim an der Brenz, Stadt    50025      467    89522   

           lon        lat  type  
0    13.405538  52.517670     1  
1     9.996970  53.550678     1  
2    11.575997  48.13

#### Connecting to SAS 9.4

To connect to SAS 9.4 we use the saspy (https://github.com/sassoftware/saspy) package. To connect we use a configuration file (see https://sassoftware.github.io/saspy/configuration.html#sascfg-personal-py).

The following code establishes a connection and creates a SAS data set from the Pandas data frame.

In [11]:
import saspy

sas = saspy.SASsession(cfgfile="sascfg_personal.py")
_ = sas.df2sd(indata, table="indata")

Using SAS Config named: local
SAS Connection established. Subprocess id is 15956



#### Defining and Solving the Optimization Problem with OPTMODEL

 In this specific example, every customer location (city) can also be a site. The demand of each city is its size while we use the density as the cost to build a site. This means that building a site in a less densely populated cities is preferable. This leads to an interesting optimization problem for demonstration purposes but has no real-world meaning.

 First we define the OPTMODEL code that we want to execute. Then it is sent to SAS.

In [12]:
# Define the model in OPTMODEL
optmodel_code = """
proc optmodel;
   /* Define set of sites, only need one set since all sites are also customers */
   set <str> SITES;

   /* Latitude and Longitude for SITES */
   num lat {SITES};
   num lon {SITES};

   /* Capacity of each site */
   num C = 5000000;

   /* Other parameters */
   num demand {SITES};
   num cost {SITES};
   num region {SITES};

   /* Define a set of tuples for all possible assignments */
   set PAIRS = {i in SITES, j in SITES};

   /* Compute distances between sites */
   num distance {<i,j> in PAIRS}
       = round(geodist(lat[i], lon[i], lat[j], lon[j], 'K'));

   /* Read the data */
   read data indata into SITES=[name] lat lon region=state demand=size cost=density;

   /* Create variables */
   var Assign {PAIRS} binary;
   var Build {SITES} binary;

   /* Define objective function */
   min TotalCost
       = sum {<i,j> in PAIRS} distance[i,j] * Assign[i,j]
         + sum {j in SITES} cost[j] * Build[j];

   /* Each site needs to be assigned to exactly once */
   con assign_def {i in SITES}:
      sum {<(i),j> in PAIRS} Assign[i,j] = 1;

   /* Each site we build can handle at most C demand */
   con capacity {j in SITES}:
      sum {<i,(j)> in PAIRS} demand[i] * Assign[i,j] <= C * Build[j];

   /* Solve with the MILP solver */
   solve;

   /* Create output data sets */
   create data assignments from
        [customer site]={<i,j> in PAIRS: Assign[i,j] > 0.5}
        lat1=lat[i] lon1=lon[i] lat2=lat[j] lon2=lon[j] distance[i,j];
   create data sites from
        [site]={j in SITES: Build[j] > 0.5}
            name=j lat[j] lon[j] cost[j];
quit;
"""
# Submit the model to SAS
log = sas.submitLOG(optmodel_code)
print(log)

# Create output data frames to plot the solution
assignments = sas.sasdata("assignments").to_df()
sites = sas.sasdata("sites").to_df()


13                                                         The SAS System                                 16:30 Monday, July 8, 2024

1641       ods listing close;ods html5 (id=saspy_internal) file=_tomods1 options(bitmap_mode='inline') device=svg style=HTMLBlue;
1641     ! ods graphics on / outputfmt=png;
NOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1
1642       
1643       
1644       proc optmodel;
1645          /* Define set of sites, only need one set since all sites are also customers */
1646          set <str> SITES;
1647       
1648          /* Latitude and Longitude for SITES */
1649          num lat {SITES};
1650          num lon {SITES};
1651       
1652          /* Capacity of each site */
1653          num C = 5000000;
1654       
1655          /* Other parameters */
1656          num demand {SITES};
1657          num cost {SITES};
1658          num region {SITES};
1659       
1660          /* Define a set of tuples for all possible assignments */
1661          s

The following code computes the parts of the objective from the output data sets to see how long the total distance between sites and customers is compared to the building costs of the sites.

In [13]:
# Print the sum of the distances
total_distance = assignments["distance"].sum()
print(f"Total distance: {total_distance}")

# Print the sum of the building costs
total_site_cost = sites["cost"].sum()
print(f"Total site cost: {total_site_cost}")

print(f"Objective: {total_distance + total_site_cost}")

Total distance: 11735.0
Total site cost: 5197.0
Objective: 16932.0


Now display the map with the solution.

In [14]:
map_noregion = folium.Map(location=(52, 9), zoom_start=6)

# Plot all cities
for index, row in indata.iterrows():
    folium.Marker(location=[row["lat"], row["lon"]], tooltip=f'{row["name"]}<br>Size: {row["size"]}<br>Density: {row["density"]}').add_to(map_noregion)

# Plot the cities that are sites to build
for index, row in sites.iterrows():
    folium.Marker(location=[row["lat"], row["lon"]], tooltip=row["name"], icon=folium.Icon(color="green")).add_to(map_noregion)

# Plot the assignments of customers to sites
for idx, row in assignments.iterrows():
    folium.PolyLine([[row["lat1"], row["lon1"]],
                     [row["lat2"], row["lon2"]]]).add_to(map_noregion)

display(map_noregion)

#### Modify the Example to Respect Regions

The following code modifies the OPTMODEL code by restricting the possible assignments to assignments within the region. While it is possible to add constraints or `fix` statements into the OPTMODEL code, the more efficient solution is to directly restrict the set of available variables by restricting the `PAIRS` set.

Note that by doing this we actually create 16 independent optimization problems, one for each state (some of which are trivial because there is only one choice). The result is an optimization problem with disjoint blocks. By default, the MILP solver detects this structure and deals with it (works better on newer versions). But it's also possible to use the statement: `solve with milp / decomp=(method=concomp);` With this `solve` statement, the DECOMP algorithm is used that will solve each block independently and in parallel. There are other ways to deal with this situation like [COFOR](https://go.documentation.sas.com/doc/en/pgmsascdc/default/casmopt/casmopt_optmodel_syntax11.htm#casmopt.optmodel.npxcoforstmt) and [BY-Group Processing](https://go.documentation.sas.com/doc/en/pgmsascdc/default/casactmopt/casactmopt_optimization_details05.htm).

In [15]:
# Change the model to assign only within regions
optmodel_code = optmodel_code.replace("set PAIRS = {i in SITES, j in SITES};","set PAIRS = {i in SITES, j in SITES: region[i] = region[j]};")

# Optional: Use DECOMP to solve the disconnected blocks individually
optmodel_code = optmodel_code.replace("solve;","solve with milp / decomp=(method=concomp);")

# Submit the changed code again
log = sas.submitLOG(optmodel_code)
print(log)

# Create output data frames to plot the solution
assignments = sas.sasdata("assignments").to_df()
sites = sas.sasdata("sites").to_df()


45                                                         The SAS System                                 16:30 Monday, July 8, 2024

1898       ods listing close;ods html5 (id=saspy_internal) file=_tomods1 options(bitmap_mode='inline') device=svg style=HTMLBlue;
1898     ! ods graphics on / outputfmt=png;
NOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1
1899       
1900       
1901       proc optmodel;
1902          /* Define set of sites, only need one set since all sites are also customers */
1903          set <str> SITES;
1904       
1905          /* Latitude and Longitude for SITES */
1906          num lat {SITES};
1907          num lon {SITES};
1908       
1909          /* Capacity of each site */
1910          num C = 5000000;
1911       
1912          /* Other parameters */
1913          num demand {SITES};
1914          num cost {SITES};
1915          num region {SITES};
1916       
1917          /* Define a set of tuples for all possible assignments */
1918          s

Print the objective for the modified problem.

In [16]:
# Print the sum of the distances
total_distance = assignments["distance"].sum()
print(f"Total distance: {total_distance}")

# Print the sum of the building costs
total_site_cost = sites["cost"].sum()
print(f"Total site cost: {total_site_cost}")

print(f"Objective: {total_distance + total_site_cost}")

Total distance: 9970.0
Total site cost: 18102.0
Objective: 28072.0


Display the map for the modified problem where regions are respected. Note that Berlin, Hamburg, and Saarbrücken have to be sites now since they are the only possible sites in their regions. Bremen chooses Bremerhaven as the site in the state of Bremen (which might or might not be a valid decision to make).

In [17]:
map_region = folium.Map(location=(52, 9), zoom_start=6)

# Plot all cities
for index, row in indata.iterrows():
    folium.Marker(location=[row["lat"], row["lon"]], tooltip=f'{row["name"]}<br>Size: {row["size"]}<br>Density: {row["density"]}').add_to(map_region)

# Plot the cities that are sites to build
for index, row in sites.iterrows():
    folium.Marker(location=[row["lat"], row["lon"]], tooltip=row["name"], icon=folium.Icon(color="green")).add_to(map_region)

# Plot the assignments of customers to sites
for idx, row in assignments.iterrows():
    folium.PolyLine([[row["lat1"], row["lon1"]],
                     [row["lat2"], row["lon2"]]]).add_to(map_region)

display(map_region)

In [18]:
# Close the connection to saspy
sas.endsas()

SAS Connection terminated. Subprocess id was 15956
