<a href="https://colab.research.google.com/github/aheiX/Teaching/blob/main/PuLP%20-%20Tutorial/Tutorial%20PuLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tutorial PuLP

This tutorial uses the well-known Traveling-Salesperson-Problem (TSP) to explain how Python's module PuLP can be used to model and solve simple Linear Programs. A Tutorial on the implementaion of the TSP in CPLEX can be found here: [TSP Tutorial on CPLEX](https://www.scm.bwl.uni-kiel.de/de/lehre/tutorial-on-cplex.pdf).



## Problem Description

The Traveling Salesperson Problem (TSP) is one of the most famous combinatorial problems in the fields of mathematics, computer science, and operations research. The classical definition of the TSP is as follows: What is the shortest possible route for a traveling salesperson seeking to visit each city on a list exactly once and return to his city of origin? (Cook, William (2012) *In Pursuit of the Travelin Salesman*).

While the problem is easy to understand and easy to formulate as a mathematical model, solving it to optimality through complete enumeration of all feasible solutions quickly becomes intractable as the number of potential solutions grows exponentially in the number of cities to visit.

To date, there exists no algorithm that solves the TSP to optimality in polynomial time and it is widely believed that there is no such algorithm. However, heuristics are usually capable of finding very good solutions in a short computation time. 

Let's continue with an artificial small-sized data set that is used throughout this tutorial. The example consist of nine locations in which the first location ('Bahnhof')  is the salesperson's city of origin (also referred to as depot). The following table shows the latitude and longitude information of the locations:
<br><br>
\begin{array}{lll}
         name  & latitude & longitude \\
         \hline
      Bahnhof & 54.315487 & 10.132285 \\
Friedrichsort & 54.393713 & 10.184142 \\
     Holtenau & 54.374804 & 10.148470 \\
   University & 54.348125 & 10.117918 \\
        Mitte & 54.324606 & 10.136630 \\
       Garden & 54.313102 & 10.150910 \\
  Wellingdorf & 54.328293 & 10.179582 \\
   Heikendorf & 54.379798 & 10.212037 \\
        Laboe & 54.409884 & 10.232679 \\
\end{array}

## Mathematical Formulation


A graph-based formulation is used that is based on the paper from Langevin, André, Francois Soumis, and Jacques Desrosiers ("Classification of travelling salesman problem formulations". In: Operations Research Letters. 1990). For the subtour constraints, the popular formulation from Miller, Tucker and Zemlin is used. 

Let $N$ denote the number of nodes in the network, i.e., the number of cities. The distance for the salesperson to travel between any two nodes $i$ and $j$ is denoted with $c_{ij}$. Decision variable $x_{ij}$ is used to describe if the salesperson traverses from node $i$ to node $j$ ($x_{ij}=1$), or not ($x_{ij}=0$). Using this notation, the mathematical model for the TSP is as follows: 
<br><br>
$
\begin{align}
  \begin{array}{lll}
    &\textbf{Objective} & \\
    & \min \sum_{i=1,\dots,N} \sum_{j=1,\dots,N} c_{ij} \cdot x_{ij} &~~~ (1) \\
    &&\\
    &\textbf{Constraints} & \\
    & \sum_{i=1,\dots,N} x_{ij} = 1,~ \forall~ j = 1,\dots,N  &~~~ (2) \\
    & \sum_{j=1,\dots,N} x_{ij} = 1,~ \forall~ i = 1,\dots,N  &~~~ (3) \\
    & x_{ii} = 0,~ \forall~ i = 1,\dots,N  &~~~ (4) \\
    & u_{i} - u_{j} + N \cdot x_{ij} \le N -1,~ \forall~ i,j = 1,\dots,N: j \ne 1 ~\text{and}~ i \ne j  &~~~ (5) \\
    & u_{i} \in \mathbb{Z}^{+},~ \forall~ i = 1,\dots,N  &~~~ (6) \\
    & x_{ij} \in \{0,1\},~ \forall~ i,j = 1,\dots,N  &~~~ (7) \\
  \end{array}
\end{align}
$
<br><br>
Equation (1) states the objective (distance minimization) by summing up the distances of the selected arcs. Equations (2) to (4) ensure that the salesperson enters and exists each node exactly once (i.e., that each node is visited exactly once), respectively. Equations (5) restricts solutions with subtour. Finally, Equations (6) and (7) states the domains of the decision variables.

## Python Implementation

### Loading Packages

In [263]:
!pip install haversine
!pip install numpy
!pip install pandas
!pip install pulp
!pip install plotly.express

import haversine
import numpy as np
import pandas as pd
import plotly.express as px
import pulp


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


### Input Data

In [264]:
input_array = np.array(
    [
     ['Bahnhof', 54.31548738087378, 10.132285213361302],
     ['Friedrichsort', 54.39371269762963, 10.184142058855787], 
     ['Holtenau', 54.37480425366618, 10.148469747635753], 
     ['University', 54.348125164196105, 10.117918463246932],
     ['Mitte', 54.32460566307153, 10.136629552088058],
     ['Garden', 54.313101710271425, 10.15091017118389],
     ['Wellingdorf', 54.32829259013629, 10.179582183151096],
     ['Heikendorf', 54.37979792093858, 10.212036804008633],
     ['Laboe', 54.409884128644144, 10.232678775594744]
     ]
)

df = pd.DataFrame(input_array, columns=['name', 'latitude', 'longitude'])
df['longitude'] = pd.to_numeric(df['longitude'])
df['latitude'] = pd.to_numeric(df['latitude'])

print(df)

fig = px.scatter_mapbox(df, lat='latitude', lon='longitude', color='name', 
                        zoom=10, mapbox_style='open-street-map')
fig.update_traces(marker={'size': 15})
fig.show()


            name   latitude  longitude
0        Bahnhof  54.315487  10.132285
1  Friedrichsort  54.393713  10.184142
2       Holtenau  54.374804  10.148470
3     University  54.348125  10.117918
4          Mitte  54.324606  10.136630
5         Garden  54.313102  10.150910
6    Wellingdorf  54.328293  10.179582
7     Heikendorf  54.379798  10.212037
8          Laboe  54.409884  10.232679


### Distances

In [265]:
distances = dict()
for index, row in df.iterrows():
  distances[row['name']] = dict()
  for index, row2 in df.iterrows():
    distances[row['name']][row2['name']] = round(haversine.haversine(
        (row['latitude'], row['longitude']), 
        (row2['latitude'], row2['longitude'])), 2)
    


### PuLP-Model

#### Initializing

In [266]:
from pulp.constants import LpMinimize

nodes = df.name
edges = [(i,j) for i in nodes for j in nodes]

x = pulp.LpVariable.dicts(name='x', indices=(nodes, nodes), 
                          lowBound=0, upBound=1, cat='Binary')
u = pulp.LpVariable.dicts(name='u', indices=nodes, 
                          lowBound=1, upBound=len(nodes), cat='Integer')

# build problem
prob = pulp.LpProblem(name='TSP', 
                      sense=LpMinimize)

print('N: ' + str(len(nodes)))

N: 9


#### Add Model

$
\begin{align}
  \begin{array}{lll}
    &\textbf{Objective} & \\
    & \min \sum_{i=1,\dots,N} \sum_{j=1,\dots,N} c_{ij} \cdot x_{ij} &~~~ (1) \\
    &&\\
    &\textbf{Constraints} & \\
    & \sum_{i=1,\dots,N} x_{ij} = 1,~ \forall~ j = 1,\dots,N  &~~~ (2) \\
    & \sum_{j=1,\dots,N} x_{ij} = 1,~ \forall~ i = 1,\dots,N  &~~~ (3) \\
    & x_{ii} = 0,~ \forall~ i = 1,\dots,N  &~~~ (4) \\
    & u_{i} - u_{j} + N \cdot x_{ij} \le N -1,~ \forall~ i,j = 1,\dots,N: j \ne 1 ~\text{and}~ i \ne j  &~~~ (5) \\
    & u_{i} \in \mathbb{Z}^{+},~ \forall~ i = 1,\dots,N  &~~~ (6) \\
    & x_{ij} \in \{0,1\},~ \forall~ i,j = 1,\dots,N  &~~~ (7) \\
  \end{array}
\end{align}
$

#### Objective

In [267]:
prob = pulp.LpProblem(name='TSP', sense=LpMinimize)

# Objective (1)
prob += sum(distances[i][j] * x[i][j] for i in nodes for j in nodes), "Objective: total travel distance"

# for j in nodes:
#   prob += pulp.lpSum(x[i][j] for i in nodes) == 1, '(2): ' + j
#   for i in nodes:
#     if j == i:
#       prob += x[i][j] >= 1, 'test: ' + x[i][j]


#### Constraints

In [268]:
prob = pulp.LpProblem(name='TSP', sense=LpMinimize)

# Objective (1)
prob += sum(distances[i][j] * x[i][j] for i in nodes for j in nodes), "(1):Objective"

# Constraints (2)
for j in nodes:
  prob += sum(x[i][j] for i in nodes) == 1, '(2):' + j

# Constraints (3)
for i in nodes:
  prob += sum(x[i][j] for j in nodes) == 1, '(3):' + i

# Constraints (4)
for i in nodes:
  prob += x[i][i] == 0, '(4): ' + i

# Constraints (5)
for i in nodes:
  for j in nodes:
    if j != nodes[0] and i != j:
      prob += u[i] - u[j] + 9*x[i][j] <= 9 - 1, '(5):' + i + '>' + j

#### Solve problem

In [269]:
# print problem
# print(prob)

# solve problem
prob.solve()

# results
print("Status:", pulp.LpStatus[prob.status])
for v in prob.variables():
    if v.varValue > 0:
      print(v.name, "=", v.varValue)

Status: Optimal
u_Bahnhof = 1.0
u_Friedrichsort = 5.0
u_Garden = 9.0
u_Heikendorf = 7.0
u_Holtenau = 4.0
u_Laboe = 6.0
u_Mitte = 2.0
u_University = 3.0
u_Wellingdorf = 8.0
x_Bahnhof_Mitte = 1.0
x_Friedrichsort_Laboe = 1.0
x_Garden_Bahnhof = 1.0
x_Heikendorf_Wellingdorf = 1.0
x_Holtenau_Friedrichsort = 1.0
x_Laboe_Heikendorf = 1.0
x_Mitte_University = 1.0
x_University_Holtenau = 1.0
x_Wellingdorf_Garden = 1.0


# Graphical Visualization

In [270]:
import plotly.graph_objects as go

fig = go.Figure()

tour = [nodes[0]]
while len(tour) <= len(nodes):
  for j in nodes:
    if x[tour[-1]][j].varValue > 0:
      # val = df.loc[df.name==j, 'latitude'].values[0]
      # print(str(val))
      fig.add_trace(go.Scattermapbox(
          mode = "markers+lines",
          lon = [
              df.loc[df.name==tour[-1], 'longitude'].values[0],
              df.loc[df.name==j, 'longitude'].values[0]
          ],
          lat = [
              df.loc[df.name==tour[-1], 'latitude'].values[0],
              df.loc[df.name==j, 'latitude'].values[0]
          ],
          marker = {'size': 10},
          name='Segment ' + str(len(tour))
          ))
      tour.append(j)
      break;

fig.update_layout(
    mapbox = {
        'center': {'lon': df.longitude.mean(), 'lat': df.latitude.mean()},
        'style': "open-street-map",
        'zoom': 10
        })

print('Optimal tour: ' + str(tour))

fig.show()

Optimal tour: ['Bahnhof', 'Mitte', 'University', 'Holtenau', 'Friedrichsort', 'Laboe', 'Heikendorf', 'Wellingdorf', 'Garden', 'Bahnhof']
