# Urban zone fares
In this problem, we have a look at how to price the public transport of Copenhagen for a selected number of stations. Specifically, we will be looking at Nørreport, Kastrup, Glostrup, Klampenborg, Herlev and Christianshavn.

## Initialization and data import
First, let's get the data in:

In [1]:
# Initialization
import xpress as xp
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Define the base ride parameter
f0 = 15
fmin = 10
fmax = 40
cdiffmin = 0
cdiffmax = 30

class Connection:
    def __init__(self, origin, destination, distance, elasticity, base_rideship):
        self.origin = origin
        self.destination = destination
        self.distance = distance
        self.elasticity = elasticity
        self.base_ridership = base_rideship
        
    def __str__(self):
        return f'{self.origin}->{self.destination}'
      
        
connections = [Connection("Norreport", "Kastrup", 7.9, -0.6, 50),
              Connection("Norreport", "Glostrup", 12.1, -0.7, 3),
              Connection("Norreport", "Klampenborg", 11.5, -0.6, 9),
              Connection("Norreport", "Herlev", 9.6, -0.8, 15),
              Connection("Norreport", "Christianshavn", 1.8, -0.9, 80),
              Connection("Kastrup", "Glostrup", 18.3, -0.7, 8),
              Connection("Kastrup", "Klampenborg", 19.1, -0.5, 9),
              Connection("Kastrup", "Herlev", 17.5, -0.9, 5),
              Connection("Kastrup", "Christianshavn", 6.2, -0.9, 60),
              Connection("Glostrup", "Klampenborg", 20.2, -0.6, 16),
              Connection("Glostrup", "Herlev", 7.5, -0.8, 26),
              Connection("Glostrup", "Christianshavn", 13.1, -0.9, 34),
              Connection("Klampenborg", "Herlev", 13.7, -0.4, 35),
              Connection("Klampenborg", "Christianshavn", 13.2, -0.9, 12),
              Connection("Herlev", "Christianshavn", 11.6, -0.9, 19)]

## Variable definition
Let's have a look at the variables we are going to need: we need to identify the flows, the fixed cost per fare and the mileage cost.

In [7]:
model = xp.problem("Urban zone fares")

f = {c : xp.var(vartype = xp.continuous, lb = fmin, ub = fmax, name = f'f_{c}') for c in connections}
cfix = xp.var(vartype = xp.continuous, lb = 0, name = 'cfix')
cmile = xp.var(vartype = xp.continuous, lb = 0, name = 'cmile')
P = {c : xp.var(vartype = xp.continuous, lb = 0, name = f'P_{c}') for c in connections}

model.addVariable(f,cfix,cmile, P)

These variables are related through a simple linear relation.

In [8]:
fare_relation = (xp.constraint(f[c] == cfix + cmile*c.distance, name = f'Fix fare price for {c}') 
                     for c in connections)
rideship_relation = (xp.constraint(P[c] == c.base_ridership * (1 + c.elasticity * ((f[c] - f0)/f0)),
                                   name = f'Fix ridership for {c}')
                     for c in connections)

## Constraint definition
There is a minimum and maximum difference between the fixed and mileage cost:

In [9]:
cdiffbound = xp.constraint(cdiffmin <= cfix - cmile <= cdiffmax, name = 'Difference in mileage and fixed price')
model.addConstraint(fare_relation, rideship_relation, cdiffbound)

## The objective function
The difficult part of the example is the formulation of the objective function, a task that is often neglected. Clever objective function modelling can often result in a much nicer and easier problem to handle.

First, we begin by stating that the overall revenue is given by ridership $P_{ij}$ times fare $f_{ij}$, i.e.
\begin{equation}
R = \sum \limits_{i} \sum \limits_{j>i} P_{ij}f_{ij}
\end{equation}

However, how is ridership established? Well, we take a look at the notes:
\begin{equation}
P_{ij} = P_{ij}^0 \left(1+ e_{ij}\frac{f_{ij} - f_{ij}^0}{f_{ij}^0}\right)
\end{equation}

Although we probably could simply put this into CPLEX, it is worth the effort to do some linear algebra here:
\begin{equation}
P_{ij} = P_{ij}^0 \left(\frac{f_{ij}^0 + e_{ij}f_{ij} - e_{ij}f_{ij}^0}{f_{ij}^0}\right) \\
P_{ij} = \frac{P_{ij}^0}{f_{ij}^0}(e_{ij}f_{ij} + (1-e_{ij})f_{ij}^0) \\
P_{ij} = \frac{e_{ij}P_{ij}^0}{f_{ij}^0}\left(f_{ij} + \frac{1-e_{ij}}{e_{ij}} f_{ij}^0\right)
\end{equation}

### Why should you do this?
This little bit of linear algebra proves concavity! If I give you $f_{ij}P_{ij}^0 \left(1+ e_{ij}\frac{f_{ij} - f_{ij}^0}{f_{ij}^0}\right)$, do you immediately know that it is concave? Using this simple reformulation we have proven concavity and the fact that this is a simple convex quadratic programming problem. The reason it is a convex QP by the way is, because we look to maximize the revenue, which is equivalent to minimizing $-R$, which is convex.

> However, from a pure math perspective we will stick with $P_{ij}$ and $f_{ij}$ and get:

In [10]:
model.setObjective(xp.Sum(P[c]*f[c] for c in connections), sense = xp.maximize)

## Solution and post-processing

In [11]:
model.solve()
print(f'Solution status: {model.getProbStatusString()}')

Solution status: lp_optimal


### Let's look at the money

In [22]:
print(f'Total revenue: {np.round(model.getObjVal() / 1000,2)} M DKK per day')
print (f'Current revenue: {np.round(sum(c.base_ridership * f0 for c in connections) / 1000,2)} M DKK per day')

Total revenue: 5.84 M DKK per day
Current revenue: 5.72 M DKK per day


This means we are earning 120'000 DKK more per day with this new setup. Ok, great. So now let's dig into the data itself. It may be interesting to look at the revenue generation from each route:

In [24]:
for c in connections:
    print(f'{c}: {np.round(model.getSolution(f[c])*model.getSolution(P[c]),2)} DKK')

Norreport->Kastrup: 783.71 DKK
Norreport->Glostrup: 46.44 DKK
Norreport->Klampenborg: 142.48 DKK
Norreport->Herlev: 227.48 DKK
Norreport->Christianshavn: 1203.33 DKK
Kastrup->Glostrup: 123.28 DKK
Kastrup->Klampenborg: 149.41 DKK
Kastrup->Herlev: 71.65 DKK
Kastrup->Christianshavn: 899.35 DKK
Glostrup->Klampenborg: 255.99 DKK
Glostrup->Herlev: 394.83 DKK
Glostrup->Christianshavn: 499.0 DKK
Klampenborg->Herlev: 586.11 DKK
Klampenborg->Christianshavn: 176.04 DKK
Herlev->Christianshavn: 280.6 DKK


We see that the largest values are from Norreport to Kastrup and Christianshavn, and from Kastrup to Christianshavn. Norreport to Glostrup on the other hand is not really that important.

Lastly, we want to look at the change in rideship along the routes. Do some get more busy?

In [27]:
for c in connections:
    print(f'{c}: {np.round(100* ((model.getSolution(P[c]) / c.base_ridership)-1),2)}%')

Norreport->Kastrup: -8.59%
Norreport->Glostrup: -14.37%
Norreport->Klampenborg: -11.78%
Norreport->Herlev: -13.46%
Norreport->Christianshavn: -4.75%
Kastrup->Glostrup: -20.79%
Kastrup->Klampenborg: -15.44%
Kastrup->Herlev: -25.66%
Kastrup->Christianshavn: -10.61%
Glostrup->Klampenborg: -19.51%
Glostrup->Herlev: -10.97%
Glostrup->Christianshavn: -19.8%
Klampenborg->Herlev: -9.16%
Klampenborg->Christianshavn: -19.94%
Herlev->Christianshavn: -17.81%


Generally, the ridership reduced on all lines, even by up to 25%. Interestingly enough, 3 out of the 4 lowest changes correspond to the biggest cash cows, i.e. where we get the most revenue from.