# Supply chain example
One day, your boss Anne calls you over and asks you:
> Our warehouses in Copenhagen and Hamburg are consistently running low on stock even though we are producing more than enough! Can’t we do better?

She also tells you that the freight cost is $90 \frac{\text{€}}{\text{L} \cdot 100\text{ km}}$, the freight loading cost is $25 \frac{\text{€}}{\text{L}}$ and you are handed the following information:
- Odense can supply 350 L
- Aarhus can supply 600 L
- Copenhagen requires 325 L
- Hamburg requires 275 L

## Extracting the actual problem
What do you do at this point? Well, since this is an example problem, the outcome is kinda on the nose, but not very far off from what you would find in the real life. First, we have to define what "better" means. Since we are given pricing information, it is reasonable to assume that we want to minimize our cost (the most common objective in mathmetical programming). The second question is much trickier though: who is "we"? This boils down to what degrees of freedom do we have? Do we have full control over the shipping? Do we have to hire third-party vendors that require certain volumes to be met etc.? This heavily influences what model you are going to build. However, as this is a simple example case, we can narrow our scope down to the following formulation:
> We want to find out how much we have to ship from Odense/Aarhus to Copenhagen/Hamburg such that we minimize cost and subject to the market demand, supply capacities and physical constraints.

In a more mathematical way, we want to solve:
\begin{equation}
\begin{array}{ll}
\text{minimize} & \text{Shipping costs} \\
\text{subject to} & \text{Market demand} \\
& \text{Supply capacities}
\end{array}
\end{equation}

## Getting started
Ok, now that we have narrowed down what we want solve, let's setup the problem and get the data into Python:

In [1]:
import numpy as np
import xpress as xp
from dataclasses import dataclass
from geopy.distance import geodesic

# Generate the model
model = xp.problem("Supply chain example")

# The location data
@dataclass()
class Location:
    name: str
    max_quantity: float # Positive for supplier, negative for consumer
    coordinate: tuple
        
    def get_distance(self, other_location) -> float:
        """Calculates the distance in units of 100km"""
        return geodesic(self.coordinate, other_location.coordinate).kilometers / 100

locations = [
    Location("Odense", 350, (55.396229, 10.390600)),
    Location("Aarhus", 600, (56.158150, 10.212030)),
    Location("Copenhagen", -325, (55.676098, 12.568337)),
    Location("Hamburg", -275, (53.553841, 9.991650))
]

# Extract origins and destinations
origins = [loc for loc in locations if loc.max_quantity > 0]
destinations = [loc for loc in locations if loc.max_quantity < 0]

# The cost data
distance_cost = 90; # € per L per 100km
loading_cost = 25; # € per L

A few notes before we continue:
1. The use of a `dataclass` may seem a bit like overkill, but it actually makes this code very easily readable in my opinion. It also really highlights the power of Python, where we can define classes and do geodesic calculations in 15 lines of code. This is awesome!
2. The distance are "as the Nazgul flies". This is of course unrealistic, but digging down deeper into this does not add more value to this example. However feel free to e.g. use the Google Maps API for that (see e.g. [here](https://matthewkudija.com/blog/2018/11/19/google-maps-api/)).

## Solving the problem
Now that we are all caught up, let's start solving the problem. To do that, we need to take our verbose formulation from above and fit it into mathematics. The first thing to do to that end is *defining the variables*:

### Defining the variables
We are interested in the shipping quantities, i.e. how much is transported from `origins` to `destinations`. Therefore, these are our variables. In fact, variables are always the things we *don't know* and want to find out, in other words our degrees of freedom. For our specific example, we define:
\begin{equation}
q_{o,d} \in [0,S_o]
\end{equation}
where $q_{o,d}$ is the amount of product shipped from origin $o$ to destination $d$ and $S_o$ is the supply available at origin $o$. The bounds of the variable are given by the fact that we cannot ship negative amounts, and that we can only ship as much from a given origin as that one has to offer.

In [56]:
# First we define the connections that are featured in the variable
class Connection:    
    def __init__(self, origin: Location, destination: Location):
        self.origin = origin
        self.destination = destination
        self.distance = origin.get_distance(destination)
        
# Generate all the connections
connections = [Connection(origin, destination) for origin in origins 
               for destination in destinations]

# Generate the variables and add to the problem
quantities = {conn : xp.var(vartype = xp.continuous, lb = 0, 
                            ub = conn.origin.max_quantity, 
                            name = f'q_({conn.origin.name},{conn.destination.name})') 
              for conn in connections}
model.addVariable(quantities)

A few notes here before we continue:
- You should *always* write your variables as dictionaries with the indices as keys and the variable objects as values.
- While it is possible to have tuples as the indices directly, in my experience it is *always* a good idea to create a custom class for the index combination needed and then use that. It makes the code much more versatile.

### Defining the constraints
Now that we have the variables in place, we can tackle the different constraints:
#### Market demand
To fulfill the market demand, the amount shipped has to be at least as great as the demand at the consumer site. Mathematically, we can express this as:
\begin{equation}
\sum \limits_o q_{o,d} \geq D_d, \hspace{0.15cm} \forall d
\end{equation}
where $D_d$ is the demand at destination $d$.

#### Supply capacities
However, we only have a given amount available from the suppliers, and we have to ensure that the amount shipped does not exceed that. Mathematically, we can express this as:
\begin{equation}
\sum \limits_d q_{o,d} \leq S_o, \hspace{0.15cm} \forall o
\end{equation}
where $S_o$ is the supply available at origin $o$.
> This in fact makes the bound we defined for each individual shipping quantity redundant, as this constraint is *tighter*. However, it is still good to have bounds on your variables, even if they are not used.


In [57]:
MarketDemand = [xp.constraint(xp.Sum(quantities[conn] 
                                     for conn in connections 
                                     if conn.destination == dest) 
                              >= -dest.max_quantity, 
                              name = f'Market demand for {dest.name}') 
                for dest in destinations]
SupplyCapacities = [xp.constraint(xp.Sum(quantities[conn] 
                                         for conn in connections 
                                         if conn.origin == origin) 
                                  <= origin.max_quantity, 
                              name = f'Supply capacity for {origin.name}') 
                    for origin in origins]

model.addConstraint(MarketDemand, SupplyCapacities)

A few notes before we continue:
- Remeber that we defined the destination as negative quantities. If you forget about this, the constraint will be useless since it will say that the sum of positive variables has to be greater than a negative number.
- We *always* name our constraints.
- We separate variable creation from constraint creation as much as possible to keep the code clean and modular.

### The objective function and solution
Before solving the problem, we still have to define the objective function. In our case this is pretty easy: we simply want to minimize the cost:
\begin{equation}
\sum \limits_{o,d} (c_l + c_d\delta_{o,d})q_{o,d}
\end{equation}
where $c_l$ is the loading cost, $c_d$ is the distance cost and $\delta_{o,d}$ is the distance between $o$ and $d$.

In [58]:
model.setObjective(xp.Sum((loading_cost + distance_cost*conn.distance) * 
                          quantities[conn] for conn in connections))

Finally, we are no ready to solve the problem. So let's hit `solve`!

In [62]:
model.solve()

# Print how it ended
print(f'Solver status: {model.getProbStatusString()}')

Solver status: lp_optimal


## Analysing the result
Great, we now have solved the problem. So what do we get out? Basically, we get values for our degrees of freedom, i.e. our variables $q_{o,d}$:

In [67]:
print('Shipping quantities:')
for conn in connections:
    print(f'{conn.origin.name} -> {conn.destination.name}: {model.getSolution(quantities[conn])} L')

Shipping quantities:
Odense -> Copenhagen: 75.0 L
Odense -> Hamburg: 275.0 L
Aarhus -> Copenhagen: 250.0 L
Aarhus -> Hamburg: 0.0 L


This result makes intuitive sense: we're going to fill the cheapest demand first, and then fill up the rest. We also get the value of the objective function:

In [70]:
print(f'Shipping costs: {model.getObjVal()} €')

Shipping costs: 110954.27135156743 €


A note of caution here: this type of floating point accuracy with 10 digits behind the decimal always comes up in continuous problems. Of course these are not [*significant* digits](https://en.wikipedia.org/wiki/Significant_figures), as the uncertainty of the data, let alone the simplifications of the model, do not warrant that precision. Although this may seem obvious to you, it is not obvious to everybody and so I suggest that you always only report a rounded version of the result. In this case for example, I would probably report (e.g. in an interface):

In [75]:
print(f'Shipping costs: {round(model.getObjVal() / 1000, 1)} k€')

Shipping costs: 111.0 k€


Why did I chose this number and not another? No real reason, other that it seemed common sense that we probably can say something down to 100 € in accuracy. But for each problem you encouter, you should always think about this and treat your output accordingly.

But is this all we can get from the solution? Actually, we can get two more pieces of information that are often relevant:
- *The value of the dual variables:* We will touch upon this in the nonlinear part of this course a little bit, but dual variables effectively represent the derivate of the solution along the constraints, i.e. it represents how "expensive" it is to have a certain constraint and how much could be gained by relaxing it. It is an extremely important and very deep concept, and feel free to go crazy with [this note](https://sites.math.washington.edu/~rtr/papers/rtr054-ConjugateDuality.pdf) and Chapter 5 in [this excellent book](https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf).
- *The log information:* Sometimes it is very useful to know *how* we arrived at a certain solution. Was the problem symmetric? Were there numerical difficulties? Those and other things are typically only reported in the log file and give a lot of insight into whether the solution is what we were looking for and how to improve on it. Also, most users would like some form of update on how their optimization progresses (unless it solves in less than 10 seconds typically), so you most likely will have to deal with the log from that standpoint (we will touch upon how to do this with callbacks in a later excercise).