### Transport Tutorial - Notebook 1

# Solve *Dantzig's Transport Problem* using **ixmp4** and **linopy**

## Aim and scope of this tutorial

This tutorial takes you through the steps to solve a simple optimization model
using the **ixmp4** database management package and the **linopy** optimization package.

We use **Dantzig's transport problem**, which is used as a [tutorial for linopy](https://linopy.readthedocs.io/en/latest/transport-tutorial.html).
This problem solves for a least-cost shipping schedule that meets demand constraints at several markets (cities)
and supply constraints at factories.

For reference of the transport problem, see:
> Dantzig, G B, Chapter 3.3. In Linear Programming and Extensions.  
> Princeton University Press, Princeton, New Jersey, 1963.

## Tutorial outline

This tutorial consists of three Jupyter notebooks:

0. Set up an **ixmp4.Platform** to store the scenario input data and solution
1. Implement the **baseline version of the transport problem** and solve it
2. Create an **alternative scenario** and solve it 

<div class="alert alert-info">

This notebook requires that you set up a database and defined the units as shown in [**Notebook 0**](0_transport-tutorial_platform-setup.ipynb).

</div>

## The platform as a connection to the database

An [**ixmp4.Platform**](https://docs.ece.iiasa.ac.at/projects/ixmp4/en/latest/devs/ixmp4.core/platform.html#ixmp4.core.platform.Platform)
is the connection to a database instance that can hold scenario data and relevant additional information.

In [None]:
import ixmp4

In [None]:
platform = ixmp4.Platform("transport-tutorial")

An [**ixmp4.Run**](https://docs.ece.iiasa.ac.at/projects/ixmp4/en/latest/devs/ixmp4.core/run.html#ixmp4.core.run.Run)
is an object that holds all relevant information for one quantification of a "scenario".  
A run is identified by a *model name*, a *scenario name* and an automatically assigned *version number*.

As a first step to solve the **transport problem**, we create a new run.

In [None]:
run = platform.runs.create(model="transport problem", scenario="standard")

## Defining the structure of the optimization problem

### The IndexSets

An **IndexSet** defines a list of elements with a name. These sets can be used for "indexed assignment" of parameters, variables and equations. 
The entries of these parameters, etc. are then validated against the elements of the linked set. 
In database terms, a column of a parameter, etc. can be "foreign-keyed" onto an set.

Below, we first show the data as they would be written in the linopy tutorial.

We now initialize these sets and add the data.

In [None]:
i = run.optimization.indexsets.create("i")
i.add(["seattle", "san-diego"])

We can display the elements of any **IndexSet** as a Python list:

In [None]:
i.data

An **IndexSet** can have a docstring as a means for documentation.

In [None]:
i.docs = "Canning Plants"

For simplicity, the steps of creating an **IndexSet** and assigning elements can be done in one line.

We illustrate this for the second index-set of the transport problem.

In [None]:
run.optimization.indexsets.create("j").add(["new-york", "chicago", "topeka"])

To add the docstring, we now have to explicitly get the index-set *j* and add the documentation.

In [None]:
run.optimization.indexsets.get("j").docs = "Markets"

### Parameters of the optimization problem

A **Parameter** is a table with a number of index columns (each constrained to an **IndexSet**) as well as *units* and *values* columns.

As a next step to solving the transport problem, we define the parameters *capacity* and *demand*.
The parameters are assigned on the indexsets *i* and *j*, respectively.

The parameter data can be assigned as a dictionary.

In [None]:
a = run.optimization.parameters.create(name="a", constrained_to_indexsets=["i"])
a.docs = "Capacity of plant i"

a_data = {
    "i": ["seattle", "san-diego"],
    "values": [350, 600],
    "units": ["cases", "cases"],
}
a.add(data=a_data)

Alternatively, the parameter data can be passed as a **pandas.DataFrame**.

In [None]:
import pandas as pd


b = run.optimization.parameters.create("b", constrained_to_indexsets=["j"])
b.docs = "Demand at market j"

b_data = pd.DataFrame(
    [
        ["new-york", 325, "cases"],
        ["chicago", 300, "cases"],
        ["topeka", 275, "cases"],
    ],
    columns=["j", "values", "units"],
)
b.add(b_data)

Notice how the data has three columns but has only been linked to one **IndexSet**? That's on purpose: Every **Parameter** needs to have (the columns) *values* and *units*. The value(s) can be any number(s), but the units have to be defined a-priori in the **ixmp4.Platform**.

Here's how to access `parameter.data` to e.g. quickly confirm that *b* is set correctly:

In [None]:
b.data

We now turn to a multi-dimensional parameter...

It is possible to add data to a parameter in several steps...

Here, we first define the parameter *d* and add four datapoints.

In [None]:
d = run.optimization.parameters.create("d", constrained_to_indexsets=["i", "j"])
d.docs = "Distance between cities"

d_data = {
    "i": ["seattle", "seattle", "seattle", "san-diego"],
    "j": ["new-york", "chicago", "topeka", "new-york"],
    "values": [2.5, 1.7, 1.8, 2.5],
    "units": ["km", "km", "km", "km"]
}
d.add(d_data)

Now, we add the two other datapoints. This step-by-step manipulation of parameter data can be helpful in large models.

In [None]:
d.add({"i": ["san-diego"], "j": ["chicago"], "values": [1.8], "units": ["km"]})
d.add({"i": ["san-diego"], "j": ["topeka"], "values": [1.4], "units": ["km"]})

<div class="alert alert-warning">

Every time you add data, **all** columns of the parameter must be present!

</div>

### Scalars

Another type of input data for optimization problems is a **Scalar**. These are not linked to an **IndexSet**, but consist of only a value and a unit (and a docstring).

In [None]:
f = run.optimization.scalars.create(name="f", value=90, unit="USD/km")
f.docs = "Freight"

### Defining the solution structure

The solution of an optimization problem are a list of **Variable** and **Equation** objects.

We first define the variables and equations in the **ixmp4.Run**. The values of the solution (level and marginal, mathematically speaking) are read from the **linopy** output after solving the problem.

Here, *supply* can only come from the factories in `IndexSet` *i*, while *demand* needs to be met at the markets in `IndexSet` *j*.

Shipment happens from a factory to a market, so *x* needs to be assigned to both *i* and *j*.

In [None]:
x = run.optimization.variables.create("x", constrained_to_indexsets=["i", "j"])
z = run.optimization.variables.create("z")

supply = run.optimization.equations.create("supply", constrained_to_indexsets=["i"])
demand = run.optimization.equations.create("demand", constrained_to_indexsets=["j"])

## Solve the scenario

In this tutorial, we solve the tutorial using the open-source solver *highs* in **linopy**. 

The ``create_dantzig_model()`` function is a convenience shortcut to retrieve the data from the **ixmp4.Run**
and set up a linopy model correctly to solve the transport problem. Please see ``linopy_model.py`` for details.

The solution of the transport problem is stored with the model object automatically.
The function ``store_dantzig_solution()`` reads the solution and stores it in the respective **Variable** and **Equation** objects of the **ixmp4.Run**.

In [None]:
from tutorial.transport.dantzig_model_linopy import (
    create_dantzig_model,
    read_dantzig_solution,
)


linopy_model = create_dantzig_model(run=run)
linopy_model.solve("highs")
read_dantzig_solution(model=linopy_model, run=run)

## Display and analyze the results

We can now retrieve and display the components of the solution.

First, the variable *z* is the total cost of satisfying the demand at all markets.

In [None]:
z.levels

The variable *x* shows the least-cost (optimal) shipment from plants to markets.

In [None]:
pd.DataFrame(x.data)

The levels and marginals of the **Equation** show the shipped quantities and shadow prices ("dual variables") of the least-cost solution.

In [None]:
demand.data

In [None]:
supply.data

## Setting a default version of a run

The key benefit of **ixmp4** is handling a large number of scenarios - aka **ixmp4.Run** objects - in a database.
Each run is identified by a *model name*, a *scenario name* and an automatically assigned *version number*.

For every model-scenario combination, we can assign one run as the *default version*.
This allows to keep previous versions in the database (for easy reference and comparison) but have a well-defined approach to get the "right" version (e.g., the latest version of a scenario).

In [None]:
run.set_as_default()