# Railway dispatching
Public transport is an industry with many use cases that benefit from mathematical optimization. Examples include long-term planning (which services to operate at what frequency, timetabling of services), mid-term (assigning rolling stock and crew to train trips) and short-term (rescheduling, dispatching). In this notebook we consider the railway dispatching problem. We are given a set of trains with their current position and desired route. The challenge is to decide on the sequence of movements, so each train reaches its endpoint as soon as possible while respecting capacity limitations at the tracks and intermediate stations.

## Load required packages
If you have a Gurobi license and installed the software already, you can skip the installation of `gurobipy`, but always make sure you have the [latest version](https://www.gurobi.com/downloads/gurobi-software) available.

In [1]:
%pip install gurobipy pandas plotly.express nbformat

In [2]:
import itertools as it
import math
import pandas as pd
import plotly.express as px
import datetime as dt
from os import path
import gurobipy as gp
from gurobipy import GRB

## Retrieving the data
We will be loading our scenarios from CSV files. These can be found in the `data` folder; each scenario consists of various CSV files within a subfolder. The files contain the following data:
- `stations.csv` contains the railway stations. Each has a `capacity` (number of trains that can be at the station concurrently) and `duration` (time required for a stop at the station). Stations also have a `y` coordinate for visualizing the schedule later.
- `tracks.csv` contains the railway tracks that connect pairs of stations. Each has a `capacity` and `duration` as above, as well as a `start` and `end` station. We assume tracks can be traversed in both directions, so the order of `start` and `end` is not relevant.
- `routes.csv` contains predefined routes through the network. Each route is a list of resources (stations and tracks).
- `trains.csv` contains the trains that require scheduling for traversing the network. Each train follows a `route`, but may `start` at any point in the route.

In [3]:
dataset = 'linear'
folder = f'data/{dataset}'
df_stations = pd.read_csv(path.join(folder, 'stations.csv')).set_index('id')
df_tracks = pd.read_csv(path.join(folder, 'tracks.csv')).set_index('id')
df_routes = pd.read_csv(path.join(folder, 'routes.csv')).set_index('id')
df_trains = pd.read_csv(path.join(folder, 'trains.csv')).set_index('id')

We will do a bit of processing, to make it easier to use our data.

In [4]:
# For each route, parse the station/track string
routes = df_routes['resources'].map(lambda route: [int(res) for res in route.split('-')]).to_dict()

# For each train, find the corresponding route and apply the starting position
trains = df_trains.apply(lambda row: list(it.dropwhile(lambda resource: resource!=row.start, routes[row.route])), axis=1).to_dict()

# For each resource, find the duration and capacity
duration = pd.concat([df_tracks['duration'], df_stations['duration']]).to_dict()
capacity = pd.concat([df_tracks['capacity'], df_stations['capacity']]).to_dict()

# Find Y coordinates for the stations
station_y = df_stations['y'].to_dict()

# For each track, find the pair of stations on both sides of the track
track_stations = df_tracks.apply(lambda track: (track.start, track.end), axis=1)

# Find the set of resources with their types (S=station, T=track)
resource_type = { key: 'S' for key in df_stations.index } | { key: 'T' for key in df_tracks.index }
resources = resource_type.keys()

We can show the input data as a simple Dataframe combining tracks and stations as below. Note that tracks don't have `y` values while stations don't have `endpoints` values.

In [5]:
pd.DataFrame({'type': resource_type, 'duration': duration, 'capacity': capacity, 'y': station_y, 'endpoints': track_stations})

Unnamed: 0,type,duration,capacity,y,endpoints
0,S,2,2,0.0,
1,T,6,1,,"(0, 2)"
2,S,4,2,6.0,
3,T,8,1,,"(2, 4)"
4,S,2,2,14.0,
5,T,5,1,,"(4, 6)"
6,S,2,2,19.0,


To make our lives easier, we will define a little helper function that gives us, for a particular train and resource, the resource immediately *after* the given resource along the route for that train. If the given resource is the last one in the route, then we will return `None`.

In [6]:
def get_next_resource(train, resource):
    route = trains[train]
    index = route.index(resource)
    return route[index + 1] if index < len(route)-1 else None

The various datasets included with this notebook are illustrated below. Nodes (circles) indicate stations whereas lines represent tracks. While the `linear` model does not involve branching, the `yshape` is a bit more complex and the `xshape` adds even more complexity.

![](images/Slide2.png)

## Mathematical optimization model

### Timing aspects
The main decision in this use case is about timing: when should each train visit each resource.

Let's define a few important sets.
- Let $R$ the set of resources; these include both stations and tracks.
- Let $I$ be the set of trains; we define $R_i$ to be the (ordered) set of resources to be visited by train $i \in I$

For timing, we introduce two sets of decision variables.
- The main variables will be $t_{i,r}$ to denote the time by which train $i$ visits (starts occupying) resource $r$.
- Since we want to look at total throughput time, we will also define $t^F_i \: \forall i \in I$ as the finish time for train $i$

In [7]:
model = gp.Model()
events = gp.tuplelist([(train, resource) for train, route in trains.items() for resource in route])
t = model.addVars(events, name='t')
tf = model.addVars(trains.keys(), name='tf')

# Helper function to retrieve 't' or 'tf', depending on if resource=None
def tvar(train, resource):
    return t[train, resource] if resource is not None else tf[train]

Set parameter LicenseID to value 692374


If we consider a single train, we know that resources are visited in a particular order and the train will take a known duration to visit each resource.
- Let's assume $dur_r$ for each resource $r \in R$ denotes the duration required for traversing the resource
- Then for a pair of resources $(r, r')$ visited consecutively by a train $i$, we know $t_{i,r'} = t_{i,r} + dur_r$

So we can link pairs of consecutive time variables for each train using that equation.

In [8]:
# Precedence constraints
for train, route in trains.items():
    for train_i in range(len(route)-1):
        # We assume that trains are present in the network from the start, so must
        # have their first event equal to 0.
        if train_i==0:
            model.addConstr(t[train, route[train_i+1]] - t[train, route[train_i]] >= duration[route[train_i]])
        else:
            model.addConstr(t[train, route[train_i+1]] - t[train, route[train_i]] >= duration[route[train_i]])            
    model.addConstr(tf[train] - t[train, route[len(route)-1]] == duration[route[len(route)-1]])

# Starting position
for train in trains:
    resource = trains[train][0]
    t[train, resource].UB=0

### Conflict detection
If we would solve the model as it is now, we would optimize each train separately since there are no constraints linking them together. The relationship between trains is introduced because they share certain resources with limited capacity.

In order to deal with resource capacity, the first step is to focus on pairs of trains $i$ and $j$ that share a particular resource $r$. Let's assume train $i$ visits resource $u$ after $r$, while train $j$ visits resource $v$ next. 

Now depending on the value of the $t$ variables, we will face one of three situations:
- Train $i$ precedes $j$, which would mean train $i$ reaches $u$ before train $j$ reaches $r$.
- Similarly, train $j$ might precede train $i$ when it reaches $v$ before train $i$ reaches $r$.
- Finally, trains $i$ and $j$ might be sharing the resource concurrently for some period of time.

 We can introduce binary variables to represent each of the three scenarios above:
- $y_{i,j,r}=1$ when $i$ precedes $j$; for this we add constraint $t_{j,r} - t_{i,u} \geq -M \cdot (1-y_{i,j,r})$
- $y_{j,i,r}=1$ when $j$ precedes $i$; for this we add constraint $t_{i,r} - t_{j,v} \geq -M \cdot (1-y_{j,i,r})$
- $x_{i,j,r}=1$ when $i$ and $j$ meet at resource $r$; for this we have a pair of constraints $t_{j,v} - t_{i,r} \geq -M \cdot (1-x_{i,j,r})$ and $t_{i,u} - t_{j,r} \geq -M \cdot (1-x_{i,j,r})$
- By definition, we must have $y_{i,j,r} + y_{j,i,r} + x_{i,j,r} = 1$

![](images/Slide1.png)

The left diagram shows how two trains $i$ and $i'$ visit a shared resource $r$ and then continue to resource $u$ resp. $v$. The right diagram shows the various events involved in conflict detection at resource $r$. Each arrow indicates that the head node must follow the tail node. The $P$ arrows indicate precedence constraints for a single train. The $X$ constraints apply when the trains occupy the resource simultaneously. The $Y$ arrows apply when one train precedes the other one on the resource $r$.

In [9]:
M = sum(duration) * len(trains)

def create_disjunct_constraints(resource, train_i, resource_u, train_j, resource_v, M):
    t_ir = tvar(train_i, resource)
    t_iu = tvar(train_i, resource_u)
    t_jr = tvar(train_j, resource)
    t_jv = tvar(train_j, resource_v)

    # A before B
    y_ab = model.addVar(vtype=GRB.BINARY, name=f'y[{resource},{train_i},{train_j}]')
    model.addConstr(t_jr - t_iu >= -M *(1-y_ab), name=f'y[{resource},{train_i},{train_j}]')

    # B before A
    y_ba = model.addVar(vtype=GRB.BINARY, name=f'y[{resource},{train_j},{train_i}]')
    model.addConstr(t_ir - t_jv >= -M *(1-y_ba), name=f'y[{resource},{train_j},{train_i}]')

    # A and B meet
    x = model.addVar(vtype=GRB.BINARY, name=f'x[{resource},{train_i},{train_j}]')
    model.addConstr(t_jv - t_ir >= -M *(1-x), name=f'x1[{resource},{train_i},{train_j}]')
    model.addConstr(t_iu - t_jr >= -M *(1-x), name=f'x2[{resource},{train_i},{train_j}]')

    # Have exactly one of three situations
    model.addConstr(y_ab + y_ba + x == 1, name=f'xy[{resource},{train_i},{train_j}]')
    return x

### Resource capacity
With the constraints above, we would only derive the correct values for $x$ and $y$ given the values of $t$, without impacting the feasible region for $t$. The missing piece is the limited capacity of our resources.

Let's assume we have a resource with capacity $c$, which is visited by $n$ trains. If $c \geq n$ then we don't need to do anything. Otherwise, we will need constraints to ensure we never have more than $c$ resources at the same time. How can we model this using the $x$ variables for pairs of trains?

Imagine we have $c+1$ trains visiting at the same time. Those trains would form ${c+1 \choose 2}$ pairs of trains $(i,j)$ and all corresponding variables $y_{i,j,r}$ would have value 1 since they all meet at the same time. The sum of those variables would therefore equal ${c+1 \choose 2}$.

So the way to prevent this, is by adding a constraint for every subset of $c+1$ trains. That constraint should guarantee that at least one pair of trains in the subset does not meet at the resource. In other words, the sum of $y$ variables in the subset would need to be less than ${c+1 \choose 2}$.

The algorithm to generate those constraints is as follows:
- Iterate over all resources
- Find all pairs of trains visiting the resource
- Generate the $x$ and $y$ variables using the helper function above, for each pair of trains
- If the number of trains does not exceed $c$, then skip
- Else, if the resource has capacity $c>1$:
  - Generate all subsets of trains of size $c+1$
  - Add a constraint that ensures the sum of $y$ variables for pairs of trains in this subset, does not exceed ${c+1 \choose 2}$
- Otherwise, we can simply force $x=0$ to prevent overlap.

In [10]:
# Conflicts
for resource in resources:
    # Find trains that visit
    resource_events = events.select('*', resource)
    resource_trains = [t for (t,r) in resource_events]
    trains_with_next = []

    # For each train, find the consecutive event
    for train in resource_trains:
        next_resource = get_next_resource(train, resource)
        trains_with_next.append((train, next_resource))

    # List of X variables to use for hotspot constraint later
    xlist = {}

    # Consider pairs of trains
    pairs = it.combinations(trains_with_next, 2)
    for pair in pairs:
        ((train_i, resource_u), (train_j, resource_v)) = pair
        xlist[train_i,train_j] = create_disjunct_constraints(resource, train_i, resource_u, train_j, resource_v, M)

    # Add hotspot constraints
    if len(resource_trains) <= capacity[resource]:
        continue
    elif(capacity[resource] > 1):
        subsets = it.combinations(resource_trains, capacity[resource]+1)
        rhs = math.comb(capacity[resource]+1, 2) - 1
        for subset in subsets:
            pairs = it.combinations(subset, 2)
            xsel = [xlist[ta,tb] for (ta,tb) in pairs]
            suffix = ','.join(list(subset))
            model.addConstr(gp.quicksum(xsel) <= rhs, f'h[{resource},{suffix}]')            
    else:
        for x in xlist.values():
            x.UB = 0

### Additional complexity
If desired, you can add more complexity to the model here by adding constraints on the event timing. For example, in the `linear` dataset you might want to force train `F` to arrive before train `E`. We can do that by uncommenting the statement below. Note how this increases overall delay, since train `E` must wait for train `F` at a station.

In [11]:
#model.addConstr(tf['F'] <= tf['E'])

### Minimizing delays
And that's most of the work! We have defined timing variables with precedence constraints per train; then we added relationships between trains when they share resources and made sure those trains never exceed the resource capacity. Now the only missing piece is an objective function. To optimize passenger experience as well as operational efficiency, we want to get all trains to their endpoint as soon as possible. We could model this in several ways:
- One simple objective could be to simply sum up the arrival times at the endpoints. Here it doesn't matter whether one train has a long delay, or the same total delay is distributed over multiple trains. What we do achieve here, is that resources with limited capacity are being used as soon as possible.
- Another objective could be to minimize the time by which *all* trains have arrived at their endpoint. This would be similar to the concept of *makespan* in other scheduling problems.
- These values are being determined not only by our dispatching decisions, but also by the minimum travel time for each train which cannot be avoided. We could compensate for that by subtracting this minimum duration from the scheduled arrival time of each train. Note that since we're just subtracting a constant value, we will get the same optimal solution as the previous bullet.
- Finally, to make things more interesting, we could try and distribute delay between trains. While this might have negative impact on operational efficiency (it could be easier handling a single delay, than many small delays), it does represent a concept that could be considered *fairness*.

Below we will take the third approach, but one can easily modify the objective function to work with any of the ideas mentioned here. The fourth approach requires a full Gurobi license as it involves a quadratic objective.

In [12]:
def delay(train):
    min_duration = sum(duration[resource] for resource in trains[train])
    return tf[train] - min_duration
    
model.setObjective(gp.quicksum(delay(train) for train in trains))
model.ModelSense = GRB.MINIMIZE

## Solving the model
Now we're ready to solve the model. Gurobi does this within a couple of seconds:

In [13]:
model.optimize()

Gurobi Optimizer version 12.0.0 build v12.0.0rc1 (mac64[arm] - Darwin 24.1.0 24B91)

CPU model: Apple M1
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 367 rows, 218 columns and 1069 nonzeros
Model fingerprint: 0x55c46576
Variable types: 38 continuous, 180 integer (180 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+02]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+02]
Presolve removed 154 rows and 108 columns
Presolve time: 0.00s
Presolved: 213 rows, 110 columns, 641 nonzeros
Variable types: 26 continuous, 84 integer (84 binary)
Found heuristic solution: objective 70.0000000

Root relaxation: objective 1.000000e+01, 39 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0   10.00000    0    9   70.00000   10.00000  85

## Visualizing the solution
### Retrieving the solution

Once the solution has been calculated, we want to validate this by visualizing in several ways. To get started with that, we first retrieve the values for all variables and structure this information in a useful way. For every combination of a train and a resource being visited, we collect the time of reaching that resource, as well as the next resource and the time of reaching that next resource.

In [14]:
intervals = {
    (train, resource): (tvar(train, resource).X, tvar(train, get_next_resource(train, resource)).X, get_next_resource(train, resource) )
    for (train, resource) in events
}
model.dispose()

### Displaying the solution per train and per resource

A quick-and-dirty way to visualize the resulting schedule, is by creating a (pivot) table where the columns represent the resources (the horizontal axis represents *space*) and the rows represent trains. The cells show the time at which the train starts occupying the resource. The direction of travel can be seen easily from the fact that the times of a particular train increase going from left to right, or right to left.

In [15]:
df_intervals = pd.DataFrame.from_dict(intervals, orient='index').rename(columns={0:'start',1:'end',2:'next'}).reset_index()
df_intervals['train'], df_intervals['resource'] = zip(*df_intervals['index'])
df_intervals.drop(columns='index', inplace=True)
df_intervals.pivot_table(values='start', columns='resource', index='train', fill_value='')

  df_intervals.pivot_table(values='start', columns='resource', index='train', fill_value='')


resource,0,1,2,3,4,5,6
train,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
A,0.0,6.0,12.0,24.0,32.0,34.0,39.0
B,,0.0,6.0,16.0,24.0,26.0,31.0
C,,,,0.0,8.0,10.0,15.0
D,,,,,0.0,2.0,7.0
E,26.0,20.0,16.0,8.0,0.0,,
F,50.0,44.0,40.0,32.0,26.0,15.0,0.0


### Visualizing the schedule with a time-space diagram

Train schedules are typically visualized by showing space on the *vertical* axis and time on the *horizontal* axis. Stations have specific points on the y-axis whereas tracks are the space between those stations. Individual train trips are paths through the diagram, connecting (location, time) pairs. Horizontal line segments show that the train stands still at a station. When two trains run in the same direction and their path crosses, it means the faster train takes over the slower train. To construct such visualization, we first need to compute Y coordinates for each event of each train.

In [16]:
# For each train, compute Y coodinates per event
train_coordinates = { }
for train, route in trains.items():
    current_resource = None
    for resource in route:
        if resource_type[resource] == 'S':
            train_coordinates[train, resource] = (station_y[resource], station_y[resource])
        else:
            endpoints = track_stations[resource]
            if current_resource != None:
                first = current_resource
                second = endpoints[0] if endpoints[1] == current_resource else endpoints[1]                
            else:
                second = route[1]
                first = endpoints[0] if endpoints[1] == second else endpoints[1]
            train_coordinates[train, resource] = (station_y[first], station_y[second])
        current_resource = resource

Once we have those coordinates, we can compute the individual line segments per train and then connect those as paths. We draw the stations as dotted lines. Usually the slope of the line segment represents the speed of the train.

Note that for the `yshape` dataset, there is a junction in the track but the two branches cannot be properly visualized in one dimension. Station 6 is drawn between stations 4 and 8, even though trains running from station 4 to 8 don't actually pass by/through station 6.

In [17]:
df_diagram = it.chain(*[[{ 'train':train, 'y': train_coordinates[train, resource][0], 'x': start }, { 'train':train, 'y': train_coordinates[train, resource][1], 'x':end }] for (train, resource), (start, end, next) in intervals.items()])
fig = px.line(df_diagram, x='x', y='y', color='train', width=600, height=400)
fig.update_traces(line={'width': 4}, opacity=.7)
for station, y_ab in station_y.items():    
    fig.add_hline(y=y_ab, annotation_text=station, line_dash="dot", line_width=1, line_color='gray')
fig.show()

### Resource occupation

Finally, we can visualize the occupancy of each resource in the form of a Gantt chart. Rows represent resources and the horizontal axis represents time. When two bars for different trains overlap, they occupy the same resource (partially) at the same time.

In [18]:
t = dt.datetime.combine(dt.date.today(), dt.time())
df_timeline =  pd.DataFrame.from_records([{ 'train':train, 'resource':resource, 'start':t+dt.timedelta(minutes=start), 'end':t+dt.timedelta(minutes=end) } for (train, resource), (start, end, next) in intervals.items()])
fig = px.timeline(df_timeline, x_start='start', x_end='end', y='resource', color='train', height=400, opacity=0.5)
fig.update_yaxes(type='category', showticklabels=True, categoryorder='array', categoryarray=sorted(int(x) for x in resources))
fig.show() 

 # Conclusion

In this notebook we have looked at scheduling for railway operators. We have formulated the scheduling problem as a MIP which can be solved to optimality in reasonable amounts of time using Gurobi for relatively small instances. Of course with the chosen approach, the model size will grow quickly as the number of trains and resources, as well as the capacity of individual resources, increases. More advanced techniques have been developed for solving larger instances. Learn more about these in our [2024 webinar with Sintef](https://www.gurobi.com/events/sintef-railway-optimization/).