
## OR Study Group: Queueing Theory

### Collaborators: 
* Clare Essex
* Hamish MacGregor
* Rudi Narendran
* Jonathan Teagles
* Emma Tearse

This notebook is designed to run simple queueing theory models using python via mybinder.org



The notebook is structured in XX parts:
* [Set Up](#setup)
* [Theory](#theory1)
* [Simple Queues - M/M/1](#simple)
    * [Case Study 1: XX](#casestudy1)
* [Understanding $\lambda$, $\alpha$ and other variables](#theory2)
* [Complex Queues - M/M/$\infty$](#complex)
    * [Case Study 2: XX](#casestudy2)

## Set Up <a class = "anchor" id = "setup"></a>

We need to install some packages to run this notebook:
* **pandas** - this is a package for *shaping* data
* **numpy** - this is a package with helpful functions for *numerical transformations*
* **matplotlib** - this is a package for *visualisiations*

There are installed using the command **pip install 'package'** and then *imported* into your notebook using **import 'package'**, we can give the package a shortened name as we will need to call it a lot later.

*Note: pip should be run from the command line, to run a shell command from within a notebook cell, you must put a ! in from of the command*

So let's do this for the above packages:

In [None]:
#!pip install pandas
import pandas as pd

In [None]:
#!pip  install numpy
import numpy as np

In [None]:
#!pip install matplotlib
import matplotlib.pyplot as plt

In [None]:
#!pip install ciw
import ciw

## Theory <a class = "anchor" id = "theory1"></a>

## Simple Queues - M/M/1  <a class = "anchor" id = "simple"></a>

Here we will use the *ciw* package to build a simple M/M/1 queue, such as a queue at a supermarket checkout.
First, we create the _network_ '_N_', which defines the structure of the queueing system.
Functions preceded by _ciw._ are built into the *ciw* package.

In [None]:
#Set up mean arrival rate and service rate (see below)
arrival_rate = 0.2
service_rate = 0.25

ciw.seed(1) # defines a random seed, ensuring the results are the same on each run

N = ciw.create_network(
    arrival_distributions = [ciw.dists.Exponential(arrival_rate)],
    service_distributions = [ciw.dists.Exponential(service_rate)],
    number_of_servers = [1])

This network has three attributes:
* The **arrival distribution**, which we have set to be exponential (Poisson process) with a mean arrival rate $\lambda$ of 0.2 customers per minute (1 every 5 minutes)
* The **service distribution**, which we have also set to be exponential with a mean arrival rate $\mu$ of 0.25 customers per minute (1 every 4 minutes). Since $\lambda < \mu$, the queue should be stable and not grow indefinitely.
* The **number of servers**, which in this case is 1.

Note that the choice of units (minutes) is arbitrary and will work as long as we are consistent.

We can now simulate the queue by creating and running a *Simulation* object, *Q*:

In [None]:
Q = ciw.Simulation(N) # a Simulation object for our network N

Q.simulate_until_max_time(1440) # run the simulation Q for 1440 minutes (one day)

The *ciw* package automatically records useful statistics about the simulation. For instance, we can obtain the average time spent waiting in the queue, or the average time to be served:

In [None]:
recs = Q.get_all_records() # extracts all individual records into the list 'recs'

wait_times = [r.waiting_time for r in recs] # loops through 'recs' extracting waiting times
service_times = [r.service_time for r in recs] # likewise for service times

We can now easily extract the mean waiting time and service time using np.mean():

In [None]:
#mean service time
np.mean(service_times)

In [None]:
#mean waiting time
np.mean(wait_times)

We set up the simulation with $\mu = 0.25$, so we are expecting a mean service time ($1/\mu$) of 4 minutes - our result of 3.94 minutes is not far off. For the waiting time, we can use the formula

$T = \frac{\mu}{1-\lambda/\mu} - \frac{1}{\mu}$

In [None]:
T = (1/service_rate)/(1-arrival_rate/service_rate) - 1/service_rate
T

Our estimate of 7.47 minutes is not so close here. This is because we are doing just one simulation - to get accurate results, we should run the simulation multiple times and average the results. Try setting a different seed and running it again!

In [None]:
# This block is for study group team only - what happens when you run multiple times with different seeds. 
# the results are an underestimate due to the transient behaviour at the start, but this is dealt with later.
# we probably won't actually average their results during the presentation, just ask them to post in the chat. 

wait_array = np.array([])
for trial in range(200):
    ciw.seed(trial)
    Q = ciw.Simulation(N)
    Q.simulate_until_max_time(1440)
    recs = Q.get_all_records()
    wait_times = [r.waiting_time for r in recs]
    service_times = [r.service_time for r in recs]
    wait_array = np.append(wait_array, np.mean(wait_times))
    
np.mean(wait_array)

### Plotting the queue behaviour

Let's explore the M/M/1 queue a little further. We can track the state of the system using a 'tracker'. Let's run the same queue three times with different random seeds, and look at the queue length over time.

In [None]:
for trial in range(3):
    ciw.seed(trial) #set a different seed for each run
    
    #set up and run a new simulation using the same network, with a tracker on the service node
    Q2 = ciw.Simulation(N, tracker = ciw.trackers.NodePopulationSubset([0])) 
    Q2.simulate_until_max_time(480) #simulate for 8 hours
    
    #Extract the results (queue length over time) into an array
    h = np.array(Q2.statetracker.history, dtype = object)
    h[:, 1] = [i[0] for i in h[:, 1]] 
    plt.plot(h[:, 0], h[:, 1])

Try changing the mean arrival rate (at the top of this section) and see how the queue behaviour changes over time  (You will need to re-run any code blocks you change, in order top-to-bottom). What happens if you set the arrival rate higher than the service rate? What if they are equal?

We can compare our results to what we expect from the theory. Let's try varying the arrival rate, and measuring the average queue length.
We can't take the time-average queue length when the arrival rate exceeds the service rate (since the queue length diverges to infinity over time), so we will keep our arrival rates below the service rate.

First we set up our simulation parameters, and empty arrays to store the results:

In [None]:
# set parameters
test_rates = np.linspace(0.05, 0.245, 30) # we will test a range of arrival rates from 0.05 to 0.245
service_rate = 0.25
simulation_time = 1440

#create arrays for results
queue_means = np.array([])
queue_var = np.array([])
ciw.seed(3)

We can now run the simulations:

In [None]:
for arrival_mean in test_rates:
    N = ciw.create_network(
        arrival_distributions = [ciw.dists.Exponential(arrival_mean)],
        service_distributions = [ciw.dists.Exponential(service_rate)],
        number_of_servers = [1])
    
    Q = ciw.Simulation(N, tracker = ciw.trackers.NodePopulationSubset([0]))
    Q.simulate_until_max_time(simulation_time)
    probs = Q.statetracker.state_probabilities() # a discrete probability distribution for the queue length
    probs = np.array(list(probs.items()), dtype = object) # convert dictionary object to array
    probs[:, 0] = [i[0] for i in probs[:, 0]]
    
    queue_mean = sum(probs[:, 0] * probs[:, 1]) # calculate mean queue length
    queue_means = np.append(queue_means, queue_mean)
    queue_var = np.append(queue_var,
        sum(probs[:, 0] * probs[:, 0] * probs[:, 1]) - queue_mean) # calculate queue length variance
    
plt.plot(test_rates, queue_means)

We can see that the average queue length increases rapidly as we approach equal arrival and service rates.

The theory predicts that the mean queue length $L$ is determined by the following expressions:

$L =  \frac{\lambda/\mu}{1-\lambda/\mu}$

We can compare this to our trials:

In [None]:
pred_length = (test_rates/service_rate)/(1 - test_rates/service_rate) #calculate theoretical results

#plot and compare
plt.plot(test_rates, queue_means, label = "Simulation results")
plt.plot(test_rates, pred_length, '--', label = "Theoretical results")
plt.legend()

Something is not quite right here! The results are OK at low $\lambda$, but as we increase the arrival rate, the simulation results become noisy and we appear to be underestimating the rise in average queue length. What have we done wrong?

Answer below...

.

.

.

.

.

.

.

.

**We have run the simulation for too short a time.** This is producing two effects:
* *The simulation becomes noisy as the arrival rate approaches the service rate* - the theory predicts that the size of the fluctuations in the  queue length increases as $\lambda$ approaches $\mu$. In order to get accurate results, we must simulate for a long time.
* *The queue has not reached a steady state* - as $\lambda$ increases and the queue becomes longer, we must wait longer before the queue reaches a steady state. Near the start, the queue is shorter than average (since we started with an empty queue), so our mean queue length is too low. For a truly accurate result, we ought to exclude the start from our measurements, but we can improve things by running the simulation for longer.

Try running the simulation again for a longer time, and see if the results improve!

## Case Study 1 <a class = "anchor" id = "casestudy1"></a>

## Understanding Variables <a class = "anchor" id = "theory2"></a>

## Complex Queues - M/M/$\infty$ <a class = "anchor" id = "complex"></a>

## Case Study 2 <a class = "anchor" id = "casestudy2"></a>