Ruggiero Seccia, PhD candidate in Operations Research

La Sapienza University of Rome

Email: ruggiero.seccia@uniroma1.it

Phone: +39 3318606535

# Standard Nurses Rostering Problem


This notebook implements a first simple model of the Nurses rostering problem. We assume that the number of nurses $N$ is enough to satisfy the minimum number of nurses required in each shift, define the basic NRP model and minimize the overall number of hours worked by the nurses. Then, we change the objective so as to __minimize the worst-case scenario__ (i.e. minimizing the overall number of hours worked by the nurse that works the most) and compare the two solutions found to show the improvements that can be obtained with the worst-case scenario formulation. 

We also provide an __interactive toolbox__ for checking with nurse is working on which shift in which day.


The model implemented in this notebook corresponds to the formulation $(1)$ described [here](http://www.optimization-online.org/DB_FILE/2020/03/7712.pdf)

## importing the packages

In [1]:
import numpy as np
try:
    from docplex.mp.model import Model
except:
    !pip install docplex
from docplex.mp.model import Model

import pandas as pd

try:
    import ipywidgets
except:
    !pip install ipywidgets
from ipywidgets import interact
import ipywidgets as widgets
import time

## Parameters specification

Let us consider a department in a hospital with a given number of nurses $N$. We want to organize their shifts for the next $T$ days, e.g. $T=7$ one week or $T=30$ next month,  and for all the followings so to minimize the effort required by the staff to satisfy the demand. By contract, each nurse $i$ has to work $H_i$ hours over the time horizon $T$ (e.g. each nurse must work at least 36 hours per week, $H=36$ and $T=7$). If the $i$th nurse works for a number of hours higher than $H_i$, then it is counted as extra work and then paid more by the healthcare structure. Each day three shifts need to be covered by the nurses: morning, afternoon and night. Each shift $s$ requires $R_s$ nurses and lasts $h_s$ hours. Each nurse cannot cover more than one shift per day. Moreover, we have the further constraint that if a nurse covers a night shift then they need to rest and cannot work the following day.

Let us  consider the parameter $p_i$ which brings information about the previous period. Namely, $p_i$ is a boolean parameter such that 
\begin{equation*}
    p_i=
    \begin{cases}
    1  \qquad \text{if the } i \text{th nurse worked on the last day of the previous period} \\
    0 \qquad \text{otherwise.}
    \end{cases}
\end{equation*}

(To get an estimate of the minimum values for $N$ to be sure that the number of nurses is large enough, we can consider that we need at least the number of nurses for covering each shift in a day plus the number of nurses to cover a night shift. E.g.
if we need 5 nurses during the morning shift, 4 during the afternoon and 3 during the night, then overall we need $N=5+4+3+3$ nurses.
)


In [2]:
# number of nurses
N = 15
nurses = ['Nurse_' +str(n) for n in range(N)]
# periods to schedule, number of days
T = 7
days = ['Day_' +str(t) for t in range(T)]
# shifts
S = ['Morning', 'Afternoon', 'Night']


# standard number of hours by contract per nurse
H_base = 36
H = {n:j for n in nurses for j in [H_base]*len(nurses)}

# update some nurses values
# H['Nurse_1'] =20

# number of nurses required per shift
R = {'Morning' : 5,
     'Afternoon' : 4,
     'Night' : 3}
# duration of each shift
h = {'Morning' : 7,
     'Afternoon' : 8,
     'Night' : 9}

# list of nurses that on the last day of the previous period covered  the night shift
p_list= ['Nurse_0']
# dictionary with the values of p per each nurse
p = {n:0 for n in nurses}
# update the dictionary with p_list
for pp in p_list:
    p[pp]=1


## Optimization model 

To formulate this optimization problem, let us introduce the binary variable $x_{ist}\in\{0,1\}$ such that 
\begin{equation*}
    x_{ist}=
    \begin{cases}
    1  \qquad \text{if nurse } i \text{th covers shift } s \text{th on day } t\text{th} \\
    0 \qquad \text{otherwise}
    \end{cases}
\end{equation*}


We want to find the optimal schedule $x^\star$ such that the number of hours worked by nurses is minimized and all the department's constraints are satisfied. 

In [3]:
mdl = Model('Scheduling')

# create the variables
idx_x = [(i,s,t) for i in nurses for s in S for t in days]
x = mdl.binary_var_dict(idx_x)


### Objective function
The objective function is asking to minimize the overall number of hours worked by all nurses within the period under consideration while reducing the most the number of hours not covered by nurses

\begin{equation*}
\begin{aligned}
& \underset{x_{ist}\in\{0,1\}}{\text{min}}
& & \sum_{i=1}^N\sum_{s=1}^3\sum_{t=1}^T x_{ist}h_s \\
\end{aligned}
\end{equation*}


In [4]:
# objective function definition
mdl.minimize(mdl.sum(x[i,s,t]*h[s] for i in nurses for s in S for t in days))

### Constraints
- Each person cannot cover more than one shift in the same day. 
$$ \sum_{s=1}^3x_{ist}\leq 1 \qquad \forall i=1,...,N \;t=1,...,T \label{eq: 21}$$

- The number of personnel per each shift in each day is satisfied. 
$$\sum_{i=1}^N x_{ist} \geq R_{s} \qquad \forall s=1,...,3 \; t=1,...,T  \label{eq: 22} $$

- Each nurse works at minimum the number of hours required by contract. 
$$ \sum_{s=1}^3\sum_{t=1}^T x_{ist}h_s\geq H_i \qquad \forall i=1,...,N \label{eq: 23}$$

- If a nurse covers a night shift, then the next day they cannot work;
$$ x_{i3t}+\sum_{s=1}^3 x_{ist+1}\leq 1 \qquad \forall i=1,...N \; t=1,...,T-1\label{eq: 24}$$

- Each nurse cannot work on the first day of the new period if they worked on the last day of the previous period. 
$$ \sum_{s=1}^3 x_{is1}\leq (1-p_i) \qquad \forall i=1,...,N  \label{eq: 25}$$


In [5]:
mdl.add_constraints(mdl.sum(x[i,s,t] for s in S) <= 1 for i in nurses for t in days);

mdl.add_constraints(mdl.sum(x[i,s,t] for i in nurses)>= R[s]  for s in S for t in days);

mdl.add_constraints(mdl.sum(x[i,s,t]*h[s] for s in S for t in days) >= H[i] for i in nurses );

mdl.add_constraints( x[i,S[-1],t] + mdl.sum(x[i,s,days[j+1]] for s in S )<= 1 for i in nurses for j,t in enumerate(days[:-1]) );

mdl.add_constraints(mdl.sum(x[i,s,days[0]] for s in S ) <= (1-p[i]) for i in nurses );

### Defining KPI

In [6]:
mdl.add_kpi(mdl.max(mdl.sum(x[i,s,t]*h[s] for s in S for t in days)for i in nurses), 'Maximum # hours worked')
mdl.add_kpi(mdl.min(mdl.sum(x[i,s,t]*h[s] for s in S for t in days)for i in nurses), 'Minimum # hours worked');

### Solve the problem

In [7]:
mdl.print_information()
mdl.solve()
mdl.solution.solve_details

Model: Scheduling
 - number of variables: 345
   - binary=315, integer=0, continuous=30
 - number of constraints: 276
   - linear=276
 - parameters: defaults
 - problem type is: MILP


docplex.mp.SolveDetails(time=0.063,status='integer optimal solution')

In [8]:
mdl.report()

* model Scheduling solved with objective = 658
*  KPI: Maximum # hours worked = 47.000
*  KPI: Minimum # hours worked = 39.000


In [9]:
status = mdl.solve_details.status == 'integer optimal solution'
status

True

Note that if cplex does not solve the problem, it is because it is not feasible, i.e. $N$ is too small. This problem is fixed in the extension of the model

## Analysing the solution

Below we provide simple tools to analyse the solution obtained

In [10]:
def x_star_to_pandas(x):
    '''
    takes in input the solution of the optimization problem as a dictionary 
    returns the solution as a dataframe 
    '''
    sol = pd.DataFrame(columns = ['Nurse', 'Shift', 'Day'])
    k = 0
    for key, value in x.items():
        if value>0:
            sol.loc[k] =np.array([i for i in key])
            k+=1
    return sol

In [11]:
# transform the solution into a dataframe
x_star_dict =mdl.solution.get_value_dict(x)
sol_x = x_star_to_pandas(x_star_dict)
sol_x.head()

Unnamed: 0,Nurse,Shift,Day
0,Nurse_0,Morning,Day_2
1,Nurse_0,Morning,Day_5
2,Nurse_0,Afternoon,Day_1
3,Nurse_0,Afternoon,Day_3
4,Nurse_0,Afternoon,Day_4


### How many hours does each nurse work over the period?

In [12]:
worked_hours = {n:0 for n in nurses}

for i,j in sol_x.iterrows():
    worked_hours[j['Nurse']]+=h[j['Shift']]
worked_hours

{'Nurse_0': 46,
 'Nurse_1': 46,
 'Nurse_2': 44,
 'Nurse_3': 40,
 'Nurse_4': 45,
 'Nurse_5': 47,
 'Nurse_6': 47,
 'Nurse_7': 47,
 'Nurse_8': 46,
 'Nurse_9': 41,
 'Nurse_10': 46,
 'Nurse_11': 40,
 'Nurse_12': 41,
 'Nurse_13': 39,
 'Nurse_14': 43}

### Average of hours worked by day

In [13]:
for i, j in worked_hours.items():
    print(i,':',j/T)

Nurse_0 : 6.571428571428571
Nurse_1 : 6.571428571428571
Nurse_2 : 6.285714285714286
Nurse_3 : 5.714285714285714
Nurse_4 : 6.428571428571429
Nurse_5 : 6.714285714285714
Nurse_6 : 6.714285714285714
Nurse_7 : 6.714285714285714
Nurse_8 : 6.571428571428571
Nurse_9 : 5.857142857142857
Nurse_10 : 6.571428571428571
Nurse_11 : 5.714285714285714
Nurse_12 : 5.857142857142857
Nurse_13 : 5.571428571428571
Nurse_14 : 6.142857142857143


### Visualization tool

Below we provide a tool to check the schedule. 

In [14]:
# remove warning from pandas (in the viz_tool it does what we need)
import warnings
warnings.simplefilter(action='ignore')

In [15]:

def viz_tool(nurse,shift,day):
    '''
    interactive function to extract the information required:
    if a value is 'All' then it returns all the values for that specific feature
    '''
    global nurses,S,days
    
    if nurse == 'All':
        df_tmp = sol_x[(sol_x['Nurse'].isin(nurses))]
    else:
        df_tmp = sol_x[(sol_x['Nurse']==nurse)]

    if shift == 'All':
        df_tmp = df_tmp[(sol_x['Shift'].isin(S))]
    else:
        df_tmp = df_tmp[(sol_x['Shift']==shift)]

    if day == 'All':
        df_tmp = df_tmp[(sol_x['Day'].isin(days))]    
    else:
        df_tmp = df_tmp[(sol_x['Day']==day)]

    print(df_tmp)

interact(viz_tool, nurse = widgets.Dropdown(value="All",placeholder='Type something', options=nurses+['All']),
              shift=widgets.Dropdown(value='All',placeholder='Type something', options=S+['All']),
              day = widgets.Dropdown(value="All",placeholder='Type something', options=days+['All'])
        );



interactive(children=(Dropdown(description='nurse', index=15, options=('Nurse_0', 'Nurse_1', 'Nurse_2', 'Nurse…

## Minimizing the worst case scenario

 We want to modify the objective function so to minimize the worst case scenario, i.e. minimize the maximum number of hours done by a nurse. It can be easily accomplished by introducing the further continuous variable $y\in R$, rewriting the objective function as
\begin{equation*}
    \underset{x_{ist}\in\{0,1\},y\in R}{\text{min}} \quad  y 
\end{equation*}

and introducing the set of constraints:
\begin{equation*}
    y\geq \sum_{s=1}^3\sum_{t=1}^T x_{ist}h_s \qquad \forall i=1,...,N
\end{equation*}

In [16]:
# create the new variable for minimizing the worst case scenario
y = mdl.continuous_var()

# modify the objective function
mdl.minimize(y)

# add the linear constraint
mdl.add_constraints( y>= mdl.sum( x[i,s,t]*h[s] for s in S for t in days) for i in nurses);

In [17]:
mdl.print_information()
mdl.solve()
mdl.solution.solve_details

Model: Scheduling
 - number of variables: 346
   - binary=315, integer=0, continuous=31
 - number of constraints: 291
   - linear=291
 - parameters: defaults
 - problem type is: MILP


docplex.mp.SolveDetails(time=0.969,status='integer optimal solution')

In [18]:
mdl.report()

* model Scheduling solved with objective = 45.000
*  KPI: Maximum # hours worked = 45.000
*  KPI: Minimum # hours worked = 41.000


## Comparing the new solution with the old one

In [19]:
# transform the solution into a dataframe
x_star_dict =mdl.solution.get_value_dict(x)
sol_x = x_star_to_pandas(x_star_dict)
sol_x

Unnamed: 0,Nurse,Shift,Day
0,Nurse_0,Morning,Day_2
1,Nurse_0,Morning,Day_3
2,Nurse_0,Morning,Day_5
3,Nurse_0,Afternoon,Day_1
4,Nurse_0,Afternoon,Day_4
5,Nurse_0,Afternoon,Day_6
6,Nurse_1,Morning,Day_0
7,Nurse_1,Morning,Day_1
8,Nurse_1,Morning,Day_2
9,Nurse_1,Morning,Day_4


### Compute number of hours worked

In [20]:
# number of hours worked by each nurse 
worked_hours_wc = {n:0 for n in nurses}

for i,j in sol_x.iterrows():
    worked_hours_wc[j['Nurse']]+=h[j['Shift']]


#### Comparison with the previous solution

Hours are more equally distributed!

In [21]:
print("{0:<10s} {1:<10s} {2:<10s}".format("Nurse","Old value", "New value") )
for i in worked_hours.keys():
    print("{0:<10s} {1:<10.0f} {2:<10.0f}".format(i+":",worked_hours[i],worked_hours_wc[i]) )

Nurse      Old value  New value 
Nurse_0:   46         45        
Nurse_1:   46         45        
Nurse_2:   44         45        
Nurse_3:   40         45        
Nurse_4:   45         41        
Nurse_5:   47         45        
Nurse_6:   47         43        
Nurse_7:   47         43        
Nurse_8:   46         45        
Nurse_9:   41         45        
Nurse_10:  46         45        
Nurse_11:  40         43        
Nurse_12:  41         41        
Nurse_13:  39         45        
Nurse_14:  43         42        


### Maximum number of hours worked:

In [22]:
print('Maximum number of hours worked:')
print("Old value: {0:>2.0f} New value: {1:>2.0f}".format(max(worked_hours.values()),max(worked_hours_wc.values())) )


Maximum number of hours worked:
Old value: 47 New value: 45


In [23]:
interact(viz_tool, nurse = widgets.Dropdown(value="All",placeholder='Type something', options=nurses+['All']),
              shift=widgets.Dropdown(value='All',placeholder='Type something', options=S+['All']),
              day = widgets.Dropdown(value="All",placeholder='Type something', options=days+['All']),
        );



interactive(children=(Dropdown(description='nurse', index=15, options=('Nurse_0', 'Nurse_1', 'Nurse_2', 'Nurse…