## Employee Scheduling Solver

## Introduction
This notebook was created to help solve employee scheduling problem during post-peak Coronavirus period.

In my company, during Coronavirus lockdown, most people in my part of the organization worked from home. When the virus spread subsided, we started preparing for a controlled return to the office. One of many safety measures we implemented was limiting the number of workspots per floor to maintain 1.5 meter distance.

With the number of workspots reduced, not everyone can come to the office at the same time. Everyone has a preference on which days to come in. Teams that work together should come together. Figuring out a schedule manually, even for a small group of people, is a tedious and error-prone task. There must be a better way...

One way of looking at schedule construction is treating it like a mathematical optimization problem:
* Define "utility" value of a schedule as a sum of happiness of employees. Coming in on a preferred day makes an employee happy. Coming in on a non-preferred day makes an employee very unhappy. Coming in together with teammates makes employees happier. 
* Maximize the "utility" of a schedule, given constraints:
 - Number of employees who come in on any given day should not exceed the number of available workspots.
 - Optional: number of times an employee comes in each week should not exceed an arbitrary number, to give everyone a chance to come to the office.
 
This type of optimization problem can be solved by a Mixed Integer Linear Program (MILP) solver. The code below shows how to construct such a problem, feed it to the solver, and show the solution.

## Alternative narrative
This approach works equally well for the following problem:

A small group of the Avengers are on a rotating patrol duty this week. Each day, two Avengers share a Quinjet and fight evil around the world.

* A Quinjet fits only one or two Avengers.
* Each Avenger should be on duty at least once a week, but not more than two days per week.
* Each Avenger has personal preferences regarding which days they want to work.
* Each Avenger has personal preferences regarding with whom they want to work.

Your boss, Nick Fury, asked you to "make a damn schedule, and make it damn work".

So you get to work. ¯\\\_(ツ)_/¯


### Quinjet
![quinjet](quinjet.jpg)

### Bad choice
![Tony vs Steve](tony-vs-steve.jpeg)

### Good choice
![Tony and Bruce](tony-and-bruce.jpg)

In [None]:
using JuMP
using Cbc
using Combinatorics
using Base.Iterators
using XLSX

### Utility function - load data from Excel sheet

In [None]:
# The data is loaded from named ranges. See example sheet for reference.
# 
# Preferences of employees to attend each day are indicated with
#   Y (yes, strength 5), M (maybe, strength 1) and N (no, strength -99).
# Empty cells are treated as "N"
#
# Connections between employees are indicated by integers. Positive values will steer the scheduler
# to schedule those people together. Negative values can be used to ensure that some people are never scheduled together.
# Empty cells are treated as 0.

function load_data(input_file_name)
    xf = XLSX.openxlsx(input_file_name)
    sheet = xf[1]
    my_employee_names = sheet["EmployeeNames"]
    my_day_names = sheet["DayNames"]
    my_preferences = sheet["Preferences"]
    my_connections = sheet["Connections"]
    close(xf)

    day_names = string.(reshape(my_day_names, size(my_day_names)[2]) )
    employee_names = string.(reshape(my_employee_names, size(my_employee_names)[1]))
    connections = Int64.(replace(my_connections, missing => 0))
    preferences = Int64.(replace(lowercase.(replace(my_preferences, missing => "N")),
                "y" => 5, "m" => 1, "n" => -99))

    n_days = size(day_names)[1]
    n_employees = size(employee_names)[1]
    @assert (n_employees, n_employees) == size(connections)
    @assert (n_employees, n_days) == size(preferences)
    return (employee_names, day_names, preferences, connections)
end

#### Load real data from a sheet

In [156]:
#input_file_name = "input.xlsx"
input_file_name = "input-small.xlsx"
NUM_SPOTS = 2 # Change this to the number of workspots available
MAX_VISITS_PER_WEEK = 2
MIN_VISITS_PER_WEEK = 1
employee_names, day_names, preferences, connections = load_data(input_file_name);

### Compute utility variables

In [157]:
function get_optimization_ranges(employee_names, day_names)
    employees = 1:length(employee_names)
    days = 1:length(day_names)
    return (employees, days)
end

employees, days = get_optimization_ranges(employee_names, day_names)
employee_pairs = collect(combinations(employees, 2));

In [158]:
#employees, days, employee_pairs

### Construct a linear model

In [179]:
function build_model(employee_names::Array{String,1}, day_names::Array{String,1},
        preferences::Array{Int64,2}, connections::Array{Int64,2}, 
        num_spots::Int64, max_visits_per_week::Int64, min_visits_per_week::Int64)

    roster_model = Model(Cbc.Optimizer)

    employees, days = get_optimization_ranges(employee_names, day_names)
    employee_pairs = collect(combinations(employees, 2));

    # Attendance variables - employee per day
    attendance_vars = Dict((e, d) => @variable(roster_model, 
            base_name="{$(employee_names[e])}_on_{$(day_names[d])}", binary=true) 
        for e in employees for d in days)

    # Linearization variables
    z_vars = Dict((e1, e2, d) => @variable(roster_model, 
            base_name="$(employee_names[e1])_{$(employee_names[e2])}_on_$(day_names[d])", binary=true)
        for (e1, e2) in employee_pairs for d in days)

    # Maximize the sum of preferences of all attending employees
    # and the sum of connections between present employees
    @objective(roster_model, Max, 
        sum(preferences[e, d] * attendance_vars[(e, d)] for e in employees for d in days) + 
        sum(z_vars[(e1, e2, d)] * connections[e1, e2] for (e1, e2) in employee_pairs for d in days))

    # Limit the number of employees attending each day to NUM_SPOTS
    for d in days
        @constraint(roster_model, sum(attendance_vars[(e, d)] for e in employees) <= num_spots)
    end

    # Limit max number of visits per employee per week to MAX_VISITS_PER_WEEK
    for week in partition(days, 5)
        for e in employees
            @constraint(roster_model, sum(attendance_vars[(e, d)] for d in week) <= max_visits_per_week)
            @constraint(roster_model, sum(attendance_vars[(e, d)] for d in week) >= min_visits_per_week)
        end
    end

    # Linearization constraints
    for d in days
        for (e1, e2) in employee_pairs
            @constraint(roster_model, z_vars[(e1, e2, d)] <= attendance_vars[(e1, d)])
            @constraint(roster_model, z_vars[(e1, e2, d)] <= attendance_vars[(e2, d)])
            @constraint(roster_model, z_vars[(e1, e2, d)] >= attendance_vars[(e1, d)] + attendance_vars[(e2, d)] - 1)
        end
    end
    return roster_model, attendance_vars
end

model, attendance_vars = build_model(employee_names, day_names, preferences, connections, NUM_SPOTS, MAX_VISITS_PER_WEEK, MIN_VISITS_PER_WEEK);

In [181]:
JuMP.latex_formulation(model);

### Run optimization

In [182]:
optimize!(model)

Welcome to the CBC MILP Solver 
Version: 2.10.5 
Build Date: Jan  1 1970 

command line - Cbc_C_Interface -solve -quit (default strategy 1)
Continuous objective value is 42 - 0.00 seconds
Cgl0003I 0 fixed, 0 tightened bounds, 15 strengthened rows, 0 substitutions
Cgl0003I 0 fixed, 0 tightened bounds, 15 strengthened rows, 0 substitutions
Cgl0004I processed model has 38 rows, 30 columns (30 integer (30 of which binary)) and 120 elements
Cutoff increment increased from 1e-05 to 0.9999
Cbc0038I Initial state - 0 integers unsatisfied sum - 0
Cbc0038I Solution found of -42
Cbc0038I Before mini branch and bound, 30 integers at bound fixed and 0 continuous
Cbc0038I Mini branch and bound did not improve solution (0.00 seconds)
Cbc0038I After 0.00 seconds - Feasibility pump exiting with objective of -42 - took 0.00 seconds
Cbc0012I Integer solution of -42 found by feasibility pump after 0 iterations and 0 nodes (0.00 seconds)
Cbc0001I Search completed - best objective -42, took 0 iterations and

#### Print results - watch out, can be large, depending on the size of your problem.

In [184]:
println("Status = ", termination_status(model))
println("Solution = ", objective_value(model))
println()

for e in employees
    n = lpad(employee_names[e], 8)
    print("$(n): ")
    for d in days
        print(convert(Int8, value(attendance_vars[(e, d)])), ", ")
    end
    println()
end

Status = OPTIMAL
Solution = 42.0

    Tony: 1, 0, 1, 0, 0, 
   Bruce: 1, 0, 1, 0, 0, 
   Steve: 0, 1, 0, 1, 0, 


### Utility funciton - save solution to an Excel file

In [185]:
function save_solution(output_file_name::String, 
        employee_names::Array{String,1}, 
        day_names::Array{String,1}, attendance_vars)

    employees, days = get_optimization_ranges(employee_names, day_names)
    allocations = [[convert(Int64, value(attendance_vars[(e, d)])) for d in days] for e in employees]

    XLSX.openxlsx(output_file_name, mode="w") do xf
        sheet = xf[1]
        XLSX.rename!(sheet, "schedule")
        sheet[2, 1, dim=1] = employee_names # dim=1 means column
        sheet[1, 2, dim=2] = day_names # dim=2 means row
        for e in employees
            for d in days
                sheet[1 + e, 1 + d] = allocations[e][d]
            end
        end
    end
end

save_solution (generic function with 1 method)

In [186]:
output_file_name = "schedule.xlsx"
save_solution(output_file_name, employee_names, day_names, attendance_vars)