# I. Introduction

Greedy algorithms can solve many different types of problems.  This exercise explores the use of a greedy algorithm to assign security officers to a work-shift scheduling problem.

First we import helpful packages.

In [39]:
import copy
import pandas as pd
#import matplotlib.pyplot as plt
#import seaborn as sns
#%matplotlib notebook

# II. Generating the Test Data

To add some realism and complexity to the scheduling problem a number of constraints were added to the initial assignment.

Firstly, we need to schedule 1 security officer for each hour of each day in a week.

In [40]:
hours_in_day = 24
days_to_cover = ['0_mon', '1_tue', '2_wed', '3_thu', '4_fri', '5_sat', '6_sun']

To keep track of which officers are assigned when, we construct of list of 6 generic security officers.

In [41]:
peronnel = ['officer1', 'officer2', 'officer3', 'officer4', 'officer5', 'officer6']

The costs included for this project are outlined below.  In this more advanced problem, we consider that the security officers have yet to be hired.  As such, for every officer needed, there is an upfront fixed cost of $1,000 to hire, onboard, and train the officer.

In [42]:
base_hourly_wage = 15 # dollars
overtime_hourly_wage = 15 + 5 # dollars
onboarding_cost = 1000 # dollars

The constaints for this problem include a maximum number of hours that any individual officer can work...
* per week
* per calendar day
* consecutively

There is also a minimum number of hours that an officer can work in a single shift.

In [43]:
max_per_week = 40 # hard max for each officer per week
max_per_day = 14 # hard max for each officer per day
max_consecutive = 14 # maximum consecutive hours an officer can work
min_hours_per_shift = 4 # minimum consecutive hours an officer can work

Per the original problem, overtime pay kicks in when an officer works more than 8 consecutive hours.

In [44]:
consecutive_before_overtime = 8 # maximum consecutive hours worked before overtime pay kicks in

All of these constraints have been parameterized so they can easily be altered and the problem resolved.  

Lastly, we instantiate a few data structure to track the assignment of the officers to each hour in a week.

In [45]:
hours_to_cover = {}
for day in days_to_cover:
    hours_to_cover[day] = hours_in_day

hours_worked = {}
for officer in peronnel:
    hours_worked[officer] = {}
    for day in days_to_cover:
        hours_worked[officer][day] = []

hours_worked is a double nested dictionary within which contains a list of a numbers related to each hour an officer works each day.

The last 'constraint' is that officers need a minimum time off between shifts equal to the the length of their last shift.  To help monitor the time needed before you can assign assign a new shift to an officer, the last data structure in instantiated.

In [46]:
most_rec_consecutive = {}
for officer in peronnel:
    most_rec_consecutive[officer] = 0

# III. Functions for a Greedy Algorithm

To run our greedy algorithm, a few helper function are generated first.

The first helper function will aid in our determining if the time off for an officer after his/her last shift has been long enough to be re-assigned to a new shift.  
* If the officer has not worked yesterday or today, then the function returns 'None' indicating that it has been _at least_ 24 hours, greater than the 14 hour maximum any officer would have to wait before their next shift scheduled.  
* If the officer worked yesterday, but not today the function will return a negative value indicating how many hours _before_ the last midnight that they worked.

In [47]:
def last_two_days_last_hour_worked(officer, day, current_hour):
    last_hour_worked = max(hours_worked[officer][day], default = None)
    if last_hour_worked == None:
        prev_day = days_to_cover[days_to_cover.index(day)-1]
        if max(hours_worked[officer][prev_day], default = None) is not None:
            last_hour_worked = max(hours_worked[officer][prev_day]) - hours_in_day
    return last_hour_worked

The next function is used to to help us to determine at any given time with any given officer, how many hours they can work.  

We call this function with a specific officer, a current day & hour, and the maximum number of hours we want returned.  This last argument allows the function to be called first with the maximum time before overtime kicks in (8 hours) and then we can call it again with the hard maximum consecutive hours if our schedule has yet to be filled with non-overtime work.

Within this function, we leverage the 'last hour worked' function we just instantiated and check to see that either:
1. the **last_hour_worked** is _None_ (it's been at least 24 hours)
2. the difference between the current hour and the **last_hour_worked** is greater than the length of the officer's last shift
3. the officer was previously assigned a shift that ends at the current time, but they are available to work more hours (specifically in the case where we call the function first with the consecutive overtime constraint (8 hours) and then recall it with the maximum consecutive time (14 hours)

In [48]:
def hours_can_work(officer, day, current_hour, max_hours):
    hours_needed = hours_in_day - current_hour
    weeks_hours_worked = sum([len(sublist) for sublist in hours_worked[officer].values()])
    todays_hours_worked = len(hours_worked[officer][day])
    
    # these constraints ignore any consecutive hours worked requirements
    week_constraint = max_per_week - weeks_hours_worked
    day_constraint = max_per_day - todays_hours_worked
    if min_hours_per_shift > week_constraint:
        return 0
    if min(hours_needed, min_hours_per_shift) > day_constraint:
        return 0
    
    # determines how many consecutive hours the officer can work, ignoring the complexities of the daily maximum constraint
    last_hour_worked = last_two_days_last_hour_worked(officer, day, current_hour)
    if last_hour_worked is not None:
        hour_lapse = current_hour - last_hour_worked
        if hour_lapse >= most_rec_consecutive[officer]:
            consec_hours_only = max_hours
        elif ((hour_lapse == 0) & (most_rec_consecutive[officer] < max_hours)):
            consec_hours_only = max_hours - most_rec_consecutive[officer]
        else:
            return 0
    else:
        consec_hours_only = max_hours
    
    # if the hours needed before the day is complete are less than the number of hours this officer can work this day, 
    # then we can ignore the day constraint
    if day_constraint >= hours_needed:
        return min(week_constraint, consec_hours_only)
    else:
        return min(week_constraint, day_constraint, consec_hours_only)

As part of our solution, because each additional officer in the schedule incurs $1,000 up-front cost, we'd like to avoid including any extra unneeded officers in the schedule if possible.  As such, let's first determine the minimum possible number of security officers we would need, solely based on the 40 hour work week maximum.

In [49]:
min_personnel = -(-24*7 // 40)
included = peronnel[0:min_personnel]

Now we have the minimum set of potential security officers we'll need for this project in **included**.  If we get stuck solving the problem with the existing personnel, we will have to add security officers one at a time to this **included** data structure.

Our last helper function serves to find officers available to work (via the **hours_can_work** helper function we just instantiated) and newly assign them to the an unfilled part in the schedule.  Shifts can extend from one day to the next and whenever a new shift is assigned, the function _breaks_ so we can re-asses if primary day's hours have been filled yet.  

Each time this function is called, we always begin searching for available officers at 'officer1' and cycle through the officers in order to find an available one.  This is one part of the 'greediness' of this work-shift schedule solution.

In [50]:
def assign_hours(day, max_hours):
    prev_day = days_to_cover[days_to_cover.index(day) - 1]
    for officer in included:
        current_hour = hours_in_day - hours_to_cover[day]
        hours_to_work = hours_can_work(officer, day, current_hour, max_hours)
        if hours_to_work == 0:
            continue # skips the rest of the code in this iteration of the for-loop and moves to the next officer
        
        # if the officer can work and their shift will end before the next day
        if ((hours_to_work > 0) & (hours_to_work <= hours_to_cover[day])): 
            # if we're extended a previously assigned shift with the same officer
            if max(hours_worked[officer][day], default = None) == current_hour or \
            (max(hours_worked[officer][prev_day], default = None) == hours_in_day & current_hour == 0):
                most_rec_consecutive[officer] = most_rec_consecutive[officer] + hours_to_work
            else:
                most_rec_consecutive[officer] = hours_to_work
            # this for-loop actually assigns the hours the officer will work as elements in a list
            for hour in range(hours_to_work):
                hours_worked[officer][day].append(current_hour + hour + 1)
            # now we can reduce the number of hours that still need coverage on this day
            hours_to_cover[day] = hours_to_cover[day] - hours_to_work
            # This is another greedy component to the solution. 
            # Every time a new shift is assigned, we exit the loop (and the function) 
            # and 'start over' to find the first available officer for the next shift.
            break
        
        # if the officer can work and their shift will continue to the next day
        if ((hours_to_work > 0) & (hours_to_work > hours_to_cover[day])):
            # if we're extended a previously assigned shift with the same officer
            if max(hours_worked[officer][day], default = None) == current_hour or \
            (max(hours_worked[officer][prev_day], default = None) == hours_in_day & current_hour == 0):
                most_rec_consecutive[officer] = most_rec_consecutive[officer] + hours_to_cover[day]
            else:
                most_rec_consecutive[officer] = hours_to_cover[day]
            # this for-loop actually assigns the hours the officer will work that day as elements in a list
            for hour in range(hours_to_cover[day]):
                hours_worked[officer][day].append(current_hour + hour + 1)
            # now we can reduce the number of hours that this officer can continue to work
            hours_to_work = hours_to_work - hours_to_cover[day]
            hours_to_cover[day] = 0
            
            # if the 'day' is Sunday, than the whole schedule has been filled
            # otherwise, we can keep assigning hours to the same officer for the 'next' day
            if day != '6_sun':
                next_day = days_to_cover[days_to_cover.index(day) + 1]
                most_rec_consecutive[officer] = most_rec_consecutive[officer] + hours_to_work
                # this for-loop actually assigns the hours the officer will work that day as elements in a list
                for hour in range(hours_to_work):
                    hours_worked[officer][next_day].append(hour + 1)
                # now we can reduce the number of hours that still need coverage on this day
                hours_to_cover[next_day] = hours_to_cover[next_day] - hours_to_work
            # This is another greedy component to the solution. 
            # Every time a new shift is assigned, we exit the loop (and the function) 
            # and 'start over' to find the first available officer for the next shift.
            break

This final section of code cycles through the days in the weekly schedule, and assuming their hours have not all been assigned, calls the **assign_hours** function.  

We created a copy of the schedule pre- and post- **assign_hours**.  If there is no change in the schedule after calling **assign_hours** with the maximum consecutive hours ('max_hours') equal to the 8 consecutive hours an officer can work before overtime kicks in, than we recall the same function with the absolute maximum consecutive hours (14).  If there is still no change in the schedule, we add another officer.

This order of operations is another component of the 'greediness' of this solution.

In [51]:
# cycle through 'cases' from most desirable to least desirable
for day in days_to_cover:
    while hours_to_cover[day] > 0:
        start_sched = copy.deepcopy(hours_worked)
        assign_hours(day, consecutive_before_overtime)
        
        if hours_worked == start_sched:
            assign_hours(day, max_consecutive)
        
        if hours_worked == start_sched:
            included.append(peronnel[min_personnel])
            min_personnel = min_personnel + 1

When the for-loop completes, then **hours_worked** now contains the full work-shift schedule.

# IV. Results Preparation & Checks

To view the results slightly more easily, we converted the nested dictionary of lists into a Pandas dataframe of lists.

In [54]:
results_df = pd.DataFrame(hours_worked)

Then perform a few cursory checks of the results..
1. that each day has 24 hours of coverage
2. that no officer get assigned more than 40 hours across the whole (weekly) schedule
3. that no officer gets assigned more than 14 hours in a single day

In [56]:
# 24-hour coverage check per day
sum([results_df[x].str.len() for x in results_df.columns]) == hours_in_day

0_mon    True
1_tue    True
2_wed    True
3_thu    True
4_fri    True
5_sat    True
6_sun    True
dtype: bool

In [59]:
# check max hours per week
[sum(results_df[x].str.len()) <= max_per_week for x in results_df.columns] 

[True, True, True, True, True, True]

In [60]:
# check max hours per day
[results_df[x].str.len() <= max_per_day for x in results_df.columns] 

[0_mon    True
 1_tue    True
 2_wed    True
 3_thu    True
 4_fri    True
 5_sat    True
 6_sun    True
 Name: officer1, dtype: bool, 0_mon    True
 1_tue    True
 2_wed    True
 3_thu    True
 4_fri    True
 5_sat    True
 6_sun    True
 Name: officer2, dtype: bool, 0_mon    True
 1_tue    True
 2_wed    True
 3_thu    True
 4_fri    True
 5_sat    True
 6_sun    True
 Name: officer3, dtype: bool, 0_mon    True
 1_tue    True
 2_wed    True
 3_thu    True
 4_fri    True
 5_sat    True
 6_sun    True
 Name: officer4, dtype: bool, 0_mon    True
 1_tue    True
 2_wed    True
 3_thu    True
 4_fri    True
 5_sat    True
 6_sun    True
 Name: officer5, dtype: bool, 0_mon    True
 1_tue    True
 2_wed    True
 3_thu    True
 4_fri    True
 5_sat    True
 6_sun    True
 Name: officer6, dtype: bool]

Lastly, we can just take a look at the results in their entirety to observe the schedule in more detail.

In [61]:
results_df

Unnamed: 0,officer1,officer2,officer3,officer4,officer5,officer6
0_mon,"[1, 2, 3, 4, 5, 6, 7, 8, 17, 18, 19, 20, 21, 22]","[9, 10, 11, 12, 13, 14, 15, 16]","[23, 24]",[],[],[]
1_tue,"[7, 8, 9, 10, 11, 12, 13, 14, 23, 24]","[15, 16, 17, 18, 19, 20, 21, 22]","[1, 2, 3, 4, 5, 6]",[],[],[]
2_wed,"[1, 2, 3, 4, 5, 6, 15, 16, 17, 18, 19, 20, 21,...","[7, 8, 9, 10, 11, 12, 13, 14, 23, 24]",[],[],[],[]
3_thu,[],"[1, 2, 3, 4, 5, 6, 15, 16, 17, 18, 19, 20, 21,...","[7, 8, 9, 10, 11, 12, 13, 14, 23, 24]",[],[],[]
4_fri,[],[],"[1, 2, 3, 4, 5, 6, 15, 16, 17, 18, 19, 20, 21,...","[7, 8, 9, 10, 11, 12, 13, 14, 23, 24]",[],[]
5_sat,[],[],"[7, 8, 9, 10, 11, 12, 13, 14]","[1, 2, 3, 4, 5, 6, 15, 16, 17, 18, 19, 20, 21,...","[23, 24]",[]
6_sun,[],[],[],"[7, 8, 9, 10, 11, 12, 13, 14, 23, 24]","[1, 2, 3, 4, 5, 6, 15, 16, 17, 18, 19, 20, 21,...",[]


# V. Cost Calculations & Results Discussion

Examination of the results in full detail confirms that ...
1. all our 'hard' constraints have been fully satisfied
2. that we were able to avoid adding a 6th officer for an additional $1,000
3. That we were able to avoid any overtime work

As such, our upfront fixed and weekly variable costs are as follows...

In [63]:
fixed_cost = sum([onboarding_cost for i in included])
print("The upfront fixed cost for this schedule is: $", fixed_cost)
variable_cost = sum([sum(results_df[x].str.len())*base_hourly_wage for x in results_df.columns])
print("The weekly variable cost for this schedule is: $", variable_cost)

The upfront fixed cost for this schedule is: $ 5000
The weekly variable cost for this schedule is: $ 2520


In this case our greedy 'algorithm' found one of the optimal solutions to this problem.  

This problem has _many_ (optimal) solutions.  As a base case for a single alternate optimal solution, we could simply swap officers and arrive at the same total fixed & variable costs.  

However, if the problem constraints and available resources were not specified as they were, this greedy solution may not always find an optimal solution.  The solution assigns officers in order, always filling the next unfilled hour first, and always assigning officers the maximum amount of time they have available (under overtime or not) to the schedule.  Thus, it is possible for the functions to solve the problem in a way that is not optimal requiring overtime work or additional officers in the schedule when they might not be required with an alternate solution.  

This is particularly true because of the **min_hours_per_shift** constraint.  A officer we evaluate might be available to work 6 consecutive hours at a particularly time, in which case we would assign all 6 hours.  However, if they had previously worked a total of 32 hours, they've now been assigned a schedule of 38 hours per week, not allowing them to work any more because of the **min_hours_per_shift** requirement.  This was the case for 'officer1' in the solution we created above.

In [66]:
sum(results_df['officer1'].str.len())

38

If we had instead only assigned them a 4 hours shift, we could have assigned another 4 hour shift to them in the future, potential avoiding to need for other officers to work overtime or for new officers to be assigned to the schedule.  

However, it seems that in a vast majority of realistic scenarios, this solution will provide an optimal or _very close to optimal_ solution to this work-shift scheduling problem.