# Deploying A Schedule Building Algorithm
> Automating a manual process!

- toc:true
- badges: true
- comments: true
- author: David De Sa
- categories: [jupyter]

# Context

## Goal
Deploy an algorithm for schedulers to use that is:
-  easy to use and learn
-  transparent
-  quick
-  flexible

## Motivation
In a 24/7 manufacturing environment, the weekend shifts are covered mostly by overtime, which is scheduled according to employee availability, subject to constraints outlined in the labour collective agreement. Due to changing production needs as well as staff availability, the schedule must be re-drafted many times, often on short notice and under tight time constraints. Drafting it is tedious, error-prone, and time consuming. It could be automated.

## Challenges
-Data being manually entered in a variety of formats or not available in machine readable form. (e.g. total hours, employee type, employee availability, individualized job restrictions, outlier reasons for non-eligibility such as consecutive days worked) This is probably the main challenge!!

 - Algorithm ambiguity. The collective agreement defines the constraints that each assignment decision is subject to, but doesn't strictly specify all aspects, allowing for arbitrary choice on schedulers part
 - Many esoteric rules and edge cases around assignments being valid or not, which are also subject to change at time of contract renegotiation.
 - Usability. The deployment must be available to all schedulers, and have a very low barrier to entry w.r.t. training and usability.

## A Bit of Lore
The notion of automating the process solution has been bouncing around my head for over a year. I always felt the main challenge was the situation posed by the data... the bad formatting relegated to excel sheets, necessarily made that way from human input and usage modality not being the same as what is best for machine readability. I am very confident I could make something that worked in VBA, but the nature of that language makes it such a pain to develop with, particularly with bad formatting. I knew python was a better solution, but didn't have the bridge between the two to make something that worked. Finally when following the FastAI course I came across the HFS+Gradio wombo combo for sharing python scripts publicly via a great UI. This discovery got me to finally choose to commit to that solution path

# Solution
## Features
The program should take in various required inputs and return a completed weekend staffing schedule, with a separate text output with the sequence of assignments made. Facilitated by the Blocks functionality within Gradio/HFS will be the possibility of feeding in an assignment number and returning a partially completed schedule at that step.

## Codebase
Using Gradio hosted via HFS for making a python algorithm available with easy integration of inputs and outputs. At first blush I thought that Pandas DataFrames would be the best input mode for tabular data, but ruled that out when HFS didn't allow for bulk copying and pasting. Maybe that was for the best because this pushed me to figure out how to work with the generic File input/output mode. It might be a little more painful to program (have to define methods to identify the right tables within the excel file), but a lot nice on the end user experience (drag and drop relevant files and go!). I was concerned about the extra steps of processing the excel file, but as with everything Python there is a library for that! I started off with some basic tests to ensure that what I wanted/needed to do was possible.

### File Manipulation
Here is my proof of concept for File manipulation. If copied into an empty Gradio space on HFS, it takes in an excel file, and adds a new table to the spreadsheet. This was all i needed to know that this could be done...

In [None]:
import gradio as gr
import openpyxl as pyxl #openPyXl allows for excel file manipulation in python

def myFunction(fl,txt):
    myWb=pyxl.load_workbook(fl.name) #Load excel file
    tab = pyxl.worksheet.table.Table(displayName="Table3", ref="E1:F5") #Define new table
    style = pyxl.worksheet.table.TableStyleInfo(name="TableStyleMedium9",showRowStripes=True, showColumnStripes=True)
    tab.tableStyleInfo = style #Assign style to table
    ws=myWb.active 
    ws.add_table(tab) #Add defined table to sheet within the loaded workbook
    otpt_fl_name='try.xlsx' 
    myWb.save(otpt_fl_name) #Save file
    return otpt_fl_name #Define output for HFS interface

demo = gr.Interface(
    myFunction, #Func to take in file and text
    [
        gr.File(
        ),
        gr.Textbox(
            label="Initial text",
            lines=3,
            value="The quick brown fox jumped over the lazy dogs.",
        ),
    ],
    gr.File(),
    description="Enter refusal files",
)
demo.launch()

### Retrieving Disparate Tables

As mentioned previously, one challenge would be to pull data from tables scattered in an unpredictable way throughout the sheet. Here I had to remember that sometimes the easiest way to rob a bank is through the front door, not trying to break through the wall... I simply changed the existing excel template files (filled in by end user) so that the data tables were actually defined as 'Tables' by excel... this made them reference-able by the openPyXl tools. Some further data type transformations were required. Example with a blank book containing a trivial data table called 'tstTbl' in Excel:

In [3]:
import openpyxl as pyxl
import pandas as pd
import numpy as np
myWb=pyxl.load_workbook('../images/Other_Files/TblTestBook.xlsx') 
#Didn't think the .. parent directory would work but it does!
ws=myWb['Sheet1']
tab=ws.tables['tstTbl'] #Pull out table
def tbl_to_df(tab):
    ref=tab.ref #Pull cell reference to string for display
    tab=[[x.value for x in sublist] for sublist in ws[tab.ref]] #Convert to list of lists (each sublist as row of excel table)
    return pd.DataFrame(tab) #Convert nested lists to Dataframe
print(tbl_to_df(tab))

         0        1        2
0  myHead1  myHead2  myHead3
1        1        a        .
2        2        b        ,
3        3        c        ]


And pulling info from multiple tables in a sheet

In [16]:
for t in ws.tables:
    tab=ws.tables[t]
    tab=tbl_to_df(tab)
    print('Table cells reference is "'+str(ref)+'":')
    print(tab)
    print('')

Table cells reference is "A9":
         0        1        2
0  myHead1  myHead2  myHead3
1        1        a        .
2        2        b        ,
3        3        c        ]

Table cells reference is "A9":
       0      1
0  Names  Hours
1  Alice      4
2    Bob     20
3  Clark      8
4   Dave     15

Table cells reference is "A9":
         0      1
0    Names  Hours
1   Arnold      4
2     Bill     60
3  Charles     53
4     Dick     10

Table cells reference is "A9":
        0      1
0   Names  Hours
1  Arthur     24
2  Blaire     70
3   Chuck     22
4  Darryl     12



At this point I can say I am constantly resisting the urge to just run away with the coding! Trying to enforce a best practice of starting off with creating not just an abstract understanding of the problem, but a particular and specified framework in which I am operating, that is, figuring out the specific nature of the inputs I will have before I go nuts building my tower of babel! Next is to mock up a way to retrieve data when a worksheet has a single 'table' not defined in Excel. That is, manually entered data in a tabular format that due to legacy sheet formatting is not able to be defined as a native Excel Table, precluding the use of table indexing seen in the previous example... My approach assumes a known top left cell, and knowing in my framework that only certain columns will be required here. 

In [6]:
ws=myWb['Arb_Tbl']
df = pd.DataFrame(ws.values)
df

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,,,,,,,,,
1,,,,,,,,,
2,Name,id,Attr1,Attr2,Attr3,Attr4,Attr5,Attr6,Attr7
3,Bob Back,0,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5
4,Jeff Jahl,1,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5
5,Hodge Hoss,2,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5
6,Kev Kroll,3,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5
7,Tim Tin,4,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5,=RAND()*5


Above we can see the loose table in the wild... ws.values pulls the whole sheet, which would be good, except it grabs formulas. Unfortunately, per the docs, openPyXl will never evaluate formulas! Time to work smart, not hard. I choose to simply use the 'mouse wriggle' technique shared at [this webpage](https://trumpexcel.com/convert-formulas-to-values-excel/) to manually convert formulas to values before passing my workbook into my functions. Though it *is* sad that the user experience won't be as smooth as dragging and dropping files.

In [18]:
#Reload new book with static data
myWb=pyxl.load_workbook('../images/Other_Files/TblAsValues.xlsx') 
ws=myWb['Arb_Tbl']
#1st find bottom row with data
for i in range(3,200): #Loop up to arbitrary number, prefer to have defined end for infinte loop stopgap
    ref="A"+str(i) 
    if ws[ref].internal_value==None: 
        #Condition met when end of data found.
        btmRow=i-1
        break
tab=[[x.internal_value for x in sublist] for sublist in ws['A3:I'+str(btmRow)]]
df_IdNameHours=pd.DataFrame(tab) #Assuming column I is end of useful data
print(df_IdNameHours)

            0   1         2         3         4         5         6         7  \
0        Name  id     Attr1     Attr2     Attr3     Attr4     Attr5     Attr6   
1    Bob Back   0  0.666338  0.778057  4.822291  1.632892  2.374052  2.922761   
2   Jeff Jahl   1  3.865002  4.639129  4.989458  3.552441  0.809309   0.61727   
3  Hodge Hoss   2  3.088418  0.360548  0.416228  1.700045  3.059734  0.443541   
4   Kev Kroll   3  3.466984  3.967881  2.390047   0.40636   4.68963  0.376537   
5     Tim Tin   4  2.910975  1.203692  0.146427  1.411585  0.963408  0.467773   

          8  
0     Attr7  
1   3.91101  
2  4.264042  
3  4.720742  
4  1.797408  
5  0.369525  


Of course, the actual code deployed is more complex than this... In particular is the case of converting a data table indicating who is trained on what in human readable form to one that is more machine readable. The existing process has a table with one row for each staff person, with one column for each job, and a 1 or 0 if the person is trained or not

In [20]:
ws=myWb['Skills_Matrix']
dataArr=np.array(pd.DataFrame(ws.values)) #Convert data table (skill matrix format) into data table (skills record format)
skills=[] #Initiate new container
for individual in dataArr[1:]: #iterate over all data rows
    for skl in range(1,len(individual)): #iterate over indices not containing the name
        if individual[skl]==1:
            skills.append([individual[0],dataArr[0][skl]])
dataArr=pd.DataFrame(skills)
print(tbl_to_df(ws.tables['Skills_Mtx']))
print('')
print(dataArr)

         0     1       2     3     4       5
0  Column1  Brew  Filter  Pack  Ship  Manage
1   Alfred     1       0     0     1       0
2     Bill     0       1     0     0       1
3    Chris     1       1     1     0       0
4    Dante     1       0     0     0       1
5    Edgar     0       0     1     1       0

         0       1
0   Alfred    Brew
1   Alfred    Ship
2     Bill  Filter
3     Bill  Manage
4    Chris    Brew
5    Chris  Filter
6    Chris    Pack
7    Dante    Brew
8    Dante  Manage
9    Edgar    Pack
10   Edgar    Ship


Now that data tables are removed from excel, we need a means of filtering/sorting them. Unfortunately I could not find good means to do this with existing tools (numpy arrays, pandas DataFrames). Fortunately, this meant learning something new! Here I bring in sqlite3, which allows for running a sql table locally. SQL is Structured Query Language, a language and toolset all about tabular data. It will allow us to do any sort of filter, sort, or view of a table that we could want. The following is a little sample taken modified from [the docs] (https://docs.python.org/2/library/sqlite3.html)

In [5]:
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS stocks (date text, trans text, symbol text, qty, price)''') # Create table.
c.execute('''DELETE FROM stocks''') #if table already existed, will have data... delete existing data to refresh with new.
purchases = [['2006-03-28', 'BUY', 'IBM', 1000, 45.00],
            ['2006-04-05', 'BUY', 'MSFT', 1000, 72.00],
            ['2006-04-06', 'SELL', 'IBM', 500, 53.00],
        ]
c.executemany('INSERT INTO stocks VALUES (?,?,?,?,?)', purchases)
conn.commit()
c.execute('SELECT * FROM stocks ORDER BY price')
listbackTable=c.fetchall()
pd.DataFrame(listbackTable)

Unnamed: 0,0,1,2,3,4
0,2006-03-28,BUY,IBM,1000.0,45.0
1,2006-04-06,SELL,IBM,500.0,53.0
2,2006-04-05,BUY,MSFT,1000.0,72.0


And we can easily beautify the process by making our own mini API functions so that the SQL language will be hidden when reading through the algorithm, and this will also make typing these pesky commands a one-off.

In [7]:
def addTBL(tblName,fields="",data=[],addOn=False):
    """Create table if not already existing, optionally with data, optionally clearing out old data if present. Fields as list of strings"""
    conn = sqlite3.connect('example.db')
    c = conn.cursor()
    listedFields=''
    for f in fields:
        listedFields=listedFields+', '+ f
    listedFields='('+listedFields[2:]+''')''' #Add leading and closing bracket, remove naively added comma,space from leading field
    c.execute('''CREATE TABLE IF NOT EXISTS'''+tblName+listedFields) # Create table.
    if addOn==False:
        c.execute('''DELETE FROM '''+tblName)
    if data!=[]:
        c.executemany('INSERT INTO '+tblName+' VALUES (?,?,?,?,?)', data)
    conn.commit()

def isNumeric(n):
    try:
        n=int(n)
        return True
    except ValueError:
        try:
            n=float(n)
            return True
        except:
            return False

def viewTBL(tblName,fields=None,sortBy=None,filterOn=None):
    """return np array of table with optional select fields, filtered, sorted. Sort syntax=[(field1,asc/desc),(field2,asc/desc)...] Filter syntax=[(field1,value),(field2,value)...]"""
    conn = sqlite3.connect('example.db')
    c = conn.cursor()
    stmnt='SELECT '
    if fields!=None: 
        flds=''
        for f in fields:
            flds=flds+', '+f
        stmnt=stmnt+flds[2:]+ ' FROM ' +tblName+' '
    else: stmnt=stmnt+'* FROM '+tblName+' ' #unspecified, select all
    if filterOn!=None:
        filt='WHERE '
        for f in filterOn:
            if isNumeric(f[1]): filt=filt+f[0]+' = '+ str(f[1])+' AND '
            else: filt=filt+f[0]+' = "'+ f[1]+'" AND '
        filt=filt[:-4] #Remove naively added final " and "
        stmnt=stmnt+filt
    if sortBy!=None:
        srt='ORDER BY '
        for s in sortBy:
            srt=srt+s[0]+' '+s[1]+', '
        srt=srt[:-2]
        stmnt=stmnt+srt
    stmnt=stmnt+';'
    #return stmnt
    c.execute(stmnt)
    return np.array(c.fetchall())

In [55]:
viewTBL('stocks')

array([['2006-03-28', 'BUY', 'IBM', '1000.0', '45.0'],
       ['2006-04-05', 'BUY', 'MSFT', '1000.0', '72.0'],
       ['2006-04-06', 'SELL', 'IBM', '500.0', '53.0']], dtype='<U32')

In [54]:
viewTBL('stocks',['symbol','price'],[('price','asc')],[('symbol','IBM')])

array([['IBM', '45.0'],
       ['IBM', '53.0']], dtype='<U32')

In [14]:
viewTBL('stocks',['symbol','price'],[('price','asc')],[('qty',1000)])

array([['IBM', '45.0'],
       ['MSFT', '72.0']], dtype='<U32')

These custom functions should allow us to very easily perform lookups and filters within the python framework. The key use-case here is that we will sort employees by hours ascending when going in priority sequence of voluntary assignment, but we will sort by seniority descending when retrieving the priority sequence for forcing assignments in vacant slots. Other use cases are filtering the training data to identify what someone is trained on. At the time of writing this, I haven't yet figured if I will need to make this a truly relational database to make that work. I suspect it wont be necessary as it may be easier to simply perform simple lookups, retrieving key values and then plugging them into where needed using plain python. Time will tell!

...And a couple of weeks later, I return to this section of the post, battle scarred and weary... I set myself a trap here, and hopefully one I do not soon forget. I cast the ouput of the above functions to np.array() for *no particular reason*. This brings a *reasonable* idea to mind- every character in a line of code should ahve a *reason* for being there. This little maneuver (which I clearly recall came from a sense of satisfaction in being spiffy and referencing a large library) cost me about 8 hours of troubleshooting much later in the process. What really confounded me most was that the issue dind't present itself until I started using test data sources with entries in all fields. What happened is that a numpy array prefers to be homogenous (storing all members of the same data type) whenever it can:

In [6]:
contents1=1,2,3
contents2=1,2,'3'
contents3=1,'2',None
l1=list(contents1)
l2=list(contents2)
l3=list(contents3)
ar1=np.array(contents1)
ar2=np.array(contents2)
ar3=np.array(contents3)
l1,l2,l3,ar1,ar2,ar3

([1, 2, 3],
 [1, 2, '3'],
 [1, '2', None],
 array([1, 2, 3]),
 array(['1', '2', '3'], dtype='<U11'),
 array([1, '2', None], dtype=object))

as can be seen above, a simple list will just leave its contents be, presenting them as they came in. A numpy array has different behaviour... when a None value is present, other values will remain as they are. However, with even 1 string value, all other numbers within will be converted to strings. And this is evident in the example query output cells. Note the price values have apostrophes around them. I didn't go so far as to reference those values for further computation when I initially tested these custom functions, and so I had a false sense of safety in using them. And since I had a data field empty throughout my testing, this lured me into a yet stronger and more false sense of security. Finally, I reached the point where I filled in my data, and what broke was the 'FilterOn' method of the querying, as I was trying to filter for the employee ID as  a numeric value, but it was showing up as a string. And it was a real doozy to trace back because I jumped to assuming the problem was inherent to SQLite3 for Python. What a fool I was. Classic case of troubleshooting best practice: only open so many black boxes as you need to. In this case I jumped to assuming the problem lay deeper in the weeds than it really did, and I wasted a lot of time in those weeds. Only when i went through all the trouble of mocking up little fake databases and entering a few lines did I realize that sqlite itself wasn't the problem... this one taught me via pain. If you want a list: don't use a numpy array!

### Algorithm Structure


The preceding section is all about just getting the data we need into our hands in a readily manipulable format. Now the fun begins! 

The following is the process to be carried out in progressively greater detail.. a fun exercise in breaking a larger problem into sub-problems until a programmable level is reached...
1. Assign individual to a timeslot
1. Repeat until all slots filled

That's accurate but useless to make anything happening... I've been thinking about the problem for a while and can list off design questions and answers I've reached pre-development.

#### Preliminary Brainstorm
 - Which timeslot/job combination should be assigned when multiple are available?
     - Based on [this video](https://www.youtube.com/watch?v=d1KyYyLmGpA) I learned that to have the best algorithm performance for this kind of problem, you want to perform a depth-first search where you make assignments to the most constrained variables first. From the perspective of assigning timeslots to staff, the sequence in which staff are assigned slots is not optional within the Collective Agreement. With a change in perspective, however, one can see this problem as that of assigning staff to timeslots. From this perspective, a single staff person has a subset of timelsots for which they are eligible to work. Within that subset, this heuristic of assigning to the most constrained slot can be applied. My idea is to maintain a tracking of how many eligible people there are for each job/timeslot combination in the voluntary or forced category, and always assigning people to slots for which they are eligible, for which the fewest people are available. I do have some concerns about whether or not it will work without further consideration as described in the video. Namely, the process of removing assignments from consideration which would leave no assignment available for downstream decisions. I will forgo further worrying about the problem for the time being since I think my problem case is sufficiently different from the one explored in the video. In this situation, it is acceptable and possible that a staff person with priority over another not be assigned, based on their voluntary hours and/or training.
 - How can eligibility criteria for timeslots be made flexible so that the program is useful even after a change to the CBA?
     - My solution to the problem of varied esoteric constraints that are difficult to define, and for which the supplementary data to test constraint criteria is not available, is to ignore these constraints in decision making within the algorithm, instead allowing the user to address these constraints by passing in a lit of enforced assertion statements to enforce some assignments, or disallow other assignments.
     - My solution to the problem of decision making criteria needing to be transparent and easily modifiable is to have the eligibility criteria for assignments be passed as an input to the system within a file (i.e. python module with functions defined within that will determine assignment eligibility)
 - How can the time slots be represented in if every weekend has different jobs to fill?
     - Again, the idea is to use a file that will be the same most of the time to provide the time slots to be filled as an input to the program
 - How can the algorithm be proofread by a human user to ensure results are valid?
     - It is a distributed process. Each staff person is responsible for communicating if they have been assigned to a slot that they are not eligible for. If the schedulign teams recieves this feedback, or if in schedule review they identify an assignment that seems to be contrary to policy, the program should be able to provide a step by step list of the assignments that were made, as well as being able to visualize a partially done schedule, at an arbitrarily selected step, to give the human scheduler the information required to validate if a generated schedule was valid or not, or if a new assertion should be entered into the input to prevent an invalid assignment from being made.     
         - Assertion Types:
            - Do Not Staff: DNS( slot )
            - Assign: A( eeid, slot, job, Type)
                - Type can be WWF, Voluntary, or Forced
            - Disallow Assignment: N( eeid, slot, optional job )

#### Back Tracking

While in an ideal world, back tracking could be used so as not to recompute an entire schedule after a new assertion is made, the details of implementation would be very complex, and my initial assumption is that the reduction in computation time and energy would likely not be worth it, so we'll forgo that for now.

*Sneak peek*, I'm coming back to this statement days later after programming it and thinking about it a fair deal... my conceptualization now is that back trakcing may in back be entirely necessary to construct the optimal schedule (i.e. one with the least forcing, ehre maximal number of people get voluntary overtime). The version 1 that I'm constructing is essentially a depth first search, without defined decision criteria to facilitate backtracking. The matter of defining how to backtrack is not a simple matter. In a sequence of many nodes, how does one select the origin node to make a different decision at? And if a node is chosen, how does one choose which alternative decision to make, generating the next node? My initial thoughts are that, for the former question, the answer should be that when a given position is identified as requiring forcing, one should look back to the first decision where someone was assigned to something else when they could've been assigned to that slot, make that alternative assignment, and then see what bears out using the same algorithm from there. Of course, one issue with this is that the end result may be less optimal and you really woulnd't know until you compute it all out. Does it reliably get worse, better, or fluctuate between better and worse if this recursive function continues? Or instead of recursing do you return to your original backtrack, and select a different decision node to try a different path from? As one can see, there are many decisions to make when designing a back tracking algorithm for which an optimal decision isn't obvious at all. For that reason, I'm going to stick to V1 without back tracking for the sake of getting a working prototype up and running instead of spending my whole life on the thought treadmill. If I'm lucky, it works well enough where the human reviewer need only make a couple of changes. The funny thing about this use case is that from the optimization purists perspective this algorithm design is a nasty, horrible mess. From the realists side, good enough may very well be within reach! Time to let go of my purism.

#### V1
With these ideas in mind, the algorithm becomes:
 - Review assertion list and make all prescribed assignments (such as dedicated weekend staff)
 - Iterate through staff in sequence defined by CBA
     - Apply eligibility constraint functions to timeslots to generate subset of eligible slots. If none, go to next staff person. If one, assign. If multiple, evaluate to see which is most constrained and assign to that.
         - Default constraint functions, applied in sequence of most to least constraining:
            1. trained on job
            1. volunteered for slot
            1. minimum 8 hours off between shifts
            1. <60 total hours worked in week
                -  On a long weekend, assuming 32 hours worked going into weekend
            1. max 12 total hours worked consecutively
 - If slots remain after all staff are iterated through, then:
     - Iterate through all staff from least to most senior, applying modified constraints to determine who must be forced:
        1. trained on job
        1. less than 48 hours worked in the week
        1. less than 8 hours forced
        1. minimum 12 hours off between shifts
        1. <60 total hours worked in week
        1. max 12 total hours worked consecutively

If in the end there is no one available for forcing, then the scheduling team will have to determine whether or not would most likely decide to change which slots are being assigned to move that gap to a different role for which a gap can be sustained in production circumstances. That would be another item to add to the assertion list. 

#### Focusing on Simple
While the above algorithm is what is prescribed by the collective agreement, the rules also state that a given employee has priority selection over another corresponding to which shift they were staffed in the week prior. Carrying out the above algorithm would result in lots of folks being assigned to a shift they volunteered for but don't have first dibs on; for example the most common preference is day shift but only 1/3 staff were on day shift at a given time. Following the algorithm as initially defined then would lead to significant computational efficiency and the need for more complicated programming to create a cascading/recursive bump management script whereby if someone were assigned to a shift they don't have priority selection for and it turns out another person has rights to it, the former would be removed, and, since their already being assigned implies they had greater priority to recieve any assignment in general, another slot would have to be sought for them, including slots taken by someone with lesser priority.. and so on. For this reason the actual algorithm is modified to eliminate the need for backtracking (this is analogous to the actual process carried out manually at this time): Instead of iterating through staff in sequence determined by CBA (# of hours), create a separate list of staff, separated by crew and employment type. Carry out the above simple algorithm on each subset of these with the corresponding reduced set of available slots each time. 

Summarized:
- Carry out above algorithm with on-shift full-timers, for each shift (C/A/B) (in existing data input format, probationaries are included in FT list)
- Carry out above algorithm with off-shift full-timers, for each shift (C/A/B) (following the C->A->B->C->A priority selection format)
- Carry out above algorithm with on-shift Temp staff, for each shift (C/A/B)
- Carry out above algorithm with off-shift Temp staff, for each shift (C/A/B)

To conclude, the outer most loop is across the different sets of staff groups, and then the inner loop is across shifts. And once again, this is done so that there should never be a situation where an individual is being bumped out of their slot by another later in the assignment process, simplifying the program implementation overall. 

#### Exploring Problems
A problem that comes to mind. Postulate: the 'most constrained first' assignment heuristic, as defined, could generate a schedule structure where someone is passed over for a slot because of the minimum shift gap constraint, when a different valid assignment earlier would ahve created the possibility of a longer contiguous shift, which in fact is what should happen.

Example:
Worker A is interested in working an 8 hour shift between 7a and 7p and their priority is daytime (7a-3p). Worker B is interested in working 4 hours between 7a and 3p and their priority is daytime (7a-3p). Higher priority individual B due to the 'most-constrained first' assignment heuristic may be assigned the 11a-3p slot, leaving A to be assigned either the 7a-3p, or 3p-7p slot. The problem is that if B were assigned the first slot of the day, the following two slots could both be covered by A, preventing the need to force anyone for the 4 hour gap posed by the former arrangement. 

The question is whether or not the postulate/thought experiment bears out... further thoughts follow; in this situation where forcing would be required, that would imply that no one else was available to fill the slots. Looking at the most-constrained heuristic in greater detail, this would mean that when B is assigned, slot 1 and 2 are both tied for 2 potential assignees (A or B), whereas slot 3 is most constrained with 1 potential assignee, A. It seems to me that this is a scenario I could leave under the umbrella of 'manual review + assignments' but my gut tells me that is instead a problem of providing decision criteria for when there is a tie in the 'most constrained' heuristic between slots. Proceeding with this, it seems obvious to me from the example thought experiment that the criteria should then be a comparison on the number of potential assignees for the slots neighbouring to the one being considered. The challenge in circumventing this problem is in creating a decision criteria where the cure isn't worse than the disease in terms of code implementation... The central challenge of the entire scheduling problem looms large in this small decision case, which is that the state that assignment variables will take later in the process can't be known except by carrying out the whole process to get there. The idea that leads me to is to check for a shift-splitting situation as described. That can be done simply by performing the following check: remove the worker whose assignment is being made from the pool of eligible assignees. Observe, then, if there are any sets of 2 or 3 contiguous slots with only 1 and the same worker eligible. Remove those slots from the pool of eligible assignments for the worker whose assignment is being made. Assign to remaining slot pool according to most constrained criteria. If >0 slots are available but insufficient to complete the persons voluntary shift, then return those removed shifts to the eligible pool and connect them. If 0 slots are available when the other were removed, then return them to the eligible pool but assign only slots from the edge of the group.

If one neighbouring slot are unassigned and each have only one and the same potential assignee after removing the assignee in question from the pool. Applied to the same thought experiment, slot one would not be identified as a shift splitting assignment since the previous slot would be assigned. Slot two would be assigned as a shift splitting assignment since slots 1 and 3 each have only A as the eligible assignee after B is assigned to slot 2. Omitting the shift splitting decision from the pool leaves only slot 1 to be assigned to B. Bear in mind that this assumes If both available slots have this criteria, then the fact of which slot comes first can arbitrarily be used to break the tie... This is because when it comes down to brass tacks, that individual B is entitled to their OT selection in that scenario even if it forces that shift split and leads to someone being forced, or a gap. The problem of a gap can be addressed outside the context of the program.

The former problem posed by someone selecting a small 4 hour block also brings to mind the other problem of people volunteering for 12 hour blocks. The challenge is that the algorithm is defined as looking first at each shift (8 hour blocks), but staff are eligible for 12 hour blocks across shifts, or 8 hour blocks straddling shifts. 

I'm at the point in this thought experiment now where I think that trying to implement a problem specific solution here has too great a risk of introducing unintended consequences that bring failure. For the sake of time and simplicity I'll proceed with a version 1 that leaves the resolution of these issues in the hands of the user via the forced assignments function.

### Data Structure

To make the program easy to maintain, debug, and code, the data structure of classes/objects/attributes and their relationships should be carefully constructed to facilitate the intended actions. My goal is to have a rigorously modular/generalized system, where almost every process in the final algorithm is a method on an object. This will make the code readable, flexible, and easier to debug in development and deployment. With the above algorithm in mind, I made a dummy script to let inuition guide the insight as to necessity of what classes/attributes/methods would be required:

In [None]:
CollectData():
    pullTables()  # Per code in above sections
    configData()  # Define timeslots objects, worker objects, collections of workers per shift/type
Schd=Sched(date)
Schd.preFill() # Iterate through & enact prescribed assignments
numLoop=0
while Schd.openSlots.count>0: #Because forcing can open someone up for voluntary, need to make this looping capacity. Tracker bit prevents inf loop
    numLoop+=1
    Schd.VolunteerFill():
        for eeTypePool in (onShiftFT,offShiftFT,onShiftTemp,offShiftTemp):
            for shift in shiftSet: #shiftSet is built based on what days selected to schedule. Always seq last to first.
                namePool=poolPicker(shift,eeTypePool)
                for person in namePool: #Idea: Define a generator function to yield the next person, across ee categories
                    slotPool=filterSlots: #per sequence above, evaluate each criteria in sequence and remove slot from pool if fails any criteria
                        isTrained, Volunteered, shiftGapOK, wklyTotOK, maxShiftLenOK
                    if poolEmpty: next person
                    s=pickSlot #Most constrained slot (if *only* person avail for off-shift, assign there. If only person for multiple, assign first chronological. If candidates>1, take most constrained on-shift slot. If tied, take first chronological) 
                    assignSlot(s,type=voluntary) #Perform necessary functions i.e. removing tally of op from no-longer compatible shifts.
    Schd.forceFill():
        for slot in Sched.unassigned.chronologicalSeq:
            assignee=lowMan(slot):
                        filter all ee for training, sort seniority low-hi, check if already worked 8+ hours, check if already forced 8 hours
            #if no assignee, flag slot as 'no staff'
            result=unassignAsNecessary(assignee,slot)
                        check if the forcing would conflict with other constraint (inter-shift gap, weekly total hrs, shiftduration, etc)
                        if the conflicting slot requiring unassignment is from the assertion list, then return an error flag for printout
            assignSlot(slot,type=forced)
    if numLoop>5: break

### Optimal vs Proven?

The algorithm explored above is my attempt at making explicit the process carried out by the human schedulers at present. This method hinges on scheduling one shift at a time with the priority selection staff for that shift, then proceeding to last with less priority. The thing is, I *know* I want my first version not to support solution tree exploration (backtracking and testing of different assignments). And trying to emulate that human method is sticky as hell because it really does demand that each stage, a check be made so that, even if someone might've had indicated willingness for a slot they had priority for, they *need* to be assigned to a slot if they are the only willing assignee for it.. Incorporating this pre-check just feels so janky to me in the scheme as it is presented above. So I'm thinking that that's because it's wrong to go about it. I'm going to make the assignment algorithm assign opposite from how schedulers do it, by assigning staff to slots in the sequence of which slot is most constrained (respecting staff priority assignment sequence), as opposed to assigning slots to staff in the sequence of who has priority pick. And I think this will work just fine because the means by which staff will be selected for assignment to a slot will follow their priority sequence. As postulated previously, it will take implementing and testing to prove that this will work and in fact be better than the alternative w.r.t minimizing overall forcing. But I wouldn't be surprised if, in the grand scheme, this leads to scheudles that are qualitatively different from ones human might make (for example, having a single person scheduled on different jobs across their shift when they could've stayed on the same one because they were interchangeable with the person on the other slot.)

#### Adjustment Required

I got a draft algorithm running smoothly but then hit a hitch when it came to logic... 

My process was as follows..
 - Start by forcing to all slots with no volunteers
 - Then fill all slots via voluntary takers
 - Finally, force again for all slots that might've been left open

This worked but the problem was as follows... individuals can only be forced to a maximum number of hours worked in the week. This meant that, although an individual might be in line to be forced on a weekend shift, if they elected to work shifts taking place earlier in the same weekend, that can put them at their weekly limit and make them not eligible for forcing. In this way, it becomes necessary to modify the process to account for this, so as not to have someone forced into a shift in Phase 1 when they volunteered theirselves out of it but that was only determined in Phase 2.

My first idea was to sweep through the schedule, assigning to all jobs in chronological order, one timeslot a time. The issue with this is that if I reduce the scope of 'most constrained slot' to a single concurrent slot, then this opens up the problem again of assigning someone to a slot which has volunteers to spare when that same person is the only volunteer for a different slot and so should cerainly be assigned there. My idea to resolve this is as follows: 
 - First, evaluate which slots have only 1 eligible volunteer, and mark those people as such. 
 - Second, sweep through the schedule, but instead of considering only a single time slot at one time, consider all time slots occurring prior to the first slot requiring a forcing for lack of volunteers. Assign all these slots to volunteers in sequence of most constrained 
   -  If making one such voluntary assignment and removing the individual from eligible volunteers to another slot leaves another slot with only one eligible volunteer, mark that person
   -  When a marked person being assigned to some voluntary slot according to the usual preferential selection sequence would result in them being unavailable for a slot they were marked for, then assign the next person in sequence, saving that individual for the marked slot
 - Evaluate forced slots as they are reached in chronological order of shift time

Unfortunately this introduces extra challenges but I feel it is necessary to eliminate the possibility of the algorithm forcing someone into a slot that they shouldn't be forced into... more functions will have to be made to evaluate some of these unique circumstances which must now be considered. Necessity is the mother of invention.

### Adjusting Adjustments - Sept 12 /22

A different possibility occurred to me: proceed from the get go assigning slots, sequenced by most constrained. This means starting off my forcing dslots without volunteers, in chronological order. At each assignment, properly update all slots (decrement eligible volunteers) who can no longer be assigned the worker as a result of the new constraint made the by the assignment just entered. Here is the caveat: when a slot is decremented to 0 eligible volunteers, consider this fact that the assignment strategy results in a forcing - re make the schedule from scratch, except this next time around, add that slot that needed forcing to the initial forcing phase. 

I think this would work better than the previous idea... The reason is that, at every step of the way, the most constrained slot is still being assigned, avoiding the issue that could be faced if moving through chronoglogically with ones horizon limited by force slots, which could result in somone being assigned to a populated slot when they were the only taker somewhere else later on. 

This strategy would bypass the issue of "When I go back in my search tree, to which node to I go, and how do I select an alternative edge to search from that node?" by flipping the paradigm. It demands scrapping the old exploration tree, in favour of a new seed, one generated with the knowledge that proceeding with assignments per most constrained criteria would result in a forced slot... by forcing that slot in the beginning this time around, you ensure you haven't made the problem of assigning that person voluntary overtime later when they wouldn't be eligible for it due to being forced earlier. Arguably more, arguably less computation work to start from scratch, but either way, ensures the problem of 'how to backtrack' is avoided. I do think I will try to implement this for sure as it should also be much more easy to code, I think. Aside from copying the seed before starting and applying a loop, all I need to do is to check all slots someone is trained on against the already defined slot eligiblity checker function to facilitate the continuous updating/removing of an individual from the tally of eligible volunteers for a given slot.

There is a loose string here, though. What is the decision criteria not to loop back to the original seed? I mean, the need to force is encountered, so we return to square one, and force from the get go. Assuming we have someone to force in, then that slot does not appear for forcing again. But another slot appears. So we start from scratch again. When does this end? The logical answer to me seems that a log is kept of which slots were encountered in this way, and that the process only start from scratch when its the first time a given slot is finding itself in need of a forcing as a result of voluntary assignments taking away the last voluntary overtime taker. The search down the exploration tree should go deeper and deeper each time around until finally all slots are assigned (or fail to be assigned via forcing for lack of staff). And this means that *all* forcings would be made first, before any volunteer scheduling. But this poses a problem again that I forgot about in my rush on this... how then to know if someone can be forced if you haven't yet determined if they elected to work prior to that shift? Again I feel bamboozled by this issue of the not being able to foresee the future of an algorithm. I worry I may be fighting against an insurmountable wall here, like the Game Of Life in a different form... For now I think I will set this problem aside and proceeed with this idea as is. usually, very few people get forced and so by shoving the potential algorithm error into the issue of 'they could've volunteered prior to their forcing', in fact I believe it makes it easier to evaluate for algorithmic error. Instead of an error being who knows where and more difficult to find, I think this method would make it so that, if a person was forced and was indeed meant to be forced, then you can be sure there are no more errors, since from there on the schedule was all built by assigning the most constrained slots in sequence. And if it was an erroneous forcing in the sense that the person should've gotten voluntary time prior to it, then that can be added to the assignment list and the program re run.

I had plenty of time to think of this while making the first version of the fractional shift indicator code. The part that adds 1/2, or 1/3, 2/3 etc beside peoples names on their slots to help identify full shifts. That was a much bigger pain than I thought it would be, unfortunately. And just when I thought I had it working great, I test a different use case and its completely knackered... big sigh.

### Developments Sept 17 /22

I got the 1/2,2/3,3/3 script working. It wasn't too hard in the end. What took the past few days was trying to implement the recursive scheduling strategy. I've got it working now (mostly?) though not without difficulties along the way with typos, logic errors, and accidental infinite loops. Par for the course, of course. But what a pain! I am now seeing what I hope is truly the final problem... the triggers I defined as reason to start a new schedule from the top are as follows: when assigning a slot, you follow up to see what slots that person could previously do which now they can't due to being assigned to the slot at hand. If one of those other slots now has no volunteers, thats the trigger to add that slot to the forcing list, and restart the scheduling loop. But I realized when making these things that it isn't right to think of it as the forcing phase. Rather, it is simply the priority phase. That is, before any assignments are made, we think we have the priority sequence of slots to assign based on volunteers available, but the process of assigning changes that sequence, as we find with our checking. So really phase one is not a forcing phase, it is simply the phase where we assign the slots we know to be most constrained, and that info only comes from previous iterations where the restart was triggered. So... the trigger condition comes from checking which slots are no longer eligible for. But the problem I see now is the problem of forcing limitations. Specifically, when I was doing the check on 'What other slots are they still an eligible volunteer for?', I was making that check assuming all the OT was voluntary... but some slots may or may become ineligible depending on how many hours they've been forced thus far in the week and how many have been worked total in the week... The failure mode that revealed this to me was that after the priority forcing phase, a person was scheduled for voluntary time *prior* to their 8 forced hours. This created a situation where the person was bein forced after 48 hours in the week, which is not allowed. As I see it, the solution cna be approached from one side or another: trying to fit this check in for all potential slots that someone could be assigned to, when they are assigned to another slot, or simply checking if the rule is being broken *after* assinging someone a slot. I think the latter method is easier (iterate through assigned slots and run the total hours/forced hours etc.) but the latter identifies the issue before assignment, which is aligned with the other methods I'm employing. Nowhere else is the condition checked after assignment. I think doing it before assignment makes more sense from a strategy and consistency perspective. So the form that takes is as follows: when a person is scheduled to a slot, observe if they are forced for any hours, and if so, remove their eligiblity for slots before forcing such that the forced hours cannot go past 48 in the week... in this way, if that person was the only eligible volunteer for a slot prior to their forcing, that slot will then be identified as having no volunteers, putting it in the priority assignment phase, where it will be forced (or voluntary assigned) prior to the other one which was being assigned. I remember in my previous update I mentioned the methodology I was using would leave the possibility open of not assigning people to slots prior to their forcing.. I don't know if thats still possible. My brain hurts. I can't tell if I'm eliminating the failure modes entirely, or just one type of failure mode (>48hrs voluntary before forcing). But I think its good.)

Well then I thought to myself. If I remove their ID from the list of elibile volunteers for a slot, but this forcing past 48 thing isn't actually evaluated for in the function that checks if a slot is ok to assign, doesnt that leave the failure mode open still? Yea... in a situation where there are other eligible volunteers for that slot, the trigger to restart wouldn't be hit.. and if the person whos ID was removed from the list had priority assignment to that slot before the other people, then they would be selected and assign, though still breaking the rule. And I can't make their id being present in the volunteer list as a 'slot OK' check criteria because I already have a failsafe situation where someone might be an eligible volunteer though their name *isn't* in the list. So I can't just make the opposite happen here... I will have to resort to after-assignment checking of the rule being broken I suppose.

### Success Sort of - Sept 19

It really seems like its fully working, but latest real world weekend schedule had a *lot* of particularities that were unusual which would've taken me a long while to input, so I didn't proof check it with that real data. We'll give it a go next week!

If I can make the HFS side of it work in time. Doing that today.

### Progress, Surely - Sept 26
I got that working later that same night. The next week, I trialled it using the previous weeks weekend data but... I was sunk by the edge cases of it. That week was a real gong show of peculiarities in the schedule. Trying again this week and the results are promising but not perfect. I'm seeing people being forced in for 12 hours, people forced who are on long term leave, and it failed to assign someone for 12 even though they were willing, forcing in lieu. The long term leave thing is a simple fix, but the forcing and failure to assign 12 hours is a bit of a mystery

### Could This Be It? - Sept 27

Fixing people on long term leave was easy but still a nice classic little exercise in a quick and dirty way appearing to work but not really, or an equally quick way that really worked. The former I tried first was to not define an employee if their crew wasn't one of the accepted values. Rapidly hit some snags such as a variety of unexpected values, and errors being thrown when the ee object wasn't created for someone who was on WWF list who I failed to mark as inactive due to vacation. The better way was to add a new check on the 'slotOK' check function to see if the persons crew is valid. This way all ee objects are still made so there aren't any errors thrown when trying to reference one that will ultimately be invalid.

The matter of people being forced in for 12 hours was simply because I had forgotten to select '40 hour week' instead of '32' and so people were forcable for up to 16 hours to take them to 48 total in the week. The 'failing to assign, forcing in lieu' was a misread on my part as there were valid rules being followed that prevented that eprson from being eligible though they were volunteering.

I encountered some other interesting issues when troubleshooting such as identifying a situation where someone was absent for multiple days in the weekk, but did OT on the other days. Since the sheet isn't given sick hours, it made it seem that the person wasn't forcable (past 48 hours) but really they weren't. Until/unless that capability gets built in, forcings would always have to be double checked.

A more interesting problem was that when I put a specified assignment in the assignment list, it was being implemented but at the same time, not... the persons other slot indicated 2/2, implying the other job was there, and the slot was present in the employee object assignment list, but the slot was assigned someone else. It turns out what was happening here was that when the list of slots to assign to is initially defined, it only omitted slots with 'DNS' or 'WWF' assignment, when it shoukld've included 'V' and 'F' as well for those cases where those assignments were pre assigned in the assignment list. Since that was missing, the assignments were made but then the slots were re assigned (overwriting slot info but not original assignee info) resulting in the bug.

.....

And.. I've reviewed the real life version and the automatic version, and that's it! They aren't a perfect match, but for every different assignment, the automatic version is what it should be based strictly on the input data. The remaining differences I put down to human judgement or people changing their minds about working after the fact not being reflected in the polling sheet. Success!

### Adjustments Required - Sep 29 2022

After review with the team, identified issue that it needs to give precedence to someone who wants 8 hours, even if they are lower in the selection sequence than someone who wants 4 hours. The problem I stopped at when trying to do this in earlier versions was that it became very complicated when choosing to explore neighbours to the left or right (slots chronologically before or after). Its further complicated by the issue of shift-selection precedence and the possiblity of someone taking an 8 hour shift straddling two shift periods, meaning the sequence of priority selection is different across those slots. The solution we came to is a slight injury to my pride, but I can live with that. That's because it is in fact a return to what I originally said could not be, which is assigning the slots the way the people working the process do it currently. Which is to tierate through the 3 shifts of the day, and iterate through the priority selection sequence assigning people the slots they want. But first to do that only for people who want full 8 hour slots on the shift they were scheduled for this past week. What this does is essentially take away one part of that 'which side to explore' question. By immediately assigning 8 hour shifts within the defined night, day, and afternoon shift periods to anyone who is eligible (chosen in priority sequence), then slots that are unassigned going into the next phase can only become a aprt of an 8 hour shift if they become one half of a straddled shift. That makes it only a one way search, much easier. So this can be iterated upon to identify all straddled 8 hour shifts, or necessarily any 12 hour shift. Finally, any slot that remains unassigned is open to 4 hour assignments. Time to implement.

--
Few hours later. Good progress. I' am for once implementing the ebst practice of testing that its working after every couple of lines that I write. hopefully this will prevent need for any debugging once last line is written. Have already added the differentiation between Probationary,full time, modified assignee selector to reflect what that sequence should be, and a few other things. So tempted to continue but need to go to the gym.

### Painful - Sep 29 2022

Many hours today. Finally lots of progress making it prioritize 8 hour blocks over 4, but in the home stretch right now and I can't manage tot ie the bow on it... one last nagging issue here of leaving one single slot open when it sohuld be forced and I cant understand why its overlooking that. 


Bloody bloody bloody lessons learned - caution with using sub functions inside recursive function. Caution with version tracking. experimenting with different changes in different files. Caution especially with that latter item, and importing those modules within other modules (changed file anme and reflected in main but not in the other auxiliary also referencing that file) and AGAIN with copying and pasting old code and assuming it would work (nto recursing when final phase (4hr assn) caused slot with no volunteers) and also again having the right logic but implementing incorrectly ("phase 0.5"... no... phase 0 had to go through with pre8 in chronological order and assign pre8 if observed.) And also keepign in mind that append() retunrs but extend() does NOT. what a waste

Thought I had it going but... seems not.. when using only can lines, seeing wierd things. Seems like some WWF spots being over written, among others things as well.... see 'cans only' file

also the other one ('rev 1') kicked out after 25 iterations with more spaces to go but can also see things wrong there.. check it out... trace it back..

### The Promised Land - Oct 13 2022

It was on Oct 6, last Thursday, that I finally slaid that beast of the scheduling program. Coming back to it now to finish off this post and mention closing thoughts/details/lessons learned. The main thing, believe you me, is that I am a fool. In truth, I *PRESUMED* and that led me astray. I recalled tha tMIT AI lecture in which the prof demonstrated a greedy algorithm to solve constrained assignment problems, and from that faint, 2 year old emmory, I dared to presume that - yes - I was right. I knew the way, the best way, the way it *should* be done, and that the way it *was* being done was wrong, and no one knew but me, and that was that. Well I think any practitioner of change management and programming could tell you that that was laughable! It's usually done the way it is for a reason. I was right, based on the assumptions I was making, but in the grand scheme of things, I was so wrong because the truth is that I was making some simplifying assumptions because it suited me. The killer mistake was omitting the requirement/constraint that 8 hour shifts took preference over 4 hour shifts, superseding the shift-selection-priority clause. Yes, flipping the script to assign people to slots instead of slots to people was a great idea! But the issue, as explored in earlier blog sections, was that by proceeding in this way (1 slot at a time), it couldn't account for the possibility of someone with lower precedence winning the slot because they took 8 hours. And I was all hung up on making it ha[en in 8 hour chunks because of the question 'how do you handle the possible 8 hour assignable period straddling 2 regularly scheduled shifts?' A chicken or egg problem, as the taking of an regularly scheduled 8 hour shift would inform the possibility of one of the halves being available for a straddle shift, while you could say the same about the reverse. 

In the end, I just let go of my pride and did it the way its always been done, instantiated in the code. Because (who guessed it) that's the way it has to be done. So the final process is as follows:
- Assign, forcing if necessary, all slots in the priority list, in chronological order, that list being initialized in the first iteration as all slots with no volunteers
- Look at every possible 8 hour block on regularly scheduled shift times and fill them in with eligible volunteers as they are found.
   - Iterating through 1st half + 2nd half possibilities in sequence of most constrained slots. (constraint measured by measure of eligible volunteers)
   - Look through all options for Sunday, then Friday, then Saturday, then Monday.
- Look at every possible 8 hour block straddling regularly scheduled shift times and fill them in with eligible volunteers as they are found.
   - Iterating through 1st half + 2nd half possibilities in sequence of most constrained slots. (constraint measured by measure of eligible volunteers)
   - Look through all options for Sunday, then Friday, then Saturday, then Monday.
   - In terms of who gets first pick at straddled shift, it goes to the crew the first half falls on for first pick, then the crew 2nd half falls on, then the third crew. 
- Finally, once all these are done, proceed with *'el classico David especiale'* assigning 4 hour slots one at a time, one full day at a time in order of previously mentioned day preference, and then in order of most to least constrained. 
- If at any time, an assignment results in another slot having no more eligible volunteers, or a broken forcing rule (eg. someone was forced in phase 1, then voluntarily assigned to an earlier slot, meaning they put in voluntary time, disallowing them from being forced afterwards), then add those slots that were broke to the priority list for assignment, forced or otherwise in phase 1, and restart the entire process.
   - Since 8 hour assignments cannot be checked when evaluating one slot at a time, but the slots are evaluated one at a time in the forcing phase, there is a built in 8 hour assignment tracker such that in first phase, when a slot is encountered on the priority list for which an 8 hour assignment was previously made, that assignment is made again in that initial phase so long as it doesn't break key rules for the previously identified assignee. An example of when the same assignment woudln't be made is if that person was now forced into a prior slot, and the 8 hour assignment isn't viable for example because there is too short of a shift gap between the forcing and 8 hour assignment, or the forcing coincides with the 8 hour assignment but for a different job, etc.

There were also a number of other significant upgrades made in that final week, such as the ability to schedule in weekender crew according to the usual rules. Previously I just had all their assignments hard coded in via the assignment list. But for long weekend, I made it so that by selecting one input or another, weekenders could be assigned overtime (above their 12+12) following the same rules as all the others, and including them in the correct priority sequence of selection for any given slot etc. I also finally fixed the assignemnt logging function, which had been mostly broken for mostly all of the development process, and enhanced the supplementary data printout functions on the secondary tabs, including that log in the printout to inform why decisions were made as they were throughout the assignment process so that this algorithm was completely transparent. 

And to my great chagrin it finally worked very well on the Thanksgiving long weekend test case, for which I am most grateful after my long trials and tribulations. Now a couple more weeks of test cases/proof testing and training up and sharing with the team and I should be able to wash my hands of this.