# Airline Crew Schedule Bidding

*By Brian Smoliak*

An airline schedules its in-flight crews based on their seniority with the company. Each month, the airline divides its flight schedule into a series of subsets called *patterns*. Some patterns are mostly populated with short, regional daytrips. Others contain layovers, red-eye flights, and long days. Employees must bid for their preferred patterns. 

In [5]:
import pandas as pd
import pdftotext
import numpy as np
from IPython.display import display, HTML

## April 2020

1. Change the number of patterns (`n_patterns`) to match the number in the current month's bid package.
2. Change the number of flight attendants (`n_fas`) to match the number in the current month's bid package.

In [6]:
n_patterns = 146
n_fas = 397

3. Change the values of the following python dictionary `positions` to correspond to the number of positions per line, as indicated in the current month's bid package.

In [16]:
positions = {1: np.arange(73, 121).tolist() +
                [124, 129, 131, 135,
                 np.arange(137, 147)],
             3: [125, 126, 130, 133],
             4: np.arange(1, 73).tolist() + 
                [121, 122, 123, 127, 128, 132, 134, 136]}

### Functions

In [7]:
def extract_bids_from_pdf(filename):
    
    bids = {}
    
    with open(filename, "rb") as f:
        pdf = pdftotext.PDF(f)

    for page in pdf:
        for line in page.splitlines():
            line_items = line.split()
            if line_items[0].isdigit(): 
                bid = line_items[line_items.index("FA")+1::]
                bids.update({line_items[0]: bid})
    
    numbids = np.array([len(bids[fa]) for fa in bids])
    filled_bids = {}

    for fa in bids:
        filled_bid = bids[fa]
        while len(filled_bid) <= numbids.max():
            filled_bid.append("0")
        filled_bids.update({fa: filled_bid})

    return pd.DataFrame(filled_bids, dtype=int)

In [8]:
def create_pattern(positions, n_patterns):
    """Create a dataframe containing pattern positions"""
    
    max_positions = max(positions.keys())
    
    column_names = ["Position " + str(i) for i in range(1, max_positions)]
    
    patterns = pd.DataFrame(np.zeros((n_patterns, max_positions), dtype=int), 
                            index=range(1, n_patterns+1), 
                            columns=["Position " + str(i) 
                                     for i in range(1, max_positions + 1)])
    patterns.index.name = "Pattern #"
    
    gen = (x for x in list(positions.keys()) if x not in [max_positions])

    for i in gen:
        for j in positions[i]:
            
            patterns.iloc[j - 1, -i::] = 999
    
    return patterns

In [9]:
def assign_fa(seniority, bid, patterns):
    """Assign a flight attendant to a line"""
    
    row_ind = np.nan
    col_ind = np.nan
    
    for i in range(0, len(bid)):
        if bid[i] != 0:
            if any(patterns.loc[bid[i]] == 0):
                row_ind = bid[i]
                col_ind = patterns.loc[bid[i]].where(patterns.loc[bid[i]] == 0).idxmin
                break
    
    if np.isnan(row_ind):
        for i in range(1, 135):
            if any(patterns.loc[i] == 0):
                row_ind = i
                col_ind = patterns.loc[row_ind].where(patterns.loc[row_ind] == 0).idxmin
                break
            
    patterns.loc[row_ind, col_ind] = seniority
    
    return patterns

In [10]:
def predict_pattern(bids, patterns, **kwargs):
    """Predict the pattern"""
    
    stop = kwargs.get("stop", len(bids.columns))
    
    for i in range(1, stop):

        patterns = assign_fa(i, bids[str(i)], patterns)

    return patterns

In [11]:
def organize_assignments(bids, patterns, **kwargs):
    
    stop = kwargs.get("stop", len(bids.columns))
    
    assignments = pd.DataFrame(np.zeros((stop-1, 2), dtype=int),
                               index=range(1, stop),
                               columns=["Line", "Position"])
    assignments.index.name = "Bid #"
            
    for i in range(1, stop):
        
        assignments.loc[i, "Line"] = patterns[patterns.isin([i])].stack().index[0][0]
        assignments.loc[i, "Position"] = [j for j in patterns[patterns.isin([i])].stack().index[0][1].split() if j.isdigit()]

    return assignments

In [12]:
bids = extract_bids_from_pdf("data/e-Crew-mar.pdf")

  return bool(asarray(a1 == a2).all())


Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,393,394,395,396,397,398,399,400,401,402
0,19,19,26,19,16,19,12,18,19,14,...,121,105,99,106,107,98,0,121,0,121
1,0,0,0,16,25,16,18,8,15,16,...,119,112,100,100,114,99,0,122,0,122
2,0,0,0,0,0,0,13,16,8,19,...,120,111,114,99,119,100,0,120,0,113
3,0,0,0,0,0,0,0,0,0,0,...,117,118,115,119,115,107,0,113,0,107
4,0,0,0,0,0,0,0,0,0,0,...,116,119,116,113,113,114,0,106,0,104


### Results

First we extract the bids from the triangle PDF into a pandas DataFrame:

In [17]:
bids = extract_bids_from_pdf("data/e-Crew-mar.pdf")

Calling the `head` method lets us look at the first few bids of each FA:

In [18]:
bids.head()

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,393,394,395,396,397,398,399,400,401,402
0,19,19,26,19,16,19,12,18,19,14,...,121,105,99,106,107,98,0,121,0,121
1,0,0,0,16,25,16,18,8,15,16,...,119,112,100,100,114,99,0,122,0,122
2,0,0,0,0,0,0,13,16,8,19,...,120,111,114,99,119,100,0,120,0,113
3,0,0,0,0,0,0,0,0,0,0,...,117,118,115,119,115,107,0,113,0,107
4,0,0,0,0,0,0,0,0,0,0,...,116,119,116,113,113,114,0,106,0,104


Next we call the `create_pattern` function to develop a DataFrame containing `0` where a FA can be assigned and `999` to limit positions that are unavailable for assignment.

In [85]:
patterns = create_pattern(positions, n_patterns)

At this point we can call the `predict_pattern` function on the entire triangle and `organize_assignments` to generate a DataFrame with lines and positions.

In [86]:
patterns = predict_pattern(bids, patterns)

In [87]:
assignments = organize_assignments(bids, patterns)

Once the assignments are predicted we can display the assignments by FA number and pattern number:

In [None]:
display(HTML(assignments.to_html()))

In [None]:
display(HTML(patterns.to_html()))

#### Open positions remaining

Sometimes a FA may want to know which patterns are available just prior to their bid. We can use the `stop` argument to `predict_pattern` and `organize_assignments` to discover this. The seniority number just prior to the FA of interest should be input (e.g. 124 for FA #125)

In [None]:
stop = 124
patterns = create_pattern(positions, n_patterns)
patterns = predict_pattern(bids, patterns, stop=stop)
assignments = organize_assignments(bids, patterns, stop=stop)

By stopping the assignment process early, we can search for the number of 0s per row and display the sum as the number of open positions for a each pattern.

In [88]:
display(HTML(pd.DataFrame(index=(patterns == 0).astype(int).sum(axis=1).index, 
             data=(patterns == 0).astype(int).sum(axis=1).values,
             columns=["Open Positions"]).to_html()))

Unnamed: 0_level_0,Open Positions
Pattern #,Unnamed: 1_level_1
1,0
2,0
3,0
4,0
5,0
6,0
7,0
8,0
9,0
10,0
