# Airline Crew Schedule Bidding

*By Brian Smoliak*

An airline schedules its in-flight crews based on their seniority with the company. Each month, the airline divides its flight schedule into a series of subsets called *patterns*. Some patterns are mostly populated with short, regional daytrips. Others contain layovers, red-eye flights, and long days. Employees must bid for their preferred patterns. 

In [1]:
import pandas as pd
import pdftotext
import numpy as np
from IPython.display import display, HTML

## April 2020

1. Change the number of patterns (`n_patterns`) to match the number in the current month's bid package.
2. Change the number of flight attendants (`n_fas`) to match the number in the current month's bid package.

In [2]:
n_patterns = 127
n_fas = 301

3. Change the values of the following python dictionary `positions` to correspond to the number of positions per line, as indicated in the current month's bid package.

In [3]:
positions = {1: np.arange(55, 106).tolist() +
                [107, 109, 111, 112, 113, 114,
                 117, 122, 123, 124, 126],
             2: [106, 108, 115, 116, 118, 119, 121, 125, 127],
             3: [110],
             4: np.arange(1, 55).tolist() + 
                [121, 122, 123, 127, 128, 132, 134, 136, 144]}

### Functions

In [4]:
def extract_bids_from_pdf(filename):
    
    bids = {}
    
    with open(filename, "rb") as f:
        pdf = pdftotext.PDF(f)

    for page in pdf:
        for line in page.splitlines():
            line_items = line.split()
            if line_items[0].isdigit(): 
                bid = line_items[line_items.index("FA")+1::]
                bids.update({line_items[0]: bid})
    
    numbids = np.array([len(bids[fa]) for fa in bids])
    filled_bids = {}

    for fa in bids:
        filled_bid = bids[fa]
        while len(filled_bid) <= numbids.max():
            filled_bid.append("0")
        filled_bids.update({fa: filled_bid})

    return pd.DataFrame(filled_bids, dtype=int)

In [5]:
def create_pattern(positions, n_patterns):
    """Create a dataframe containing pattern positions"""
    
    max_positions = max(positions.keys())
    
    column_names = ["Position " + str(i) for i in range(1, max_positions)]
    
    patterns = pd.DataFrame(np.zeros((n_patterns, max_positions), dtype=int), 
                            index=range(1, n_patterns+1), 
                            columns=["Position " + str(i) 
                                     for i in range(1, max_positions + 1)])
    patterns.index.name = "Pattern #"
    
    gen = (x for x in list(positions.keys()) if x not in [max_positions])

    for i in gen:
        for j in positions[i]:
            
            patterns.iloc[j - 1, -(max_positions-i)::] = 999
    
    return patterns

In [6]:
def assign_fa(seniority, bid, patterns):
    """Assign a flight attendant to a line"""
    
    row_ind = np.nan
    col_ind = np.nan
    
    for i in range(0, len(bid)):
        if bid[i] != 0:
            if any(patterns.loc[bid[i]] == 0):
                row_ind = bid[i]
                col_ind = patterns.loc[bid[i]].where(patterns.loc[bid[i]] == 0).idxmin
                break
    
    if np.isnan(row_ind):
        for i in range(1, len(patterns.index)):
            if any(patterns.loc[i] == 0):
                row_ind = i
                col_ind = patterns.loc[row_ind].where(patterns.loc[row_ind] == 0).idxmin
                break
            
    patterns.loc[row_ind, col_ind] = seniority
    
    return patterns

In [7]:
def predict_pattern(bids, patterns, **kwargs):
    """Predict the pattern"""
    
    stop = kwargs.get("stop", len(bids.columns))
    
    for i in range(1, stop):

        patterns = assign_fa(i, bids[str(i)], patterns)

    return patterns

In [8]:
def organize_assignments(bids, patterns, **kwargs):
    
    stop = kwargs.get("stop", len(bids.columns))
    
    assignments = pd.DataFrame(np.zeros((stop-1, 2), dtype=int),
                               index=range(1, stop),
                               columns=["Line", "Position"])
    assignments.index.name = "Bid #"
            
    for i in range(1, stop):
        
        assignments.loc[i, "Line"] = patterns[patterns.isin([i])].stack().index[0][0]
        assignments.loc[i, "Position"] = [j for j in patterns[patterns.isin([i])].stack().index[0][1].split() if j.isdigit()]

    return assignments

### Results

First we extract the bids from the triangle PDF into a pandas DataFrame:

In [9]:
bids = extract_bids_from_pdf("../data/e-Crew.pdf")

  return bool(asarray(a1 == a2).all())


Calling the `head` method lets us look at the first few bids of each FA:

In [10]:
bids.head()

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,388,389,390,391,392,393,394,395,396,397
0,36,39,36,68,39,36,39,11,46,11,...,82,144,120,127,128,137,0,124,144,134
1,0,0,0,0,29,7,81,10,11,35,...,83,121,127,128,135,141,0,131,145,139
2,0,0,0,0,0,11,53,36,10,72,...,84,134,134,137,138,124,0,140,135,146
3,0,0,0,0,0,0,0,0,0,71,...,85,131,121,0,131,143,0,141,133,143
4,0,0,0,0,0,0,0,0,0,0,...,86,125,145,0,127,131,0,143,126,137


Next we call the `create_pattern` function to develop a DataFrame containing `0` where a FA can be assigned and `999` to limit positions that are unavailable for assignment.

In [11]:
patterns = create_pattern(positions, n_patterns)

At this point we can call the `predict_pattern` function on the entire triangle and `organize_assignments` to generate a DataFrame with lines and positions.

In [12]:
patterns = predict_pattern(bids, patterns)

In [13]:
assignments = organize_assignments(bids, patterns)

Once the assignments are predicted we can display the assignments by FA number and pattern number:

In [14]:
display(HTML(assignments.to_html()))

Unnamed: 0_level_0,Line,Position
Bid #,Unnamed: 1_level_1,Unnamed: 2_level_1
1,36,1
2,39,1
3,36,2
4,68,1
5,39,2
6,36,3
7,39,3
8,11,1
9,46,1
10,11,2


In [15]:
display(HTML(patterns.to_html()))

Unnamed: 0_level_0,Position 1,Position 2,Position 3,Position 4
Pattern #,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,67,130,160,222
2,89,90,237,250
3,91,253,257,263
4,223,226,242,252
5,264,275,277,281
6,217,265,270,273
7,39,54,77,88
8,58,61,98,112
9,125,126,221,225
10,17,18,24,25


#### Open positions remaining

Sometimes a FA may want to know which patterns are available just prior to their bid. We can use the `stop` argument to `predict_pattern` and `organize_assignments` to discover this. The seniority number just prior to the FA of interest should be input (e.g. 124 for FA #125)

In [16]:
stop = 92
patterns = create_pattern(positions, n_patterns)
patterns = predict_pattern(bids, patterns, stop=stop)
assignments = organize_assignments(bids, patterns, stop=stop)

By stopping the assignment process early, we can search for the number of 0s per row and display the sum as the number of open positions for a each pattern.

In [17]:
display(HTML(pd.DataFrame(index=(patterns == 0).astype(int).sum(axis=1).index, 
             data=(patterns == 0).astype(int).sum(axis=1).values,
             columns=["Open Positions"]).to_html()))

Unnamed: 0_level_0,Open Positions
Pattern #,Unnamed: 1_level_1
1,3
2,2
3,3
4,4
5,4
6,4
7,0
8,0
9,4
10,0
