# Airline analysis

Here, we want to look at what airports are most dominated by which airlines, using the same data. For simplicity, we only look at departing flights. Since most departing flights have a corresponding return flight, this should be fairly accurate.

In [None]:
import pandas as pd
import numpy as np

First, we need to load our data from the provided CSV files.

In [None]:
data = pd.read_csv("data/air_sample.csv")

In [None]:
market_ids = pd.read_csv("data/L_CITY_MARKET_ID.csv")

data = data.merge(
    market_ids.rename(columns={"Description": "OriginCity"}).set_index("Code"),
    left_on="OriginCityMarketID",
    right_index=True
)

data = data.merge(
    market_ids.rename(columns={"Description": "DestCity"}).set_index("Code"),
    left_on="DestCityMarketID",
    right_index=True
)

In [None]:
carriers = pd.read_csv("data/L_CARRIERS.csv")

data = data.merge(
    carriers.rename(columns={"Description": "OperatingCarrierName"}).set_index("Code"),
    left_on="OpCarrier",
    right_index=True
)

data = data.merge(
    carriers.rename(columns={"Description": "TicketingCarrierName"}).set_index("Code"),
    left_on="TkCarrier",
    right_index=True
)

## Market shares

Now, we compute market shares by airline.

In [None]:
mkt_shares = (
    data
        .groupby(["OriginCity", "OperatingCarrierName"])
        .Passengers
        .sum()
        .reset_index()
)

mkt_shares["market_share"] = mkt_shares.Passengers / mkt_shares.groupby("OriginCity").Passengers.transform("sum")

mkt_shares = mkt_shares.sort_values("market_share", ascending=False)

mkt_shares.loc[mkt_shares.Passengers > 1000]

### Accounting for regional carriers

Many of the smaller airlines actually operate regional aircraft for larger carriers. For instance, PSA Airlines flies small aircraft for American Airlines, branded as American Eagle and sold with connections to/from American Airlines flights. Here, we repeat the analysis using the TicketingCarrierName instead of the OperatingCarrierName.

In [None]:
mkt_shares = (
    data
        .groupby(["OriginCity", "TicketingCarrierName"])
        .Passengers
        .sum()
        .reset_index()
)

mkt_shares["market_share"] = mkt_shares.Passengers / mkt_shares.groupby("OriginCity").Passengers.transform("sum")

mkt_shares = mkt_shares.sort_values("market_share", ascending=False)

mkt_shares.loc[mkt_shares.Passengers > 1000]

For instance, American is now much more dominant in Charlotte than before.