### Airline Fleet Dataset Exploration and Visualization

<B>Dataset includes the following data: </B>

Parent Airline: i.e. International Airlines Group (IAG)

Airline: i.e. Iberia, Aer Lingus, British Airways...etc. which are owned by IAG

Aircraft Type: Manufacturer & Model

Current: Quantity of airplanes in Operation

Future: Quantity of airplanes on order, from planespotter.net

Order: Quantity airplanes on order, from Wikipedia

Unit Cost: Average unit cost ($M) of Aircraft Type, as found by Wikipedia and various google searches

Total Cost: Current quantity * Unit Cost ($M)

Average Age: Average age of "Current" airplanes by "Aircraft Type"

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
fleet = pd.read_csv('../input/Fleet Data.csv')

<B> Exploring the dataset: </B>

In [None]:
fleet.head()

In [None]:
fleet.describe()

In [None]:
fleet.info()

In [None]:
#Number of unique parent airlines:
fleet['Parent Airline'].nunique()

In [None]:
#Number of unique airlines:
fleet['Airline'].nunique()

In [None]:
# Number of unique types of aircraft:
fleet['Aircraft Type'].nunique()

In [None]:
fleet.columns

In [None]:
#Separating quantity and cost data from age:
aircraftfleet = fleet[['Airline','Aircraft Type', 'Current', 'Future', 'Historic', 'Total', 'Orders', 'Total Cost (Current)']]
parentfleet = fleet[['Parent Airline','Aircraft Type', 'Current', 'Future', 'Historic', 'Total', 'Orders', 'Total Cost (Current)']]

In [None]:
aircraftfleet.head(5)

In [None]:
parentfleet.head(5)

<h2> Airlines</h2>

In [None]:
# Grouping data by Airline name to see tha aircraft fleet size including all aircraft type:
aircraftfleet.groupby(axis=0, by='Airline').sum().head(10)

In [None]:
# Top 20 Parent Airlines with the biggest currently active aircraft fleet:
aircraftfleet.groupby(by='Airline').sum()['Current'].sort_values(ascending=False).head(20)

<h2> Parent Airlines </h2>

In [None]:
# Grouping data by Parent Airline name to see tha aircraft fleet size including all aircraft type:
parentfleet.groupby(axis=0, by='Parent Airline').sum().head(10)

In [None]:
# Top20 Airlines with the biggest currently active aircraft fleet:
parentfleet.groupby(by='Parent Airline').sum()['Current'].sort_values(ascending=False).head(50)

#### Let's explore  information about some specific major Parent Airlines operating in Europe, Middle-East and Asia in a bit in more  detail:

I'll choose the following airlines: <B> Air France/KLM, Aeroflot, Lufthansa, Emirates, American Airlines<B>

In [None]:
# Fleet size for these specific airlines:
fleet[(fleet['Parent Airline'] == 'Air France/KLM') | (fleet['Parent Airline'] == 'Aeroflot') | (fleet['Parent Airline'] == 'Lufthansa') | (fleet['Parent Airline'] == 'Emirates') | (fleet['Parent Airline'] == 'American Airlines')].groupby(by='Parent Airline').sum()['Current'].sort_values(ascending=False).head(50)

In [None]:
# selected_airlines = fleet[(fleet['Parent Airline'] == 'Air France/KLM') | (fleet['Parent Airline'] == 'Aeroflot') | (fleet['Parent Airline'] == 'Lufthansa') | (fleet['Parent Airline'] == 'Emirates') | (fleet['Parent Airline'] == 'American Airlines')]

selected_airlines = fleet[(fleet['Parent Airline'] == 'Air France/KLM') | (fleet['Parent Airline'] == 'Aeroflot') | (fleet['Parent Airline'] == 'Lufthansa') | (fleet['Parent Airline'] == 'Emirates') | (fleet['Parent Airline'] == 'American Airlines')].copy()

In [None]:
# Top20 oldest currently active aircraft types based on the average age:

selected_airlines[['Parent Airline', 'Aircraft Type','Average Age']].dropna(axis=0).sort_values(by='Average Age', ascending=False).head(20)

In [None]:
# Top20 newest currently active planes:

selected_airlines[['Parent Airline', 'Aircraft Type','Average Age']].dropna(axis=0).sort_values(by='Average Age').head(20)

In [None]:
selected_airlines.columns

<h4> The following plot demonstrates the distribution of the aircraft types among daughter airlines.
We can see which airline has the biggest variety of the planes.</h4>

In [None]:
plt.figure(figsize=(14,10))
sns.countplot(data=selected_airlines, x='Parent Airline', hue='Airline')
plt.legend(bbox_to_anchor=(1, 1.0))

In [None]:
# Number of unique aircraft types
selected_airlines['Aircraft Type'].nunique()

In [None]:
# Here we can find the number of unique aircraft types used by Emirates:

selected_airlines[selected_airlines['Parent Airline'] == 'Emirates']['Aircraft Type'].nunique()

In [None]:
selected_airlines[selected_airlines['Parent Airline'] == 'Emirates'][['Aircraft Type', 'Current', 'Future',
       'Historic', 'Total', 'Orders']]

<h4> Let's explore the average age of the aircraft accross selected Parent Airlines </h4>

In [None]:
sns.set_style('darkgrid')
plt.figure(figsize=(14,10))
sns.boxplot(data=selected_airlines, x='Parent Airline', y='Average Age', palette='coolwarm')

In [None]:
avg = selected_airlines.dropna(axis=0, subset=['Average Age',])[['Parent Airline','Airline','Aircraft Type','Average Age']]

In [None]:
# List of unique airplanes for these airlines:

avg['Aircraft Type'].unique()

In [None]:
plt.figure(figsize=(14,10))
sns.boxplot(data=avg, y='Aircraft Type', x='Average Age')

<h4> Now let's see the distribution of the big planes like Airbus A380, Airbus A330, Airbus A340, Boeing 747, Boeing 777, Boeing 787 among these selected airlines.<h4>

In [None]:
biggies = selected_airlines[(selected_airlines['Aircraft Type'] == 'Airbus A380') | (selected_airlines['Aircraft Type'] == 'Airbus A330') | (selected_airlines['Aircraft Type'] == 'Airbus A340') | (selected_airlines['Aircraft Type'] == 'Boeing 747') | (selected_airlines['Aircraft Type'] == 'Boeing 777') | (selected_airlines['Aircraft Type'] == 'Boeing 787 Dreamliner')]
biggies.head(5)

In [None]:
biggies.sort_values('Aircraft Type')[biggies['Current'] > 0][['Parent Airline', 'Airline', 'Aircraft Type', 'Current']].head(20)

In [None]:
# The plot demonstrates how many Airlines under major daughter airlines use large aircraft.

plt.figure(figsize=(10,6))
sns.countplot(data=biggies, x='Aircraft Type', hue='Parent Airline')
plt.legend(bbox_to_anchor=(1, 1.0))

In [None]:
selected_airlines.columns

In [None]:
bigsorted = biggies.drop(axis=1, columns='Average Age').groupby('Aircraft Type').sum()
bigsorted.head(10).sort_values('Current', ascending=False)

# As we can see the most used type of Aircraft among the selected Airlines is Boeing 777.
# Boeing 787 Dreamliner is quite a new plane and is not that widely used yet.
# Boeing 747 is quite old already and probably most airlines will soone replace these planes with newer ones, like 787.

In [None]:
bigsorted = biggies.groupby('Aircraft Type').mean().sort_values('Average Age', ascending=False)
bigsorted['Average Age']

# As we can see below Boeing 747 is indeed in general older that other planes, while Boeing 787 planes are the youngest ones.

In [None]:
airplanes = biggies[['Parent Airline', 'Aircraft Type', 'Current']].copy()
airplanes.dropna(axis=0, subset=['Current',], inplace=True)
airplanes.sort_values('Aircraft Type')

In [None]:
airplanes = airplanes.groupby(by=['Parent Airline', 'Aircraft Type']).sum()
airplanes = airplanes.reset_index()

In [None]:
sns.lmplot(x='Parent Airline', y='Current', hue='Aircraft Type', data=airplanes, fit_reg=False, size=6)

In [None]:
# Facit Grid plot to display data abour Airline fleet in details. Columns - Aircraft Type, Rows - Airline

g = sns.FacetGrid(airplanes, row='Parent Airline' , col="Aircraft Type", hue='Aircraft Type', margin_titles=True)
g = g.map(plt.bar, "Aircraft Type", "Current")

In [None]:
# Alternative barplot graph to make it easier to compare the fleets of different Airlines.

plt.figure(figsize=(10,6))
sns.barplot(data=airplanes, x='Parent Airline',y='Current', hue='Aircraft Type')
plt.legend(bbox_to_anchor=(1, 1.0))