# Bird Strikes in Aviation
`Jordan Burmylo-Magrann`

The dataset that will be studied throughout this project is bird strikes by planes. Though this topic may seem plain, there is a lot of information involving the common occurence of planes striking birds while in flight. If you've ever been on a flight before, chances are a flight you were on struck a bird at some given time even though you may not know so. On occasion, these strikes can cause more damage than one would think possible.

Thoughout this project, the similarities and differences of statistics involving these common bird strikes will be explored. Whether it be plane size, type of plane, cost of damage or type of damage caused, or even the type of bird, weather, effect, number of birds struck total, or other types of statistics, this project will compare and contrast the relationships of each. 

## Motivation

This topic jumped out to me as it is vastly different and interesting. Not only did I not know of the commonness of bird strikes, I didn't know of the vast damage it could cause and the impacts it could have. Traveling via flight is a very common mode of travel, and not enough people know about the about how regular planes striking birds has become. 

## Methods

### `Software hygiene`
Keeping data clean and readable, for purpose of viewing and analyzing more effeciently
### `Cleaning dataframe`
Easier to access and refer to columns, certain columns needed editing for plotting purposes
### `Plotting` (Line, Scatter, Heatmap, Boxplot)
These were the most fitting and usable for this data. Created the best logistical views for visualization and analyzing purposes. Also helped describe and show the trends the best


In [2]:
# Basic imports to begin
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
import hvplot.pandas
# Turn off scroll wheel 
hv.plotting.bokeh.element.ElementPlot.active_tools = ['pan']

# Begin by creating a dataframe of the bird strike information
birdstrike_data = pd.read_csv('Data/Bird_strikes.csv')
birdstrike_data.head()


%opts magic unavailable (pyparsing cannot be imported)
%compositor magic unavailable (pyparsing cannot be imported)


Unnamed: 0,RecordID,AircraftType,AirportName,AltitudeBin,MakeModel,NumberStruck,NumberStruckActual,Effect,FlightDate,Damage,...,RemainsSentToSmithsonian,Remarks,WildlifeSize,ConditionsSky,WildlifeSpecies,PilotWarned,Cost,Altitude,PeopleInjured,IsAircraftLarge?
0,202152,Airplane,LAGUARDIA NY,"(1000, 2000]",B-737-400,Over 100,859,Engine Shut Down,11/23/00 0:00,Caused damage,...,False,FLT 753. PILOT REPTD A HUNDRED BIRDS ON UNKN T...,Medium,No Cloud,Unknown bird - medium,N,30736,1500,0,Yes
1,208159,Airplane,DALLAS/FORT WORTH INTL ARPT,"(-1, 0]",MD-80,Over 100,424,,7/25/01 0:00,Caused damage,...,False,102 CARCASSES FOUND. 1 LDG LIGHT ON NOSE GEAR ...,Small,Some Cloud,Rock pigeon,Y,0,0,0,No
2,207601,Airplane,LAKEFRONT AIRPORT,"(30, 50]",C-500,Over 100,261,,9/14/01 0:00,No damage,...,False,FLEW UNDER A VERY LARGE FLOCK OF BIRDS OVER AP...,Small,No Cloud,European starling,N,0,50,0,No
3,215953,Airplane,SEATTLE-TACOMA INTL,"(30, 50]",B-737-400,Over 100,806,Precautionary Landing,9/5/02 0:00,No damage,...,False,"NOTAM WARNING. 26 BIRDS HIT THE A/C, FORCING A...",Small,Some Cloud,European starling,Y,0,50,0,Yes
4,219878,Airplane,NORFOLK INTL,"(30, 50]",CL-RJ100/200,Over 100,942,,6/23/03 0:00,No damage,...,False,NO DMG REPTD.,Small,No Cloud,European starling,N,0,50,0,No


In [3]:
# Determine what information is given to us and if we need to edit column names for efficiency  
birdstrike_data.columns

Index(['RecordID', 'AircraftType', 'AirportName', 'AltitudeBin', 'MakeModel',
       'NumberStruck', 'NumberStruckActual', 'Effect', 'FlightDate', 'Damage',
       'Engines', 'Operator', 'OriginState', 'FlightPhase',
       'ConditionsPrecipitation', 'RemainsCollected?',
       'RemainsSentToSmithsonian', 'Remarks', 'WildlifeSize', 'ConditionsSky',
       'WildlifeSpecies', 'PilotWarned', 'Cost', 'Altitude', 'PeopleInjured',
       'IsAircraftLarge?'],
      dtype='object')

In [4]:
# Clean columns 
birdstrike_data.columns = [col.lower() for col in birdstrike_data.columns]
birdstrike_data.rename({'recordid': 'record_id', 'aircrafttype': 'aircraft_type', 
                       'airportname': 'airport_name', 'altitudebin': 'altitude_bin', 
                       'makemodel': 'make_model', 'numberstruck': 'number_struck', 
                       'numberstruckactual': 'number_struck_actual', 'flightdate': 'flight_date',
                       'originstate': 'origin_state', 'flightphase': 'flight_phase',
                       'conditionsprecipitation': 'conditions_precipitation', 'remainscollected?': 'remains_collected',
                       'remainssenttosmithsonian': 'remains_sent_to_smithsonian', 'wildlifesize': 'wildlife_size',
                       'conditionssky': 'conditions_sky', 'wildlifespecies': 'wildlife_species',
                       'pilotwarned': 'pilot_warned', 'peopleinjured': 'people_injured', 
                       'isaircraftlarge?': 'is_aircraft_large'}, axis=1, inplace=True)


In [5]:
birdstrike_data.columns

Index(['record_id', 'aircraft_type', 'airport_name', 'altitude_bin',
       'make_model', 'number_struck', 'number_struck_actual', 'effect',
       'flight_date', 'damage', 'engines', 'operator', 'origin_state',
       'flight_phase', 'conditions_precipitation', 'remains_collected',
       'remains_sent_to_smithsonian', 'remarks', 'wildlife_size',
       'conditions_sky', 'wildlife_species', 'pilot_warned', 'cost',
       'altitude', 'people_injured', 'is_aircraft_large'],
      dtype='object')

In [6]:
# Begin analyzing
birdstrike_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25429 entries, 0 to 25428
Data columns (total 26 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   record_id                    25429 non-null  int64 
 1   aircraft_type                25429 non-null  object
 2   airport_name                 25429 non-null  object
 3   altitude_bin                 25429 non-null  object
 4   make_model                   25429 non-null  object
 5   number_struck                25429 non-null  object
 6   number_struck_actual         25429 non-null  int64 
 7   effect                       2078 non-null   object
 8   flight_date                  25429 non-null  object
 9   damage                       25429 non-null  object
 10  engines                      25195 non-null  object
 11  operator                     25429 non-null  object
 12  origin_state                 24980 non-null  object
 13  flight_phase                 25

In [7]:
birdstrike_data.describe()

Unnamed: 0,record_id,number_struck_actual,altitude,people_injured
count,25429.0,25429.0,25429.0,25429.0
mean,253800.148767,2.699634,799.028432,0.000826
std,38472.800499,12.825804,1740.079843,0.047339
min,1195.0,1.0,0.0,0.0
25%,225742.0,1.0,0.0,0.0
50%,248609.0,1.0,50.0,0.0
75%,269044.0,1.0,700.0,0.0
max,321909.0,942.0,18000.0,6.0


In [8]:
# Missing data/information?
birdstrike_data.isnull().sum()

record_id                          0
aircraft_type                      0
airport_name                       0
altitude_bin                       0
make_model                         0
number_struck                      0
number_struck_actual               0
effect                         23351
flight_date                        0
damage                             0
engines                          234
operator                           0
origin_state                     449
flight_phase                       0
conditions_precipitation       23414
remains_collected                  0
remains_sent_to_smithsonian        0
remarks                         4761
wildlife_size                      0
conditions_sky                     0
wildlife_species                   0
pilot_warned                       0
cost                               0
altitude                           0
people_injured                     0
is_aircraft_large                  0
dtype: int64

In [9]:
# Convert FlightDate to datetime
birdstrike_data['flight_date'] = pd.to_datetime(birdstrike_data['flight_date'])


  birdstrike_data['flight_date'] = pd.to_datetime(birdstrike_data['flight_date'])


### Main Results 

In [19]:
import Modules.BirdStrikePlotter as mbsv
plotter = mbsv.BirdStrikePlotter(birdstrike_data)
dashboard = plotter.create_dashboard()
dashboard

In general, my findings are that bird strikes over time have generally increased as the years went on. In general, birds get struck at a lower altitude, aside from some outliers that get struck at high altitudes. For the most part lower altitudes are where bird strikes occur. An interesting find was that the highest quantity of bird strikes take place from July-October, potentially because of weather and migration patterns. A noticeable spike in this data in terms of how many birds were struck took place in August of both 2009 and 2010. Another mentionable stat is that small birds, for the most part, get struck at very low altitudes, while medium and large birds get struck at a greater span from low to pretty high altitudes. 

## Conclusion

Overall there were great findings that came from this data set. Most birds get struck at lower altitudes, while there are some outliers, specifically medium birds, getting struck more often at high altitudes. Recorded time of strike spikes from July-October with the highest strike month being August. Migration and other factors may have something to do with these facts. In the end, we need to find more ways to preserve bird life and reduce our strike numbers moving forward as there is a large amount killed annually. 