# Tornados and Trailer Parks

It is a well known fact that when a tornado touches down the first thing it does is ask for 
directions to the nearest trailer park.  This is an unfortunate thing for people who live in trailer 
parks (as I did when I was a kid).  

There is a dataset out there that contains statistical information on every tornado in the US from 
1950 to 2015.  So it occured to me that this data might be useful in determining the best place to 
build a trailer park in the US.

Now I have come to understand from reputable people that tornados are spawned as a result of pre-
existing conditions.  They don't actually get to choose where they are going to touchdown.  So they
can't in a targeted way just drop in on a specific trailer park.

Therefore, there must exist some optimal location to build a trailer park which is removed from the
areas where tornados frequently spawn.

Our goal in this effort is to find that sweet spot if you will.

# Import our tornados .csv file into a pandas dataset

In [29]:
import pandas as pd

df = pd.read_csv (r'Capstone/Tornadoes_SPC_1950to2015.csv')

df.head()

Unnamed: 0,om,yr,mo,dy,date,time,tz,st,stf,stn,...,fat,loss,closs,slat,slon,elat,elon,len,wid,fc
0,1,1950,1,3,1/3/1950,11:00:00,3,MO,29,1,...,0,6.0,0.0,38.77,-90.22,38.83,-90.03,9.5,150,0
1,2,1950,1,3,1/3/1950,11:55:00,3,IL,17,2,...,0,5.0,0.0,39.1,-89.3,39.12,-89.23,3.6,130,0
2,3,1950,1,3,1/3/1950,16:00:00,3,OH,39,1,...,0,4.0,0.0,40.88,-84.58,0.0,0.0,0.1,10,0
3,4,1950,1,13,1/13/1950,5:25:00,3,AR,5,1,...,1,3.0,0.0,34.4,-94.37,0.0,0.0,0.6,17,0
4,5,1950,1,25,1/25/1950,19:30:00,3,MO,29,2,...,0,5.0,0.0,37.6,-90.68,37.63,-90.65,2.3,300,0


So the fields in this dataset are as follows:
om = tornado number within a given year
yr = year
mo = month
dy = day
date = date in MM/DD/YYYY format
time = time of tornado 'birth'
tz = time zone
st = State
stf = a numeric code for the state (we'll drop this column as we don't need it)
stn = the number of this tornado in it's home state in a given year
mag = magnitude of tornado on the EF scale (F0 - F5)
inj = number of injuries
fat = number of fatalities
loss = property loss where 1 <= $50, 2 <= $500, 3 <= $5,000, 4 <= $50,000, 5 <= $500,000 and so on.
closs = crop loss
slat = starting latitude
slon = starting longitude
elat = ending latitude
elon = ending longitude (not to be confused with Elon Musk)
len = length in miles
wid = width in yards
fc = a composite code that contains info on multiple states affected.  Earlier entries lack this so we will drop it.

# Drop the columns we don't need

In [31]:
df = df.drop(["stf", "fc"], axis=1)

df.head()

Unnamed: 0,om,yr,mo,dy,date,time,tz,st,stn,mag,inj,fat,loss,closs,slat,slon,elat,elon,len,wid
0,1,1950,1,3,1/3/1950,11:00:00,3,MO,1,3,3,0,6.0,0.0,38.77,-90.22,38.83,-90.03,9.5,150
1,2,1950,1,3,1/3/1950,11:55:00,3,IL,2,3,3,0,5.0,0.0,39.1,-89.3,39.12,-89.23,3.6,130
2,3,1950,1,3,1/3/1950,16:00:00,3,OH,1,1,1,0,4.0,0.0,40.88,-84.58,0.0,0.0,0.1,10
3,4,1950,1,13,1/13/1950,5:25:00,3,AR,1,3,1,1,3.0,0.0,34.4,-94.37,0.0,0.0,0.6,17
4,5,1950,1,25,1/25/1950,19:30:00,3,MO,2,2,5,0,5.0,0.0,37.6,-90.68,37.63,-90.65,2.3,300
