# Table of Contents
 <p><div class="lev1 toc-item"><a href="#Aim-of-notebook" data-toc-modified-id="Aim-of-notebook-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Aim of notebook</a></div><div class="lev1 toc-item"><a href="#Load-the-main-airport-traffic-data" data-toc-modified-id="Load-the-main-airport-traffic-data-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Load the main airport traffic data</a></div><div class="lev1 toc-item"><a href="#Load-lookup-table-provided-by-BTS" data-toc-modified-id="Load-lookup-table-provided-by-BTS-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Load lookup table provided by BTS</a></div><div class="lev1 toc-item"><a href="#Create-enhanced-lookuptable" data-toc-modified-id="Create-enhanced-lookuptable-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Create <em>enhanced</em> lookuptable</a></div><div class="lev2 toc-item"><a href="#Remove-Code-that-is-not-present-our-dataset" data-toc-modified-id="Remove-Code-that-is-not-present-our-dataset-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Remove Code that is not present our dataset</a></div><div class="lev2 toc-item"><a href="#Parse-state,city,-and-airport-name-from-'Description'-field" data-toc-modified-id="Parse-state,city,-and-airport-name-from-'Description'-field-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Parse state,city, and airport-name from 'Description' field</a></div><div class="lev2 toc-item"><a href="#Add-state-'region'-information" data-toc-modified-id="Add-state-'region'-information-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Add state 'region' information</a></div><div class="lev2 toc-item"><a href="#Add-airport-latitude-and-longitude-information-using-Google-geocoder" data-toc-modified-id="Add-airport-latitude-and-longitude-information-using-Google-geocoder-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>Add airport latitude and longitude information using Google geocoder</a></div><div class="lev2 toc-item"><a href="#Add-column-with-both-city-and-state" data-toc-modified-id="Add-column-with-both-city-and-state-4.5"><span class="toc-item-num">4.5&nbsp;&nbsp;</span>Add column with both city and state</a></div><div class="lev1 toc-item"><a href="#All-done.-Save-dataframe-on-disk" data-toc-modified-id="All-done.-Save-dataframe-on-disk-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>All done. Save dataframe on disk</a></div>

In [1]:
%matplotlib inline
import pandas as pd
import time
import numpy as np

from pprint import pprint
from util import print_time

# Aim of notebook

- The [airport traffic dataset](http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time) encodes the airport with unique ID numbers.

- In this notebook, we'll create an *enhanced lookup-table* by taking the lookup table provided by the BTS ([download link](http://www.transtats.bts.gov/Download_Lookup.asp?Lookup=L_AIRPORT_ID)) and adding additional relevant information regarding the airport (such as latitude/longitutde info)

# Load the main airport traffic data

- Load 3 years worth of air-traffic data provided by BTS ([link](http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time))
- (from November 2013 to October 2016)

In [2]:
from util import load_airport_data_3years
df_data = load_airport_data_3years()

 ... load dataframe from 2013-11.zip 
 ... load dataframe from 2013-12.zip 
 ... load dataframe from 2014-01.zip 
 ... load dataframe from 2014-02.zip 
 ... load dataframe from 2014-03.zip 
 ... load dataframe from 2014-04.zip 
 ... load dataframe from 2014-05.zip 
 ... load dataframe from 2014-06.zip 
 ... load dataframe from 2014-07.zip 
 ... load dataframe from 2014-08.zip 
 ... load dataframe from 2014-09.zip 
 ... load dataframe from 2014-10.zip 
 ... load dataframe from 2014-11.zip 
 ... load dataframe from 2014-12.zip 
 ... load dataframe from 2015-01.zip 
 ... load dataframe from 2015-02.zip 
 ... load dataframe from 2015-03.zip 
 ... load dataframe from 2015-04.zip 
 ... load dataframe from 2015-05.zip 
 ... load dataframe from 2015-06.zip 
 ... load dataframe from 2015-07.zip 
 ... load dataframe from 2015-08.zip 
 ... load dataframe from 2015-09.zip 
 ... load dataframe from 2015-10.zip 
 ... load dataframe from 2015-11.zip 
 ... load dataframe from 2015-12.zip 
 ... load da

In [3]:
print df_data.shape
df_data.head(n=5)

(17364696, 7)


Unnamed: 0,YEAR,QUARTER,MONTH,DAY_OF_MONTH,DAY_OF_WEEK,ORIGIN_AIRPORT_ID,DEST_AIRPORT_ID
0,2013,4,11,3,7,12478,10693
1,2013,4,11,4,1,12478,10693
2,2013,4,11,5,2,12478,10693
3,2013,4,11,6,3,12478,10693
4,2013,4,11,7,4,12478,10693


# Load lookup table provided by BTS

In [4]:
df_lookup = pd.read_csv('../data/L_AIRPORT_ID.csv')
print df_lookup.shape

(6409, 2)


# Create *enhanced* lookuptable

In [5]:
df_lookup.head(n=10)

Unnamed: 0,Code,Description
0,10001,"Afognak Lake, AK: Afognak Lake Airport"
1,10003,"Granite Mountain, AK: Bear Creek Mining Strip"
2,10004,"Lik, AK: Lik Mining Camp"
3,10005,"Little Squaw, AK: Little Squaw Airport"
4,10006,"Kizhuyak, AK: Kizhuyak Bay"
5,10007,"Klawock, AK: Klawock Seaplane Base"
6,10008,"Elizabeth Island, AK: Elizabeth Island Airport"
7,10009,"Homer, AK: Augustin Island"
8,10010,"Hudson, NY: Columbia County"
9,10011,"Peach Springs, AZ: Grand Canyon West"


## Remove Code that is not present our dataset

In [6]:
# unique ID's in the dataset
uniq_orig = df_data['ORIGIN_AIRPORT_ID'].unique().tolist() 
uniq_dest = df_data['DEST_AIRPORT_ID'].unique().tolist()

# apply ``set`` function to get unique items in concatenated list
uniq_id = list(set(uniq_orig + uniq_dest))

print "There are {} Airport-Codes in the lookup table".format(df_lookup.shape[0])
print "There are {} unique airport-codes in our dataset".format(uniq_id.__len__())

There are 6409 Airport-Codes in the lookup table
There are 334 unique airport-codes in our dataset


Let's filter/drop the rows/records that we do not need in our analysis

In [7]:
# only keep the items in the main dataframe
_mask = df_lookup['Code'].isin( uniq_id )
df_lookup = df_lookup[ _mask ].reset_index(drop=True)

print df_lookup.shape
df_lookup.head(10)

(334, 2)


Unnamed: 0,Code,Description
0,10135,"Allentown/Bethlehem/Easton, PA: Lehigh Valley ..."
1,10136,"Abilene, TX: Abilene Regional"
2,10140,"Albuquerque, NM: Albuquerque International Sun..."
3,10141,"Aberdeen, SD: Aberdeen Regional"
4,10146,"Albany, GA: Southwest Georgia Regional"
5,10154,"Nantucket, MA: Nantucket Memorial"
6,10155,"Waco, TX: Waco Regional"
7,10157,"Arcata/Eureka, CA: Arcata"
8,10158,"Atlantic City, NJ: Atlantic City International"
9,10165,"Adak Island, AK: Adak"


## Parse state,city, and airport-name from 'Description' field

- Above we realize that the ``Description`` field contains information regarding the *city*, *state*, and *name* of the airport.

- Let's create individual field for each information.

- Fortunately, the ``Description`` column uses a comma (``,``) and colon (``:``) to delimit the City, State, Airport-name information, so splitting these are is straightforward.



In [8]:
# apply string "split" method to break information up
df_parse = map(lambda splits: {'City':splits[0],'State':splits[2],'Airport':splits[4]},
               df_lookup['Description'].str.split(r'(,\s|:\s)') )

pprint(df_parse[:5])

# convert dict to dataframe
df_parse = pd.DataFrame(df_parse)
df_parse.head(5)

[{'Airport': 'Lehigh Valley International',
  'City': 'Allentown/Bethlehem/Easton',
  'State': 'PA'},
 {'Airport': 'Abilene Regional', 'City': 'Abilene', 'State': 'TX'},
 {'Airport': 'Albuquerque International Sunport',
  'City': 'Albuquerque',
  'State': 'NM'},
 {'Airport': 'Aberdeen Regional', 'City': 'Aberdeen', 'State': 'SD'},
 {'Airport': 'Southwest Georgia Regional', 'City': 'Albany', 'State': 'GA'}]


Unnamed: 0,Airport,City,State
0,Lehigh Valley International,Allentown/Bethlehem/Easton,PA
1,Abilene Regional,Abilene,TX
2,Albuquerque International Sunport,Albuquerque,NM
3,Aberdeen Regional,Aberdeen,SD
4,Southwest Georgia Regional,Albany,GA


In [9]:
# now we can readily add these information to our lookup table
df_lookup = df_lookup.join(df_parse)

print df_lookup.shape
df_lookup.head()

(334, 5)


Unnamed: 0,Code,Description,Airport,City,State
0,10135,"Allentown/Bethlehem/Easton, PA: Lehigh Valley ...",Lehigh Valley International,Allentown/Bethlehem/Easton,PA
1,10136,"Abilene, TX: Abilene Regional",Abilene Regional,Abilene,TX
2,10140,"Albuquerque, NM: Albuquerque International Sun...",Albuquerque International Sunport,Albuquerque,NM
3,10141,"Aberdeen, SD: Aberdeen Regional",Aberdeen Regional,Aberdeen,SD
4,10146,"Albany, GA: Southwest Georgia Regional",Southwest Georgia Regional,Albany,GA


## Add state 'region' information

I also would like to study patterns among the four-regions in the United States: 

(1) Northeast
(2) South
(3) West
(4) Midwest

I saved a json lookup file for this purpose

In [10]:
%%bash
cat ../data/us_states_regions.json

{
"Northeast" : ["Connecticut","Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont","New Jersey", "New York", "Pennsylvania"],
"Midwest"   : ["Illinois", "Indiana", "Michigan", "Ohio", "Wisconsin", "Iowa", "Kansas", "Minnesota", "Missouri", "Nebraska", "North Dakota", "South Dakota"],
"South"     : [ "Delaware", "Florida", "Georgia", "Maryland", "North Carolina", "South Carolina", "Virginia", "District of Columbia", "West Virginia",             "Alabama", "Kentucky", "Mississippi", "Tennessee","Arkansas", "Louisiana", "Oklahoma", "Texas"],
"West"      : ["Arizona", "Colorado", "Idaho", "Montana", "Nevada", "New Mexico", "Utah",  "Wyoming", "Alaska", "California", "Hawaii", "Oregon", "Washington"]
}

In [11]:
import json
with open('../data/us_states_regions.json','r') as f:
    regions = json.load(f)

print regions.keys()
print regions.values()

[u'West', u'Northeast', u'Midwest', u'South']
[[u'Arizona', u'Colorado', u'Idaho', u'Montana', u'Nevada', u'New Mexico', u'Utah', u'Wyoming', u'Alaska', u'California', u'Hawaii', u'Oregon', u'Washington'], [u'Connecticut', u'Maine', u'Massachusetts', u'New Hampshire', u'Rhode Island', u'Vermont', u'New Jersey', u'New York', u'Pennsylvania'], [u'Illinois', u'Indiana', u'Michigan', u'Ohio', u'Wisconsin', u'Iowa', u'Kansas', u'Minnesota', u'Missouri', u'Nebraska', u'North Dakota', u'South Dakota'], [u'Delaware', u'Florida', u'Georgia', u'Maryland', u'North Carolina', u'South Carolina', u'Virginia', u'District of Columbia', u'West Virginia', u'Alabama', u'Kentucky', u'Mississippi', u'Tennessee', u'Arkansas', u'Louisiana', u'Oklahoma', u'Texas']]


In [12]:
df_region = []
for key in regions:
    _dftmp = pd.DataFrame( regions[key], columns=['State']  )
    _dftmp['Region'] = key
    df_region.append(_dftmp)
    
df_region = pd.concat(df_region,ignore_index=True)

df_region.head()

Unnamed: 0,State,Region
0,Arizona,West
1,Colorado,West
2,Idaho,West
3,Montana,West
4,Nevada,West


Let's use a hash-table (source) to map state name to its abbreviation

In [13]:
from util import hash_state_to_abbrev
hash_state = hash_state_to_abbrev()

df_region['State'] = df_region['State'].map(lambda key: hash_state[key])
df_region.head()

Unnamed: 0,State,Region
0,AZ,West
1,CO,West
2,ID,West
3,MT,West
4,NV,West


In [14]:
# good, we're now ready to join this "Region" information to our lookup table
df_lookup = df_lookup.merge(df_region,on='State',how='left')

df_lookup.head(10)

Unnamed: 0,Code,Description,Airport,City,State,Region
0,10135,"Allentown/Bethlehem/Easton, PA: Lehigh Valley ...",Lehigh Valley International,Allentown/Bethlehem/Easton,PA,Northeast
1,10136,"Abilene, TX: Abilene Regional",Abilene Regional,Abilene,TX,South
2,10140,"Albuquerque, NM: Albuquerque International Sun...",Albuquerque International Sunport,Albuquerque,NM,West
3,10141,"Aberdeen, SD: Aberdeen Regional",Aberdeen Regional,Aberdeen,SD,Midwest
4,10146,"Albany, GA: Southwest Georgia Regional",Southwest Georgia Regional,Albany,GA,South
5,10154,"Nantucket, MA: Nantucket Memorial",Nantucket Memorial,Nantucket,MA,Northeast
6,10155,"Waco, TX: Waco Regional",Waco Regional,Waco,TX,South
7,10157,"Arcata/Eureka, CA: Arcata",Arcata,Arcata/Eureka,CA,West
8,10158,"Atlantic City, NJ: Atlantic City International",Atlantic City International,Atlantic City,NJ,Northeast
9,10165,"Adak Island, AK: Adak",Adak,Adak Island,AK,West


## Add airport latitude and longitude information using Google geocoder

- Next we'll query the geograhical latitude/longitude location of each airport using geocoder provided from Google API.

- This information will be useful especially when creating visualization plots.

- There's a nice Python package for to query lat/lon: https://pypi.python.org/pypi/geocoder

- Below will take a while, so a good time to brew more coffee :)

In [15]:
import geocoder
from util import print_time

t = time.time()
lat,lon = [],[]

n_items = df_lookup.shape[0]
for i,airport in enumerate(df_lookup['Airport']):
    if i%20==0:
         print '({:3} out of {})'.format(i,n_items),print_time(t)
    loc = geocoder.google(airport)
    time.sleep(10) # add a pause to avoid getting service timed-out

    if loc is not None:
        lon.append(loc.lng)
        lat.append(loc.lat)
    else:
        # lookup failed
        lon.append(None)
        lat.append(None)

# add as new columns
df_lookup['lat'] = lat
df_lookup['lon'] = lon

n_nans = df_lookup['lat'].isnull().sum(axis=0)
print "-- {} NANs out {} ({:.2f}%) --".format(n_nans,n_items,n_nans/float(n_items)*100)

(  0 out of 334) Elapsed time:  0.00 seconds
( 20 out of 334) Elapsed time: 204.68 seconds
( 40 out of 334) Elapsed time: 413.08 seconds
( 60 out of 334) Elapsed time: 620.54 seconds
( 80 out of 334) Elapsed time: 831.28 seconds
(100 out of 334) Elapsed time: 1042.17 seconds
(120 out of 334) Elapsed time: 1253.24 seconds
(140 out of 334) Elapsed time: 1464.41 seconds
(160 out of 334) Elapsed time: 1676.92 seconds
(180 out of 334) Elapsed time: 1887.61 seconds
(200 out of 334) Elapsed time: 2098.65 seconds
(220 out of 334) Elapsed time: 2309.66 seconds
(240 out of 334) Elapsed time: 2520.07 seconds
(260 out of 334) Elapsed time: 2730.17 seconds
(280 out of 334) Elapsed time: 2941.02 seconds
(300 out of 334) Elapsed time: 3152.57 seconds
(320 out of 334) Elapsed time: 3363.34 seconds
-- 22 NANs out 334 (6.59%) --


- 6.59% of the queries failed...

- For airports, search using City + State information (lose locality a bit but will suffice for our analysis)

In [17]:
idx_nan = [] # keep track of the index location that may fail yet again
for i in xrange(n_items):
    print i,
    if lat[i] is not None:
        continue
    city,state = df_lookup['City'].ix[i], df_lookup['State'].ix[i]
    loc = geocoder.google('{}, {}'.format(city,state))
    time.sleep(10) # add a pause to avoid getting service timed-out

    if loc is not None:
        lon[i] = loc.lng
        lat[i] = loc.lat
    else:
        print '    lookup failed for: {}, {}'.format(city,state)
        idx_nan.append(i)
        
# update columns
df_lookup['lat'] = lat
df_lookup['lon'] = lon

n_nans = df_lookup['lat'].isnull().sum(axis=0)
print "-- {} NANs out {} ({:.2f}%) --".format(n_nans,n_items,n_nans/float(n_items)*100)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 27

- So at this point, we have a single lookup failure

- Although unelegant, I'll just manually query these in the geocoder

In [18]:
df_lookup[df_lookup['lat'].isnull()]

Unnamed: 0,Code,Description,Airport,City,State,Region,lat,lon
252,14109,"Hattiesburg/Laurel, MS: Hattiesburg-Laurel Reg...",Hattiesburg-Laurel Regional,Hattiesburg/Laurel,MS,South,,


In [19]:
# one lookup failed...let's just use the cityname before the "/" char
loc = geocoder.google('Hattiesburg MS')
loc

<[OK] Google - Geocode [Hattiesburg, MS, USA]>

In [20]:
df_lookup.ix[252,'lat'] = loc.lat
df_lookup.ix[252,'lon'] = loc.lng

df_lookup.isnull().sum()

Code           0
Description    0
Airport        0
City           0
State          0
Region         8
lat            0
lon            0
dtype: int64

## Add column with both city and state

- Since I am not familiar with many names of the airport, I'd rather work with City and State names.

- However, there may be multiple airports in the same city (eg, JKF and Laguardia in NYC), so uniqueness of "City/State" is not guaranteed.

- Here, I'll create yet another (and final) column containing both the City and State information, and modify duplicates as needed.



In [21]:
df_lookup['City_State'] = df_lookup['City'] + ' (' + df_lookup['State'] + ')'

df_lookup.sample(5)

Unnamed: 0,Code,Description,Airport,City,State,Region,lat,lon,City_State
34,10620,"Billings, MT: Billings Logan International",Billings Logan International,Billings,MT,West,45.803738,-108.537214,Billings (MT)
180,12888,"Laramie, WY: Laramie Regional",Laramie Regional,Laramie,WY,West,41.320194,-105.670345,Laramie (WY)
153,12250,"Hyannis, MA: Barnstable Municipal-Boardman/Pol...",Barnstable Municipal-Boardman/Polando Field,Hyannis,MA,Northeast,41.667338,-70.284745,Hyannis (MA)
257,14222,"Pago Pago, TT: Pago Pago International",Pago Pago International,Pago Pago,TT,,-14.331389,-170.711389,Pago Pago (TT)
103,11525,"Elko, NV: Elko Regional",Elko Regional,Elko,NV,West,40.827819,-115.786212,Elko (NV)


In [22]:
# check duplicates in "City_State"
dups = df_lookup['City_State'].value_counts()
dups = dups[dups != 1]

dups

Houston (TX)       3
Chicago (IL)       2
Washington (DC)    2
New York (NY)      2
Name: City_State, dtype: int64

In [23]:
# create hash-table for airport name lookup
hash_airport = df_lookup.set_index('Code')['Airport'].to_dict()
pprint({k: hash_airport[k] for k in hash_airport.keys()[:5]})

{10245: 'King Salmon Airport',
 10754: 'Wiley Post/Will Rogers Memorial',
 11267: 'James M Cox/Dayton International',
 13830: 'Kahului Airport',
 14696: 'South Bend International'}


In [24]:
# check duplicates in "City_State"
for dup in dups.index:
    print dup
    for i in df_lookup[df_lookup['City_State'] == dup].Code:
        print "    (Code = {:6}) Airport = {}".format((df_data['ORIGIN_AIRPORT_ID'] == i).sum(), hash_airport[i])

Houston (TX)
    (Code =      1) Airport = Ellington
    (Code = 171407) Airport = William P Hobby
    (Code = 478137) Airport = George Bush Intercontinental/Houston
Chicago (IL)
    (Code = 264913) Airport = Chicago Midway International
    (Code = 853523) Airport = Chicago O'Hare International
Washington (DC)
    (Code = 230760) Airport = Ronald Reagan Washington National
    (Code = 135697) Airport = Washington Dulles International
New York (NY)
    (Code = 302634) Airport = John F. Kennedy International
    (Code = 314816) Airport = LaGuardia


Again, kinda hacky, but will create manual replacement on these duplicates using the "cleaner" below

In [25]:
cleaner = [
    ('Houston (TX) [Ell]', 'Ellington'), 
    ('Houston (TX) [WP.Hobby]', 'William P Hobby'), 
    ('Houston (TX) [G.Bush]',  'George Bush Intercontinental/Houston'), 
    ('Chicago (IL) [Midway]',   'Chicago Midway International'),
    ("Chicago (IL) [O'Hare]",   "Chicago O'Hare International"),
    ('Washington (DC) [R.Reagan]',   'Ronald Reagan Washington National'),
    ('Washington (DC) [W.Dulles]',   'Washington Dulles International'),
    ("New York (NY) [JFK]",   "John F. Kennedy International"),
    ("New York (NY) [Lag]",   "LaGuardia"),
]

for _replace, _airport in cleaner:
    df_lookup.loc[df_lookup['Airport'] == _airport, 'City_State'] = _replace

# check duplicates are removed
assert np.all(df_lookup['City_State'].value_counts() == 1)

# All done. Save dataframe on disk

- We have created our *enhanced* lookup table.

- Let's save this on disk for later analysis.

In [26]:
print df_lookup.shape
df_lookup.sample(10).sort_index()

(334, 9)


Unnamed: 0,Code,Description,Airport,City,State,Region,lat,lon,City_State
46,10739,"Brainerd, MN: Brainerd Lakes Regional",Brainerd Lakes Regional,Brainerd,MN,Midwest,46.393046,-94.141263,Brainerd (MN)
77,11111,"Columbia, MO: Columbia Regional",Columbia Regional,Columbia,MO,Midwest,38.817316,-92.221582,Columbia (MO)
87,11274,"Dubuque, IA: Dubuque Regional",Dubuque Regional,Dubuque,IA,Midwest,42.397092,-90.705606,Dubuque (IA)
89,11292,"Denver, CO: Denver International",Denver International,Denver,CO,West,39.856096,-104.673738,Denver (CO)
91,11308,"Dothan, AL: Dothan Regional",Dothan Regional,Dothan,AL,South,31.321656,-85.452433,Dothan (AL)
119,11697,"Fort Lauderdale, FL: Fort Lauderdale-Hollywood...",Fort Lauderdale-Hollywood International,Fort Lauderdale,FL,South,26.074234,-80.150602,Fort Lauderdale (FL)
154,12255,"Hays, KS: Hays Regional",Hays Regional,Hays,KS,Midwest,38.842222,-99.273167,Hays (KS)
167,12397,"Ithaca/Cortland, NY: Ithaca Tompkins Regional",Ithaca Tompkins Regional,Ithaca/Cortland,NY,Northeast,42.490638,-76.459941,Ithaca/Cortland (NY)
317,15356,"Trenton, NJ: Trenton Mercer",Trenton Mercer,Trenton,NJ,Northeast,40.277031,-74.818049,Trenton (NJ)
324,15412,"Knoxville, TN: McGhee Tyson",McGhee Tyson,Knoxville,TN,South,35.810833,-83.993889,Knoxville (TN)


In [28]:
df_lookup.to_csv('df_lookup.csv',index=False)