# Mapping 

Transportation is about getting from place A to place B.  Therefore, most transportation data has a spatial component to it.  It is nice to be able to put these data on a map and see what is going on.  It is even better if we can put it on a map and interact with the data.  It would be even cooler if we could put our interactive map on a website to show it off!

To do this, we are going to use a package called folium.  You can find the documentation here: 

https://folium.readthedocs.io/en/latest/

And access it on github here: 

https://github.com/python-visualization/folium


### Credits

This lesson draws from the folium quickstart notebook, and from Vik Paruchuri DataQuest lesson: 

https://www.dataquest.io/blog/python-data-visualization-libraries/

### A side note on static mapping

Sometimes you may want to create a static map instead of an interactive map.  Interactive maps are nice for exploring your data, but static maps work well for an image that you can insert into a paper.  If you want to create static maps, then basemap is a good tool.  Here is a nice lesson focused on mapping earthquake activity: 

http://introtopython.org/visualization_earthquakes.html



### OK, back to interactive mapping, because that's fun...

It turns out that folium doesn't do much itself.  It is just a wrapper around something called leafletjs.  You can read more about that here:

http://leafletjs.com/index.html

Leaflet is a library in the JavaScript language.  JavaScript is the language used for most web applications.  We could do the same thing using JavaScript and leaflet directly, but then we would have to learn the syntax for another language.  That might not be too hard, but to keep it simple, we'll stick to the python wrapper for now.  It is good to be aware of, though, because if you want more options than folium allows, you can go directly to leaflet.  

What makes this possible is the fact that leaflet has a well-defined API.  That means that we can pass data back and forth, even from a different language.  


### Setup

Start by installing folium using pip.  At a command prompt, type: 

    pip install folium

Hmm...when I tried this on my desktop, I get an error that says: 

    PermissionError: [WinError 5] Access is denied: 'c:\\program files\\anaconda3\\Lib\\site-packages\\folium'
    
It seems that it is trying to install something in the program files directory, which Windows has protected.  This will depend on the security settings on your machine.  If you get this error, open a command prompt as an administrator.  In the windows search bar, type cmd.  When you see the command prompt, right click, and select run as administrator.  

This did the trick, and now I get: 

    Successfully installed folium-0.2.1
    
In addition, let's go to github and clone the folium repository (https://github.com/python-visualization/folium) to our desktop.  This gives us the source code on our local machine.  What we're really interested in is the examples folder, which gives us a bunch of jupyter notebooks showing how to do different stuff.  You are welcome to explore these as needed. 

You also need to install geopandas, which will make it easier to work with goegraphic data.  The pip installer doesn't work (the long explanation is here: http://geoffboeing.com/2014/09/using-geopandas-windows/), so we'll install using anaconda.  Type: 

    conda install -c conda-forge geopandas
 


Getting Started
---------------

To create a base map, simply pass your starting coordinates to Folium:

In [1]:
import folium

In [3]:
folium.Map?

In [6]:
m = folium.Map(location=[38.034,-84.500])

to display it in your notebook, just ask for the object representation. 

In [7]:
m

In [11]:
dhaka = folium.Map (location =[23.8103,90.4125])
dhaka

To save it in a file

In [None]:
m.save('lex.html')

We can use different backgrounds, or tilesets.  Several are built in.  Options include Stamen Terrain, Stamen Toner, Mapbox Bright, and Mapbox Control room tiles. 

In [13]:
folium.Map (
    location =[23.8103,90.4125],
    tiles = 'Stamen Toner',
    zoom_start = 13
)

In [12]:
folium.Map(
    location=[38.034,-84.500],
    tiles='Stamen Toner',
    zoom_start=13
)

Pick one you like and work with that for the rest of the class.  

Folium also supports Cloudmade and Mapbox custom tilesets- simply pass your key to the API_key keyword.  These are services where you can buy more backgrounds to make your maps look nice. 

```python
folium.Map(location=[45.5236, -122.6750],
           tiles='Mapbox',
           API_key='your.API.key')
```

### Open flights

Let's go back to our openflight data and make some maps. 

In [5]:
import pandas as pd
import numpy as np

In [6]:
# These files use \N as a missing value indicator.  When reading the CSVs, we will tell
# it to use that value as missing or NA.  The double backslash is required because
# otherwise it will interpret \N as a carriage return. 

# Read in the airports data.
airports = pd.read_csv("data/airports.dat", header=None, na_values='\\N')
airports.columns = ["id", "name", "city", "country", "iata", "icao", "latitude", "longitude", "altitude","timezone", "dst", "tz", "type", "source"]

# Read in the airlines data.
airlines = pd.read_csv("data/airlines.dat", header=None, na_values='\\N')
airlines.columns = ["id", "name", "alias", "iata", "icao", "callsign", "country", "active"]

# Read in the routes data.
routes = pd.read_csv("data/routes.dat", header=None, na_values='\\N')
routes.columns = ["airline", "airline_id", "source", "source_id", "dest", "dest_id", "codeshare", "stops", "equipment"]

In [4]:
# let's peek at what we have
airports.head()

Unnamed: 0,id,name,city,country,iata,icao,latitude,longitude,altitude,timezone,dst,tz,type,source
0,1,Goroka Airport,Goroka,Papua New Guinea,GKA,AYGA,-6.08169,145.391998,5282,10.0,U,Pacific/Port_Moresby,airport,OurAirports
1,2,Madang Airport,Madang,Papua New Guinea,MAG,AYMD,-5.20708,145.789001,20,10.0,U,Pacific/Port_Moresby,airport,OurAirports
2,3,Mount Hagen Kagamuga Airport,Mount Hagen,Papua New Guinea,HGU,AYMH,-5.82679,144.296005,5388,10.0,U,Pacific/Port_Moresby,airport,OurAirports
3,4,Nadzab Airport,Nadzab,Papua New Guinea,LAE,AYNZ,-6.569803,146.725977,239,10.0,U,Pacific/Port_Moresby,airport,OurAirports
4,5,Port Moresby Jacksons International Airport,Port Moresby,Papua New Guinea,POM,AYPY,-9.44338,147.220001,146,10.0,U,Pacific/Port_Moresby,airport,OurAirports


In [11]:
airlines.head()

Unnamed: 0,id,name,alias,iata,icao,callsign,country,active
0,-1,Unknown,,-,,,,Y
1,1,Private flight,,-,,,,Y
2,2,135 Airways,,,GNL,GENERAL,United States,N
3,3,1Time Airline,,1T,RNX,NEXTIME,South Africa,Y
4,4,2 Sqn No 1 Elementary Flying Training School,,,WYT,,United Kingdom,N


In [12]:
routes.head()

Unnamed: 0,airline,airline_id,source,source_id,dest,dest_id,codeshare,stops,equipment
0,2B,410.0,AER,2965.0,KZN,2990.0,,0,CR2
1,2B,410.0,ASF,2966.0,KZN,2990.0,,0,CR2
2,2B,410.0,ASF,2966.0,MRV,2962.0,,0,CR2
3,2B,410.0,CEK,2968.0,KZN,2990.0,,0,CR2
4,2B,410.0,CEK,2968.0,OVB,4078.0,,0,CR2


Make a map with the airports on it.

In [10]:
# since there are a lot of airports, making the map can be slow
# so limit it to US airports
us_airports = airports[airports['country']=='United States']
len(us_airports)

1435

In [11]:
us_airports.iterrows?

In [10]:
# Get a basic world map.
# 30 centers the map E-W, and 0 is the equator
airports_map = folium.Map(location=[30, 0], zoom_start=2)

# Loop through the airports, and draw each one as a marker on the map
# popup tells it what to display when you click on it
for name, row in us_airports.iterrows():
    
    # For some reason, this one airport causes issues with the map.
    if row["name"] != "South Pole Station":
        marker = folium.Marker([row["latitude"], row["longitude"]], popup=row['name'])
        marker.add_to(airports_map)
        
# Save it to a file (it's kinda big for the notebook)
airports_map.save('airports.html')

Hmm...it looks like there are airports everywhere!  Let's try again with smaller makers. 

We can also specify the color.  A list of custom colors is available here: 

http://www.w3schools.com/cssref/css_colors.asp

In [12]:
# over-write the airports_map, rather than just adding more markers to it. 
airports_map = folium.Map(location=[30, 0], zoom_start=2)

# use circle markers this time, with custom size and color
for name, row in us_airports.iterrows():
        
    # For some reason, this one airport causes issues with the map.
    if row["name"] != "South Pole Station":
        marker = folium.CircleMarker([row["latitude"], row["longitude"]], 
                                     radius=5,
                                     color='DarkCyan',
                                     fill_color='DarkCyan', 
                                     popup=row['name'])
        marker.add_to(airports_map)
        
airports_map.save('airports.html')

You can also select icons to use as markers.  That code would look like: 

    marker = folium.Marker([row["latitude"], row["longitude"]], 
                           icon=folium.Icon(icon='cloud'), 
                           popup=row['name'])
                           
The list of icons comes from something called bootstrap, and can be found here: 

http://www.bootstrapicons.com/


Or you can use clusters of markers to clean up the map.  This will group them when you zoom out, similar to a Craigslist map.  You can see how to do that here: 

https://ocefpaf.github.io/python4oceanographers/blog/2015/12/14/geopandas_folium/

You can clean up the rest of this airports map as part of your homework this week.  

Let's draw the routes, but since we have lots, let's just start with the routes departing Lexington. 

In [52]:
# Select the LEX routes, then join the source airports
lex_routes = routes[(routes['source']=="LEX")]
#lex_routes = pd.merge(lex_routes, airports, left_on='source_id', right_on='id', how='left')

In [49]:
routes.head ()

Unnamed: 0,airline,airline_id,source,source_id,dest,dest_id,codeshare,stops,equipment
0,2B,410.0,AER,2965.0,KZN,2990.0,,0,CR2
1,2B,410.0,ASF,2966.0,KZN,2990.0,,0,CR2
2,2B,410.0,ASF,2966.0,MRV,2962.0,,0,CR2
3,2B,410.0,CEK,2968.0,KZN,2990.0,,0,CR2
4,2B,410.0,CEK,2968.0,OVB,4078.0,,0,CR2


In [50]:
airports.head ()

Unnamed: 0,id,name,city,country,iata,icao,latitude,longitude,altitude,timezone,dst,tz,type,source
0,1,Goroka Airport,Goroka,Papua New Guinea,GKA,AYGA,-6.08169,145.391998,5282,10.0,U,Pacific/Port_Moresby,airport,OurAirports
1,2,Madang Airport,Madang,Papua New Guinea,MAG,AYMD,-5.20708,145.789001,20,10.0,U,Pacific/Port_Moresby,airport,OurAirports
2,3,Mount Hagen Kagamuga Airport,Mount Hagen,Papua New Guinea,HGU,AYMH,-5.82679,144.296005,5388,10.0,U,Pacific/Port_Moresby,airport,OurAirports
3,4,Nadzab Airport,Nadzab,Papua New Guinea,LAE,AYNZ,-6.569803,146.725977,239,10.0,U,Pacific/Port_Moresby,airport,OurAirports
4,5,Port Moresby Jacksons International Airport,Port Moresby,Papua New Guinea,POM,AYPY,-9.44338,147.220001,146,10.0,U,Pacific/Port_Moresby,airport,OurAirports


In [51]:
lex_routes.head ()

Unnamed: 0,airline,airline_id,source_x,source_id,dest,dest_id,codeshare,stops,equipment,id,name,city,country,iata,icao,latitude,longitude,altitude,timezone,dst,tz,type,source_y
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,OurAirports
1,AA,24.0,LEX,4017.0,CLT,3876.0,Y,0,CR7 CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,OurAirports
2,AA,24.0,LEX,4017.0,DFW,3670.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,OurAirports
3,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,OurAirports
4,AF,137.0,LEX,4017.0,ATL,3682.0,Y,0,CRJ CR9,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,OurAirports


In [53]:
# join the destination airports.  Here we need to use the suffixes option, because 
# the column names overlap, and we want to distinguish between source and dest
lex_routes = pd.merge(lex_routes, airports, 
                      left_on='dest_id', 
                      right_on='id', 
                      how='left', 
                      suffixes=['_source','_dest'])

In [57]:
lex_routes=lex_routes.drop_duplicates(subset=['source_source','dest'])
lex_routes

Unnamed: 0,airline,airline_id,source_source,source_id,dest,dest_id,codeshare,stops,equipment,id,name,city,country,iata,icao,latitude,longitude,altitude,timezone,dst,tz,type,source_dest
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports
1,AA,24.0,LEX,4017.0,CLT,3876.0,Y,0,CR7 CRJ,3876,Charlotte Douglas International Airport,Charlotte,United States,CLT,KCLT,35.214001,-80.9431,748,-5.0,A,America/New_York,airport,OurAirports
2,AA,24.0,LEX,4017.0,DFW,3670.0,Y,0,ERD ER4,3670,Dallas Fort Worth International Airport,Dallas-Fort Worth,United States,DFW,KDFW,32.896801,-97.038002,607,-6.0,A,America/Chicago,airport,OurAirports
3,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,3830,Chicago O'Hare International Airport,Chicago,United States,ORD,KORD,41.9786,-87.9048,672,-6.0,A,America/Chicago,airport,OurAirports
6,DL,2009.0,LEX,4017.0,DCA,3520.0,Y,0,CRJ,3520,Ronald Reagan Washington National Airport,Washington,United States,DCA,KDCA,38.8521,-77.037697,15,-5.0,A,America/New_York,airport,OurAirports
7,DL,2009.0,LEX,4017.0,DTW,3645.0,Y,0,CR7 CRJ CR9,3645,Detroit Metropolitan Wayne County Airport,Detroit,United States,DTW,KDTW,42.212399,-83.353401,645,-5.0,A,America/New_York,airport,OurAirports
8,DL,2009.0,LEX,4017.0,LGA,3697.0,,0,ERJ,3697,La Guardia Airport,New York,United States,LGA,KLGA,40.777199,-73.872597,21,-5.0,A,America/New_York,airport,OurAirports
9,DL,2009.0,LEX,4017.0,MSP,3858.0,Y,0,CRJ,3858,Minneapolis-St Paul International/Wold-Chamber...,Minneapolis,United States,MSP,KMSP,44.882,-93.221802,841,-6.0,A,America/Chicago,airport,OurAirports
10,G4,35.0,LEX,4017.0,FLL,3533.0,,0,M80,3533,Fort Lauderdale Hollywood International Airport,Fort Lauderdale,United States,FLL,KFLL,26.072599,-80.152702,9,-5.0,A,America/New_York,airport,OurAirports
11,G4,35.0,LEX,4017.0,PGD,7056.0,,0,M80,7056,Charlotte County Airport,Punta Gorda,United States,PGD,KPGD,26.9202,-81.990501,26,-5.0,A,America/New_York,airport,OurAirports


In [58]:
one_transfer = pd.merge(lex_routes, routes, 
                      left_on='dest', 
                      right_on='source', 
                      how='inner', 
                      suffixes=['_lex','_trnsfr'])

In [60]:
one_transfer=one_transfer.drop_duplicates()

In [61]:
one_transfer

Unnamed: 0,airline_lex,airline_id_lex,source_source,source_id_lex,dest_lex,dest_id_lex,codeshare_lex,stops_lex,equipment_lex,id,name,city,country,iata,icao,latitude,longitude,altitude,timezone,dst,tz,type,source_dest,airline_trnsfr,airline_id_trnsfr,source,source_id_trnsfr,dest_trnsfr,dest_id_trnsfr,codeshare_trnsfr,stops_trnsfr,equipment_trnsfr
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3
1,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,3M,20710.0,ATL,3682.0,MCN,3754.0,,0,SF3
2,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,3M,20710.0,ATL,3682.0,MEI,4335.0,,0,SF3
3,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,3M,20710.0,ATL,3682.0,MSL,5756.0,,0,SF3
4,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,3M,20710.0,ATL,3682.0,PIB,5759.0,,0,SF3
5,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,3M,20710.0,ATL,3682.0,TUP,5773.0,,0,SF3
6,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,9E,3976.0,ATL,3682.0,AZO,4039.0,,0,CRJ
7,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,9E,3976.0,ATL,3682.0,CHA,3578.0,,0,CRJ
8,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,9E,3976.0,ATL,3682.0,CID,4043.0,,0,CRJ
9,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,OurAirports,9E,3976.0,ATL,3682.0,CRW,4285.0,,0,CRJ


In [25]:
# here is what our data looks like
one_transfer.describe()

Unnamed: 0,airline_id_lex,source_id_lex,dest_id_lex,stops_lex,id_source,latitude_source,longitude_source,altitude_source,timezone_source,id_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,airline_id_trnsfr,source_id_trnsfr,dest_id_trnsfr,stops_trnsfr
count,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3558.0,3557.0,3558.0,3558.0,3558.0
mean,1834.94407,4017.0,3726.698145,0.0,4017.0,38.036499,-84.605904,979.0,-5.0,3726.698145,35.999216,-86.388806,613.05846,-5.418212,2775.65336,3726.698145,3293.567173,0.0
std,1856.5372,0.0,267.048878,0.0,0.0,0.0,0.0,0.0,0.0,267.048878,5.132088,6.488571,361.867601,0.493335,2478.664191,267.048878,1172.594345,0.0
min,24.0,4017.0,3520.0,0.0,4017.0,38.036499,-84.605904,979.0,-5.0,3520.0,26.072599,-97.038002,9.0,-6.0,24.0,3520.0,16.0,0.0
25%,24.0,4017.0,3670.0,0.0,4017.0,38.036499,-84.605904,979.0,-5.0,3670.0,32.896801,-93.221802,607.0,-6.0,321.0,3670.0,3458.0,0.0
50%,2009.0,4017.0,3682.0,0.0,4017.0,38.036499,-84.605904,979.0,-5.0,3682.0,33.6367,-84.428101,672.0,-5.0,2009.0,3682.0,3660.5,0.0
75%,3976.0,4017.0,3830.0,0.0,4017.0,38.036499,-84.605904,979.0,-5.0,3830.0,41.9786,-81.425753,1026.0,-5.0,5209.0,3830.0,3830.0,0.0
max,5209.0,4017.0,7056.0,0.0,4017.0,38.036499,-84.605904,979.0,-5.0,7056.0,44.882,-73.872597,1026.0,-5.0,20710.0,7056.0,11051.0,0.0


In [23]:
one_transfer=one_transfer.drop_duplicates()

In [21]:
pd.options.display.max_columns=50

In [14]:
# It looks like source has some duplicate names.  Drop the values from the airports
# file ane keep the one from the routes file
lex_routes = lex_routes.drop(['source_y','source'], axis=1)
lex_routes = lex_routes.rename(columns={'source_x': 'source'})

In [15]:
# Let's keep only one route between each airport pair
# so we don't have a bunch of lines on top of each other
# The subset option tells it to consider just those columns when determining
# what is a duplicate. 

lex_routes = lex_routes.drop_duplicates(subset=['source', 'dest'])
lex_routes

Unnamed: 0,airline,airline_id,source,source_id,dest,dest_id,codeshare,stops,equipment,id_source,...,country_dest,iata_dest,icao_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,dst_dest,tz_dest,type_dest
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,...,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport
1,AA,24.0,LEX,4017.0,CLT,3876.0,Y,0,CR7 CRJ,4017,...,United States,CLT,KCLT,35.214001,-80.9431,748,-5.0,A,America/New_York,airport
2,AA,24.0,LEX,4017.0,DFW,3670.0,Y,0,ERD ER4,4017,...,United States,DFW,KDFW,32.896801,-97.038002,607,-6.0,A,America/Chicago,airport
3,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,4017,...,United States,ORD,KORD,41.9786,-87.9048,672,-6.0,A,America/Chicago,airport
6,DL,2009.0,LEX,4017.0,DCA,3520.0,Y,0,CRJ,4017,...,United States,DCA,KDCA,38.8521,-77.037697,15,-5.0,A,America/New_York,airport
7,DL,2009.0,LEX,4017.0,DTW,3645.0,Y,0,CR7 CRJ CR9,4017,...,United States,DTW,KDTW,42.212399,-83.353401,645,-5.0,A,America/New_York,airport
8,DL,2009.0,LEX,4017.0,LGA,3697.0,,0,ERJ,4017,...,United States,LGA,KLGA,40.777199,-73.872597,21,-5.0,A,America/New_York,airport
9,DL,2009.0,LEX,4017.0,MSP,3858.0,Y,0,CRJ,4017,...,United States,MSP,KMSP,44.882,-93.221802,841,-6.0,A,America/Chicago,airport
10,G4,35.0,LEX,4017.0,FLL,3533.0,,0,M80,4017,...,United States,FLL,KFLL,26.072599,-80.152702,9,-5.0,A,America/New_York,airport
11,G4,35.0,LEX,4017.0,PGD,7056.0,,0,M80,4017,...,United States,PGD,KPGD,26.9202,-81.990501,26,-5.0,A,America/New_York,airport


That looks better.  Now, let's create a map.  To avoid adding duplicate airports, we are going to use a container called a set.  A set is an unordered collection of unique elements.  This means we can keep adding LEX to the set, and end up with only 1 LEX in the end.  

In [13]:
# create a basic map, centered on Lexington
lex_air = folium.Map(
    location=[38.034,-84.500],
    tiles='Stamen Toner',
    zoom_start=4
)

In [22]:
# Define some empty sets
airport_set = set()
route_set = set()

# Make sure we don't add duplicates, especially for the origins
for name, row in lex_routes.iterrows():
    
    if row['source'] not in airport_set: 
        popup_string = row['city_source'] + ' (' + row['source'] + ')'
        marker = folium.CircleMarker([row["latitude_source"], row["longitude_source"]], 
                                     color='DarkCyan',
                                     fill_color='DarkCyan', 
                                     radius=5, popup=popup_string)
        marker.add_to(lex_air)
        airport_set.add(row['source'])
        
    if row['dest'] not in airport_set: 
        popup_string = row['city_dest'] + '(' + row['dest'] + ')'
        marker = folium.CircleMarker([row["latitude_dest"], row["longitude_dest"]], 
                                     color='MidnightBlue',
                                     fill_color='MidnightBlue', 
                                     radius=5, popup=popup_string)
        marker.add_to(lex_air)
        airport_set.add(row['dest'])
    
    # the parentheses in the indicate that we are adding a tuple to the route_set
    if (row['source'],row['dest']) not in route_set:            
        popup_string = row['source'] + '-' + row['dest']        
        line = folium.PolyLine([(row["latitude_source"], row["longitude_source"]), 
                                (row["latitude_dest"], row["longitude_dest"])], 
                                weight=2, 
                                popup=popup_string)
        line.add_to(lex_air)
        route_set.add((row['source'],row['dest']))
        
lex_air

That's cool.  But airplanes don't fly in a straight line.  They follow the great circle.  So when you fly from Chicago to London, you go over Greenland (which is really pretty on a clear day!).  Can we make the lines follow a great circle? 

It looks like there are some options here: 

http://gis.stackexchange.com/questions/47/what-tools-in-python-are-available-for-doing-great-circle-distance-line-creati

Let's try one of them. 

In [14]:
import pyproj

# when creating a function, it is good practice to define the API!
def getGreatCirclePoints(startlat, startlon, endlat, endlon): 
    """
    startlat - starting latitude 
    startlon - starting longitude 
    endlat   - ending latitude 
    endlon   - ending longitude 
    
    returns - a list of tuples, where each tuple is the lat-long for a point
              along the curve.  
    """
    # calculate distance between points
    g = pyproj.Geod(ellps='WGS84')
    (az12, az21, dist) = g.inv(startlon, startlat, endlon, endlat)

    # calculate line string along path with segments <= 20 km
    lonlats = g.npts(startlon, startlat, endlon, endlat,
                     1 + int(dist / 20000))

    # the npts function uses lon-lat, while the folium functions use lat-lon
    # This sort of thing is maddening!  What happens is the lines don't show
    # up on the map and you don't know why.  Learn from my mistakes
    latlons = []
    for lon_lat in lonlats: 
        
        # this is how you get values out of a tuple
        (lon, lat) = lon_lat
        
        # add them to our list
        latlons.append((lat, lon)) 
    
    # npts doesn't include start/end points, so prepend/append them
    latlons.insert(0, (startlat, startlon))
    latlons.append((endlat, endlon))
    
    return latlons


In [15]:
# any time we write a function, we should test that it works
p = getGreatCirclePoints(38.034, -84.500, 33.636700, -84.428101) 
p

[(38.034, -84.5),
 (37.864933929949096, -84.49708149511396),
 (37.695862920583586, -84.49417629534568),
 (37.52678697986378, -84.49128425988357),
 (37.35770611591111, -84.48840524964173),
 (37.18862033700805, -84.48553912723197),
 (37.019529651598035, -84.48268575693639),
 (36.85043406828536, -84.47984500468044),
 (36.68133359583508, -84.4770167380065),
 (36.51222824317288, -84.474200826048),
 (36.343118019385024, -84.47139713950395),
 (36.17400293371815, -84.46860555061399),
 (36.00488299557918, -84.46582593313389),
 (35.83575821453518, -84.46305816231151),
 (35.66662860031317, -84.46030211486323),
 (35.49749416279997, -84.45755766895067),
 (35.328354912042094, -84.45482470415813),
 (35.15921085824549, -84.45210310147009),
 (34.99006201177538, -84.44939274324938),
 (34.82090838315613, -84.44669351321562),
 (34.65174998307092, -84.44400529642411),
 (34.48258682236167, -84.44132797924507),
 (34.3134189120287, -84.43866144934323),
 (34.14424626323062, -84.43600559565783),
 (33.9750688872

In [16]:
# create a basic map, centered on Lexington
lex_air = folium.Map(
    location=[38.034,-84.500],
    tiles='Stamen Toner',
    zoom_start=4
)

In [26]:
# define the map in the same way, but use great circles for the lines

# Define some empty sets
airport_set = set()
route_set = set()

# Make sure we don't add duplicates, especially for the origins
for name, row in lex_routes.iterrows():
    
    if row['source'] not in airport_set: 
        popup_string = row['city_source'] + ' (' + row['source'] + ')'
        marker = folium.CircleMarker([row["latitude_source"], row["longitude_source"]], 
                                     color='DarkCyan',
                                     fill_color='DarkCyan', 
                                     radius=5, popup=popup_string)
        marker.add_to(lex_air)
        airport_set.add(row['source'])
        
    if row['dest'] not in airport_set: 
        popup_string = row['city_dest'] + '(' + row['dest'] + ')'
        marker = folium.CircleMarker([row["latitude_dest"], row["longitude_dest"]], 
                                     color='MidnightBlue',
                                     fill_color='MidnightBlue', 
                                     radius=5, popup=popup_string)
        marker.add_to(lex_air)
        airport_set.add(row['dest'])
    
    # PolyLine will accept a whole list of tuples, not just two
    if (row['source'],row['dest']) not in route_set:            
        popup_string = row['source'] + '-' + row['dest']       
        
        gc_points = getGreatCirclePoints(row["latitude_source"], 
                                         row["longitude_source"], 
                                         row["latitude_dest"], 
                                         row["longitude_dest"])
        
        line = folium.PolyLine(gc_points, weight=2, popup=popup_string)
        line.add_to(lex_air)
        route_set.add((row['source'],row['dest']))
        
lex_air   

NameError: name 'getGreatCirclePoints' is not defined

In [17]:
# save it to its own file
lex_air.save("lex_air.html")

### Your turn

The above map shows everywhere you can get to from Lexington on a direct flight.  Your job is to:

1. Make a map of all the possible destinations with one transfer. 
2. Make a map of all the possible desitnations with two transfers. 

Make the maps look nice!  Use color coding, vary the size of the features, or be selective about what you display in order to communicate the information effectively.  

Bonus: This is the air travel version of the Kevin Bacon game (https://oracleofbacon.org/).  What is the number N, such that you can reach every airport in the world with N or fewer transfers?  

Extra Bonus: Use this very important piece of knowledge to impress your friends at parties!

In [7]:
# Select the LEX routes, then join the source airports
lex_routes = routes[(routes['source']=="LEX")]
lex_routes = pd.merge(lex_routes, airports, left_on='source_id', right_on='id', how='left')

In [9]:
lex_routes = pd.merge(lex_routes, airports, 
                      left_on='dest_id', 
                      right_on='id', 
                      how='left', 
                      suffixes=['_source','_dest'])

In [12]:
lex_routes = lex_routes.drop(['source_y','source'], axis=1)
lex_routes = lex_routes.rename(columns={'source_x': 'source'})

In [16]:
lex_routes = lex_routes.drop_duplicates(subset=['source', 'dest'])
lex_routes

Unnamed: 0,airline,airline_id,source,source_id,dest,dest_id,codeshare,stops,equipment,id_source,name_source,city_source,country_source,iata_source,icao_source,latitude_source,longitude_source,altitude_source,timezone_source,dst_source,tz_source,type_source,id_dest,name_dest,city_dest,country_dest,iata_dest,icao_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,dst_dest,tz_dest,type_dest
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport
1,AA,24.0,LEX,4017.0,CLT,3876.0,Y,0,CR7 CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3876,Charlotte Douglas International Airport,Charlotte,United States,CLT,KCLT,35.214001,-80.9431,748,-5.0,A,America/New_York,airport
2,AA,24.0,LEX,4017.0,DFW,3670.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3670,Dallas Fort Worth International Airport,Dallas-Fort Worth,United States,DFW,KDFW,32.896801,-97.038002,607,-6.0,A,America/Chicago,airport
3,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3830,Chicago O'Hare International Airport,Chicago,United States,ORD,KORD,41.9786,-87.9048,672,-6.0,A,America/Chicago,airport
6,DL,2009.0,LEX,4017.0,DCA,3520.0,Y,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3520,Ronald Reagan Washington National Airport,Washington,United States,DCA,KDCA,38.8521,-77.037697,15,-5.0,A,America/New_York,airport
7,DL,2009.0,LEX,4017.0,DTW,3645.0,Y,0,CR7 CRJ CR9,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3645,Detroit Metropolitan Wayne County Airport,Detroit,United States,DTW,KDTW,42.212399,-83.353401,645,-5.0,A,America/New_York,airport
8,DL,2009.0,LEX,4017.0,LGA,3697.0,,0,ERJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3697,La Guardia Airport,New York,United States,LGA,KLGA,40.777199,-73.872597,21,-5.0,A,America/New_York,airport
9,DL,2009.0,LEX,4017.0,MSP,3858.0,Y,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3858,Minneapolis-St Paul International/Wold-Chamber...,Minneapolis,United States,MSP,KMSP,44.882,-93.221802,841,-6.0,A,America/Chicago,airport
10,G4,35.0,LEX,4017.0,FLL,3533.0,,0,M80,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3533,Fort Lauderdale Hollywood International Airport,Fort Lauderdale,United States,FLL,KFLL,26.072599,-80.152702,9,-5.0,A,America/New_York,airport
11,G4,35.0,LEX,4017.0,PGD,7056.0,,0,M80,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,7056,Charlotte County Airport,Punta Gorda,United States,PGD,KPGD,26.9202,-81.990501,26,-5.0,A,America/New_York,airport


In [32]:
pd.options.display.max_columns = 60

In [17]:
one_transfer = pd.merge(lex_routes, routes, 
                      left_on='dest', 
                      right_on='source', 
                      how='inner', 
                      suffixes=['_lex','_trnsfr'])

In [18]:
one_transfer.head ()

Unnamed: 0,airline_lex,airline_id_lex,source_lex,source_id_lex,dest_lex,dest_id_lex,codeshare_lex,stops_lex,equipment_lex,id_source,name_source,city_source,country_source,iata_source,icao_source,latitude_source,longitude_source,altitude_source,timezone_source,dst_source,tz_source,type_source,id_dest,name_dest,city_dest,country_dest,iata_dest,icao_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,dst_dest,tz_dest,type_dest,airline_trnsfr,airline_id_trnsfr,source_trnsfr,source_id_trnsfr,dest_trnsfr,dest_id_trnsfr,codeshare_trnsfr,stops_trnsfr,equipment_trnsfr
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3
1,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MCN,3754.0,,0,SF3
2,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MEI,4335.0,,0,SF3
3,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MSL,5756.0,,0,SF3
4,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,PIB,5759.0,,0,SF3


In [19]:
one_transfer = one_transfer.drop_duplicates(subset=['source_trnsfr', 'dest_trnsfr'])
one_transfer

Unnamed: 0,airline_lex,airline_id_lex,source_lex,source_id_lex,dest_lex,dest_id_lex,codeshare_lex,stops_lex,equipment_lex,id_source,name_source,city_source,country_source,iata_source,icao_source,latitude_source,longitude_source,altitude_source,timezone_source,dst_source,tz_source,type_source,id_dest,name_dest,city_dest,country_dest,iata_dest,icao_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,dst_dest,tz_dest,type_dest,airline_trnsfr,airline_id_trnsfr,source_trnsfr,source_id_trnsfr,dest_trnsfr,dest_id_trnsfr,codeshare_trnsfr,stops_trnsfr,equipment_trnsfr
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3
1,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MCN,3754.0,,0,SF3
2,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MEI,4335.0,,0,SF3
3,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MSL,5756.0,,0,SF3
4,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,PIB,5759.0,,0,SF3
5,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,TUP,5773.0,,0,SF3
6,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,AZO,4039.0,,0,CRJ
7,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,CHA,3578.0,,0,CRJ
8,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,CID,4043.0,,0,CRJ
9,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,CRW,4285.0,,0,CRJ


In [29]:
two_transfer  = pd.merge(one_transfer, routes, 
                      left_on='dest_trnsfr', 
                      right_on='source', 
                      how='inner' 
                      )

In [33]:
two_transfer.head ()

Unnamed: 0,airline_lex,airline_id_lex,source_lex,source_id_lex,dest_lex,dest_id_lex,codeshare_lex,stops_lex,equipment_lex,id_source,name_source,city_source,country_source,iata_source,icao_source,latitude_source,longitude_source,altitude_source,timezone_source,dst_source,tz_source,type_source,id_dest,name_dest,city_dest,country_dest,iata_dest,icao_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,dst_dest,tz_dest,type_dest,airline_trnsfr,airline_id_trnsfr,source_trnsfr,source_id_trnsfr,dest_trnsfr,dest_id_trnsfr,codeshare_trnsfr,stops_trnsfr,equipment_trnsfr,airline,airline_id,source,source_id,dest,dest_id,codeshare,stops,equipment
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3,3M,20710.0,LWB,6958.0,ATL,3682.0,,0,SF3
1,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3,UA,5209.0,LWB,6958.0,IAD,3714.0,Y,0,SF3
2,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MCN,3754.0,,0,SF3,3M,20710.0,MCN,3754.0,ATL,3682.0,,0,SF3
3,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MCN,3754.0,,0,SF3,3M,20710.0,MCN,3754.0,MCO,3878.0,,0,SF3
4,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.6367,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MEI,4335.0,,0,SF3,3M,20710.0,MEI,4335.0,ATL,3682.0,,0,SF3


In [35]:
two_transfer = two_transfer.drop_duplicates(subset=['source_trnsfr', 'dest'])
two_transfer

Unnamed: 0,airline_lex,airline_id_lex,source_lex,source_id_lex,dest_lex,dest_id_lex,codeshare_lex,stops_lex,equipment_lex,id_source,name_source,city_source,country_source,iata_source,icao_source,latitude_source,longitude_source,altitude_source,timezone_source,dst_source,tz_source,type_source,id_dest,name_dest,city_dest,country_dest,iata_dest,icao_dest,latitude_dest,longitude_dest,altitude_dest,timezone_dest,dst_dest,tz_dest,type_dest,airline_trnsfr,airline_id_trnsfr,source_trnsfr,source_id_trnsfr,dest_trnsfr,dest_id_trnsfr,codeshare_trnsfr,stops_trnsfr,equipment_trnsfr,airline,airline_id,source,source_id,dest,dest_id,codeshare,stops,equipment
0,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3,3M,20710.0,LWB,6958.0,ATL,3682.0,,0,SF3
1,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,LWB,6958.0,,0,SF3,UA,5209.0,LWB,6958.0,IAD,3714.0,Y,0,SF3
3,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,MCN,3754.0,,0,SF3,3M,20710.0,MCN,3754.0,MCO,3878.0,,0,SF3
8,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,3M,20710.0,ATL,3682.0,TUP,5773.0,,0,SF3,3M,20710.0,TUP,5773.0,GLH,6130.0,,0,SF3
10,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,AZO,4039.0,,0,CRJ,AA,24.0,AZO,4039.0,ORD,3830.0,Y,0,ER4 ERD
12,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,AZO,4039.0,,0,CRJ,DL,2009.0,AZO,4039.0,DTW,3645.0,,0,ERJ
13,9E,3976.0,LEX,4017.0,ATL,3682.0,,0,CRJ,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3682,Hartsfield Jackson Atlanta International Airport,Atlanta,United States,ATL,KATL,33.636700,-84.428101,1026,-5.0,A,America/New_York,airport,9E,3976.0,ATL,3682.0,AZO,4039.0,,0,CRJ,DL,2009.0,AZO,4039.0,MSP,3858.0,Y,0,CRJ
15,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3830,Chicago O'Hare International Airport,Chicago,United States,ORD,KORD,41.978600,-87.904800,672,-6.0,A,America/Chicago,airport,AA,24.0,ORD,3830.0,AZO,4039.0,Y,0,ER4 ERD,9E,3976.0,AZO,4039.0,ATL,3682.0,,0,CRJ
16,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3830,Chicago O'Hare International Airport,Chicago,United States,ORD,KORD,41.978600,-87.904800,672,-6.0,A,America/Chicago,airport,AA,24.0,ORD,3830.0,AZO,4039.0,Y,0,ER4 ERD,AA,24.0,AZO,4039.0,ORD,3830.0,Y,0,ER4 ERD
18,AA,24.0,LEX,4017.0,ORD,3830.0,Y,0,ERD ER4,4017,Blue Grass Airport,Lexington KY,United States,LEX,KLEX,38.036499,-84.605904,979,-5.0,A,America/New_York,airport,3830,Chicago O'Hare International Airport,Chicago,United States,ORD,KORD,41.978600,-87.904800,672,-6.0,A,America/Chicago,airport,AA,24.0,ORD,3830.0,AZO,4039.0,Y,0,ER4 ERD,DL,2009.0,AZO,4039.0,DTW,3645.0,,0,ERJ
