### Your turn

The above map shows everywhere you can get to from Lexington on a direct flight.  Your job is to:

1. Make a map of all the possible destinations with one transfer. 
2. Make a map of all the possible desitnations with two transfers. 

Make the maps look nice!  Use color coding, vary the size of the features, or be selective about what you display in order to communicate the information effectively.  

Bonus: This is the air travel version of the Kevin Bacon game.  What is the number N, such that you can reach every airport in the world with N or fewer transfers?  

Extra Bonus: Use this very important piece of knowledge to impress your friends at parties!

In [None]:
import folium
import pandas as pd
import numpy as np

In [None]:
# These files use \N as a missing value indicator.  When reading the CSVs, we will tell
# it to use that value as missing or NA.  The double backslash is required because
# otherwise it will interpret \N as a carriage return. 

# Read in the airports data.
airports = pd.read_csv("data/airports.dat", header=None, na_values='\\N')
airports.columns = ["id", "name", "city", "country", "iata", "icao", "latitude", "longitude", "altitude","timezone", "dst", "tz", "type", "source"]

# Read in the airlines data.
airlines = pd.read_csv("data/airlines.dat", header=None, na_values='\\N')
airlines.columns = ["id", "name", "alias", "iata", "icao", "callsign", "country", "active"]

# Read in the routes data.
routes = pd.read_csv("data/routes.dat", header=None, na_values='\\N')
routes.columns = ["airline", "airline_id", "source", "source_id", "dest", "dest_id", "codeshare", "stops", "equipment"]

In [None]:
# one solution

print('There are a total of ' + str(len(airports)) + ' in the world.')

connected_airports = {'LEX'}
for i in range(0,15): 
    dest_airports = set()
    
    for index, row in routes.iterrows(): 
        if row['source'] in connected_airports: 
            dest_airports.add(row['dest'])
            
    # this creates the union of the sets
    connected_airports = connected_airports | dest_airports
    
    print('Within ' + str(i) + ' transfers we can reach a total of ' + str(len(connected_airports)) + ' airports.')


In [None]:
# seems like we max out at 3378 airports.  Let's check to see if that's right. 
# the apply function is a shortcut to avoid looping
# to see what's in there, dump the data to a csv and have a look

airports['connected'] = airports['iata'].apply(lambda x: x in connected_airports)
airports.to_csv('airports.csv')

In [None]:
# it looks like there are a bunch that don't have an airport code, so drop those, and 
# see how many valid aiports are left

airports = airports.dropna(subset=['iata'])
airports.to_csv('airports.csv')
len(airports)

In [None]:
# have a look at the unconnected airports.  Have you heard of any of those?  
# maybe they don't have any commercial flights going there.  
# create a separate set of all airports that have a route going there. 

airports_with_flights = set()

for index, row in routes.iterrows(): 
    airports_with_flights.add(row['source'])
    airports_with_flights.add(row['dest'])
        
print('There are a total of ' + str(len(airports_with_flights)) + ' airports with scheduled flights.')


In [None]:
# there are still a few we can't reach.  Drop those without flights, and look at those that are still detached

airports['has_flights'] = airports['iata'].apply(lambda x: x in airports_with_flights)
airports = airports[airports['has_flights']==True]
airports.to_csv('airports.csv')

In [None]:
# these are our leftovers.  Boing Field makes sense.  Not sure why Melbourne is on there.  Don't know the others. 
# you can dig into these more to see what is going on.  Maybe the other end is not a valid airport

airports[airports['connected']==False]