---

# 3 Finding Best Routes (Q3)
Whenever you plan to fly to a specific city, your goal is to find the most efficient and fastest flight to reach your destination. In the system you are designing, the best route is defined as the one that minimizes the total distance flown to the greatest extent possible.
> In this task, you need to implement a function that, given an origin and destination city, determines the best possible route between them. To simplify, the focus will be limited to flights operating on a specific day.

Note: Each city may have multiple airports; in such cases, the function should calculate the best route for every possible airport pair between the two cities. For example, if city A has airports a1, a2, and city B has b1, b2, the function should compute the best routes for a1 → b1, a1 → b2, a2 → b1 and a2 → b2. If it's not possible to travel from one airport in the origin city to another airport in the destination city on that date, you must report it as well.

The function takes the following inputs:
1. Flights network
2. Origin city name
3. Destination city name
4. Considered Date (in yyyy-mm-dd format)

The function output:
1. A table with three columns: 'Origin _city_airport, 'Destination_city_airport', and the 'Best_route.

Note: In the "Best _route" column, we expect a list of airport names connected by →, showing the order in which they are to be visited during the optimal route. If no such route exists, the entry should display "No route found."

---

In [1]:
import modules.shortest_path
import pandas as pd
import numpy as np
df = pd.read_csv('Airports2.csv')
df = df.drop_duplicates(keep='last') # Drop duplicates, keeping the last entry

Having loaded the data and removed duplicate entries by keeping the last entries only, we can proceed with selecting only the relevant columns for the task of Question 3. Namely by selecting by only keeping the columns 'Origin_airport', 'Destination_airport', 'Fly_date', 'Origin_city', 'Destination_city', 'Distance' and making a new dataframe, **working_df**.

In [3]:
df.head()

Unnamed: 0,Origin_airport,Destination_airport,Origin_city,Destination_city,Passengers,Seats,Flights,Distance,Fly_date,Origin_population,Destination_population,Org_airport_lat,Org_airport_long,Dest_airport_lat,Dest_airport_long
0,MHK,AMW,"Manhattan, KS","Ames, IA",21,30,1,254,2008-10-01,122049,86219,39.140999,-96.670799,,
1,EUG,RDM,"Eugene, OR","Bend, OR",41,396,22,103,1990-11-01,284093,76034,44.124599,-123.211998,44.254101,-121.150002
2,EUG,RDM,"Eugene, OR","Bend, OR",88,342,19,103,1990-12-01,284093,76034,44.124599,-123.211998,44.254101,-121.150002
3,EUG,RDM,"Eugene, OR","Bend, OR",11,72,4,103,1990-10-01,284093,76034,44.124599,-123.211998,44.254101,-121.150002
4,MFR,RDM,"Medford, OR","Bend, OR",0,18,1,156,1990-02-01,147300,76034,42.374199,-122.873001,44.254101,-121.150002


In [2]:
# I need origin_airport, destination_airport, fly_date, origin_city, destination_city, and distance
working_df = df[['Origin_airport', 'Destination_airport', 'Fly_date', 'Origin_city', 'Destination_city', 'Distance']].copy()

Performing some sanity checks below:

In [5]:
working_df.isna().sum()

Origin_airport         0
Destination_airport    0
Fly_date               0
Origin_city            0
Destination_city       0
Distance               0
dtype: int64

In [6]:
working_df.dtypes

Origin_airport         object
Destination_airport    object
Fly_date               object
Origin_city            object
Destination_city       object
Distance                int64
dtype: object

In [7]:
print(f"Any empty values in 'Origin_airport' column: {sum(working_df.Origin_airport == "")}") # check if there are empty strings in Origin_airport
print(f"Any empty values in 'Destination_airport' column: {sum(working_df.Destination_airport == "")}") # check if there are empty strings in Destination_airport
print(f"Any empty values in 'Fly_date' column: {sum(working_df.Fly_date == "")}") # check if there are empty strings in Fly_date
print(f"Any empty values in 'Origin_city' column: {sum(working_df.Origin_city == "")}") # check if there are empty strings in Origin_city
print(f"Any empty values in 'Destination_city' column: {sum(working_df.Destination_city == "")}") # check if there are empty strings in Destination_city
print(f"Any NA value in 'Distance' column: {working_df.Distance.isna().sum()}") # check if there are empty strings in Distance

Any empty values in 'Origin_airport' column: 0
Any empty values in 'Destination_airport' column: 0
Any empty values in 'Fly_date' column: 0
Any empty values in 'Origin_city' column: 0
Any empty values in 'Destination_city' column: 0
Any NA value in 'Distance' column: 0


In [3]:
working_df.Origin_airport = working_df.Origin_airport.str.strip()
working_df.Destination_airport = working_df.Destination_airport.str.strip()
working_df.Fly_date = working_df.Fly_date.str.strip()
working_df.Origin_city = working_df.Origin_city.str.strip()
working_df.Destination_city = working_df.Destination_city.str.strip()

Because the algorithm we will use to find the shortest path in terms of distances requires non negative weights (distances), referring to Dijkstra's Algorithm, we have to ensure that no distance in the dataframe is negative and if, a few of them are negative, perhaps due to errors, we will remove those records.

In [4]:
working_df.Distance = working_df.Distance.astype(int)
print(f"Is there any record of flight with negative distance? Answer: {'Yes' if np.any(working_df.Distance < 0) else 'No'}")

Is there any record of flight with negative distance? Answer: No


We will now proceed to convert our dataframe to a __network__ of flights by making use of NetworkX data structure. We will create a flight network, using the `create_flight_network` defined in __shortest_path__ module, that takes in a dataframe and returns a Directed Graph (based on NetworkX data structure) along with relevant nodes, edges, and *data*.

In [6]:
help(modules.shortest_path.create_flight_network)
flight_network = modules.shortest_path.create_flight_network(working_df)

Help on function create_flight_network in module modules.shortest_path:

create_flight_network(working_df)
    Input:
    working_df: pd.DataFrame, the working dataframe
    
    Output:
    G: nx.DiGraph, the flight network
    
    About:
    This function creates a directed graph using the networkx library.
    The nodes of the graph are the airports and the edges are the flights between the airports.
    The graph has the following attributes:
    - Node attributes: city
    - Edge attributes: distance, date



100%|██████████| 3565050/3565050 [00:05<00:00, 603082.35it/s]


In [11]:
import zipfile
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
import re
from collections import defaultdict
import folium
import requests
import time
from modules.graph import *
from modules.utils import *
from modules.shortest_path import *
from IPython.display import display
import random
import heapq
from collections import deque
import math

In [26]:
def degree_centrality(G, node):
    '''
    Function that calculates the degree centrality of a node based on the number of neighbors
    Inputs:
    - G (nx.DiGraph): graph of the flight network
    - node (str): node of the airport
    Outputs:
    - int: degree centrality of the node, number of neighbors
    '''
    return G.degree(node) / (G.number_of_nodes()-1)

def closeness_centrality(G, node):
    '''
    Function that calculates the closeness centrality of a node
    Inputs:
    - G (nx.DiGraph): graph of the flight network
    - node (str): node of the airport
    Outputs:
    - int: closeness centrality of the node
    '''
    distances, _ = deploy_dijkstra(G, node) # compute distances to each other node
    reachable = [x for x in distances.values() if x < float('inf')] # keep only reachable nodes
    if len(reachable) <=1: # if there are no reachable nodes, return 0
        return 0
    normalization = (G.number_of_nodes() - 1) / (len(reachable) - 1) # optional
    return (G.number_of_nodes()-1) / sum(reachable) * normalization

def betweenness_centrality(G, node=None):
    '''
    Function that calculates the betweenness centrality of a node
    Inputs:
    - G (nx.DiGraph): graph of the flight network
    - node (str): node of the airport
    Outputs:
    - int: betweenness centrality of the node if indicated, all betweenness centralities otherwise
    '''
    C_B = defaultdict(float) # initialize default dictionaries of betwenness centrality scores

    # Iterate over source nodes s
    for s in G.nodes:

        # Initializations
        distances = {x: float('inf') for x in G.nodes} # intialize distances to s (lengths of shortest paths)
        distances[s] = 0 # set distance of s to itself to 0
        predecessors = defaultdict(list) # initialize defaultdict of predecessors
        pq = [(0,s)] # initialize priority queue
        num_shortest_paths = defaultdict(int) # initialize dictionary of numbers of shortest paths
        num_shortest_paths[s] = 1 # number of shortest paths of source is 1
        backtrack_stack = deque() # initialize empty stack for backtracking

        # Convert priority queue into a heap
        heapq.heapify(pq)
        
        #print(f'Computing Dijkstra for airport {s} now.')

        # Dijkstra loop
        while len(pq)>0:
            # Extract tuple (distance, node) with smallest distance from the heap
            current_distance, current_node = heapq.heappop(pq)
            # Add current_node to backtrack_stack
            backtrack_stack.append(current_node)
            # If the current distance is larger than the distance of the current node
            if current_distance > distances[current_node]:
                continue # Skip the rest of the loop

            # Iterate over the neighbors of current_node
            for neighbor in G.successors(current_node):
                
                #print(f'Analyzing neighbor {neighbor} of {current_node}')

                # Compute the distance do the neighbor using the 'distance' attribute
                edge_weight = G[current_node][neighbor]['distance']
                # Compute new distance
                new_distance = current_distance + edge_weight
                # See if the new distance is smaller than the previous distance to the neighbor
                if distances[neighbor] > new_distance:
                    # Update distance of this neighbor to the source s
                    distances[neighbor] = new_distance
                    # Update number of shortest paths from s to neighbor
                    num_shortest_paths[neighbor] = num_shortest_paths[current_node]
                    # Update priority queue, pushing the neighbor with the new distance
                    heapq.heappush(pq, (new_distance, neighbor))
                    # Update predecessors
                    predecessors[neighbor] = [current_node]
                # See if the new distance is equally long as the previous one
                elif distances[neighbor] == new_distance:
                    # Update number of shortest paths from source to the neighbor
                    num_shortest_paths[neighbor] += num_shortest_paths[current_node]
                    # Update predecessors
                    predecessors[neighbor].append(current_node)
            
        # Initialize dictionary of dependencies
        dependencies = defaultdict(float)

        #print(f'Backtracking for airport {s} now.')
        
        # Backpropagation through the graph to find dependencies
        while backtrack_stack:
            # Take last element from the stack
            current_node = backtrack_stack.pop()
            # Iterate over the predecessors of the current node
            for predecessor in predecessors[current_node]:
                # Update dependency of source s on predecessor
                dependencies[predecessor] += (num_shortest_paths[predecessor] / num_shortest_paths[current_node]) * (1 + dependencies[current_node])                # If the current node is not the source node
            if current_node != s:
                # Update betweenness centrality
                C_B[current_node] += dependencies[current_node]

    # Compute normalization
    n = G.number_of_nodes()
    normalization = (n-1) * (n-2)

    # Normalize betweenness centralities
    for key in C_B.keys():
        C_B[key] /= normalization

    # return betweenness centrality of target node
    if node:
        return C_B[node]
    return C_B
    

def pagerank(G, node, a=0.5, seed=42, T=10000):
    '''
    Function that calculates the betweenness centrality of a node
    Inputs:
    - G (nx.DiGraph): graph of the flight network
    - node (str): starting node
    - a (int): parameter in [0,1]
    - seed (int): random seed
    - T (int): nuber of steps
    Outputs:
    - int: betweenness centrality of the node
    '''
    random.seed(seed) # set random seed
    t = 1
    current = node # starting node
    freq = defaultdict(int) # initialize defaultdict to measure the frequency with which a node is seen

    # Random walk through the airport graph
    while t<T+1:
        coin_flip = random.choices([0, 1], weights=[1-a, a], k=1)[0] # flip a coin

        # If outcome is 0, go to a random out-neighbor of the current node
        if coin_flip==0:
            current = random.choice(list(G.successors(node)))
        
        # If outcome is 1, go to a random node
        if coin_flip==1:
            current = random.choice(list(G.nodes))
        
        freq[current] +=1
        t += 1
    
    # Set the Pagerank
    Pagerank = defaultdict(int)
    for u in G.nodes:
        Pagerank[u] = freq[u] / T

    return Pagerank[node]

In [124]:
from heapq import heappop, heappush  # Import heap functions for priority queue operations
from itertools import count  # Import count to generate unique sequence numbers

def betweenness_centrality(G):
    """
    Calculate the betweenness centrality for a directed, weighted G.

    Parameters:
        G (dict): A directed, weighted G represented as an adjacency list.

    Returns:
        dict: A dictionary with nodes as keys and their betweenness centrality as values.
    """

    # Initialize betweenness centrality for each node to 0.0
    centrality = dict.fromkeys(G, 0.0)
    all_nodes = G  # Extract all nodes from the G

    # Iterate over all nodes in the G to calculate their contributions to centrality
    for start_node in all_nodes:

        # Dijkstra's algorithm setup
        visited_stack = []  # Stack to keep track of the nodes visited in order
        predecessors = {}  # Dictionary to store predecessors of each node
        for node in G:
            predecessors[node] = []

        path_count = dict.fromkeys(G, 0.0)  # Initialize path counts for each node
        shortest_distances = {}  # Dictionary to store shortest distances from the start node
        path_count[start_node] = 1.0  # There's one path to the start node itself

        # Priority queue for nodes to be explored; stores (distance, unique ID, predecessor, current node)
        push = heappush
        pop = heappop
        seen_distances = {start_node: 0}  # Dictionary to track the minimum distance seen for each node
        node_counter = count()  # Unique sequence numbers for heap operations
        priority_queue = []
        push(priority_queue, (0, next(node_counter), start_node, start_node))

        # Process the priority queue until empty
        while priority_queue:
            (current_distance, _, from_node, current_node) = pop(priority_queue)

            # Skip processing if the node is already finalized
            if current_node in shortest_distances:
                continue

            # Update the number of shortest paths to the current node
            path_count[current_node] += path_count[from_node]
            visited_stack.append(current_node)  # Add the current node to the stack

            # Finalize the shortest distance to the current node
            shortest_distances[current_node] = current_distance

            # Explore neighbors of the current node
            for neighbor, _ in G[current_node].items():
                distance_to_neighbor = current_distance + G.edges[current_node, neighbor]['distance']

                # If a shorter path to the neighbor is found, update the priority queue and path counts
                if neighbor not in shortest_distances and (neighbor not in seen_distances or distance_to_neighbor < seen_distances[neighbor]):
                    seen_distances[neighbor] = distance_to_neighbor
                    push(priority_queue, (distance_to_neighbor, next(node_counter), current_node, neighbor))
                    path_count[neighbor] = 0.0
                    predecessors[neighbor] = [current_node]

                # If another shortest path to the neighbor is found, update path counts and predecessors
                elif distance_to_neighbor == seen_distances[neighbor]:
                    path_count[neighbor] += path_count[current_node]
                    predecessors[neighbor].append(current_node)

        # Accumulate dependencies for betweenness centrality
        dependencies = dict.fromkeys(visited_stack, 0)  # Initialize dependency for each node in the stack
        while visited_stack:
            current_node = visited_stack.pop()  # Pop nodes in reverse order of finishing times
            coefficient = (1 + dependencies[current_node]) / path_count[current_node]
            for from_node in predecessors[current_node]:
                dependencies[from_node] += path_count[from_node] * coefficient

            # Accumulate betweenness centrality for nodes other than the start node
            if current_node != start_node:
                centrality[current_node] += dependencies[current_node]

    # Normalize the betweenness centrality values
    total_nodes = len(G)  # Total number of nodes in the G
    if total_nodes <= 2:
        scale_factor = None  # No normalization if there are less than 3 nodes
    else:
        scale_factor = 1 / ((total_nodes - 1) * (total_nodes - 2))

    if scale_factor is not None:
        for node in centrality:
            centrality[node] *= scale_factor

    return centrality


In [125]:
betweenness_centrality(flight_network)['LGA']

0.004842129824535282

In [51]:
nx.betweenness_centrality(flight_network,weight='distance')['LGA']

0.004842129824535282

Having prepared the flight network, we can get a brief look at the problem we are trying to solve: <br>

__Generic Problem__: Single-Source Shortest Paths.

__Input__: A directed graph $G = (V, E)$, a starting vertex $s \in V$, and a nonnegative length $l_e$ for each edge $e \in E$. <br>
__Output__: $dist(s,v)$ for every vertex $v \in V$. <br> <br>

---

<h4 style="text-align:center;">Pseudocode for the Algorithm Deployed: <i>(deploy_dijkstra in the shortest_path module)</i></h4>

$$Dijkstra's Algorithm$$

**Input:**
- A graph $ G = (V, E) $ in adjacency-list form. Each edge $ e = (u, v) $ must have a non-negative weight $ l_{u,v} $.
- A source node $ s \in V $.

**Output:**
- `distances[v]`: The shortest distance from $ s $ to every node $ v $.
- `predecessors[v]`: The predecessor of $ v $ in the shortest path, which we will use for path reconstruction.


__Steps:__

1. **Initialization:**
    > Create a `distances` dictionary: <br>
     - distances[v] = $+\infty$ for all $ v \in V $ (because we don't know the shortest distance).<br>
     - distances[s] = 0 (distance to the source is zero).<br>
    > Create a `predecessors` dictionary:<br>
     - predecessors[v] = None, for all $ v \in V $ (initially, we don't know the path leading to each V so we equate each to None).<br>
    > Create a `priority_queue` (min-heap) and insert $(0, s)$, where first value in the tuple is the initialised distance and second value in the name of the Node. *The queue ensures that the node with the smallest known distance is processed first.*<br>

2. **Main Loop:** <br>
    While `priority_queue` is not empty: <br>
    > Extract the node with the smallest distance by popping the first tuple from the queue: `current_distance, current_node = heapq.heappop(priority_queue)`. <br>
    > If `current_distance > distances[current_node]`, skip this node (stale entry). <br>
    > For each neighbor $ v $ of `current_node`:<br>
       - Retrieve the edge weight (distance): $ l_{current\_node, v} $.<br>
       - Compute the *tentative* distance:<br>
         - `new_distance = current_distance + l_{current_node, v}`.<br>
       - If `new_distance < distances[v]`:<br>
         - Update `distances[v] = new_distance`.<br>
         - Update `predecessors[v] = current_node` so that we get a record of what preceeded the node in ordering of distance.<br>
         - Push `(new_distance, v)` into the `priority_queue` so that we can iteratively start from the next node with the smallest distance.<br>

3. **Returns:**
   - `distances`: The shortest distance from $ s $ to every reachable node.
   - `predecessors`: The predecessor of each node for path reconstruction.

__Postcondition__: for every vertex $v$, the value $len(v)$ equals the true shortest-path distance $dist(s, v)$.

---

In the the __shortest_path__ module, we have created a function `deploy_dijkstra` using the pseudocode outlined about. It takes a flight network and a source airport code as input, and returns the shortest paths from the source airport to all other airports in the network using Dijkstra's algorithm. For more information, please refer to the docstring of the function using the following command: `help(shortest_path.deploy_dijkstra)` or navigate to the __shortest_path__ module and check the docstring of the function. <br>

In [12]:
print(f"Printing the Docstring of the function deploy_dijkstra for reference:")
help(shortest_path.deploy_dijkstra)

Printing the Docstring of the function deploy_dijkstra for reference:
Help on function deploy_dijkstra in module shortest_path:

deploy_dijkstra(flight_network, source)
    Input:
    flight_network: nx.DiGraph
    source: str, the origin airport code

    Output:
    shortest_paths: dict, key: destination airport code, value: shortest path from source to destination
    predecessors: dict, key: destination airport code, value: predecessor of the destination airport

    About:
    This function computes the shortest paths from a source airport to all other airports in the network using Dijkstra's algorithm.



Recalling that the questions requires us to take find the shortest paths between airports from an origin city to airports in a destination city considering a __specific date__, we have created a fucntion named as `get_graph_for_a_date`which takes as inputs a date (str) and a directed graph, and returns back a directed graph but with nodes and edges which have the date attribute equal to the inserted into the function. This function will serve as a helper function for the next function `get_shortest_paths_for_a_date` which we will describe shortly. For more information on `get_graph_for_a_date`, please refer to the docstring of the function using the following command: `help(shortest_path.get_graph_for_a_date)` or navigate to the __shortest_path__ module and check the docstring of the function. <br>

In [8]:
print(f"Printing the Docstring of the function get_graph_for_a_date for reference:")
help(modules.shortest_path.get_graph_for_a_date)

Printing the Docstring of the function get_graph_for_a_date for reference:
Help on function get_graph_for_a_date in module modules.shortest_path:

get_graph_for_a_date(G, date)
    Inputs:
    G: (nx.DiGraph): Original graph.
    date: (str): Date to filter edges.
    
    Output:
    graph_for_a_date: (nx.DiGraph): Filtered graph.
    
    About:
    Filters the graph to include only flights available on the given date.



We have defined our Dijkstra's algorithm in a way that it also returns a dictionary of __predecessors__, which is a mapping of each node to its predecessor. This dictionary will be used to reconstruct the shortest path from the source to the destination. Here is where the __helper function__ `reconstruct_path` comes into play. It takes as inputs the predecessors dictionary, the source node, and the destination node, and returns the shortest path as a list of nodes. For more information on `reconstruct_path`, please refer to the docstring of the function using the following command: `help(shortest_path.reconstruct_path)` or navigate to the __shortest_path__ module and check the docstring of the function. <br>

In [14]:
print(f"Printing the Docstring of the function reconstruct_path for reference:")
help(shortest_path.reconstruct_path)

Printing the Docstring of the function reconstruct_path for reference:
Help on function reconstruct_path in module shortest_path:

reconstruct_path(predecessors, source, destination)
    Inputs:
    - predecessors (dict): Mapping of each node to its predecessor.
    - source (str): The source node.
    - destination (str): The destination node.

    Output:
    path: (list): List of nodes representing the shortest path.

    About:
    Reconstructs the shortest path from source to destination.



Finally, we can move on to the main function `get_shortest_paths_for_a_date`. This function takes as inputs a date (str), a source city (str), a destination city (str), and a directed graph, and returns the shortest paths, in form of a __table__, from all the airports in the source city to all airports in the destination city on the given date. The function uses the helper functions `get_graph_for_a_date`, `deploy_dijkstra`, and `reconstruct_path` to achieve this. For more information on `get_shortest_paths_for_a_date`, please refer to the docstring of the function using the following command: `help(shortest_path.get_shortest_paths_for_a_date)` or navigate to the __shortest_path__ module and check the docstring of the function.

In [15]:
print(f"Printing the Docstring of the function get_shortest_paths_for_a_date for reference:")
help(shortest_path.get_shortest_paths_for_a_date)

Printing the Docstring of the function get_shortest_paths_for_a_date for reference:
Help on function get_shortest_paths_for_a_date in module shortest_path:

get_shortest_paths_for_a_date(flight_network, origin_city_name, destination_city_name, date)
    Inputs:
    flight_network: nx.DiGraph, the flight network
    origin_city_name: str, the name of the origin city
    destination_city_name: str, the name of the destination city
    date: str, the date of the flight

    Output:
    df: pd.DataFrame, the table with the best routes

    About:
    This function computes the best routes between all possible airport pairs between the origin and destination cities on a given date.



In [9]:
np.random.seed(21) # Set the seed for reproducibility
origin_city = "New York, NY"
destination_city = "Los Angeles, CA"    
all_dates = list(set(working_df.Fly_date.values)) # Get all the dates in the dataset
date = np.random.choice(all_dates) # Randomly select one date using np.random.choice
shortest_paths = modules.shortest_path.get_shortest_paths_for_a_date(flight_network, origin_city, destination_city, date)

In [10]:
print(f"The following table prints the best routes from a particular airport in a specific origin city to a particular airport in a specific destination city on a particular date: {date}")
shortest_paths

The following table prints the best routes from a particular airport in a specific origin city to a particular airport in a specific destination city on a particular date: 2006-07-01


Unnamed: 0,Origin_city_airport,Destination_city_airport,Best_route,Total_distance
0,JFK,LAX,No route found,
1,LGA,LAX,No route found,
2,JRA,LAX,No route found,
3,JRB,LAX,No route found,
4,TSS,LAX,No route found,
5,WTC,LAX,No route found,


Even though it was not requested, we also added total_distance from origin airport to the destination airport for comparisons. To confirm if our results are correct, we thought of checking about the possibility that the there is an edge (a flight) from an airport in New York to an airport in Los Angeles (only LAX) because both cities are rather infamous. If there is an edge, then obviously that should be the shortest path between the two airports (instead of taking multiple connecting flights)

In [27]:
# Get the filtered flight network for the specific date
flight_network_filtered = shortest_path.get_graph_for_a_date(flight_network, date)

# Get airports for New York City
airports_for_nyc = [airport_code for airport_code, data in flight_network_filtered.nodes(data=True) if data['city'] == 'New York, NY']
print(f"Airports for New York City: {airports_for_nyc}")

# Get airports for Los Angeles
airports_for_la = [airport_code for airport_code, data in flight_network_filtered.nodes(data=True) if data['city'] == 'Los Angeles, CA']
print(f"Airports for Los Angeles: {airports_for_la}")

# Check for flights from New York City to Los Angeles
for airport_code in airports_for_nyc:
    if flight_network_filtered.has_edge(airport_code, airports_for_la[0]):
        print(f"Flight from New York City to Los Angeles on {date} is available from airport {airport_code}")
        # Print the distances
        print(f"Distance: {flight_network_filtered[airport_code][airports_for_la[0]]['distance']} miles")
    else:
        print(f"No flight available from New York City to Los Angeles, on {date} from airport {airport_code} to {airports_for_la}. For this reason, no distance available")

Airports for New York City: ['JFK', 'LGA', 'JRA', 'JRB', 'TSS', 'WTC']
Airports for Los Angeles: ['LAX']
No flight available from New York City to Los Angeles, on 2009-09-01 from airport JFK to ['LAX']. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport LGA to ['LAX']. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport JRA to ['LAX']. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport JRB to ['LAX']. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport TSS to ['LAX']. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport WTC to ['LAX']. For this reason, no distance available


The output above shows that there was no direct flight from NYC to LA - thus solidying the correctness of our results.

---

**Another check**

Using the original flight network (not filtered by date) to check for flights from New York City to Los Angeles so that we can get the date of the flight

In [32]:
airports_for_nyc = [airport_code for airport_code, data in flight_network.nodes(data=True) if data['city'] == 'New York, NY']
print(f"Airports for New York City: {airports_for_nyc}")

# Get airports for Los Angeles
airports_for_la = [airport_code for airport_code, data in flight_network.nodes(data=True) if data['city'] == 'Los Angeles, CA']
print(f"Airports for Los Angeles: {airports_for_la}")

# Check for flights from New York City to Los Angeles
for airport_code in airports_for_nyc:
    if flight_network.has_edge(airport_code, airports_for_la[0]):
        print(f"Flight from New York City to Los Angeles on {flight_network[airport_code][airports_for_la[0]]['date']} is available from airport {airport_code}")
        # Print the distances
        print(f"Distance: {flight_network[airport_code][airports_for_la[0]]['distance']} miles")
    else:
        print(f"No flight available from New York City to Los Angeles, on {date} from airport {airport_code} to {airports_for_la[0]}. For this reason, no distance available")

Airports for New York City: ['JFK', 'LGA', 'JRA', 'JRB', 'TSS', 'WTC']
Airports for Los Angeles: ['LAX']
Flight from New York City to Los Angeles on 2009-12-01 is available from airport JFK
Distance: 2475 miles
Flight from New York City to Los Angeles on 2009-10-01 is available from airport LGA
Distance: 2469 miles
No flight available from New York City to Los Angeles, on 2009-09-01 from airport JRA to LAX. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport JRB to LAX. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport TSS to LAX. For this reason, no distance available
No flight available from New York City to Los Angeles, on 2009-09-01 from airport WTC to LAX. For this reason, no distance available


In [33]:
date = '2009-12-01'
shortest_paths = shortest_path.get_shortest_paths_for_a_date(flight_network, origin_city, destination_city, date)

print(f"The following table prints the best routes from a particular airport in a specific origin city to a particular airport in a specific destination city on a particular date")

shortest_paths

The following table prints the best routes from a particular airport in a specific origin city to a particular airport in a specific destination city on a particular date


Unnamed: 0,Origin_city_airport,Destination_city_airport,Best_route,Total_distance
0,JFK,LAX,JFK->LAX,2475.0
1,LGA,LAX,LGA->ORD->LAX,2478.0
2,JRA,LAX,No route found,
3,JRB,LAX,No route found,
4,TSS,LAX,No route found,
5,WTC,LAX,No route found,


The output above shows matches the true minimum intuitive distance!

---