# 3. Finding Best Routes (Q3)

## Task: Finding the Best Flight Routes

### Problem Statement

When planning to fly to a specific city, the goal is to find the **most efficient and fastest flight** that minimizes the **total distance flown**. 

The focus is on flights operating on a **specific day**.

#### Key Considerations:
- Each city may have **multiple airports**.  
- The function should compute the **best route** for every possible pair of airports between the two cities.

For example, if city **A** has airports \( a_1, a_2 \) and city **B** has airports \( b_1, b_2 \), the function should calculate the best routes for:
-  `a₁ → b₁`
-  `a₁ → b₂`
-  `a₂ → b₁`
-  `a₂ → b₂`

If no valid route exists for a given pair, it should be reported as **"No route found"**.

---

### Function Inputs:
1. **Flights network**: A graph representation of the flight data (airports and distances).
2. **Origin city name**: The starting city.
3. **Destination city name**: The target city.
4. **Considered Date**: The specific date for which flights are being analyzed, in `yyyy-mm-dd` format.

---

### Function Output:
A table with the following three columns:
1. **`Origin_city_airport`**: The code of the origin airport.
2. **`Destination_city_airport`**: The code of the destination airport.
3. **`Best_route`**: A list of airport names showing the order of travel in the optimal route.  
   - The route should minimize the total distance flown.  
   - If no valid route exists, the entry should display **"No route found"**.

____



In [20]:
import pandas as pd
import heapq
from collections import defaultdict

class FlightNetwork:
    def __init__(self):
        self.graph = defaultdict(list)  

    def add_flight(self, origin, destination, distance):
        self.graph[origin].append((destination, distance))

    def dijkstra(self, start, end):
        heap = [(0, start, [start])] 
        visited = set()

        while heap:
            current_distance, current_node, path = heapq.heappop(heap)

            if current_node in visited:
                continue
            visited.add(current_node)

            if current_node == end:
                return path, current_distance  

            for neighbor, weight in self.graph[current_node]:
                if neighbor not in visited:
                    heapq.heappush(heap, (current_distance + weight, neighbor, path + [neighbor]))

        return "No route found", float('inf')

    def load_and_build_network(file_path, date, origin_city, destination_city):
        df = pd.read_csv(file_path)

        required_columns = {'Fly_date', 'Origin_city', 'Destination_city', 'Origin_airport', 'Destination_airport', 'Distance'}
        if not required_columns.issubset(set(df.columns)):
            raise KeyError(f"Dataset is missing required columns. Found columns: {df.columns}")

        df['Fly_date'] = pd.to_datetime(df['Fly_date']).dt.date
        df_date = df[
            (df['Fly_date'] == pd.to_datetime(date).date()) &
            (df['Origin_city'] == origin_city) &
            (df['Destination_city'] == destination_city)
        ]
        print(f"Number of flights on {date} from {origin_city} to {destination_city}: {len(df_date)}")

        if df_date.empty:
            raise ValueError(f"No flights found for {origin_city} to {destination_city} on {date}.")
        
        network = FlightNetwork()

        for _, row in df_date.iterrows():
            network.add_flight(row['Origin_airport'], row['Destination_airport'], row['Distance'])

        return network, df_date



    def find_best_routes(network, df_date, origin_city, destination_city):
        origin_airports = df_date[df_date['Origin_city'] == origin_city]['Origin_airport'].unique()
        destination_airports = df_date[df_date['Destination_city'] == destination_city]['Destination_airport'].unique()

        results = []

        for origin_airport in origin_airports:
            for destination_airport in destination_airports:
                path, distance = network.dijkstra(origin_airport, destination_airport)
                results.append({
                    'Origin_city_airport': origin_airport,
                    'Destination_city_airport': destination_airport,
                    'Best_route': ' → '.join(path) if isinstance(path, list) else path,
                    'Distance': distance
                })

        return pd.DataFrame(results).sort_values(by='Distance', ascending=True)

In [23]:
file_path = 'C:/Users/EMILIO/Documents/università/ADM/ADM-HW5/ADM-HW5/Airports2.csv'  

data = "2001-01-01"  
origin_city = "New York, NY"
destination_city = "Chicago, IL"


network, df = FlightNetwork.load_and_build_network(file_path, data, origin_city, destination_city)
results = FlightNetwork.find_best_routes(network, df, origin_city, destination_city)

print("Best Routes Between Cities:")
print(results)


Number of flights on 2001-01-01 from New York, NY to Chicago, IL: 27
Best Routes Between Cities:
  Origin_city_airport Destination_city_airport      Best_route  Distance
1                 LGA                      MDW       LGA → MDW     725.0
0                 LGA                      ORD       LGA → ORD     733.0
2                 JFK                      ORD       JFK → ORD     740.0
3                 JFK                      MDW  No route found       inf
