# Finding the Cheapest and Shortest Flights from Riga

## Objective: 

Design an algorithm to find the cheapest and shortest flights from Riga to each European capital city. While some flights may have negative costs due to promotions, distances are always positive. The solution should cater to both budget-conscious travelers and those seeking to minimize environmental impact by traveling shorter distances. Additionally, detect if there are any negative cycles in the flight routes.

## Problem Statement
Input:

A directed graph representing European capital cities as nodes and flights as weighted edges.
Each flight has a cost that could be positive, zero, or negative (due to promotions) and a distance that is always positive.
Tasks:

## Implement an algorithm to:
Find the cheapest flights from Riga to all other European capitals.
Find the shortest flights (by distance) from Riga to all other European capitals.
Bonus: Detect if a negative cycle exists in the graph, which would imply an infinitely decreasing flight cost by continuously traversing the cycle.


## Dataset:

Each capital city is connected to between 3 to 5 other capitals.
Each connection has both a cost and a distance.
The dataset includes at least 30-40 European capitals to ensure sufficient complexity.


## Requirements
Algorithm Choice:

For finding the cheapest flights (by cost), use Bellman-Ford or an equivalent algorithm that can handle negative weights.
For finding the shortest flights (by distance), use Dijkstra’s algorithm  or above Bellman-Ford or another algorithm optimized for non-negative weights.
For the bonus, use the cycle detection capabilities of Bellman-Ford or an alternative method if preferred.
Implementation Details:

## Output two lists or tables:

Cheapest flights: minimum flight cost from Riga to each other capital.
Shortest flights: minimum distance from Riga to each other capital.
If a capital is unreachable from Riga, denote it appropriately in the output.
For the bonus, indicate if a negative cycle was detected.
Additional Constraints:

Students should handle edge cases, such as isolated cities or cities with only long-distance connections.
Solutions should be efficient, given the dataset size.

## Expected Output
Two tables or dictionaries (can be saved to text,csv,json, etc):
One for cheapest costs from Riga to each capital.
One for shortest distances from Riga to each capital.
An indicator of whether a negative cycle was detected.


In [2]:
import pandas as pd
import heapq
# Initialize directed graphs for distance and cost
class Graph:
    def __init__(self):
        self.adjacency_list = {}

    def add_node(self, node):
        if node not in self.adjacency_list:
            self.adjacency_list[node] = []

    def add_edge(self, from_node, to_node, weight):
        self.add_node(from_node)
        self.add_node(to_node)

        self.adjacency_list[from_node].append((to_node, weight))

    def get_neighbors(self, node):
        return self.adjacency_list.get(node, [])

    def __str__(self):
        return str(self.adjacency_list)

def initialize_graph(file_path):
    file_path = 'Enhanced_Flight_Dataset_with_Distances.csv'


    flight_data = pd.read_csv(file_path)
    cost_graph = Graph()
    distance_graph = Graph()

    for _, row in flight_data.iterrows():
        from_city = row['From']
        to_city = row['To']
        distance = row['Distance (km)']
        cost = row['Cost']

        distance_graph.add_edge(from_city, to_city, distance)
        cost_graph.add_edge(from_city, to_city, cost)

    return distance_graph, cost_graph


def bellman_ford(graph, start):
    distances = {node: float('inf') for node in graph.adjacency_list}
    distances[start] = 0

    for _ in range(len(graph.adjacency_list) - 1):
        for node in graph.adjacency_list:
            for neighbor, weight in graph.get_neighbors(node):
                if distances[node] + weight < distances[neighbor]:
                    distances[neighbor] = distances[node] + weight

    for node in graph.adjacency_list:
        for neighbor, weight in graph.get_neighbors(node):
            if distances[node] + weight < distances[neighbor]:
                raise ValueError("Graph contains a negative-weight cycle")

    return distances


def dijkstra(graph, start):
    distances = {node: float('inf') for node in graph.adjacency_list}
    distances[start] = 0

    priority_queue = [(0, start)]  # (distance, node)

    while priority_queue:
        current_distance, current_node = heapq.heappop(priority_queue)

        if current_distance > distances[current_node]:
            continue

        for neighbor, weight in graph.get_neighbors(current_node):
            distance = current_distance + weight

            # Only update if a shorter path to the neighbor is found
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(priority_queue, (distance, neighbor))

    return distances

# def main():
file_path = 'Enhanced_Flight_Dataset_with_Distances.csv'
distance_graph, cost_graph = initialize_graph(file_path)
cheapest_flight_dict = bellman_ford(cost_graph, "Riga")
print(cheapest_flight_dict)
shortest_distance_dict = dijkstra(distance_graph, "Riga")
print(shortest_distance_dict)
# let's do bellman ford for distance as well
shortest_distance_dict_bellman = bellman_ford(distance_graph, "Riga")
print(shortest_distance_dict_bellman)






# if __name__ == '__main__':
#     main()

# main()

{'Riga': 0, 'Zagreb': -13, 'Bern': 82, 'Lisbon': 95, 'Vienna': 81, 'Tallinn': 8, 'Berlin': 8, 'Athens': 103, 'Brussels': 33, 'Vilnius': 135, 'Dublin': 155, 'Helsinki': 119, 'Luxembourg': 207, 'Budapest': 152, 'Sofia': 177, 'Stockholm': 187, 'Sarajevo': 73, 'Oslo': 108, 'London': 118, 'Copenhagen': 100, 'Belgrade': 180, 'Amsterdam': 82, 'Warsaw': 21, 'Bratislava': 5, 'Paris': 13, 'Prague': 67, 'Ljubljana': 135, 'Rome': 97, 'Madrid': 134, 'Skopje': 233}
{'Riga': 0, 'Zagreb': 286, 'Bern': 1686, 'Lisbon': 2063, 'Vienna': 1970, 'Tallinn': 1977, 'Berlin': 2783, 'Athens': 2333, 'Brussels': 2625, 'Vilnius': 3123, 'Dublin': 2620, 'Helsinki': 2545, 'Luxembourg': 2667, 'Budapest': 2299, 'Sofia': 2518, 'Stockholm': 3379, 'Sarajevo': 2578, 'Oslo': 2730, 'London': 3561, 'Copenhagen': 3285, 'Belgrade': 2809, 'Amsterdam': 3305, 'Warsaw': 1827, 'Bratislava': 3187, 'Paris': 2271, 'Prague': 2937, 'Ljubljana': 2749, 'Rome': 1403, 'Madrid': 2763, 'Skopje': 5200}
{'Riga': 0, 'Zagreb': 286, 'Bern': 1686, 'Li

In [3]:
# let's get coordinates for each city from an API
# let's plot the cities on a map
# let's plot the flights on a map




## Getting Geographical coordinates 

In [4]:
# get list of cities from the dataset
# get the coordinates of each city

# we can use cheapest_flight_dict for our names of our cities
cities = sorted(cheapest_flight_dict.keys())
cities

['Amsterdam',
 'Athens',
 'Belgrade',
 'Berlin',
 'Bern',
 'Bratislava',
 'Brussels',
 'Budapest',
 'Copenhagen',
 'Dublin',
 'Helsinki',
 'Lisbon',
 'Ljubljana',
 'London',
 'Luxembourg',
 'Madrid',
 'Oslo',
 'Paris',
 'Prague',
 'Riga',
 'Rome',
 'Sarajevo',
 'Skopje',
 'Sofia',
 'Stockholm',
 'Tallinn',
 'Vienna',
 'Vilnius',
 'Warsaw',
 'Zagreb']

In [None]:
# url to get city coordinates
url = 'https://nominatim.openstreetmap.org/search?format=json&q='

# get the coordinates of each city
import requests
import json
import time

user_agent = 'Mozilla/5.0'

error_count = 0
MAX_ERRORS = 3

city_coordinates = {}
for city in cities:
    city_url = url + city
    print(f"Getting coordinates from {city_url}")
    response = requests.get(city_url, headers={'User-Agent': user_agent})
    if response.status_code != 200:
        print(f"Error getting coordinates for {city} status code: {response.status_code}")
        error_count += 1
        if error_count >= MAX_ERRORS:
            print("Too many errors", error_count)
            break
        continue # we go to the start of the loop again try next city
    data = response.json()
    # filter out only those cities where name matches
    data = [x for x in data if x['name'] == city]
    print(f"Found {len(data)} exact matches for {city}")
    # also filter out only those where "addresstype": "city",
    data = [x for x in data if x['addresstype'] == 'city'] # FIXME why some cities look fine when seen in browser but not here?
    print(f"Found {len(data)} addresstype cities for {city}")
    if not data:
        print(f"No coordinates found for {city}")
        continue
    print(f"Found {len(data)} coordinates for {city}")
    # sort by importance
    data = sorted(data, key=lambda x: x['importance'], reverse=True)
    city_coordinates[city] = (data[0]['lat'], data[0]['lon'])
    time.sleep(1) # sleep for 0.2 seconds to avoid getting blocked

city_coordinates
# FIXME why some cities are missing coordinates?

Getting coordinates from https://nominatim.openstreetmap.org/search?format=json&q=Amsterdam
Found 4 exact matches for Amsterdam
Found 2 addresstype cities for Amsterdam
Found 2 coordinates for Amsterdam
Getting coordinates from https://nominatim.openstreetmap.org/search?format=json&q=Athens
Found 7 exact matches for Athens
Found 0 addresstype cities for Athens
No coordinates found for Athens
Getting coordinates from https://nominatim.openstreetmap.org/search?format=json&q=Belgrade
Found 5 exact matches for Belgrade
Found 2 addresstype cities for Belgrade
Found 2 coordinates for Belgrade
Getting coordinates from https://nominatim.openstreetmap.org/search?format=json&q=Berlin
Found 1 exact matches for Berlin
Found 1 addresstype cities for Berlin
Found 1 coordinates for Berlin
Getting coordinates from https://nominatim.openstreetmap.org/search?format=json&q=Bern
Found 3 exact matches for Bern
Found 3 addresstype cities for Bern
Found 3 coordinates for Bern
Getting coordinates from https:/

{'Amsterdam': ('52.3730796', '4.8924534'),
 'Belgrade': ('45.773279', '-111.184535'),
 'Berlin': ('52.510885', '13.3989367'),
 'Bern': ('46.9484742', '7.4521749'),
 'Bratislava': ('48.1516988', '17.1093063'),
 'Budapest': ('47.48138955', '19.14609412691246'),
 'Dublin': ('53.3493795', '-6.2605593'),
 'Helsinki': ('60.1674881', '24.9427473'),
 'Lisbon': ('46.441634', '-97.68121'),
 'London': ('51.5074456', '-0.1277653'),
 'Luxembourg': ('49.6112768', '6.129799'),
 'Madrid': ('40.4167047', '-3.7035825'),
 'Oslo': ('48.1951323', '-97.131159'),
 'Paris': ('48.8534951', '2.3483915'),
 'Prague': ('35.4867369', '-96.6850174'),
 'Rome': ('40.9814147', '-91.682387'),
 'Sarajevo': ('43.8519774', '18.3866868'),
 'Stockholm': ('59.3251172', '18.0710935'),
 'Tallinn': ('59.4372155', '24.7453688'),
 'Vilnius': ('54.6870458', '25.2829111'),
 'Warsaw': ('41.2381017', '-85.8530544'),
 'Zagreb': ('45.8130967', '15.9772795')}

In [13]:
# let's repeat the process for two specific cities Athens and Belgrade
# we want to print the json response to see what we are getting
city_url = url + "Athens"
response = requests.get(city_url, headers={'User-Agent': user_agent})
print(response.json())


[{'place_id': 553154, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'relation', 'osm_id': 119353, 'lat': '33.9597677', 'lon': '-83.376398', 'class': 'boundary', 'type': 'administrative', 'place_rank': 16, 'importance': 0.5987053733126958, 'addresstype': 'city', 'name': 'Athens-Clarke County Unified Government', 'display_name': 'Athens-Clarke County Unified Government, Athens-Clarke County, Georgia, United States', 'boundingbox': ['33.8480209', '34.0394660', '-83.5374696', '-83.2408342']}, {'place_id': 325534942, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright', 'osm_type': 'relation', 'osm_id': 182845, 'lat': '39.3289242', 'lon': '-82.1012479', 'class': 'boundary', 'type': 'administrative', 'place_rank': 16, 'importance': 0.5195750939573028, 'addresstype': 'town', 'name': 'Athens', 'display_name': 'Athens, Athens Township, Athens County, Ohio, 45701, United States', 'boundingbox': ['39.2863058', '39.360

In [16]:
# let's save our city coordinates into csv file
city_coordinates_df = pd.DataFrame(city_coordinates).T
city_coordinates_df.columns = ['Latitude', 'Longitude'] 
# first show head
city_coordinates_df.head()
# we want the index to be named City
city_coordinates_df.index.name = 'City'

city_coordinates_df.to_csv('city_coordinates.csv')

In [17]:
# let's plot the citys on OpenStreetMap
import folium

In [21]:


# create a map centered on Riga
m = folium.Map(location=[56.946285, 24.105078], zoom_start=4)

# add markers for each city
for city, coordinates in city_coordinates.items():
    folium.Marker(coordinates, popup=city).add_to(m)

# m.save('cities.html')
# display the map
m