## Introduction
This project aims to extract subway route data from OpenStreetMap using the Overpass API and visualize the network. By building a graph of subway stations and routes, we can analyze and display the subway network of major cities like Berlin. The main tools we use are:

- **Python**: The programming language used.
- **Overpass API**: To query OpenStreetMap data.
- **NetworkX**: For creating and analyzing graphs.
- **Folium**: For interactive map visualizations.

The goal is to represent subway routes visually and make sense of the transportation structure.


## Importing Libraries
We start by importing the necessary libraries for making API requests, handling JSON data, creating graphs, and visualizing the map.


In [1]:
import requests
import networkx as nx
import matplotlib.pyplot as plt
import folium

from IPython.display import IFrame

## Data Extraction Functions
In this section, we define helper functions to interact with the Overpass API and extract relevant data. These functions will help retrieve subway routes and station data from OpenStreetMap.

- `extract_osm_geodata(query)`: Sends a query to the Overpass API and retrieves subway data in JSON format.
- `extract_route_elements(osm_data)`: Extracts subway route data from the API response.
- `extract_node_elements(osm_data)`: Extracts station data from the API response.


In [2]:
def extract_osm_geodata(query):
    """
    Extracts geodata from OpenStreatMap using the Overpass API.

    :parameters:
    query (str): The Overpass API query strting to execute.

    :returns
    dict: GeoJSON data retrived by the query.

    """

    # Define the Overpass API URL
    overpass_url = "https://lz4.overpass-api.de/api/interpreter"

    # Define the parameter for the GET request
    params = {'data': query}

    # Send a GET request to the Overpass API with the query
    response = requests.get(overpass_url, params =params)

    # Check if the request was succesful
    if response.status_code == 200:
        return response.json()
    else:
        response.raise_for_status()

def extract_route_elements(osm_data):
    """
    Extracts route elements from the OpenStreetMap data response.

    Parameters:
    osm_data (dict): The JSON response data from the Overpass API.

    Returns:
    list: A list of route elements containing 'tags' with 'route' information.
    """

    route_elements = [element for element in osm_data['elements'] if 'tags' in element and 'route' in element['tags']]
    return route_elements

def extract_node_elements(osm_data):
    """
    Extracts node elements from the OpenStreetMap data response.
    Parameters:
    osm_data (dict): The JSON response data from the Overpass API.
    Returns:
    dict: A dictionary where the keys are node IDs and the values are the corresponding node elements.
    node_elements = {element['id']: element for element in osm_data['elements'] if element['type'] == 'node'}
    """
    node_elements = {element['id']: element for element in osm_data['elements'] if element['type'] == 'node'}
    return node_elements

## Creating the Subway Route Graph
Using the extracted subway data, we now create a graph where stations are nodes, and subway routes between stations are edges. This graph representation helps us analyze the structure and connectivity of the subway system.

- `create_route_graph(route_elements, node_elements)`: This function constructs the graph using subway stations and routes.


In [3]:

def create_route_graph(route_elements, node_elements):
    """
    Creates a graph of routes with stop nodes and edges between consecutive stops.

    Parameters:
    route_elements (list): A list of route elements containing 'members' and 'tags'.
    node_elements (dict): A dictionary of node elements with their IDs as keys.

    Returns:
    nx.Graph: A NetworkX graph object representing the routes and stop nodes.
    """

    # Create a graph object
    G = nx.Graph()

    for route in route_elements:
        stop_nodes = [member for member in route['members'] if 'stop' in member['role']]

        # Add nodes to the graph
        for node in stop_nodes:
            ref = node['ref']
            if ref in node_elements:
                node_data = node_elements[ref]
                name = node_data['tags'].get('name', str(ref))
                colour = route['tags'].get('colour', '#808080')  # Default gray color if no color is defined

                G.add_node(ref, pos=(node_data['lon'], node_data['lat']), name=name, colour=colour)

        # Add edges between consecutive stop nodes
        for i in range(len(stop_nodes) - 1):
            G.add_edge(stop_nodes[i]['ref'], stop_nodes[i + 1]['ref'])
    return G

## Visualizing the Graph with NetworkX
Here, we visualize the subway route graph using NetworkX. This visualization displays the stations and the routes connecting them. Each station is represented as a node, and the routes are edges.


In [4]:
def visualize_graph(G, title):
    """
     Visualizes the graph with nodes and edges.

     Parameters:
     G (nx.Graph): A NetworkX graph object representing the routes and stop nodes.

    Returns:
    None: Displays a plot of the graph.
    """
    # Set the plot size
    plt.figure(figsize=(15, 8))  # Adjust the width and height as needed

    # Draw the graph with nodes and edges
    pos = nx.get_node_attributes(G, 'pos')
    node_labels = nx.get_node_attributes(G, 'name')
    node_colours = list(nx.get_node_attributes(G, 'colour').values())
    nx.draw(G, pos, with_labels=True, labels=node_labels, node_size=100, node_color=node_colours, font_size=8)
    nx.draw_networkx_edges(G, pos, edge_color='gray', alpha=0.7)
    plt.title(title)
    plt.show()

## Visualizing the Subway Routes with Folium
To make the subway routes more intuitive, we plot them on an interactive map using Folium. This map allows us to see the geographical layout of the stations and their connections, offering a spatial perspective on the subway network.


In [11]:

def create_folium_map(G, zoom_start=12):
    """
    Function to create a Folium map from a NetworkX graph with 'pos' attributes (longitude, latitude),
    and show tooltips with node names on hover.

    Parameters:
    G: NetworkX graph where nodes have 'pos' attributes with (longitude, latitude).
    zoom_start: Initial zoom level for the map.

    Returns:
    folium.Map object.
    """
    # Get the positions and labels of nodes
    positions = nx.get_node_attributes(G, 'pos')  # Get 'pos' attribute (lon, lat)
    node_labels = nx.get_node_attributes(G, 'name')  # Get node labels (names)
    node_colors = nx.get_node_attributes(G, 'colour')  # Get node colors

    if not positions:
        raise ValueError("Graph nodes must contain a 'pos' attribute with (longitude, latitude).")

    # Split 'pos' into lists of longitudes and latitudes
    latitudes = [pos[1] for pos in positions.values()]  # Extract latitudes
    longitudes = [pos[0] for pos in positions.values()]  # Extract longitudes

    center_lat = sum(latitudes) / len(latitudes)
    center_lon = sum(longitudes) / len(longitudes)

    # Create the Folium map centered on the calculated average coordinates
    folium_map = folium.Map(location=[center_lat, center_lon], zoom_start=zoom_start)

    # Add markers for each node in the graph with a tooltip showing the node name
    for node, pos in positions.items():
        node_name = node_labels.get(node, f"Node {node}")  # Default to "Node {node}" if no name
        color = node_colors.get(node, 'blue')
        folium.CircleMarker(
            location=[pos[1], pos[0]],  # (lat, lon)
            radius=3,  # Circle size
            color=color,  # Circle border color
            fill=True,
            fill_color=color,  # Circle fill color
            fill_opacity=0.7,  # Circle transparency
            tooltip=node_name  # Show node name when hovering
        ).add_to(folium_map)

        # Define a function to get color based on the line's nodes

    def get_edge_color(u, v):
        # Use color of the start node as color for the edge, or default to 'grey'
        color_u = node_colors.get(u, 'grey')
        color_v = node_colors.get(v, 'grey')
        # Return a color if available, otherwise grey
        return color_u if color_u == color_v else 'grey'



    # Add edges to the map as lines connecting nodes
    for u, v in G.edges():
        if u in positions and v in positions:
            color = get_edge_color(u, v)
            folium.PolyLine(
                locations=[(positions[u][1], positions[u][0]),  # (lat, lon) for node u
                           (positions[v][1], positions[v][0])],  # (lat, lon) for node v
                color=color, weight=2.5, opacity=1
            ).add_to(folium_map)

    return folium_map

## Query to Extract Subway Data
In this section, we construct an Overpass API query to extract subway routes in Berlin. The query is designed to fetch all relations tagged as route=subway within the Berlin area, using the OpenStreetMap data.

We use the Overpass API to request the data and receive a JSON response that includes subway routes and associated nodes. The extracted data will be used in subsequent steps to build a graph of the subway network.

In [6]:
query = """
[out:json];
area[name="Berlin"]->.searchArea;
relation["route"~"subway"](area.searchArea);
out meta;
>;
out body;
"""

## Applying Functions to Extract and Process Subway Data
Now that we have defined our query, we will apply the previously defined functions to extract, process, and build a graph from the subway data. We first use extract_osm_geodata() to retrieve the data, then extract route and node elements using the corresponding functions. Finally, we create a NetworkX graph to represent the subway routes and stops.

In [7]:
# Extract subway data using the Overpass API
subway_berlin_data = extract_osm_geodata(query)

# Extract route and node elements
route_elements = extract_route_elements(subway_berlin_data)
node_elements = extract_node_elements(subway_berlin_data)

# Create a graph of the subway network
subway_berlin_G = create_route_graph(route_elements, node_elements)


## Visualization of the Subway Network
In this section, we will visualize the Berlin subway network. Using Folium, we can create an interactive map where subway stops are displayed, and edges represent the connections between them. Each line is represented with a unique color, and the nodes display the stop name when hovered over.

We first visualize the network graphically using NetworkX and Matplotlib. Then, we generate an interactive map using Folium.

In [10]:
# Visualize the graph with a Folium map
map = create_folium_map(subway_berlin_G, zoom_start=12)
#map.show_in_browser()

# Save the map to an HTML file
map.save('subway_map.html')

# Display the map in the notebook
IFrame('subway_map.html', width=800, height=600)

## Conclusion
In this project, we successfully extracted subway route data from OpenStreetMap using the Overpass API and visualized it with both NetworkX and Folium. This process helps better understand the structure and connectivity of subway systems. Future work could involve expanding to other cities or incorporating additional types of public transit routes.
