# Adding Spatial Context with OpenStreetMap (OSM) Using Overpass API

In this notebook, we download real-world geographic features from OpenStreetMap
(OSM) to add **spatial context** to our analysis. Examples of “context” layers
include:

- supermarkets and stores
- schools and hospitals
- parks
- roads or transit features

We will:
1. Send an Overpass API query to OpenStreetMap
2. Extract returned locations (lat/lon + tags)
3. Visualize results on an interactive map (Folium)
4. View results as a table (Pandas DataFrame)

**Prerequisite:** You should already have a study area in mind (city/county/region).
(Geocoded points from previous notebooks are optional.)

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
# import packages
import pandas as pd
import requests
import folium
from IPython.display import display

## Querying OpenStreetMap for Context Layers


### What is the Overpass API?

Overpass is a query service for OpenStreetMap. It lets us request specific
features (shops, parks, schools, etc.) by specifying tags such as:

- `shop=supermarket`
- `amenity=school`
- `leisure=park`

The result includes geographic coordinates and descriptive tags (like name).


### Workflow Overview

1. Build an Overpass query using OSM tags  
2. Send the query to the Overpass endpoint  
3. Parse returned JSON  
4. Convert results into a table  
5. Plot results on a map  


### Choosing Correct OSM Tags

To find the right tag keywords, use the OpenStreetMap “Map Features” reference:
- https://wiki.openstreetmap.org/wiki/Map_features


### Important Note About Search Terms

OSM queries work best using **standard OSM tags** (e.g., `shop=supermarket`)
rather than brand names like “Target” or “Walmart”.

Brand filtering is possible, but it is less consistent because OSM tagging
depends on volunteer contributors.


In [3]:
def get_locations(categories, queries, cities, state, country, brand=None):
    """Fetch locations from OpenStreetMap using Overpass API."""
    overpass_url = "https://overpass-api.de/api/interpreter"
    all_locations = []

    # Construct the optional brand filter
    brand_filter = f'["brand"="{brand}"]' if brand else ''

    # Handle multiple categories, queries, and cities
    if not categories or not queries or not cities:
        print("No categories, queries, or cities provided.")
        return []

    for category in categories:
        for city in cities:
            for query in queries:
                overpass_query = f"""
                [out:json];
                area[name="{city}"]->.searchArea;
                (
                  node["{category}"="{query}"]{brand_filter}(area.searchArea);
                  way["{category}"="{query}"]{brand_filter}(area.searchArea);
                  relation["{category}"="{query}"]{brand_filter}(area.searchArea);
                );
                out center;
                """

                try:
                    response = requests.get(overpass_url, params={'data': overpass_query})
                    response.raise_for_status()
                    data = response.json()
                    all_locations.extend(data.get("elements", []))
                except requests.exceptions.RequestException as e:
                    print(f"Request error for {category}={query} in {city}: {e}")
                except requests.exceptions.JSONDecodeError:
                    print(f"Error decoding JSON response from API for {category}={query} in {city}.")

    return all_locations

In [4]:
def plot_locations(data, city, state, country):
    """Plot locations on a Folium map."""
    if not data:
        print("No locations found.")
        return None

    # Extract the first valid location for map centering
    for place in data:
        lat = place.get('lat') or (place.get('center', {}).get('lat'))
        lon = place.get('lon') or (place.get('center', {}).get('lon'))
        if lat and lon:
            m = folium.Map(location=[lat, lon], zoom_start=12)
            break
    else:
        print("No valid coordinates found.")
        return None

    # Add markers
    for place in data:
        lat = place.get('lat') or (place.get('center', {}).get('lat'))
        lon = place.get('lon') or (place.get('center', {}).get('lon'))
        if lat and lon:
            name = place.get('tags', {}).get('name', 'Unknown')
            folium.Marker([lat, lon], popup=f"{name} ({lat}, {lon})").add_to(m)

    return m

In [5]:
def display_locations(data):
    """Display location names with coordinates in a DataFrame."""
    locations = []
    for place in data:
        lat = place.get('lat') or (place.get('center', {}).get('lat'))
        lon = place.get('lon') or (place.get('center', {}).get('lon'))
        if lat and lon:
            name = place.get('tags', {}).get('name', 'Unknown')
            locations.append([name, lat, lon])

    df = pd.DataFrame(locations, columns=['Name', 'Latitude', 'Longitude'])
    return df

### Please refer to the Query Website to See which specific Keywords to use for OSM

In [8]:
# Example Query Parameters
category = ["shop"]  # General category
#queries = ["supermarket", "department_store", "greengrocer", "farm", "health_food", "retail"]  # Multiple specific queries
queries = ["supermarket"]
#cities = ["Indianapolis", "Lawrence"]  # Cities in Marion County, IN
cities = ["Indianapolis"]
state = "Indiana"
country = "USA"
brand = None  # Change to "Walmart" or "Target" if needed or None

# Fetch Data
data = get_locations(category, queries, cities, state, country, brand)

# Plot Data on Map
map_result = plot_locations(data, cities[0] if cities else None, state, country)

# Display DataFrame of Locations
df_locations = display_locations(data)

# Display the map and data
if map_result:
    display(map_result)

display(df_locations)

Unnamed: 0,Name,Latitude,Longitude
0,Kroger,39.913798,-86.205519
1,Trader Joe's,39.912493,-86.212464
2,Kroger,39.875170,-86.119302
3,Hana Market,39.754988,-86.242418
4,Needler's Fresh Market,39.771972,-86.151670
...,...,...,...
112,El Rancho Grande,39.808881,-86.240642
113,Apna Bazaar,39.822927,-86.268925
114,Safeway,39.789167,-86.083754
115,Saraga International Grocery,39.650968,-86.120974


## Extra: Something to Think About - `Reverse Geocoding`

### (coords -> place names)

Example Coordinates:

`(lat, lon) = (38.8977, -77.0365)`

Reverse-geocoding converts coordinates to a street address like:

`1600 Pennsylvania Ave NW, Washington, DC 20500`

Geopy supports reverse geocoding. Handy when you have GPS points and want human-readable names.

In [14]:
from geopy.geocoders import Nominatim

In [15]:
# Initialize the geolocator
geolocator = Nominatim(user_agent="r_geocoder_sample")  # Replace with your app name

### There is a chance of the server being busy. If the code below fails, then retry it later!

In [16]:
# Reverse geocode sample coordinate
reverse = geolocator.reverse((39.7684, -86.1581), exactly_one=True, timeout=30) # Increased timeout to 30 seconds
print("Reverse result:", reverse.address)

Reverse result: Indianapolis, Marion County, Indiana, 46282, United States
