# Overture Maps Land Use Data Fetch

This notebook demonstrates how to fetch **Land Use** data from Overture Maps for a specific Area of Interest (AOI) using DuckDB and save it as a Parquet file. It then visualizes the data using GeoPandas.

**Objective:** Fetch Land Use features (e.g., residential, commercial, industrial, etc.) for the Southeastern USA.

**Tools:**
*   `duckdb`: For querying data directly from S3.
*   `geopandas`: For handling and visualizing geospatial data.
*   `shapely`: For geometric operations.

In [None]:
import duckdb
import geopandas as gpd
from shapely.geometry import box
import matplotlib.pyplot as plt

# Configuration
OVERTURE_RELEASE = "2025-11-19.0"
THEME = "base"
TYPE = "land_use"
S3_PATH = f"s3://overturemaps-us-west-2/release/{OVERTURE_RELEASE}/theme={THEME}/type={TYPE}/*"
OUTPUT_PARQUET = "overture_land_use_se_usa.parquet"

# AOI: Southeastern USA
# BBOX: (xmin, ymin, xmax, ymax)
AOI_BBOX = (-89.9190, 24.7674, -76.7229, 36.7593)

print(f"Target S3 Path: {S3_PATH}")
print(f"Area of Interest (BBOX): {AOI_BBOX}")

## 1. Fetch Data with DuckDB

We use DuckDB to query the Overture Maps Parquet files directly from S3. We filter by the bounding box to retrieve only the relevant data.

**Schema Note:** We are selecting `id`, `subtype`, `class`, and `geometry`.

In [None]:
con = duckdb.connect()

# Install and load necessary extensions
con.execute("INSTALL spatial; LOAD spatial;")
con.execute("INSTALL httpfs; LOAD httpfs;")

# Configure S3 access (anonymous for Overture)
con.execute("SET s3_region='us-west-2';")

query = f"""
COPY (
    SELECT
        id,
        subtype,
        class,
        geometry
    FROM read_parquet('{S3_PATH}')
    WHERE
        bbox.xmin > {AOI_BBOX[0]} AND
        bbox.xmax < {AOI_BBOX[2]} AND
        bbox.ymin > {AOI_BBOX[1]} AND
        bbox.ymax < {AOI_BBOX[3]}
) TO '{OUTPUT_PARQUET}' (FORMAT PARQUET, COMPRESSION 'ZSTD', ROW_GROUP_SIZE 100000);
"""

print("Executing DuckDB Query... (This may take a few minutes)")
con.execute(query)
print(f"âœ… Data successfully saved to {OUTPUT_PARQUET}")
con.close()

## 2. Load and Visualize Data

Now we load the saved Parquet file into a GeoDataFrame and visualize the land use classes.

In [None]:
print(f"Reading {OUTPUT_PARQUET}...")
gdf = gpd.read_parquet(OUTPUT_PARQUET)
print(f"Loaded {len(gdf)} rows.")

# Display first few rows
display(gdf.head())

In [None]:
# Plotting
fig, ax = plt.subplots(figsize=(12, 10))
gdf.plot(column='class', ax=ax, legend=True, legend_kwds={'bbox_to_anchor': (1, 1)})
ax.set_title("Overture Land Use - Southeastern USA")
ax.set_axis_off()
plt.tight_layout()
plt.show()

In [None]:
# Summary of Land Use Classes
print("Land Use Class Distribution:")
print(gdf['class'].value_counts())