# Overture Maps Loader

`OvertureMapsLoader` can download the Overture Maps data from the s3 bucket for a given region.

It is a wrapper around the [`OvertureMaestro`](https://github.com/kraina-ai/overturemaestro) library that can download the data in the original format but also have some advanced functions. 

In the `SRAI` context, `OvertureMapsLoader` utilizes so-called wide format for returning features with columns representing potential categories of the object. If you want to read more in-depth about this format, you can checkout this OvertureMaestro's [docs page](https://kraina-ai.github.io/overturemaestro/latest/examples/advanced_functions/wide_form/).

In [None]:
from shapely.geometry import box

from srai.constants import GEOMETRY_COLUMN
from srai.loaders import OvertureMapsLoader
from srai.regionalizers import geocode_to_region_gdf

## Using OvertureMapsLoader to download data for a specific area

### Download all available features in Paris, France

In [None]:
loader = OvertureMapsLoader()
paris = geocode_to_region_gdf("Paris")
paris_features_gdf = loader.load(paris)
paris_features_gdf

### Plot features

Colours from the this palette: https://colorhunt.co/palette/f8ededff8225b43f3f173b45

In [None]:
ax = paris.plot(color="#F8EDED", figsize=(16, 16))

# plot water
water_columns = [c for c in paris_features_gdf.columns if "water" in c]
water_data = paris_features_gdf[paris_features_gdf[water_columns].any(axis=1)]
water_data.plot(ax=ax, color="#FF8225", markersize=0)

# plot_roads
roads_data = paris_features_gdf[paris_features_gdf["transportation|segment|road"]]
roads_data.plot(ax=ax, color="#B43F3F", markersize=0, linewidth=0.25)

# plot buildings
building_columns = [c for c in paris_features_gdf.columns if c.startswith("buildings")]
buildings_data = paris_features_gdf[paris_features_gdf[building_columns].any(axis=1)]
buildings_data.plot(ax=ax, color="#173B45", markersize=0)

paris.boundary.plot(ax=ax, color="#173B45", linewidth=2, alpha=0.5)

xmin, ymin, xmax, ymax = paris.total_bounds
ax.set_xlim(xmin - 0.001, xmax + 0.001)
ax.set_ylim(ymin - 0.001, ymax + 0.001)

ax.set_axis_off()

### Download more detailed data with higher hierarchy value

By default, the `hierarchy_depth` value is equal to `1`, but it can be set to `None` to get a full list of all possible columns.

In [None]:
manhattan_bbox = box(-73.994551, 40.762396, -73.936872, 40.804239)
loader = OvertureMapsLoader(hierarchy_depth=None)
new_york_features_gdf = loader.load(manhattan_bbox)
new_york_features_gdf

As you can see, there are over `2600` columns available.

Let's see top 20 most popular columns. 

In [None]:
new_york_features_gdf.drop(columns=GEOMETRY_COLUMN).sum().sort_values(ascending=False).head(20)

## Configure places dataset

Places schema is the only one that is treated differently than other data types.

By default, places use both `primary` and `alternate` categories to define a feature.

Additionally, there is a filter applied to get only features with confidence score `>= 0.75`.

There are two dedicated parameters: `places_minimal_confidence` and `places_use_primary_category_only` to configure how the data should be transformed.

Let's do example with both of these parameters. We will also use a `theme_type_pairs` parameter to limit the scope of the downloaded data.

In [None]:
default_confidence_loader = OvertureMapsLoader(
    theme_type_pairs=[("places", "place")], places_use_primary_category_only=True
)
strict_confidence_loader = OvertureMapsLoader(
    theme_type_pairs=[("places", "place")],
    places_minimal_confidence=0.99,
    places_use_primary_category_only=True,
)
shibuya = geocode_to_region_gdf("Shibuya, Tokyo")
shibuya_default_confidence_features_gdf = default_confidence_loader.load(shibuya)
shibuya_strict_confidence_features_gdf = strict_confidence_loader.load(shibuya)

print(f"Default confidence score: {len(shibuya_default_confidence_features_gdf)}")
print(f"Strict confidence score:  {len(shibuya_strict_confidence_features_gdf)}")

Let's see the count of categories in the places dataset with confidence score `>= 0.99`.

In [None]:
shibuya_strict_confidence_features_df = shibuya_strict_confidence_features_gdf.drop(
    columns=GEOMETRY_COLUMN
)
shibuya_strict_confidence_features_df.sum().loc[lambda x: x > 0].sort_values(ascending=False)

### Plot features

Now we will see the difference between default list of places (gray dots) and strict ones (coloured circles)

In [None]:
m = shibuya_default_confidence_features_gdf.loc[
    shibuya_default_confidence_features_gdf.index.difference(
        shibuya_strict_confidence_features_gdf.index
    )
].geometry.explore(
    tiles="CartoDB Voyager",
    color="gray",
    tooltip=False,
    style_kwds=dict(opacity=0.25, stroke=False),
)
shibuya.boundary.explore(m=m, color="black")

shibuya_gdf_with_categories = shibuya_strict_confidence_features_gdf.join(
    shibuya_strict_confidence_features_df.dot(shibuya_strict_confidence_features_df.columns).rename(
        "category"
    )
)
shibuya_gdf_with_categories.geometry.explore(
    m=m,
    tooltip=False,
    marker_kwds=dict(radius=6),
    style_kwds=dict(color="black", fillOpacity=1),
)
shibuya_gdf_with_categories[[GEOMETRY_COLUMN, "category"]].explore(
    m=m,
    column="category",
    tooltip=["feature_id", "category"],
    cmap="tab20",
    marker_kwds=dict(radius=4),
    style_kwds=dict(fillOpacity=1),
)