# OSM PBF Loader

`OSMPbfLoader` can really quickly parse full OSM extract in the form of `*.osm.pbf` file.

It can download and parse a lot of features much faster than the `OSMOnlineLoader`, but it's much more useful when a lot of different features are required at once (like when using predefined filters).

When only a single or few features are needed, `OSMOnlineLoader` might be a better choice, since `OSMPbfLoader` will use a full extract of all features in a given region and will have to iterate over all of them.

In [None]:
from srai.loaders.osm_loaders.filters import HEX2VEC_FILTER, GEOFABRIK_LAYERS
from srai.loaders.osm_loaders.filters.popular import get_popular_tags
from srai.loaders.osm_loaders import OSMPbfLoader
from srai.constants import REGIONS_INDEX, WGS84_CRS
from srai.regionalizers import geocode_to_region_gdf
from srai.geometry import buffer_geometry

from shapely.geometry import Point, box
import geopandas as gpd

## Using OSMPbfLoader to download data for a specific area

### Download all features from `HEX2VEC_FILTER` in Warsaw, Poland

In [None]:
loader = OSMPbfLoader()
warsaw_gdf = geocode_to_region_gdf("Warsaw, Poland")
warsaw_features_gdf = loader.load(warsaw_gdf, HEX2VEC_FILTER)
warsaw_features_gdf

### Plot features

Inspired by [`prettymaps`](https://github.com/marceloprates/prettymaps)

In [None]:
clipped_features_gdf = warsaw_features_gdf.clip(warsaw_gdf.geometry.unary_union)

In [None]:
ax = warsaw_gdf.plot(color="lavender", figsize=(16, 16))

# plot water
clipped_features_gdf.dropna(subset=["water", "waterway"], how="all").plot(
    ax=ax, color="deepskyblue"
)

# plot greenery
clipped_features_gdf[
    clipped_features_gdf["landuse"].isin(
        ["grass", "orchard", "flowerbed", "forest", "greenfield", "meadow"]
    )
].plot(ax=ax, color="mediumseagreen")

# plot buildings
clipped_features_gdf.dropna(subset=["building"], how="all").plot(
    ax=ax, color="dimgray", markersize=0.1
)

xmin, ymin, xmax, ymax = warsaw_gdf.total_bounds
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)

ax.set_axis_off()

### Download all features from popular tags based on OSMTagInfo in Vienna, Austria

In [None]:
popular_tags = get_popular_tags(in_wiki_only=True)

num_keys = len(popular_tags)
f"Unique keys: {num_keys}."

In [None]:
{k: popular_tags[k] for k in list(popular_tags)[:10]}

In [None]:
vienna_center_circle = buffer_geometry(Point(16.37009, 48.20931), meters=1000)
vienna_center_circle_gdf = gpd.GeoDataFrame(
    geometry=[vienna_center_circle],
    crs=WGS84_CRS,
    index=gpd.pd.Index(data=["Vienna"], name=REGIONS_INDEX),
)

In [None]:
loader = OSMPbfLoader()
vienna_features_gdf = loader.load(vienna_center_circle_gdf, popular_tags)
vienna_features_gdf

### Plot features

Uses `default` preset colours from [`prettymaps`](https://github.com/marceloprates/prettymaps)

In [None]:
clipped_vienna_features_gdf = vienna_features_gdf.clip(vienna_center_circle)

In [None]:
ax = vienna_center_circle_gdf.plot(color="#F2F4CB", figsize=(16, 16))

# plot water
clipped_vienna_features_gdf.dropna(subset=["water", "waterway"], how="all").plot(
    ax=ax, color="#a8e1e6"
)

# plot streets
clipped_vienna_features_gdf.dropna(subset=["highway"], how="all").plot(
    ax=ax, color="#475657", markersize=0.1
)

# plot buildings
clipped_vienna_features_gdf.dropna(subset=["building"], how="all").plot(ax=ax, color="#FF5E5B")

# plot parkings
clipped_vienna_features_gdf[
    (clipped_vienna_features_gdf["amenity"] == "parking")
    | (clipped_vienna_features_gdf["highway"] == "pedestrian")
].plot(ax=ax, color="#2F3737", markersize=0.1)

# plot greenery
clipped_vienna_features_gdf[
    clipped_vienna_features_gdf["landuse"].isin(
        ["grass", "orchard", "flowerbed", "forest", "greenfield", "meadow"]
    )
].plot(ax=ax, color="#8BB174")

xmin, ymin, xmax, ymax = vienna_center_circle_gdf.total_bounds
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)

ax.set_axis_off()

### Download all grouped features based on Geofabrik layers in New York, USA

In [None]:
manhattan_bbox = box(-73.994551, 40.762396, -73.936872, 40.804239)
manhattan_bbox_gdf = gpd.GeoDataFrame(
    geometry=[manhattan_bbox],
    crs=WGS84_CRS,
    index=gpd.pd.Index(data=["New York"], name=REGIONS_INDEX),
)

In [None]:
loader = OSMPbfLoader()
new_york_features_gdf = loader.load(manhattan_bbox_gdf, GEOFABRIK_LAYERS)
new_york_features_gdf

### Plot features

Inspired by https://snazzymaps.com/style/14889/flat-pale

In [None]:
ax = manhattan_bbox_gdf.plot(color="#e7e7df", figsize=(16, 16))

# plot greenery
new_york_features_gdf[new_york_features_gdf["leisure"] == "leisure=park"].plot(
    ax=ax, color="#bae5ce"
)

# plot water
new_york_features_gdf.dropna(subset=["water", "waterways"], how="all").plot(ax=ax, color="#c7eced")

# plot streets
new_york_features_gdf.dropna(subset=["paths_unsuitable_for_cars"], how="all").plot(
    ax=ax, color="#e7e7df", linewidth=1
)
new_york_features_gdf.dropna(
    subset=["very_small_roads", "highway_links", "minor_roads"], how="all"
).plot(ax=ax, color="#fff", linewidth=2)
new_york_features_gdf.dropna(subset=["major_roads"], how="all").plot(
    ax=ax, color="#fac9a9", linewidth=3
)

# plot buildings
new_york_features_gdf.dropna(subset=["buildings"], how="all").plot(ax=ax, color="#cecebd")

xmin, ymin, xmax, ymax = manhattan_bbox_gdf.total_bounds
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)

ax.set_axis_off()