## Case of Study - Open Aerial Map (OAM)

https://map.openaerialmap.org

> OpenAerialMap (OAM) is a set of tools for searching, sharing, and using openly licensed satellite and unmanned aerial vehicle (UAV) imagery.

Believe it or not, but OAM was created more than 9 years ago!!! The design was innovative and the backend was also advanced. It was one of the first applications using dynamic tiling (https://github.com/mojodna/marblecutter-openaerialmap) and STAC like metadata (https://github.com/hotosm/oam-api).

Looking at the web app and backend today, it feels still great and better than a lot of similar projects.


###  Goals

This Notebook aims to show how to `scrape` the OAM api to create proper STAC Items/Collection and ingest them in an eoAPI stack


### Requirements

- `httpx`
- `pystac`
- `folium`
- `pypgstac` (==0.8.5, because it"s the version of the pgstac database which we'll use)


```
python -m pip install httpx pystac "pypgstac[psycopg]==0.8.5" folium
```


#### 1. Get OAM metadata

In [1]:
import httpx

# First get the number of items
resp = httpx.get("https://api.openaerialmap.org/meta", params={"limit": 1})
metadata = resp.json()["meta"]
print(metadata)

total_items_number = metadata["found"]
page_size = 500

# Get all the OAM metadata
items = []
page_number = 1
while True:
    resp = httpx.get("https://api.openaerialmap.org/meta", params={"limit": page_size, "page": page_number})
    resp.raise_for_status()
    if not resp.json()["results"]:
        break

    items.extend(resp.json()["results"])
    page_number += 1


{'provided_by': 'OpenAerialMap', 'license': 'CC-BY 4.0', 'website': 'http://beta.openaerialmap.org', 'page': 1, 'limit': 1, 'found': 16194}


In [2]:
# We make sure we have all the items
assert len(items) == total_items_number

#### 2. OAM metadata -> STAC

In [25]:
import json

import pystac
from pystac.utils import datetime_to_str, str_to_datetime
from pystac.extensions.item_assets import AssetDefinition, ItemAssetsExtension

with open("oam_items.njson", "w") as fout:
    for oam_item in items:
        if not oam_item['acquisition_start']:
            continue

        extensions = [
            "https://stac-extensions.github.io/file/v2.1.0/schema.json",
        ]

        extra_fields = {}
        asset_url = oam_item["uuid"]
        if asset_url.startswith("http://oin-hotosm") or asset_url.startswith("https://oin-hotosm.s3"):
            s3_url = asset_url.replace("http://oin-hotosm.s3.amazonaws.com", "s3://oin-hotosm")
            s3_url = s3_url.replace("http://oin-hotosm-staging.s3.amazonaws.com", "s3://oin-hotosm-staging")
            s3_url = s3_url.replace("https://oin-hotosm.s3.us-east-1.amazonaws.com", "s3://oin-hotosm")
            s3_url = s3_url.replace("https://oin-hotosm.s3.amazonaws.com", "s3://oin-hotosm")
            s3_url = s3_url.replace("https://oin-hotosm.s3.amazonaws.com", "s3://oin-hotosm")

            extra_fields = {
                "alternate:name": "S3",
                "alternate": {"s3": {"href": s3_url}},
            }
            extensions.append("https://stac-extensions.github.io/alternate-assets/v1.2.0/schema.json")

        item = pystac.Item(
            id=oam_item["_id"],
            geometry=oam_item["geojson"],
            bbox=oam_item["bbox"],
            collection="openaerialmap",
            stac_extensions=extensions,
            datetime=None,
            properties={
                "title": oam_item["title"],
                "platform": oam_item["platform"],
                "provider": oam_item["provider"],
                # "contact": oam_item["contact"],  # for privacy reason I choose not to have this field
                "start_datetime": datetime_to_str(str_to_datetime(oam_item["acquisition_start"])),
                "end_datetime": datetime_to_str(str_to_datetime(oam_item["acquisition_end"])),
                "file:size": oam_item["file_size"]
            },
        )
        item.add_link(
            pystac.Link(
                pystac.RelType.COLLECTION,
                "openaerialmap",
                media_type=pystac.MediaType.JSON,
            )
        )

        item.add_asset(
            key="image",
            asset=pystac.Asset(
                href=asset_url,
                media_type=pystac.MediaType.COG,
                roles=["data"],
                extra_fields=extra_fields,
            ),
        )

        # Here we make sure to remove any invalid character
        fout.write(
            json.dumps(
                item.to_dict(), ensure_ascii=False
            ).encode("ascii", "ignore").decode("utf-8").replace('\\"', "") + "\n"
        )

init_data = min(
    [
       str_to_datetime(oam_item["acquisition_start"]) for oam_item in items if oam_item["acquisition_start"]
    ]
)

collection = pystac.Collection(
    id="openaerialmap",
    title="OpenAerialMap Imagery catalog",
    description="OpenAerialMap",
    extent=pystac.Extent(
        spatial=pystac.SpatialExtent(
            [
                [-180, -90, 180, 90],
            ],
        ),
        temporal=pystac.TemporalExtent(
            intervals=[
                [init_data, None],
            ]
        )
    ),
    license="CC-BY 4.0",
    extra_fields={
        "renders": {
            "visual": {
                "title": "Visual Image",
                "assets": [
                    "image"
                ],
            },
        },
    },
)

item_assets = ItemAssetsExtension.ext(collection, add_if_missing=True)
item_assets.item_assets = {
    "image": AssetDefinition.create(
        title="iamge",
        description=None,
        media_type="image/tiff; application=geotiff; profile=cloud-optimized",
        roles=None,
    ),
}

with open("oam_collection.json", "w") as f:
    f.write(json.dumps(collection.to_dict(), ensure_ascii=False))


#### 3. Ingest OAM STAC

```bash
pypgstac load collections oam_collection.json --dsn postgresql://{db-user}:{db-password}@{db-host}:{db-port}/{db-name} --method insert

# NOTE: we need to set `--method ignore` because some items are duplicated in the OAM database
pypgstac load items oam_items.njson --dsn postgresql://{db-user}:{db-password}@{db-host}:{db-port}/{db-name} --method ignore
```

## 4. Search and Visualize some data

In [43]:
import httpx

stac_endpoint = "https://stac.eoapi.dev"

#### Let's try to visualize a dataset from a specific Provider

We can filter use the STAC API to find the Items which have the `properties.provider=='CEN NA - MD'` field

In [63]:
# use /search endpoint with some `filter` parameter
response = httpx.get(
    f"{stac_endpoint}/search",
    params={
        "filter": "provider='CEN NA - MD' AND collection='openaerialmap'",
        "filter-lang": "cql2-text",
        "limit": 100,
    },
)
print(response.json()["context"])

feature_collection = response.json()

{'limit': 100, 'matched': 6, 'returned': 6}


In [64]:
# Put the result of the search request (FeatureCollection) on a Map

from folium import Map, TileLayer, GeoJson

geojson = GeoJson(
    data=feature_collection,
    style_function=lambda x: {
        'opacity': 1, 'dashArray': '1', 'fillOpacity': 0, 'weight': 5
    },
)
bounds = geojson._get_self_bounds()

lat = (bounds[0][0] + bounds[1][0]) / 2.0
lon = (bounds[0][1] + bounds[1][1]) / 2.0
m = Map(tiles="OpenStreetMap", location=(lat, lon), zoom_start=11)

geojson.add_to(m)
m


In [66]:
# Let's visualize one Item using the raster API
import httpx
import json

raster_endpoint = "https://raster.eoapi.dev"

feature = feature_collection["features"][0]
print(feature)

item_id = feature["id"]
collection_id = "openaerialmap"

# Check what assets are available
resp = httpx.get(f"{raster_endpoint}/collections/{collection_id}/items/{item_id}/assets").json()
print("Available Assets: ", resp)

# Fectch `Image` asset info
print(f"Fetching Raster info for Item {item_id}")
info = httpx.get(f"{raster_endpoint}/collections/{collection_id}/items/{item_id}/info", params={"assets": "image"}).json()

print("Returned metadata for Assets:", list(info.keys()))
print()
print(json.dumps(info["image"], indent=4))
print()

print("Min/Max zoom for Asset `Image` are", info["image"]["minzoom"], info["image"]["maxzoom"])

{'id': '66e00568cd0baa0001b6206a', 'bbox': [1.23557, 44.917514, 1.239309, 44.919786], 'type': 'Feature', 'links': [{'rel': 'collection', 'type': 'application/json', 'href': 'https://stac.eoapi.dev/collections/openaerialmap'}, {'rel': 'parent', 'type': 'application/json', 'href': 'https://stac.eoapi.dev/collections/openaerialmap'}, {'rel': 'root', 'type': 'application/json', 'href': 'https://stac.eoapi.dev/'}, {'rel': 'self', 'type': 'application/geo+json', 'href': 'https://stac.eoapi.dev/collections/openaerialmap/items/66e00568cd0baa0001b6206a'}], 'assets': {'image': {'href': 'https://oin-hotosm.s3.us-east-1.amazonaws.com/66e0032ecd0baa0001b62068/0/66e0032ecd0baa0001b62069.tif', 'type': 'image/tiff; application=geotiff; profile=cloud-optimized', 'roles': ['data'], 'alternate': {'s3': {'href': 's3://oin-hotosm/66e0032ecd0baa0001b62068/0/66e0032ecd0baa0001b62069.tif'}}, 'alternate:name': 'S3'}}, 'geometry': {'type': 'MultiPolygon', 'coordinates': [[[[1.23557, 44.919728], [1.23564, 44.917

In [67]:

resp = httpx.get(
    f"{raster_endpoint}/collections/{collection_id}/items/{item_id}/WebMercatorQuad/tilejson.json", params={"assets": "image", "minzoom": 16, "maxzoom": 22}
).json()
print(resp)

lat = (info["image"]["bounds"][3] + info["image"]["bounds"][1]) / 2.0
lon = (info["image"]["bounds"][0] + info["image"]["bounds"][2]) / 2.0
m = Map(tiles="OpenStreetMap", location=(lat, lon), zoom_start=116)

aod_layer = TileLayer(
    tiles=resp["tiles"][0],
    attr=f"Item {item_id}",
    min_zoom=16,
    max_zoom=22,
    max_native_zoom=17,
)
aod_layer.add_to(m)
m

{'tilejson': '2.2.0', 'version': '1.0.0', 'scheme': 'xyz', 'tiles': ['https://raster.eoapi.dev/collections/openaerialmap/items/66e00568cd0baa0001b6206a/tiles/WebMercatorQuad/{z}/{x}/{y}@1x?assets=image'], 'minzoom': 16, 'maxzoom': 22, 'bounds': [1.23557, 44.917514, 1.239309, 44.919786], 'center': [1.2374395, 44.91865, 16]}


#### Create Virtual Mosaic with all the Items

In [96]:
response = httpx.post(
    f"{raster_endpoint}/searches/register",
    data=json.dumps(
        {
            "filter": {
                "op": "and",
                "args": [
                    {"op": "=", "args": [{"property": "collection"}, "openaerialmap"]},
                    {"op": "=", "args": [{"property": "provider"}, "CEN NA - MD"]},
                ],
            },
            "filter-lang": "cql2-json",
            "metadata": {
                "name": "CEN NA - MD",
                # NOTE: We set Minzoom at 13 because we know there are few image,
                # and it fine to visualize them at low zoom level
                # (this won't be the case for collection with higher number of images)
                "minzoom": 13,
                "maxzoom": 22,
                # NOTE: we set the bounds for the Bounds created from the results of the /search response
                "bounds": [bounds[0][1], bounds[0][0], bounds[1][1], bounds[1][0]],
                "assets": [
                    "image",
                ],
                "defaults": {
                    "image": {
                        "assets": ["image"],
                    },
                }
            }
        }
    ),
).json()

print(json.dumps(response, indent=4))

{
    "id": "2fb7cb756f682dbe56754d6f9d7cd50f",
    "links": [
        {
            "rel": "metadata",
            "title": "Mosaic metadata",
            "type": "application/json",
            "href": "https://raster.eoapi.dev/searches/2fb7cb756f682dbe56754d6f9d7cd50f/info"
        },
        {
            "rel": "tilejson",
            "title": "Link for TileJSON",
            "type": "application/json",
            "href": "https://raster.eoapi.dev/searches/2fb7cb756f682dbe56754d6f9d7cd50f/tilejson.json"
        },
        {
            "rel": "map",
            "title": "Link for Map viewer",
            "type": "application/json",
            "href": "https://raster.eoapi.dev/searches/2fb7cb756f682dbe56754d6f9d7cd50f/map"
        },
        {
            "rel": "wmts",
            "title": "Link for WMTS",
            "type": "application/json",
            "href": "https://raster.eoapi.dev/searches/2fb7cb756f682dbe56754d6f9d7cd50f/WMTSCapabilities.xml"
        },
        {
    

In [98]:
search_id = response["id"]

resp = httpx.get(
    f"{raster_endpoint}/searches/{search_id}/WebMercatorQuad/tilejson.json", params={"assets": "image"}
).json()
print(resp)

lat = (resp["bounds"][3] + resp["bounds"][1]) / 2.0
lon = (resp["bounds"][0] + resp["bounds"][2]) / 2.0
mosaic = Map(tiles="OpenStreetMap", location=(lat, lon), zoom_start=13)

aod_layer = TileLayer(
    tiles=resp["tiles"][0],
    attr=f"Mosaic {search_id}",
    min_zoom=resp["minzoom"],
    max_zoom=resp["maxzoom"],
    max_native_zoom=17,
)
aod_layer.add_to(mosaic)
geojson.add_to(mosaic)
mosaic

{'tilejson': '2.2.0', 'name': 'CEN NA - MD', 'version': '1.0.0', 'scheme': 'xyz', 'tiles': ['https://raster.eoapi.dev/searches/2fb7cb756f682dbe56754d6f9d7cd50f/tiles/WebMercatorQuad/{z}/{x}/{y}?assets=image'], 'minzoom': 13, 'maxzoom': 22, 'bounds': [0.993319, 44.917514, 1.239309, 45.216803], 'center': [1.116314, 45.0671585, 13]}
