<a href="https://colab.research.google.com/github/pacificspatial/flateau/blob/main/stac_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# STAC について学ぶ
https://stacspec.org/en/tutorials/1-read-stac-python/

https://pystac.readthedocs.io/en/stable/



In [2]:
# 必要なライブラリを読む
! pip install pystac
! pip install rasterio
! pip install shapely

Collecting rasterio
  Downloading rasterio-1.3.10-cp310-cp310-manylinux2014_x86_64.whl (21.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.5/21.5 MB[0m [31m32.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting affine (from rasterio)
  Downloading affine-2.4.0-py3-none-any.whl (15 kB)
Collecting snuggs>=1.4.1 (from rasterio)
  Downloading snuggs-1.4.7-py3-none-any.whl (5.4 kB)
Installing collected packages: snuggs, affine, rasterio
Successfully installed affine-2.4.0 rasterio-1.3.10 snuggs-1.4.7


In [11]:
import json
import shutil
import tempfile
from pathlib import Path

from pystac import Catalog, get_stac_version
from pystac.extensions.eo import EOExtension
from pystac.extensions.label import LabelExtension

import os
import rasterio
import urllib.request
import pystac

from datetime import datetime, timezone
from shapely.geometry import Polygon, mapping
from tempfile import TemporaryDirectory

In [None]:
# Read the example catalog
root_catalog = Catalog.from_file('https://raw.githubusercontent.com/stac-utils/pystac/main/docs/example-catalog/catalog.json')

<h1>ハイレベルのカタログ情報の取得</h1>

In [None]:
root_catalog.describe()

* <Catalog id=landsat-stac-collection-catalog>
    * <Collection id=landsat-8-l1>
      * <Item id=LC80140332018166LGN00>
      * <Item id=LC80150322018141LGN00>
      * <Item id=LC80150332018189LGN00>
      * <Item id=LC80300332018166LGN00>


このカタログには、Collectionが一つ、その下にItemが4つあることがわかる

In [None]:
# Print some basic metadata from the Catalog
print(f"ID: {root_catalog.id}")
print(f"Title: {root_catalog.title or 'N/A'}")
print(f"Description: {root_catalog.description or 'N/A'}")

ID: landsat-stac-collection-catalog
Title: STAC for Landsat data
Description: STAC for Landsat data


In [None]:
# check STAC version
print(get_stac_version())

1.0.0


<h1> Collection について調べる </h1>

In [None]:
collections = list(root_catalog.get_collections())

print(f"Number of collections: {len(collections)}")
print("Collections IDs:")
for collection in collections:
    print(f"- {collection.id}")

Number of collections: 1
Collections IDs:
- landsat-8-l1


Collectionが一つで、IDは、landsat-8-11 であることがわかる

次にCollection ID を使ってCollectionインスタンスを取得する。

In [None]:
collection = root_catalog.get_child("landsat-8-l1")
if collection is None:
    print("Collection is Empty. Check your downloads and try agian.")
else:
    print("Collection has a root child. You may proceed to the following steps.")

Collection has a root child. You may proceed to the following steps.


<h1>STAC Itemについて学ぶ</h1>
STAC Item は、カタログの基本構成要素。これが各データに対応するエントリーとなって、それらをOrganizeするのがCollection.で、Collectionを統合するのがCatalogで、Catalogが一番トップの要素となる。

In [None]:
items = list(root_catalog.get_all_items())

print(f"Number of items: {len(items)}")
for item in items:
    print(f"- {item.id}")

Number of items: 4
- LC80140332018166LGN00
- LC80150322018141LGN00
- LC80150332018189LGN00
- LC80300332018166LGN00


- Item のメタデータは、以下の構成になっている
 - Core Item Metadata
 - Common Metadata
 - STAC Extensions

item の id を使って、itemインスタンスを取得。

In [None]:
item = root_catalog.get_item("LC80140332018166LGN00", recursive=True)

以下で、様々な属性を取り出してみる。

In [None]:
# ジオメトリ (Core Item)
item.geometry

{'type': 'Polygon',
 'coordinates': [[[-76.12180471942207, 39.95810181489563],
   [-73.94910518227414, 39.55117185146004],
   [-74.49564725552679, 37.826064511480496],
   [-76.66550404911956, 38.240699151776084],
   [-76.12180471942207, 39.95810181489563]]]}

In [None]:
# BBOX (Core Item)
item.bbox

[-76.66703, 37.82561, -73.94861, 39.95958]

In [None]:
# date (Core Item)
item.datetime

datetime.datetime(2018, 6, 15, 15, 39, 9, tzinfo=tzutc())

In [None]:
# Collection ID (Core Item)
item.collection_id

'landsat-8-l1'

In [None]:
# Item の属する Collection について知りたいなら (Core Item)
item.get_collection()

<h2>Comon Metadataについて</h2>
データのライセンス、センサ、などの情報が格納されている

In [None]:
item.common_metadata.instruments

['OLI_TIRS']

In [None]:
item.common_metadata.platform

'landsat-8'

In [None]:
item.common_metadata.gsd

30

<h2>STAC Extensionsについて</h2>
CoreやCommonではカバーできないメタデータをこれでカバー

In [None]:
item.stac_extensions

['https://stac-extensions.github.io/eo/v1.1.0/schema.json',
 'https://stac-extensions.github.io/view/v1.0.0/schema.json',
 'https://stac-extensions.github.io/projection/v1.1.0/schema.json']

In [None]:
EOExtension.has_extension(item)

True

In [None]:
LabelExtension.has_extension(item)

False

In [None]:
# Cloud coverage
eo_item_ext = EOExtension.ext(item)
eo_item_ext.cloud_cover

22

In [None]:
# Cloud coverage 別の方法で取得
item.properties['eo:cloud_cover']

22

<h1>STAC Item のアセットにアクセスするには</h1>
ここでエられるデータで直接データにアクセス可能

In [None]:
for asset_key in item.assets:
    asset = item.assets[asset_key]
    print('{}: {} ({})'.format(asset_key, asset.href, asset.media_type))

index: https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/index.html (text/html)
thumbnail: https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/LC08_L1TP_014033_20180615_20180703_01_T1_thumb_large.jpg (image/jpeg)
B1: https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/LC08_L1TP_014033_20180615_20180703_01_T1_B1.TIF (image/tiff)
B2: https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/LC08_L1TP_014033_20180615_20180703_01_T1_B2.TIF (image/tiff)
B3: https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/LC08_L1TP_014033_20180615_20180703_01_T1_B3.TIF (image/tiff)
B4: https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/LC08_L1TP_014033_20180615_20180703_01_T1_B4.TIF (image/tiff)
B5: https://s3-us

In [None]:
asset = item.assets['B3']
asset.to_dict()

{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/014/033/LC08_L1TP_014033_20180615_20180703_01_T1/LC08_L1TP_014033_20180615_20180703_01_T1_B3.TIF',
 'type': 'image/tiff',
 'title': 'Band 3 (green)',
 'eo:bands': [{'name': 'B3',
   'full_width_half_max': 0.06,
   'center_wavelength': 0.56,
   'common_name': 'green'}],
 'roles': []}

In [None]:
eo_asset_ext = EOExtension.ext(asset)
bands = eo_asset_ext.bands
bands

[<Band name=B3>]

In [None]:
bands[0].to_dict()

{'name': 'B3',
 'full_width_half_max': 0.06,
 'center_wavelength': 0.56,
 'common_name': 'green'}

<h1>STAC Catalogを作るには</h1>



In [None]:
# サンプルデータをダウンロード　(/content/SN5_roads_train_AOI_7_Moscow_MS_chip996.tif として格納される)
!wget https://spacenet-dataset.s3.amazonaws.com/spacenet/SN5_roads/train/AOI_7_Moscow/MS/SN5_roads_train_AOI_7_Moscow_MS_chip996.tif

--2024-05-13 03:33:03--  https://spacenet-dataset.s3.amazonaws.com/spacenet/SN5_roads/train/AOI_7_Moscow/MS/SN5_roads_train_AOI_7_Moscow_MS_chip996.tif
Resolving spacenet-dataset.s3.amazonaws.com (spacenet-dataset.s3.amazonaws.com)... 3.5.28.218, 3.5.25.132, 52.217.132.121, ...
Connecting to spacenet-dataset.s3.amazonaws.com (spacenet-dataset.s3.amazonaws.com)|3.5.28.218|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1693036 (1.6M) [image/tiff]
Saving to: ‘SN5_roads_train_AOI_7_Moscow_MS_chip996.tif’


2024-05-13 03:33:04 (9.09 MB/s) - ‘SN5_roads_train_AOI_7_Moscow_MS_chip996.tif’ saved [1693036/1693036]



<h2>STAC Catlog を作る</h2>
id と description は必須。まずはそれだけでOK。

In [4]:
catalog = Catalog(id='tutorial-catalog', description='デモ用カタログ with SpaceNet 5.')

まだ何も追加していないのでカラ。

In [5]:
print(list(catalog.get_children()))
print(list(catalog.get_items()))

[]
[]


以下のコマンドを叩くと、JSONが作られていく様子を確認できる。

In [6]:
print(json.dumps(catalog.to_dict(), indent=4))

{
    "type": "Catalog",
    "id": "tutorial-catalog",
    "stac_version": "1.0.0",
    "description": "\u30c7\u30e2\u7528\u30ab\u30bf\u30ed\u30b0 with SpaceNet 5.",
    "links": []
}


In [None]:
# サンプルのラスタデータをItemとして追加するために、geometry と bbox を作成する関数を作成
def get_bbox_and_footprint(raster):
    with rasterio.open(raster) as r:
        bounds = r.bounds
        bbox = [bounds.left, bounds.bottom, bounds.right, bounds.top]
        footprint = Polygon([
            [bounds.left, bounds.bottom],
            [bounds.left, bounds.top],
            [bounds.right, bounds.top],
            [bounds.right, bounds.bottom]
        ])

        return (bbox, mapping(footprint))

In [None]:
# 実際に関数を使ってGeometryとBBOXを取得
bbox, footprint = get_bbox_and_footprint('/content/SN5_roads_train_AOI_7_Moscow_MS_chip996.tif')
print("bbox: ", bbox, "\n")
print("footprint: ", footprint)

bbox:  [37.6616853489879, 55.73478197572927, 37.66573047610874, 55.73882710285011] 

footprint:  {'type': 'Polygon', 'coordinates': (((37.6616853489879, 55.73478197572927), (37.6616853489879, 55.73882710285011), (37.66573047610874, 55.73882710285011), (37.66573047610874, 55.73478197572927), (37.6616853489879, 55.73478197572927)),)}


In [None]:
# タイムスタンプを作成
datetime_utc = datetime.now(tz=timezone.utc)

<h3>Itemの作成</h3>


*   id
*   geometry
*   bbox
*   datetime
*   properties



In [None]:
item = pystac.Item(id='local-image',
                 geometry=footprint,
                 bbox=bbox,
                 datetime=datetime_utc,
                 properties={})

In [None]:
print(item.get_parent() is None)

True


In [None]:
# ここでカタログにItemを追加する
catalog.add_item(item)

In [None]:
# カタログを確認
item.get_parent()

In [None]:
# 以下でもカタログの構造を確認できる
catalog.describe()

* <Catalog id=tutorial-catalog>
  * <Item id=local-image>


<h2>STAC Asset (つまりデータそのものの情報) を追加</h2>
ここまででItemは作成してCatalogに追加しているが、実際のアセットはItemに対してついかされていない。ここで追加する。

https://pystac.readthedocs.io/en/stable/api/asset.html#pystac-asset

すでに作成済みの item に対して asset を追加するが、key と asset を指定する。

asset では、href, media_type を指定している。
media_type としてGeoTIFFを指定しているが他のタイプも有る。

https://pystac.readthedocs.io/en/stable/api/media_type.html

ベクタ系では、parquet, kml, geojson, geopackage, flatgeobuf 等がある。

In [None]:
# Add Asset and all its information to Item
item.add_asset(
    key='image',
    asset=pystac.Asset(
        href='/content/SN5_roads_train_AOI_7_Moscow_MS_chip996.tif',
        media_type=pystac.MediaType.GEOTIFF
    )
)

In [None]:
print(json.dumps(item.to_dict(), indent=4))

{
    "type": "Feature",
    "stac_version": "1.0.0",
    "id": "local-image",
    "properties": {
        "datetime": "2024-05-13T03:33:14.468410Z"
    },
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            [
                [
                    37.6616853489879,
                    55.73478197572927
                ],
                [
                    37.6616853489879,
                    55.73882710285011
                ],
                [
                    37.66573047610874,
                    55.73882710285011
                ],
                [
                    37.66573047610874,
                    55.73478197572927
                ],
                [
                    37.6616853489879,
                    55.73478197572927
                ]
            ]
        ]
    },
    "links": [
        {
            "rel": "root",
            "href": null,
            "type": "application/json"
        },
        {
            "rel": "paren

<h1>カタログを保存</h1>

今のところカタログ自体のhrefは設定されていない

In [None]:
print(catalog.get_self_href() is None)
print(item.get_self_href() is None)

True
True


カタログのHREFを設定する

In [None]:
catalog.normalize_hrefs(os.path.join('/content/sample_data', "stac"))

In [None]:
print("Catalog HREF: ", catalog.get_self_href())
print("Item HREF: ", item.get_self_href())

Catalog HREF:  /content/sample_data/stac/catalog.json
Item HREF:  /content/sample_data/stac/local-image/local-image.json


<h2>カタログの保存</h2>

In [None]:
catalog.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)

In [None]:
!ls /content/sample_data/stac/*

/content/sample_data/stac/catalog.json

/content/sample_data/stac/local-image:
local-image.json


In [None]:
with open(catalog.self_href) as f:
    print(f.read())

{
  "type": "Catalog",
  "id": "tutorial-catalog",
  "stac_version": "1.0.0",
  "description": "\u30c7\u30e2\u7528\u30ab\u30bf\u30ed\u30b0 with SpaceNet 5.",
  "links": [
    {
      "rel": "root",
      "href": "./catalog.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./local-image/local-image.json",
      "type": "application/json"
    }
  ]
}


In [None]:
with open(item.self_href) as f:
    print(f.read())

{
  "type": "Feature",
  "stac_version": "1.0.0",
  "id": "local-image",
  "properties": {
    "datetime": "2024-05-13T03:33:14.468410Z"
  },
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          37.6616853489879,
          55.73478197572927
        ],
        [
          37.6616853489879,
          55.73882710285011
        ],
        [
          37.66573047610874,
          55.73882710285011
        ],
        [
          37.66573047610874,
          55.73478197572927
        ],
        [
          37.6616853489879,
          55.73478197572927
        ]
      ]
    ]
  },
  "links": [
    {
      "rel": "root",
      "href": "../catalog.json",
      "type": "application/json"
    },
    {
      "rel": "parent",
      "href": "../catalog.json",
      "type": "application/json"
    }
  ],
  "assets": {
    "image": {
      "href": "/content/SN5_roads_train_AOI_7_Moscow_MS_chip996.tif",
      "type": "image/tiff; application=geotiff"
    }
  },
  "bbox"

<h1>Flateauのカタログを作ってみる</h1>

In [7]:
import pandas as pd
import geopandas as gpd
import uuid

In [9]:
catalog = Catalog(id='flateau-catalog', description='flateau catalog')

ジオメトリとBBOXが必要

In [10]:
stac_file = r'/content/stac_items.csv'
stac_df = pd.read_csv(stac_file)
stac_df.head()

Unnamed: 0,_filename,url,_xmin,_xmax,_ymin,_ymax
0,01100_sapporo-shi_2020_building_centroid_lod0....,https://data.source.coop/pacificspatial/flatea...,141.119971,141.505164,42.896122,43.184199
1,01205_muroran-shi_2022_building_centroid_lod0....,https://data.source.coop/pacificspatial/flatea...,140.908458,141.048746,42.302084,42.420657
2,01100_sapporo-shi_2020_building_centroid_lod0....,https://data.source.coop/pacificspatial/flatea...,141.119971,141.505164,42.896122,43.184199
3,01205_muroran-shi_2022_building_centroid_lod0....,https://data.source.coop/pacificspatial/flatea...,140.908458,141.048746,42.302084,42.420657
4,01639_sarabetsu-mura_2023_building_centroid_lo...,https://data.source.coop/pacificspatial/flatea...,143.10478,143.304112,42.571978,42.720527


In [12]:
for index, row in stac_df.iterrows():

  bbox = [row._xmin, row._ymin, row._xmax, row._ymax]
  footprint = Polygon([
      [row._xmin, row._ymin],
      [row._xmin, row._ymax],
      [row._xmax, row._ymax],
      [row._xmax, row._ymin]
  ])

  item = pystac.Item(id=row._filename,
    geometry=mapping(footprint),
    bbox=bbox,
    datetime=datetime.now(tz=timezone.utc),
    properties={})

  # add an item to the catalog
  catalog.add_item(item)

  # Add Asset and all its information to Item
  item.add_asset(
    key=str(uuid.uuid4()),
    asset=pystac.Asset(
      href=row.url,
      media_type=pystac.MediaType.PARQUET
    )
  )



  print('added one item')

added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one item
added one 

In [13]:
# カタログを確認
item.get_parent()

In [14]:
item

In [15]:
catalog.normalize_hrefs(os.path.join('/content/sample_data', "stac"))

In [16]:
print("Catalog HREF: ", catalog.get_self_href())
print("Item HREF: ", item.get_self_href())

Catalog HREF:  /content/sample_data/stac/catalog.json
Item HREF:  /content/sample_data/stac/47201_naha-shi_2020_building_centroid_lod0.parquet/47201_naha-shi_2020_building_centroid_lod0.parquet.json


In [17]:
catalog.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)

In [18]:
with open(catalog.self_href) as f:
    print(f.read())

{
  "type": "Catalog",
  "id": "flateau-catalog",
  "stac_version": "1.0.0",
  "description": "flateau catalog",
  "links": [
    {
      "rel": "root",
      "href": "./catalog.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./01100_sapporo-shi_2020_building_centroid_lod0.parquet/01100_sapporo-shi_2020_building_centroid_lod0.parquet.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./01205_muroran-shi_2022_building_centroid_lod0.parquet/01205_muroran-shi_2022_building_centroid_lod0.parquet.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./01100_sapporo-shi_2020_building_centroid_lod0.parquet/01100_sapporo-shi_2020_building_centroid_lod0.parquet.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./01205_muroran-shi_2022_building_centroid_lod0.parquet/01205_muroran-shi_2022_building_centroid_lod0.parquet.json",
      "type": "appl

In [19]:
with open(item.self_href) as f:
    print(f.read())

{
  "type": "Feature",
  "stac_version": "1.0.0",
  "id": "47201_naha-shi_2020_building_centroid_lod0.parquet",
  "properties": {
    "datetime": "2024-05-13T18:15:54.071833Z"
  },
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          127.63998422337508,
          26.17658338710737
        ],
        [
          127.63998422337508,
          26.24653602356476
        ],
        [
          127.73870144544492,
          26.24653602356476
        ],
        [
          127.73870144544492,
          26.17658338710737
        ],
        [
          127.63998422337508,
          26.17658338710737
        ]
      ]
    ]
  },
  "links": [
    {
      "rel": "root",
      "href": "../catalog.json",
      "type": "application/json"
    },
    {
      "rel": "parent",
      "href": "../catalog.json",
      "type": "application/json"
    }
  ],
  "assets": {
    "154ef181-8cee-4678-8120-81eb22eae705": {
      "href": "https://data.source.coop/pacificspatial/flat

In [20]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [21]:
%cd /content/sample_data

/content/sample_data


In [22]:
!zip -r stac.zip stac/

  adding: stac/ (stored 0%)
  adding: stac/22216_fukuroi-shi_2023_building_centroid_lod0.parquet/ (stored 0%)
  adding: stac/22216_fukuroi-shi_2023_building_centroid_lod0.parquet/22216_fukuroi-shi_2023_building_centroid_lod0.parquet.json (deflated 64%)
  adding: stac/22461_mori-machi_2023_building_centroid_lod0.parquet/ (stored 0%)
  adding: stac/22461_mori-machi_2023_building_centroid_lod0.parquet/22461_mori-machi_2023_building_centroid_lod0.parquet.json (deflated 64%)
  adding: stac/22342_nagaizumi-cho_2023_building_centroid_lod0.parquet/ (stored 0%)
  adding: stac/22342_nagaizumi-cho_2023_building_centroid_lod0.parquet/22342_nagaizumi-cho_2023_building_centroid_lod0.parquet.json (deflated 64%)
  adding: stac/34207_fukuyama-shi_2020_building_centroid_lod0.parquet/ (stored 0%)
  adding: stac/34207_fukuyama-shi_2020_building_centroid_lod0.parquet/34207_fukuyama-shi_2020_building_centroid_lod0.parquet.json (deflated 64%)
  adding: stac/22220_susono-shi_2023_building_centroid_lod0.parque

In [23]:
from google.colab import files
files.download('stac.zip')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>