# Working with STAC API in AWS and S3

## Introduction

This notebook serves the code snippets for the following Data Scientist use-cases:
0. **Define SageMaker Notebook connections**
1. **Check connection to STAC API.**
2. **Create and upload STAC catalog using PySTAC.**
  - Importing images from external sources.
  - Importing images from external STAC Catalog APIs.
      - See this [tutorial for accessing NAIP data with the Planetary Computer STAC API](https://github.com/microsoft/PlanetaryComputerExamples/blob/main/datasets/naip/naip-example.ipynb)
      - See this [tutorial for accessing 3DEP Lidar COG data with the Planetary Computer STAC API](https://github.com/microsoft/PlanetaryComputerExamples/blob/main/datasets/3dep-lidar/3dep-lidar-cog-example.ipynb)
  - Creating STAC catalog with collection and items using PySTAC.
      -  See this [tutorial for Create a STAC Catalog with a Collection Using PySTAC](https://stacspec.org/en/tutorials/4-create-stac-collection/)
  -  Saving the catalog to a local directory.
  - Upload STAC collection and items to STAC API.
3. **Query images from STAC API.**
  - Query based on Item ID.
  - Query based on Intersect.
4. **Working with STAC Collection and Items with S3 bucket.**
  - Upload STAC collection and items to S3 bucket.
  - Download STAC collection and items from S3 bucket.

> **Remark:** This notebook should be run within the Jupyter notebook environment of SageMaker Notebook in AWS. Please upload all the necessary files in the `notebook` folder to the SageMkar Notebook instance before running.

Sources:
- [PySTAC: How to create STAC Catalogs](https://github.com/stac-utils/pystac/blob/d70dea5c70a243450ac64120e0eba84d786574b4/docs/tutorials/how-to-create-stac-catalogs.ipynb)
- [Planet: Introduction to STAC Part 2](https://developers.planet.com/docs/planetschool/introduction-to-stac-part-2-creating-an-example-stac-catalog-of-planet-imagery-with-pystac/)
- [Medium: Organizing Geospatial data with Spatio Temporal Assets Catalogs â€” STAC using python](https://towardsdatascience.com/organizing-geospatial-data-with-spatio-temporal-assets-catalogs-stac-using-python-45f1a64ca082)
- [STAC: About STAC, The STAC Specification](https://stacspec.org/en/about/stac-spec/)
- [EODAG: EAODAG as STAC client](https://eodag.readthedocs.io/en/stable/notebooks/tutos/tuto_stac_client.html)

## 0. Define SageMaker Notebook connections

The following code snippet defines the SageMaker Notebook connections to the STAC API and S3 bucket.

In [1]:
APP_HOST = "http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080"
S3_BUCKET = "data-bucket-5ndo"

## 1. Check connection to STAC API

Before you follow this tutorial, you need to make sure you have access to a local or a Cloud STAC API. This section checks the connection to your STAC API.

In [2]:
import requests

### You can use the following to programmatically check the STAC API.
app_host = APP_HOST # Update with your STAC API endpoint.
response = requests.get(app_host)
if response.status_code == 200:
    print('STAC API is reachable. Congratulations!')
else:
    print('STAC API is NOT reachable. Something went wrong!')

STAC API is reachable. Congratulations!



## 2. Create and upload STAC catalog using PySTAC.
### Importing images from external sources

This section downloads some images from ```https://spacenet-dataset.s3.amazonaws.com```, an open S3 bucket from AWS.

In [3]:
import os
import urllib.request
from tempfile import TemporaryDirectory

tmp_dir = "data/user-create-catalog"

# Create folder if not exists
if not os.path.exists(tmp_dir): os.makedirs(tmp_dir)
    
img_path1 = os.path.join(tmp_dir, 'image1.tif')
img_path2 = os.path.join(tmp_dir, 'image2.tif')

In [4]:
# Fetch and store data
url1 = ('https://spacenet-dataset.s3.amazonaws.com/'
       'spacenet/SN5_roads/train/AOI_7_Moscow/MS/'
       'SN5_roads_train_AOI_7_Moscow_MS_chip996.tif')
urllib.request.urlretrieve(url1, img_path1)

url2 = ('https://spacenet-dataset.s3.amazonaws.com/'
       'spacenet/SN5_roads/train/AOI_7_Moscow/MS/'
       'SN5_roads_train_AOI_7_Moscow_MS_chip997.tif')
urllib.request.urlretrieve(url2, img_path2)

print("img_path1: " , img_path1, "\n", "img_path2: ", img_path2)

img_path1:  data/user-create-catalog/image1.tif 
 img_path2:  data/user-create-catalog/image2.tif


### Importing images from external STAC Catalog APIs

There are external STAC APIs available on the internet that we can query. Below are some tutorials that explains this. Please consult them if you have interest.

See this [tutorial for accessing NAIP data with the Planetary Computer STAC API](https://github.com/microsoft/PlanetaryComputerExamples/blob/main/datasets/naip/naip-example.ipynb)

See this [tutorial for accessing 3DEP Lidar COG data with the Planetary Computer STAC API](https://github.com/microsoft/PlanetaryComputerExamples/blob/main/datasets/3dep-lidar/3dep-lidar-cog-example.ipynb)

### Creating STAC catalog with collection and items using PySTAC.

The new images imported from the first section are just that, images. We can use the PySTAC package to create STAC Items and STAC Collections that can be uploaded to our STAC API. The following section showcases this.

In [5]:
import pystac

In [6]:
print(pystac.Catalog.__doc__)

A PySTAC Catalog represents a STAC catalog in memory.

    A Catalog is a :class:`~pystac.STACObject` that may contain children,
    which are instances of :class:`~pystac.Catalog` or :class:`~pystac.Collection`,
    as well as :class:`~pystac.Item` s.

    Args:
        id : Identifier for the catalog. Must be unique within the STAC.
        description : Detailed multi-line description to fully explain the catalog.
            `CommonMark 0.29 syntax <https://commonmark.org/>`_ MAY be used for rich
            text representation.
        title : Optional short descriptive one-line title for the catalog.
        stac_extensions : Optional list of extensions the Catalog implements.
        href : Optional HREF for this catalog, which be set as the
            catalog's self link's HREF.
        catalog_type : Optional catalog type for this catalog. Must
            be one of the values in :class:`~pystac.CatalogType`.
        strategy : The layout strategy to use for setting the
     

Let's just give an ID and a description. We don't have to worry about the HREF right now;  that  will be set later.

In [7]:
catalog = pystac.Catalog(id='test-catalog', description='Tutorial catalog with Collection.')

There are no children or items in the catalog, since we haven't added anything yet.

In [8]:
print(list(catalog.get_children()))
print(list(catalog.get_items()))

[]
[]


We'll now create an Item to represent the image. Check the pydocs to see what you need to supply:

In [9]:
print(pystac.Item.__doc__)

An Item is the core granular entity in a STAC, containing the core metadata
    that enables any client to search or crawl online catalogs of spatial 'assets' -
    satellite imagery, derived data, DEM's, etc.

    Args:
        id : Provider identifier. Must be unique within the STAC.
        geometry : Defines the full footprint of the asset represented by this
            item, formatted according to
            `RFC 7946, section 3.1 (GeoJSON) <https://tools.ietf.org/html/rfc7946>`_.
        bbox :  Bounding Box of the asset represented by this item
            using either 2D or 3D geometries. The length of the array must be 2*n
            where n is the number of dimensions. Could also be None in the case of a
            null geometry.
        datetime : datetime associated with this item. If None,
            a start_datetime and end_datetime must be supplied.
        properties : A dictionary of additional metadata for the item.
        start_datetime : Optional start datetim

Using [rasterio](https://rasterio.readthedocs.io/en/stable/), we can pull out the bounding box of the image to use for the image metadata. If the image contained a NoData border, we would ideally pull out the footprint and save it as the geometry; in this case, we're working with a small chip that most likely has no NoData values.

In [10]:
import rasterio
from shapely.geometry import Polygon, mapping

def get_bbox_and_footprint(raster_uri):
    with rasterio.open(raster_uri) as ds:
        bounds = ds.bounds
        bbox = [bounds.left, bounds.bottom, bounds.right, bounds.top]
        footprint = Polygon([
            [bounds.left, bounds.bottom],
            [bounds.left, bounds.top],
            [bounds.right, bounds.top],
            [bounds.right, bounds.bottom]
        ])
        
        return (bbox, mapping(footprint))

In [11]:
# Run the function and print out the results for image 1
bbox, footprint = get_bbox_and_footprint(img_path1)
print(bbox)
print(footprint)

[37.6616853489879, 55.73478197572927, 37.66573047610874, 55.73882710285011]
{'type': 'Polygon', 'coordinates': (((37.6616853489879, 55.73478197572927), (37.6616853489879, 55.73882710285011), (37.66573047610874, 55.73882710285011), (37.66573047610874, 55.73478197572927), (37.6616853489879, 55.73478197572927)),)}


In [12]:
# Run the function and print out the results for image 2
bbox2, footprint2 = get_bbox_and_footprint(img_path2)
print("bbox: ", bbox2, "\n")
print("footprint: ", footprint2)

bbox:  [37.67786535472783, 55.726691972859314, 37.68191048184866, 55.730737099980146] 

footprint:  {'type': 'Polygon', 'coordinates': (((37.67786535472783, 55.726691972859314), (37.67786535472783, 55.730737099980146), (37.68191048184866, 55.730737099980146), (37.68191048184866, 55.726691972859314), (37.67786535472783, 55.726691972859314)),)}


In [13]:
### Add band information from [WorldView-3 Data Sheet](https://www.spaceimagingme.com/downloads/sensors/datasheets/DG_WorldView3_DS_2014.pdf)
from pystac.extensions.eo import Band, EOExtension
wv3_bands = [Band.create(name='Coastal', description='Coastal: 400 - 450 nm', common_name='coastal'),
             Band.create(name='Blue', description='Blue: 450 - 510 nm', common_name='blue'),
             Band.create(name='Green', description='Green: 510 - 580 nm', common_name='green'),
             Band.create(name='Yellow', description='Yellow: 585 - 625 nm', common_name='yellow'),
             Band.create(name='Red', description='Red: 630 - 690 nm', common_name='red'),
             Band.create(name='Red Edge', description='Red Edge: 705 - 745 nm', common_name='rededge'),
             Band.create(name='Near-IR1', description='Near-IR1: 770 - 895 nm', common_name='nir08'),
             Band.create(name='Near-IR2', description='Near-IR2: 860 - 1040 nm', common_name='nir09')]

Beyond what a Catalog requires, a Collection requires a license of the data in the collection and an extent that describes the range of space and time that the items it holds occupy.

An extent is comprised of a SpatialExtent and a TemporalExtent. These extents hold one or more bounding boxes and time intervals, respectively, that completely cover the items contained in the collections.

Let's start with creating two new items - these will be core items. We can set these items to implement the EO extension by specifying them in the stac_extensions.

We're also using `datetime.utcnow()` to supply the required datetime property for our Item. Since this is a required property, you might often find yourself making up a time to fill in if you don't know the exact capture time.

In [14]:
from datetime import datetime, timezone

### STAC Item 1
collection_item = pystac.Item(id='local-image-col-1',
                               geometry=footprint,
                               bbox=bbox,
                               datetime=datetime.now(timezone.utc),
                               properties={})

collection_item.common_metadata.gsd = 0.3
collection_item.common_metadata.platform = 'Maxar'
collection_item.common_metadata.instruments = ['WorldView3']

asset = pystac.Asset(href=img_path1, 
                      media_type=pystac.MediaType.GEOTIFF)
collection_item.add_asset("image", asset)
eo = EOExtension.ext(collection_item.assets["image"], add_if_missing=True)
eo.apply(wv3_bands)

### STAC Item 2
collection_item2 = pystac.Item(id='local-image-col-2',
                               geometry=footprint2,
                               bbox=bbox2,
                               datetime=datetime.now(timezone.utc),
                               properties={})

collection_item2.common_metadata.gsd = 0.3
collection_item2.common_metadata.platform = 'Maxar'
collection_item2.common_metadata.instruments = ['WorldView3']

asset2 = pystac.Asset(href=img_path2,
                     media_type=pystac.MediaType.GEOTIFF)
collection_item2.add_asset("image", asset2)
eo = EOExtension.ext(collection_item2.assets["image"], add_if_missing=True)
eo.apply([
    band for band in wv3_bands if band.name in ["Red", "Green", "Blue"]
])

We can use our two items' metadata to find out what the proper bounds are:

In [15]:
from shapely.geometry import shape

# Calculate spatial extent for Collection.
unioned_footprint = shape(footprint).union(shape(footprint2))
collection_bbox = list(unioned_footprint.bounds)
spatial_extent = pystac.SpatialExtent(bboxes=[collection_bbox])

# Calculate temporal extend for Collection
collection_interval = sorted([collection_item.datetime, collection_item2.datetime])
temporal_extent = pystac.TemporalExtent(intervals=[collection_interval])

# Combine spatial and temporal extends.
collection_extent = pystac.Extent(spatial=spatial_extent, temporal=temporal_extent)


In [16]:
collection = pystac.Collection(id='wv3-images',
                               description='Spacenet 5 images over Moscow',
                               extent=collection_extent,
                               license='CC-BY-SA-4.0')

In [17]:
collection.add_items([collection_item, collection_item2])
catalog.add_child(collection)

`describe()` is a useful method on `Catalog` - but be careful when using it on large catalogs, as it will walk the entire tree of the STAC.

In [18]:
catalog.describe()

* <Catalog id=test-catalog>
    * <Collection id=wv3-images>
      * <Item id=local-image-col-1>
      * <Item id=local-image-col-2>


### Saving the catalog to a local directory.

The current STAC Catalog is in memory of this local machine. We can save this Catalog as JSON files on a local folder.

There are still no HREFs set on these in-memory items. PySTAC uses the `self` link on STAC objects to track where the file lives. Because we haven't set them, they evaluate to `None`:

In [19]:
print(catalog.get_self_href())

None


In order to set them, we can use `normalize_hrefs`. This method will create a normalized set of HREFs for each STAC object in the catalog, according to the [best practices document](https://github.com/radiantearth/stac-spec/blob/v0.8.1/best-practices.md#catalog-layout)'s recommendations on how to lay out a catalog.

In [20]:
catalog.normalize_hrefs(os.path.join(tmp_dir, 'stac-collection'))

Now that we've normalized to a root directory (the temporary directory), we see that the `self` links are set:

In [21]:
print(catalog.get_self_href())

/home/ec2-user/SageMaker/data/user-create-catalog/stac-collection/catalog.json


We can now call `save` on the catalog, which will recursively save all the STAC objects to their respective self HREFs.

Save requires a `CatalogType` to be set. You can review the [API docs](https://pystac.readthedocs.io/en/stable/api.html#catalogtype) on `CatalogType` to see what each type means (unfortunately `help` doesn't show docstrings for attributes).

In [22]:
catalog.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)

In [23]:
!ls {tmp_dir}/stac-collection/*

data/user-create-catalog/stac-collection/catalog.json

data/user-create-catalog/stac-collection/wv3-images:
collection.json  local-image-col-1  local-image-col-2


In [24]:
# Investigate the JSON file that reperesents the STAC Catalog.
with open(catalog.self_href) as f:
    print(f.read())

{
  "type": "Catalog",
  "id": "test-catalog",
  "stac_version": "1.0.0",
  "description": "Tutorial catalog with Collection.",
  "links": [
    {
      "rel": "root",
      "href": "./catalog.json",
      "type": "application/json"
    },
    {
      "rel": "child",
      "href": "./wv3-images/collection.json",
      "type": "application/json"
    }
  ]
}


In [25]:
# Investigate the JSON file that reperesents the STAC Collection.
with open(collection.self_href) as f:
    print(f.read())

{
  "type": "Collection",
  "id": "wv3-images",
  "stac_version": "1.0.0",
  "description": "Spacenet 5 images over Moscow",
  "links": [
    {
      "rel": "root",
      "href": "../catalog.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./local-image-col-1/local-image-col-1.json",
      "type": "application/json"
    },
    {
      "rel": "item",
      "href": "./local-image-col-2/local-image-col-2.json",
      "type": "application/json"
    },
    {
      "rel": "parent",
      "href": "../catalog.json",
      "type": "application/json"
    }
  ],
  "extent": {
    "spatial": {
      "bbox": [
        [
          37.6616853489879,
          55.726691972859314,
          37.68191048184866,
          55.73882710285011
        ]
      ]
    },
    "temporal": {
      "interval": [
        [
          "2024-06-01T18:36:49.324493Z",
          "2024-06-01T18:36:49.324852Z"
        ]
      ]
    }
  },
  "license": "CC-BY-SA-4.0"
}


In [26]:
# Investigate the JSON file that reperesents the STAC Item.
with open(collection_item.self_href) as f:
    print(f.read())

{
  "type": "Feature",
  "stac_version": "1.0.0",
  "id": "local-image-col-1",
  "properties": {
    "gsd": 0.3,
    "platform": "Maxar",
    "instruments": [
      "WorldView3"
    ],
    "datetime": "2024-06-01T18:36:49.324493Z"
  },
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          37.6616853489879,
          55.73478197572927
        ],
        [
          37.6616853489879,
          55.73882710285011
        ],
        [
          37.66573047610874,
          55.73882710285011
        ],
        [
          37.66573047610874,
          55.73478197572927
        ],
        [
          37.6616853489879,
          55.73478197572927
        ]
      ]
    ]
  },
  "links": [
    {
      "rel": "root",
      "href": "../../catalog.json",
      "type": "application/json"
    },
    {
      "rel": "collection",
      "href": "../collection.json",
      "type": "application/json"
    },
    {
      "rel": "parent",
      "href": "../collection.json",
 

### Upload STAC collection and items to STAC API <a id="upload-stac-api"></a>

We have so far converted the STAC Catalog from memory to JSON files in a local folder. Now we are ready to upload the JSON files from the local folder to our STAC API, so what we can geospatially query these STAC Items in the future.

In [27]:
# Helper function to POST or PUT to the local STAC API.
def post_or_put(url: str, data: dict):
    """Post or put data to url."""
    r = requests.post(url, json=data)
    if r.status_code == 409:
        new_url = url if data["type"] == "Collection" else url + f"/{data['id']}"
        # Exists, so update
        r = requests.put(new_url, json=data)
        # Unchanged may throw a 404
        if not r.status_code == 404:
            r.raise_for_status()
    else:
        r.raise_for_status()

In [28]:
import json
from urllib.parse import urljoin

app_host = APP_HOST # Update with your STAC API endpoint.

# Upload the STAC Collection to the STAC API
with open(os.path.join(tmp_dir, 'stac-collection/wv3-images/collection.json')) as f:
        collection = json.load(f)

post_or_put(urljoin(app_host, "/collections"), collection)

In [29]:
import os
import json
from urllib.parse import urljoin
from pathlib import Path

app_host = APP_HOST # Update with your STAC API endpoint.

# Upload the STAC items to the STAC API Collection
root_folder = Path(os.path.join(tmp_dir, 'stac-collection/wv3-images/'))
img_dirs = [folder for folder in root_folder.iterdir() if folder.is_dir()]
for img_dir in img_dirs:
    json_files = img_dir.glob("*.json")
    for json_file in json_files:
        with open(json_file) as f:
            img_json = json.load(f)

        post_or_put(urljoin(app_host, f"collections/{collection['id']}/items"), img_json)

## 3. Query images from STAC API.

Now that the STAC Items also live on the STAC API, we can geospatially query the items that we are interested in.

### Query based on Item ID

STAC Items can be queried from a specific image ID you are interested in.

In [30]:
import requests

app_host = APP_HOST # Update with your STAC API endpoint.
collection = "wv3-images"
item_id = "local-image-col-1"

response = requests.get(urljoin(app_host, "/collections/" + collection + "/items/" + item_id))
if response.status_code == 200:
    print('STAC API responded successfully. Congratulations!')
else:
    print('STAC API NOT responded successfully. Something went wrong!')

STAC API responded successfully. Congratulations!


In [31]:
import json

# Parse the JSON data response
print("Returned STAC Item:")
response.json()

Returned STAC Item:


{'id': 'local-image-col-1',
 'bbox': [37.6616853489879,
  55.73478197572927,
  37.66573047610874,
  55.73882710285011],
 'type': 'Feature',
 'links': [{'rel': 'collection',
   'type': 'application/json',
   'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/collections/wv3-images'},
  {'rel': 'parent',
   'type': 'application/json',
   'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/collections/wv3-images'},
  {'rel': 'root',
   'type': 'application/json',
   'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/'},
  {'rel': 'self',
   'type': 'application/geo+json',
   'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/collections/wv3-images/items/local-image-col-1'}],
 'assets': {'image': {'href': 'data/user-create-catalog/image1.tif',
   'type': 'image/tiff; application=geotiff',
   'eo:bands': [{'name': 'Coastal',
     'common_name': 'coastal',
     'description': 'C

In [32]:
print("The asset of the returned STAC Item is located locally in:", response.json().get("assets").get("image").get("href"))

The asset of the returned STAC Item is located locally in: data/user-create-catalog/image1.tif


### Query based on Intersect

STAC Items can be queried from a specific intersect point you are interested in.

In [33]:
from shapely.geometry import Point, Polygon

# Define the polygon coordinates for one of the images in the STAC API collection wv3-images.
polygon_coordinates = [
    [37.6616853489879, 55.73478197572927],
    [37.6616853489879, 55.73882710285011],
    [37.66573047610874, 55.73882710285011],
    [37.66573047610874, 55.73478197572927]
]

# Create a Polygon object
polygon = Polygon(polygon_coordinates)

# Generate a random point within the polygon
point_within_polygon = polygon.representative_point()

# Extract the x and y coordinates of the point
x = point_within_polygon.x
y = point_within_polygon.y

# Return the point coordinates as a tuple
point_coordinates = [x, y]
point_coordinates

[37.66370791254832, 55.73680453928969]

In [34]:
import requests
import json

### You can use the following to programmatically check the STAC API.

app_host = APP_HOST # Update with your STAC API endpoint.
collection = "wv3-images"
point_coordinates = [x, y]
limit = 10

# Define the JSON payload
payload = {
    "collections": collection,
    "intersects": {
        "type": "Point",
        "coordinates": point_coordinates
    },
    "limit": limit
}
print(payload)

response = requests.get(urljoin(app_host, "/collections/" + collection + "/items"), params=payload)
if response.status_code == 200:
    print('STAC API responded successfully. Congratulations!')
else:
    print('STAC API NOT responded successfully. Something went wrong!')

{'collections': 'wv3-images', 'intersects': {'type': 'Point', 'coordinates': [37.66370791254832, 55.73680453928969]}, 'limit': 10}
STAC API responded successfully. Congratulations!


In [35]:
import json

# Parse the JSON data response
print("Returned STAC Item:")
response.json()

Returned STAC Item:


{'type': 'FeatureCollection',
 'context': {'limit': 10, 'returned': 2},
 'features': [{'id': 'local-image-col-2',
   'bbox': [37.67786535472783,
    55.726691972859314,
    37.68191048184866,
    55.730737099980146],
   'type': 'Feature',
   'links': [{'rel': 'collection',
     'type': 'application/json',
     'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/collections/wv3-images'},
    {'rel': 'parent',
     'type': 'application/json',
     'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/collections/wv3-images'},
    {'rel': 'root',
     'type': 'application/json',
     'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/'},
    {'rel': 'self',
     'type': 'application/geo+json',
     'href': 'http://internal-stacapi-lb-1304670335.us-east-1.elb.amazonaws.com:8080/collections/wv3-images/items/local-image-col-2'}],
   'assets': {'image': {'href': 'data/user-create-catalog/image2.tif',
     'type

## 4. Working with STAC Collection and Items with S3 bucket.
### Upload STAC collection and items to S3 bucket

The created STAC collection and items as JSON files can also be uploaded to an S3 bucket. 
Then, these JSON files can be downloaded in the future and uploaded in a local STAC API.

In [36]:
import boto3
import os
from botocore.exceptions import NoCredentialsError

# Upload directory to S3
def uploadDirectoryToS3(local_path, bucketname, destination_path, kms_key_alias=""):
    s3 = boto3.client('s3')
    transfer = boto3.s3.transfer.S3Transfer(client=s3)
    
    for root, dirs, files in os.walk(local_path):
        for file in files:
            relative_path = os.path.relpath(os.path.join(root, file), local_path)
            s3_key = os.path.join(destination_path, relative_path).replace("\\", "/")
            
            if kms_key_alias == "":
                transfer.upload_file(
                    os.path.join(root, file),
                    bucketname,
                    s3_key
                )
            else:
                transfer.upload_file(
                    os.path.join(root, file),
                    bucketname,
                    s3_key,
                    extra_args={
                        'ServerSideEncryption': 'aws:kms',
                        'SSEKMSKeyId': 'alias/' + kms_key_alias
                    }
                )

In [37]:
# Upload STAC catalog to S3.
import boto3

# Create an S3 client with the above region
s3 = boto3.client('s3')

# Specify the bucket name and file name
bucket_name = S3_BUCKET
local_folder = 'data/user-create-catalog'
destination_folder = '1-raw'

# Upload the file to S3
uploadDirectoryToS3(local_folder, bucket_name, destination_folder)
print("STAC catalog in folder", local_folder, "uploaded to:", os.path.join(bucket_name, destination_folder))

STAC catalog in folder data/user-create-catalog uploaded to: data-bucket-5ndo/1-raw


### Download STAC collection and items from S3 bucket

STAC collection and items as JSON files can be downloaded from an S3 bucket. 
Then, these JSON files of STAC collection and items can be uploaded to a local STAC API.

In [38]:
import boto3
import os 

def downloadDirectoryFromS3(bucketName, remote_folder, local_folder):
    s3_resource = boto3.resource('s3')
    bucket = s3_resource.Bucket(bucketName)
    for obj in bucket.objects.filter(Prefix = remote_folder):
        only_file_name = obj.key[len(remote_folder)+1:]
        local_file = os.path.join(local_folder, only_file_name)
        if not os.path.exists(os.path.dirname(local_file)):
            os.makedirs(os.path.dirname(local_file))
        bucket.download_file(obj.key, local_file)

In [39]:
# Download STAC catalog from S3 to local folder.
import boto3

s3_client = boto3.client('s3')

bucket_name = S3_BUCKET
prefix = '1-raw'
local_folder = 'data/user-download-catalog'

downloadDirectoryFromS3(bucket_name, prefix, local_folder)

Now these downloaded STAC collection and items can be again uploaded to the local STAC API using the code snippets in [Upload STAC collection and items to local STAC API](#upload-stac-api)