# Generate a STAC Catalog

For better or worse, it is easier to build a STAC catalog at the same time that the STAC items are created. Below we generate a catalog and metadata for a directory of data

---------------------
## Step I: Make the catalog

In [4]:
import datetime
from pathlib import Path
import ssl
# This is require for verification / validation using remote resources when inside the network
ssl._create_default_https_context = ssl._create_unverified_context

import pystac
from pystac import Collection, SpatialExtent, TemporalExtent, Extent



description = """
The Solid State Imager (SSI) on NASA's Galileo spacecraft acquired more than 500 images of Jupiter's moon, Europa, 
providing the only moderate- to high-resolution images of the moon's surface. Images were acquired as observation 
sequences during each orbit that targeted the moon. Each of these observation sequences consists of between 1 and 
19 images acquired close in time, that typically overlap, have consistent illumination and similar pixel scale. 
The observations vary from relatively low-resolution hemispherical imaging, to high-resolution targeted images that 
cover a small portion of the surface. Here we provide average mosaics of each of the individual observation sequences 
acquired by the Galileo spacecraft. These observation mosaics were constructed from a set of 481 Galileo images that 
were photogrammetrically controlled globally (along with 221 Voyager 1 and 2 images) to improve their relative 
locations on Europa's surface. The 92 observation mosaics provide users with nearly the entire Galileo Europa 
imaging dataset at its native resolution and with improved relative image locations.

The Solid State Imager (SSI) on NASA's Galileo spacecraft provided the only moderate- to high-resolution images 
of Jupiter's moon, Europa. Unfortunately, uncertainty in the position and pointing of the spacecraft, as well as 
the position and orientation of Europa, when the images were acquired resulted in significant errors in image 
locations on the surface. The result of these errors is that images acquired during different Galileo orbits, or 
even at different times during the same orbit, are significantly misaligned (errors of up to 100 km on the surface).
Previous work has generated global mosaics of Galileo and Voyager images that photogrammetrically control a subset 
of the available images to correct their relative locations. However, these efforts result in a "static" mosaic 
that is projected to a consistent pixel scale, and only use a fraction of the dataset (e.g., high resolution images 
are not included). The purpose of this current dataset is to increase the usability of the entire Galileo image set 
by photogrammetrically improving the locations of nearly every Europa image acquired by Galileo, and making them 
available to the community at their native resolution and in easy-to-use regional mosaics based on their acquisition time.
The dataset therefore provides a set of image mosaics that can be used for scientific analysis and mission planning activities.
"""

coll = Collection(id='usgs_controlled_voy1_voy2_galileo',
                  title='USGS Controlled Europa Voyager 1, Voyager 2, and Galileo Image Data',
                  description=description,
                  extent=Extent(SpatialExtent([-180, -90, 180, 90]), TemporalExtent([datetime.datetime(2021, 1, 1), None])),
                  href='https://asc-jupiter.s3-us-west-2.amazonaws.com/europa/individual_l2/collection.json',
                  license='PDDL-1.0'
                 )
coll.validate()

----------------
## Step II: Get a list of the input data

Below, we are generating the catalog from a list of files that contains full, qualified paths. One could also use glob to generate a file list dynamically from within a notebook.

In [5]:
UPLOAD_DIR = '/scratch/ARD/stac/jupiter/europa/'

# List the products to generate STAC for...
with open('/archive/projects/europa/GLL_FinProducts/observation_lev2_products.lis', 'r') as f:
    products = f.readlines()
products = [Path(p.rstrip()) for p in products]

In [7]:
print(products[0:2])

[PosixPath('/archive/projects/europa/GLL_FinProducts/10ESGLOBAL01/Lev2/s0413742778.equi.cub'), PosixPath('/archive/projects/europa/GLL_FinProducts/10ESGLOBAL01/Lev2/s0413742778.equi.photr.cub')]


-----------------------
## Step III: Cook Metadata and Update the Catalog

In [None]:
for f in products:
    f = str(f)
    basename = os.path.basename(f)
    outname = os.path.splitext(basename)[0]

    fgdc = FGDCMetadata('sample.xml')
    gd = GDALMetadata(f)
    imd = IsisMetadata(f)

    overrides = {'license': 'PDDL-1.0',
         'missions':['Voyager 1', 'Voyager 2', 'Galileo'],
         'doi':'mydoinumber',
         'href':'https://asc-jupiter.s3-us-west-2.amazonaws.com/europa/individual_l2'}

    record = UnifiedMetadata([fgdc, gd, imd], overrides=overrides, mappings={'bbox':IsisMetadata, })

    assets = [{'title':'JPEG thumbnail of image {productid}',
       'href':'{href}/{productid}.jpeg',
       'media_type':'image/jpeg',
       'roles':['thumbnail'],
       'key':'thumbnail'},
     {'title': 'Cloud optimized GeoTiff for image {productid}',
      'href':'{href}/{productid}-cog.tif',
      'media_type':'image/tiff; application=geotiff; profile=cloud-optimized',
      'roles':['data'],
      'key':'B1'},
     {'title': 'GDAL PAM Metadata for image {productid}',
      'href':'{href}/{productid}-cog.tif.aux.xml',
      'media_type':'application/xml',
      'roles':['metadata'],
      'key':'gdal_metadata'}]

    # Convert the generic metadata record into a STAC formatted metadata record
    as_stac = to_stac(record, assets=assets)
    as_stac.validate()
    
    # Add the item to the parent collection. This also adds the collection to the item
    coll.add_item(as_stac)
    
    # Write the STAC metadata
    with open(f'{UPLOAD_DIR}/{outname}.json', 'w') as f:
        json.dump( as_stac.to_dict(), f)

# Now write the collection
coll.validate()
with open(f'{UPLOAD_DIR}/collection.json', 'w') as f:
    json.dump(coll.to_dict(), f)