## Converting STAC catalogs to Intake catalogs
This notebook provides a brief example of the conversion process from STAC to Intake using the Sat-STAC library. 

Note: To ensure that code runs as expected, open the notebook in the conda environment provided in `environment.yml`, or ensure that the dependencies listed there are installed.

In [29]:
import yaml
from satstac import Catalog, Collection, Item

## Examining catalog structure
Note: Sat-STAC's `Catalog.items` method iterates through all items which descend from the given catalog. There doesn't seem to be a simple way to obtain the set of children which are direct descendents of a catalog. To avoid duplication, the block below only displays the catalog/collection structure.

In [30]:
def traverse(cat, level=0):
    print('\t' * level + cat.id)
    for child in cat.children():
        if isinstance(child, Catalog):
            traverse(child, level+1)

cat = Catalog.open("https://storage.googleapis.com/pdd-stac/disasters/catalog.json")
traverse(cat)

planet-disaster-data
	hurricane-harvey
		hurricane-harvey-0831


## Examining catalog items
The block below displays the assets of a single catalog item

In [31]:
item = next(cat.items())
print(item, item.assets)

20170831_172754_101c {'thumbnail': {'href': 'https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/20170831_172754_101c_thumb_large.png', 'title': 'Thumbnail', 'type': 'image/png'}, 'analytic': {'href': 'https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/20170831_172754_101c_3B_AnalyticMS.tif', 'title': 'PSScene4Band Analytic GeoTIFF', 'pl:type': 'https://api.planet.com/data/v1/asset-types/analytic', 'type': 'image/vnd.stac.geotiff; cloud-optimized=true'}, 'analytic_xml': {'href': 'https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/20170831_172754_101c_3B_AnalyticMS_metadata.xml', 'title': 'PSScene4Band XML Metadata', 'pl:type': 'https://api.planet.com/data/v1/asset-types/analytic_xml', 'type': 'text/xml'}, 'udm': {'href': 'https://storage.googleapis.com/pdd-stac/disasters/hurricane-harvey/0831/20170831_172754_101c_3B_AnalyticMS_DN_udm.tif', 'title': 'PSScene4Band Unusable Data Mask', 'pl:type': 'https://api.planet.com/data/v1

## Converting STAC catalog to Intake Catalog
The block below converts a STAC catalog to an Intake-compatable `catalog.yml` file. The method `get_driver` maps datatypes to their appropriate drivers, while `to_intake` outputs a correctly-formated object describing the catalog.

Note: This process will need to be extended to handle a wider array of filetypes, and output item-specific metadata and driver arguments, such as chunk size, where possible.

In [32]:
def get_driver(datatype):
    drivers = {
        'application/netcdf' :  'netcdf',
        'image/vnd.stac.geotiff' : 'rasterio',
        'image/vnd.stac.geotiff; cloud-optimized=true' : 'rasterio',
        'image/png' : "xarray_image",
        'image/jpg' : "xarray_image",
        'image/jpeg' : "xarray_image",
        'text/xml' : 'textfiles',
    }
    return drivers.get(datatype, datatype)
    

def to_intake(catalog):
    sources = {}
    for item in catalog.items():
        for key, value in item.assets.items():
            sources[item.id + key] = {
                'description': value.get('title', key),
                'driver': get_driver(value['type']),
                'args': {
                    'urlpath': value['href'],
                    'chunks': {}
                }
            } 
    return {
        'metadata': {
            "version": 1
        },
        'plugins': {
            'source': [{'module': 'intake_xarray'}]
        },
        'sources': sources
    }

with open('catalog.yml', 'w') as outfile: 
    yaml.dump(to_intake(cat), outfile, default_flow_style=False)

## Examining the converted Intake catalog
The block below opens up a GUI where the converted intake catalog can be examined and browser

In [28]:
import intake
intake.gui

Intake GUI instance: to get widget to display, you must install ipy/jupyter-widgets, run in a notebook and, in the case of jupyter-lab, install the jlab extension.

## Accessing items through Intake 
The block below demonstrates how catalog items can be accessed through intake and loaded. 

Note: Data structures other than Dask Arrays have not been tested yet

In [33]:
cat = intake.open_catalog('catalog.yml')
tif = cat['20170831_162740_ssc1d1visual']
tif.to_dask()

<xarray.DataArray (band: 3, y: 27671, x: 28122)>
dask.array<shape=(3, 27671, 28122), dtype=uint8, chunksize=(3, 27671, 28122)>
Coordinates:
  * band     (band) int64 1 2 3
  * y        (y) float64 3.384e+06 3.384e+06 3.384e+06 ... 3.359e+06 3.359e+06
  * x        (x) float64 -1.063e+07 -1.063e+07 ... -1.06e+07 -1.06e+07
Attributes:
    transform:   (0.9156731177980102, 0.0, -10627664.662946375, 0.0, -0.91567...
    crs:         +init=epsg:3857
    res:         (0.9156731177980102, 0.9156731177980102)
    is_tiled:    1
    nodatavals:  (nan, nan, nan)