# **SpatioTemporal Asset Catalog (STAC)**

## 1. STAC Inquiry
### 1.1 What is STAC?
The SpatioTemporal Asset Catalog (STAC) provides programmatic access to the Earth Observation metadata index with a wide variety of metadata field parameters available for searching. STAC has common set of search terms so geospatial information can be found and indexed efficiently and quickly. The spatial-temporal organization of data arranges the assets across time and space.

### 1.2 The Motivation for creating STAC
Groups working with spatial data have not been aware of the given spatial standards that are common in other areas of research. Moreover, data is oftern dispersed and difficult to collect, especially with the multiple search tools required to obtain the different types of data. With the creation of STAC, retrieving satellite data has become easier.

The main goal of STAC is to implement global index of satellite, aerial, and other spatial imagery derived from geospatial LiDAR, SAR, Full Motion Video, and Hyperspectral sensors.

### 1.3 Who can use STAC?
Anyone can! STAC is set up in a way where professional *and* non-professionals can easily query and retrieve data. 

### 1.4 How does STAC work?
Data can be uploaded into STAC once released, and from here the data can be indexed and discovered through common search terms and methods to access information. STAC has a specification organized structure, detailed in the STAC Specification. 
The STAC organization has 4 main specifications:

- **STAC Item:** Spatiotemporal data expressed as a GeoJSON specification.
- **STAC Catalog:** Structure for browsing and cataloging items.
- **STAC Collection:** Extension of the Catalog, providing information of the items, license information, and other metadata. Each collection is a different dataset with the items metadata. 
- **STAC API:** API to enables the search engine quality for the data using WFS3 format.


### 1.5 When to use STAC?
STAC can index data extremely fast and efficient, so it is a good option when querying large amounts of geospatial data. With the escalation of geographic data science, accessing bulks of satellite and aerial images in a quick manner is essential. STAC provides the support of capturing data and its information with ease and speed. 

Any open access data collection with direct links to the data available, can be added to the STAC Catalog for future indexing. STAC's capabilities of indexing allows to search for specific items according to the data attributes, such as bounding box, geometry, date range, and item properties.


## 2. The Benefits
- The STAC spec is an easy way to implement, and to be adaptable and flexible to existing implementations. The core philosophy is to enable maximum flexibility.
- Uses links (URLs) which enables modeling of much more complex relationships such as providing the source images which can used combined into a mosaic of thousands of images.
- Human readable HTML representations of STAC items in the STAC catalog. 

## 3. GEOAnalytics Canada's STAC Server
The GEOAnalytics Canada platform has implemented the STAC Catalog API to be used across the platform and provide easy access to geospatial assets that are openly searchable. As of now, Landsat 8 data is available for querying and accessing. With the use of your API Token, geospatial data can be searched, updated, and analyzed through the GEOAnalytics tools, such as creating a dataframe with the data in a notebook in our GEOAnalytics JupyterLab, or even calling the STAC items in the Desktop VM.

## 4. PySTAC Client and StackSTAC

PySTAC Client is a Python package that makes it easier for us to work with STAC Catalogs and APIs. It does this by offering higher-level functionality and the ability to make better use of STAC API search endpoints.

StackSTAC is a Python package that enables us to transform a STAC collection into a xarray.DataArray using dask. By doing so we are then able to conduct analyses on the data we queried from a STAC server!

To use these packages we must first import them:

In [16]:
import urllib
import requests
import stackstac

import geopandas as gpd

from pystac_client import Client

## 4. STAC Example
Here is an example of the STAC API showing the collections available on our STAC. With the provided API Token and the Base STAC URL, we can get the currently available collections in the GEOAnalytics STAC Server. 

In [None]:
API_TOKEN = input("Please copy and paste your API Access Token here: ").strip()

In [9]:
STAC_BASE_URL = "https://stac.geoanalytics.ca"  
requests_headers = {'cookie': API_TOKEN}

In [17]:
def get_stac_endpoint(endpoint):
    url = urllib.parse.urljoin(STAC_BASE_URL, endpoint)
    req = requests.get(url, headers=requests_headers)
    return req

In [18]:
get_stac_endpoint('/collections').json()

{'collections': [{'id': 'landsat-8-l1_EXAMPLE',
   'links': [{'rel': 'items',
     'type': 'application/geo+json',
     'href': 'http://stac.geoanalytics.ca/collections/landsat-8-l1_EXAMPLE/items'},
    {'rel': 'parent',
     'type': 'application/json',
     'href': 'http://stac.geoanalytics.ca/'},
    {'rel': 'root',
     'type': 'application/json',
     'href': 'http://stac.geoanalytics.ca/'},
    {'rel': 'self',
     'type': 'application/json',
     'href': 'http://stac.geoanalytics.ca/collections/landsat-8-l1_EXAMPLE'},
    {'rel': 'child',
     'href': 'https://landsat-stac.s3.amazonaws.com/landsat-8-l1/paths/catalog.json'}],
   'title': 'Landsat 8 L1',
   'extent': {'spatial': {'bbox': [[-180, -90, 180, 90]]},
    'temporal': {'interval': [['2013-06-01T00:00:00Z', None]]}},
   'license': 'PDDL-1.0',
   'keywords': ['landsat'],
   'providers': [{'url': 'https://landsat.usgs.gov/',
     'name': 'USGS',
     'roles': ['producer', 'licensor']},
    {'url': 'https://github.com/landsat

Now that we know what collections are currently available in the GEOAnalytics STAC server let's use PySTAC Client to query only the Landsat8-Level1 collection

## 4. How to query a STAC server
Here is an example on how to query the STAC server and put it into an XArray.dataarray, using StackSTAC.

In this example you'll learn how to query the Geoanalytics STAC server by collection, date range, AOI, and for specific Bands. 

1. Start by accessing the Geoanalytics STAC server using PySTAC Client
2. Create a polygon and convert it into a geopandas GeoDataFrame, for use as our AOI.
3. Set the remaining parameters necessary for your query.
4. Query the STAC server using PySTAC Client.
5. Ingest the output of your Query into an XArray.dataarray, using StackSTAC.

In [19]:
# Use Pystac Client to access the Geoanalytics STAC server
catalog = Client.open(STAC_BASE_URL, requests_headers)

In [20]:
# Create a polygon for defining our Area of Interest (AOI) in this case we are using a rough outline of Vancouver, BC created using: https://www.keene.edu/campus/maps/tool/
polygon={
      "coordinates": [
        [
          [
            -123.1460953,
            49.2792286
          ],
          [
            -123.242569,
            49.2895303
          ],
          [
            -123.2666016,
            49.2669084
          ],
          [
            -123.2120132,
            49.2272391
          ],
          [
            -123.1419754,
            49.2030184
          ],
          [
            -123.1313324,
            49.2675805
          ],
          [
            -123.1460953,
            49.2792286
          ]
        ]
      ],
      "type": "Polygon"
    }

In [21]:
lon_list = []
lat_list = []

for lon,lat in polygon['coordinates'][0]:
    lon_list.append(lon)
    lat_list.append(lat)
polygon_geom = Polygon(zip(lon_list, lat_list))
crs = 'EPSG:4326'
polygon = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])
polygon

NameError: name 'Polygon' is not defined

In [None]:
FOOTPRINT = polygon.to_crs('epsg:4326').geometry[0].envelope
FOOTPRINT

In [None]:
FOOTPRINT.bounds

In [None]:
# CONFIG 
# -------------
TGT_BANDS =  ['B04', 'B08']#'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B09', 'B11', 'B12', 'B8A']
YEAR = ['2016']
BEGIN_MONTH = '01'
END_MONTH = '12'
MAX_CLOUD = 5
READ_IN_CHUNK = 4096
RESOLUTION = 10
# -------------

In [None]:
items = catalog.search(
        collections = ['sentinel-2-l2a'],
        intersects = FOOTPRINT,
        query={"eo:cloud_cover": {"lt": MAX_CLOUD}, "s2:mgrs_tile":{"eq":'10UFA'}},
        datetime = date_range,
    ).get_all_items()

In [None]:
data = (
    stackstac.stack(
        items,
        assets=TGT_BANDS, 
        resolution=RESOLUTION, # Set all bands res to this
        bounds_latlon=FOOTPRINT.bounds, # clip to AOI bounds
        epsg = 32610
    ).where(lambda x: x > 0, other=np.nan).assign_coords( # Convert nodata zero to np.nan
        band=lambda x: x.common_name.rename("band"),  # use common names
        time=lambda x: x.time.dt.round(
            "D"
        ))  
)

## 5. Further Documentation
Here are resources for a deeper look into the SpatioTemporal Assest Catalog and its API:

- **Link to GEOAnalytics Canada STAC: https://stac.geoanalytics.ca/**
- **The STAC Specification: https://github.com/radiantearth/stac-spec**
- **The STAC API Specification: https://github.com/radiantearth/stac-api-spec**
- **GEOAnalytics STAC FastAPI using Swagger: https://stac.geoanalytics.ca/docs**
    - Documentation for the STAC FastAPI: https://github.com/stac-utils/stac-fastapi