# Goose Catalog Python Client Tutorial

In [1]:
from dgcatalog import Stac
from pprint import pprint

All interaction with the catalog is done using a `Stac` object.

For production use it is not necessary to specify the url parameter as the default catalog will be used.  But the `url` parameter can be used to point to test and development catalogs.

There are two ways to specify GBDX credentials when constructing a `Stac` object.  If you already have a GBDX token you can provide it to the Stac constructor using the `token` parameter.  Or you can use the `username` and `password` parameters to specify GBDX credentials.  I this case the constructor calls GBDX to generate a token.  If the password is omitted then the constructor will prompt you for it.

If `verbose` is True then `Stac` methods will print brief messages and web requests and responses to stdout.

The `Stac` object does not handle token expiration.  If you use a `Stac` object long enough that its token expires then you must create a new `Stac` object.

Use one of the following service URL's depending on the environment:

In [2]:
# service_url = 'https://api-test-2.discover.digitalglobe.com/v2/stac'
# service_url = 'https://api-dev-2.discover.digitalglobe.com/v2/stac'
service_url = 'https://api-2.discover.digitalglobe.com/v2/stac'

In [3]:
stac = Stac(url=service_url, username='super_tester@mailinator.com', verbose=True)

Password: ········
Requesting token from https://geobigdata.io/auth/v1/oauth/token
Token successfully received.


# WV04 search examples

First search over an area in Colorado.

In [4]:
items = stac.search(bbox=[-106.5, 40, -106, 40.5])

POST: https://api-2.discover.digitalglobe.com/v2/stac/search
HTTP Status: 200
Request ID: bf777c04-5480-49e2-89e8-1aebb5c96f9c
Elapsed seconds: 2.597763


Print some properties of the items returned.

In [5]:
[
    (item['properties']['eo:platform'], item['properties']['eo:cloud_cover'])
    for item in items
]

[('WORLDVIEW04', 0.17204629),
 ('WORLDVIEW04', 2.6806977),
 ('WORLDVIEW04', 5.1619554),
 ('WORLDVIEW04', 5.343196)]

Search for WV04 images in an area over Colorado.  Only images with cloud cover percent below 20% are returned:

In [6]:
items = stac.search(bbox=[-106.5, 40, -106, 40.5], query="eo:platform='WORLDVIEW04' and eo:cloud_cover < 5")

POST: https://api-2.discover.digitalglobe.com/v2/stac/search
HTTP Status: 200
Request ID: 2a8cc60d-3f38-4f33-8239-2d7174a27d51
Elapsed seconds: 1.818656


In [7]:
len(items)

2

In [8]:
pprint(items[0])

{'assets': {},
 'geometry': {'coordinates': [[[-106.399763246126, 39.5871126751837],
                               [-106.400030236603, 39.6182674043723],
                               [-106.40082535579, 39.6511323049505],
                               [-106.39964782195, 39.6828261702596],
                               [-106.398943026485, 39.715034658781],
                               [-106.398746046759, 39.7203962983793],
                               [-106.398730017936, 39.7230980002722],
                               [-106.398512394396, 39.7267561395016],
                               [-106.397765645994, 39.7470820678722],
                               [-106.396181141685, 39.7789859668921],
                               [-106.396283557033, 39.8124711820583],
                               [-106.397176066328, 39.8469995451619],
                               [-106.398117844242, 39.8819442951468],
                               [-106.398515595927, 39.9167029291437],
        

In [9]:
import arcgis.features
from arcgis.gis import GIS

collection = {
    'type': 'FeatureCollection',
    'features': items
}
features = arcgis.features.FeatureSet.from_geojson(collection)

mygis = GIS()
map = mygis.map()
map.draw(features)
map

MapView(layout=Layout(height='400px', width='100%'))

## Working with catalogs

Every catalog has an associated JSON schema used to validate STAC items when they are added to the catalog.
Associating a JSON schema with a catalog in this way is a DigitalGlobe extension to the STAC specification.

When STAC items are inserted into a catalog they are also validated against a basic STAC item JSON schema,
which verifies they are valid GeoJSON and have the minimum required STAC properties (like `datetime`).  So regardless
of what JSON schema is associated with a catalog this additional validation is always performed.

For this tutorial we simply use the GeoJSON Feature schema.  Since every STAC item is a GeoJSON feature this is suitable for demo purposes.  Later there will be STAC JSON schemas that are more suitable for validating STAC items.

In [4]:
import json
import requests
from pprint import pprint

Here is how to fetch an existing catalog, like the "maxar" catalog currently containing all WV04 images:

In [11]:
maxar = stac.get_catalog('maxar')
pprint(maxar, depth=1)

Get catalog catalog_id=maxar
GET: https://api-2.discover.digitalglobe.com/v2/stac/catalog/maxar
HTTP Status: 200
Request ID: 3d0dad2a-0050-425f-a782-64c0c83f02de
Elapsed seconds: 1.415249
{'description': 'DigitalGlobe STAC catalog',
 'id': 'maxar',
 'links': [...],
 'stac_item_schema': {...},
 'stac_version': '0.6.0',
 'title': 'DigitalGlobe'}


In [5]:
schema = json.loads(requests.get('http://geojson.org/schema/Feature.json').text)

catalog = {
    'stac_version': '0.6.0',
    'id': 'chris',
    'title': 'Chris test',
    'description': 'DigitalGlobe WV images',
    'links': [
        {
            'rel': 'self',
            'href': 'https://api.discover.digitalglobe.com/v2/stac/catalog/chris'
        }
    ],
    'stac_item_schema': schema
}
stac.insert_catalog(catalog)

POST: https://api-dev-2.discover.digitalglobe.com/v2/stac/catalog
HTTP Status: 500
Request ID: 50af0ce3-451d-4e46-9b55-071e44d85f1f
Elapsed seconds: 3.501035


StacException: A database exception occurred (Request ID: 50af0ce3-451d-4e46-9b55-071e44d85f1f)

In [None]:
catalog = stac.get_catalog('wv')

In [None]:
pprint(catalog, depth=1)

Catalogs can be updated.  A catalog's ID cannot be changed but its other properties can, including its schema.
Note that if a catalog's schema is modified existing items in the catalog are not revalidated against the new schema.

In [None]:
catalog['description'] = 'DigitalGlobe WorldView 4 images'
stac.update_catalog(catalog)

## Working with STAC items

For this tutorial we will copy a few catalog records from the DUC database to the Goose database.  We will
use the `duc_get_image` function in the `dgcatalog.tools` module.  It reads an image's catalog
metadata from the DUC catalog service and returns it as a STAC item.

In [None]:
from dgcatalog.tools import duc_get_image, duc_query

In [None]:
item = duc_get_image(image_id='10400100108FCE00')

In [None]:
pprint(item, depth=1)

Inserting a new item into a catalog:

Use head_item to perform an HTTP HEAD operation and determine whether a STAC item exists.

In [None]:
stac.head_item()

In [None]:
stac.insert_item(item, 'wv')

In [None]:
item = stac.get_item('10400100108FCE00')

In [None]:
image_ids = ["10200100782DBA00", "103001008817BC00", "102001007FB83A00", "102001007D14CD00", "102001007C528D00"]
items = duc_get_image(image_ids=image_ids)
stac.insert_items(items, 'wv')

Let's create some test data to search on.  Select a month's worth of WV04 images from DUC and insert them into the Goose "wv" catalog:

In [None]:
items = duc_query("collect_time_start >= '2017-01-01' and collect_time_start <= '2017-02-01' and vehicle_name = 'WV04'")

In [None]:
len(items)

In [None]:
stac.insert_items(items, 'wv')

## Working with item attachments

The catalog supports associating an arbitrary JSON object with each STAC item called its "attachments."  Propeties in the attachments are used to associate metadata with a STAC item that's not included in the item's feature itself.

Some attachment properties may be recognized by the catalog itself.  For now the only such property is "data-access-profile".

When inserting multiple STAC items using a feature collection you can specify an attachments property that is copied to each newly inserted item.

In [None]:
image_ids = ['10500100144DD900', '102001008164D600', '1020010080207100']
items = duc_get_image(image_ids=image_ids)

attachments = {
    'data-access-profile': {
        'policies': [
            {
                'startDate': '2019-02-01T00:00:00Z',
                'endDate': '2019-03-01T00:00:00Z',
                'allow': ['customer.001'],
                'deny': []
            },
            {
                'startDate': '2019-03-01T00:00:00Z',
                'endDate': '9999-12-31T23:59:59Z',
                'allow': ['dataaccess.public'],
                'deny': []
            }
        ]
    }
}

stac.insert_items(items, 'wv', attachments)

Get the attachments for one of the images just inserted.

In [None]:
att = stac.get_attachments('10500100144DD900', 'wv')

In [None]:
att

You can update an item's attachments.  Here we remove the first policy in the array and associate it back with the item:

In [None]:
att['data-access-profile']['policies'].pop(0)

In [None]:
stac.update_attachments('10500100144DD900', 'wv', att)

Read it back to make sure it was updated.

In [None]:
att = stac.get_attachments('10500100144DD900', 'wv')

In [None]:
att

Each STAC item has its own attachments.  When we inserted the three images above and
included attachments with them the attachments were copied for each STAC item.  Read the
attachments for another of the images to see that its attachments are unchanged.

In [None]:
att = stac.get_attachments('102001008164D600', 'wv')

In [None]:
att

In [None]:
stac.delete_attachments('102001008164D600', 'wv')

## Searching catalogs

Some notes on searching:
    
* A search may be performed against the entire database or against a particular catalog.
Specify the `catalog_id` parameter in the call to `search` to search against a particular catalog, otherwise the search is against the entire database.
* The maximum number of items returned by any search is 1000.  For larger resultsets use multiple calls to `search` with paging and ordering to retrieve the full set of results.

### Searching by date and time
Note that when search by datetime the start is inclusive and the end is exclusive.

In [None]:
from datetime import datetime
items = stac.search(catalog_id='wv', start_datetime=datetime(2017, 1, 1), end_datetime=datetime(2017, 1, 2))

In [None]:
stac._last_response.text

### Searching with a property filter

Use the `query` parameter to filter results.  Filtering is performed on the server side against properties in the STAC item's "properties" dictionary.

The general form of a query filter is "property operation value".

* "property" is the name of a STAC item property
* "operation" is one of the following:
    * Comparison operators:  =, !=, <>, <, >, <=, >=
    * "is" and "is not" for comparing with booleans and null.
    * "like" and "not like" for comparing strings with SQL patterns.
    * "in" for comparing with a list of integers or strings.
* "value" is a number, string, boolean, or null.
    * Exponential notation for floating-point values is not supported.
    * Strings are delimited by single quotes.  There is no facility for escaping single quotes inside the string or for any other escape sequences.
    * A boolean value is "true" or "false", specified without quotes, and case-insensitive.
    * A null value is "null", specified without quotes, and case-insensitive.

Filters can be combined using the operators "and" and "or".  The "and" operator takes precedence over the "or" operator.  Parenthesis can be used when combining filters with "and" and "or".

String comparisons are case-sensitive.

These are examples of valid query filters:

* vendor = 'DigitalGlobe'
* eo:cloud_cover < 20
* dg:rda_available is true
* dg:rda_available is false
* eo:gsd < 1.5
* eo:epsg is null
* eo:epsg in (32613, 26913, 26914)
* dg:sun_elevation_min < 20 and dg:sun_azimuth_max < 30
* (vendor = 'DigitalGlobe' and eo:platform = 'WORLDVIEW02') or (vendor = 'KOMPSAT' and eo:platform = 'KOMPSAT3A')

It is not an error to specify a property that an item doesn't have, but the item will
not be returned by the query no matter what other filters are provided.

Properties with nested values are not currently supported for filtering on.

A property filter is not intended to search on an item's `datetime` property.
Use the start_datetime and end_datetime seach parameters for that.
    

In [None]:
items = stac.search(start_datetime=datetime(2015, 1, 1), end_datetime=datetime(2016, 1, 1), query='eo:cloud < 10')

In [None]:
len(items)