# Goose Catalog Python Client Tutorial

In [1]:
from dgcatalog import Stac
from pprint import pprint

All interaction with the catalog is done using a `Stac` object.

For production use it is not necessary to specify the url parameter as the default catalog will be used.  But the `url` parameter can be used to point to test and development catalogs.

There are two ways to specify GBDX credentials when constructing a `Stac` object.  If you already have a GBDX token you can provide it to the Stac constructor using the `token` parameter.  Or you can use the `username` and `password` parameters to specify GBDX credentials.  I this case the constructor calls GBDX to generate a token.  If the password is omitted then the constructor will prompt you for it.

If `verbose` is True then `Stac` methods will print brief messages and web requests and responses to stdout.

The `Stac` object does not handle token expiration.  If you use a `Stac` object long enough that its token expires then you must create a new `Stac` object.

Use one of the following service URL's depending on the environment:

In [2]:
# service_url = 'https://api-test-2.discover.digitalglobe.com/v2/stac'
service_url = 'https://api-dev-2.discover.digitalglobe.com/v2/stac'

In [11]:
stac = Stac(url=service_url, username='super_tester@mailinator.com', verbose=True)

Password:  ············


Requesting token from https://geobigdata.io/auth/v1/oauth/token
Token successfully received.


In [12]:
stac._token

'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik1Ea3hPREE1UTBFeFJUTXpOek01UlVSRE5qWTRRelpHT1ROR1FUWTBNMFJHTnpjMFEwTTFSZyJ9.eyJodHRwczovL2dlb2JpZ2RhdGEuaW8vYWNjb3VudF9sZXZlbCI6ImN1c3RvbSIsImh0dHBzOi8vZ2VvYmlnZGF0YS5pby9pZCI6IjFjNzU5NjNmLWYxMDMtNDRhNi04MWU3LTc1NmJmMTljY2NhYiIsImh0dHBzOi8vZ2VvYmlnZGF0YS5pby9hY2NvdW50X2lkIjoiN2IyMTZiZDktNjUyMy00Y2E5LWFhM2ItMWQ4YTU5OTRmMDU0IiwiaHR0cHM6Ly9nZW9iaWdkYXRhLmlvL3JvbGVzIjpbInN1cGVyX2FkbWluIl0sImh0dHBzOi8vZ2VvYmlnZGF0YS5pby9lbWFpbCI6InN1cGVyX3Rlc3RlckBtYWlsaW5hdG9yLmNvbSIsImlzcyI6Imh0dHBzOi8vZGlnaXRhbGdsb2JlLXByb2R1Y3Rpb24uYXV0aDAuY29tLyIsInN1YiI6ImF1dGgwfDVjMTAxNGRkNDg5NzQ4NWYzOTM2ZmRlYiIsImF1ZCI6WyJnZW9iaWdkYXRhLmlvIiwiaHR0cHM6Ly9kaWdpdGFsZ2xvYmUtcHJvZHVjdGlvbi5hdXRoMC5jb20vdXNlcmluZm8iXSwiaWF0IjoxNTQ1NDEzNjMxLCJleHAiOjE1NDYwMTg0MzEsImF6cCI6ImRieFU1Y1pka08wU0hUbXNoRkNXbkk4OTR2eFExTmJ6Iiwic2NvcGUiOiJvcGVuaWQgZW1haWwgb2ZmbGluZV9hY2Nlc3MiLCJndHkiOiJwYXNzd29yZCJ9.Fx502T3zskmlsAE_Kv613z5hF1ZBMVHpStVei0JviLLmBcg3t-I0DuNfJz3nqqsO4bMBhjOaGt1AJYh7UN0uH0-DO

## Working with catalogs

Every catalog has an associated JSON schema used to validate STAC items when they are added to the catalog.
Associating a JSON schema with a catalog in this way is a DigitalGlobe extension to the STAC specification.

When STAC items are inserted into a catalog they are also validated against a basic STAC item JSON schema,
which verifies they are valid GeoJSON and have the minimum required STAC properties (like `datetime`).  So regardless
of what JSON schema is associated with a catalog this additional validation is always performed.

For this tutorial we simply use the GeoJSON Feature schema.  Since every STAC item is a GeoJSON feature this is suitable for demo purposes.  Later there will be STAC JSON schemas that are more suitable for validating STAC items.

In [None]:
import json
import requests
schema = json.loads(requests.get('http://geojson.org/schema/Feature.json').text)

In [None]:
catalog = {
    'stac_version': '0.6.0',
    'id': 'wv',
    'title': 'DigitalGlobe WV',
    'description': 'DigitalGlobe WV images',
    'links': [
        {
            'rel': 'self',
            'href': 'https://api.discover.digitalglobe.com/v2/stac/catalog/wv04'
        }
    ],
    'stac_item_schema': schema
}
stac.insert_catalog(catalog)

In [None]:
catalog = stac.get_catalog('wv')

In [None]:
pprint(catalog, depth=1)

Catalogs can be updated.  A catalog's ID cannot be changed but its other properties can, including its schema.
Note that if a catalog's schema is modified existing items in the catalog are not revalidated against the new schema.

In [None]:
catalog['Description'] = 'DigitalGlobe WorldView 4 images'
stac.update_catalog(catalog)

In [None]:
stac.update_catalog({'id': 'asdf'})

## Working with STAC items

For this tutorial we will copy a few catalog records from the DUC database to the Goose database.  We will
use the `duc_get_image` function in the `dgcatalog.tools` module.  It reads an image's catalog
metadata from the DUC catalog service and returns it as a STAC item.

In [6]:
from dgcatalog.tools import duc_get_image, duc_query

In [7]:
item = duc_get_image(image_id='10400100108FCE00')

In [8]:
pprint(item, depth=1)

{'assets': {...},
 'geometry': {...},
 'id': '10400100108FCE00',
 'links': [...],
 'properties': {...},
 'type': 'Feature'}


Inserting a new item into a catalog:

In [9]:
stac.insert_item(item, 'wv')

POST: https://api-dev-2.discover.digitalglobe.com/v2/stac/catalog/wv/item
HTTP Status: 400
Request ID: 959c0bf9-0538-11e9-9821-01fc1590fd29


StacException: STAC item already exists in catalog.  Key (item_id)=(10400100108FCE00) already exists.

In [10]:
item = stac.get_item('10400100108FCE00')

GET: https://api-dev-2.discover.digitalglobe.com/v2/stac/search
HTTP Status: 502


StacException: Error in catalog request.

In [None]:
image_ids = ["10200100782DBA00", "103001008817BC00", "102001007FB83A00", "102001007D14CD00", "102001007C528D00"]
items = duc_get_image(image_ids=image_ids)
stac.insert_items(items, 'wv')

Let's create some test data to search one.  Select a month's worth of WV04 images from DUC and insert them into the Goose "wv" catalog:

In [None]:
items = duc_query("collect_time_start >= '2017-01-01' and collect_time_start <= '2017-02-01' and vehicle_name = 'WV04'")

In [None]:
len(items)

In [None]:
stac.insert_items(items, 'wv')

## Searching catalogs

Some notes on searching:
    
* A search may be performed against the entire database or against a particular catalog.
Specify the `catalog_id` parameter in the call to `search` to search against a particular catalog, otherwise the search is against the entire database.
* The maximum number of items returned by any search is 1000.  For larger resultsets use multiple calls to `search` with paging and ordering to retrieve the full set of results.

### Searching by date and time
Note that when search by datetime the start is inclusive and the end is exclusive.

In [None]:
from datetime import datetime
items = stac.search(catalog_id='wv', start_datetime=datetime(2017, 1, 1), end_datetime=datetime(2017, 1, 2))

In [None]:
stac._last_response.text

### Searching with a property filter

Use the `query` parameter to filter results.  Filtering is performed on the server side against properties in the STAC item's "properties" dictionary.

The general form of a query filter is "property operation value".

* "property" is the name of a STAC item property
* "operation" is one of the following:
    * Comparison operators:  =, !=, <>, <, >, <=, >=
    * "is" and "is not" for comparing with booleans and null.
    * "like" and "not like" for comparing strings with SQL patterns.
    * "in" for comparing with a list of integers or strings.
* "value" is a number, string, boolean, or null.
    * Exponential notation for floating-point values is not supported.
    * Strings are delimited by single quotes.  There is no facility for escaping single quotes inside the string or for any other escape sequences.
    * A boolean value is "true" or "false", specified without quotes, and case-insensitive.
    * A null value is "null", specified without quotes, and case-insensitive.

Filters can be combined using the operators "and" and "or".  The "and" operator takes precedence over the "or" operator.  Parenthesis can be used when combining filters with "and" and "or".

String comparisons are case-sensitive.

These are examples of valid query filters:

* vendor = 'DigitalGlobe'
* eo:cloud_cover < 20
* dg:rda_available is true
* dg:rda_available is false
* eo:gsd < 1.5
* eo:epsg is null
* eo:epsg in (32613, 26913, 26914)
* dg:sun_elevation_min < 20 and dg:sun_azimuth_max < 30
* (vendor = 'DigitalGlobe' and eo:platform = 'WORLDVIEW02') or (vendor = 'KOMPSAT' and eo:platform = 'KOMPSAT3A')

It is not an error to specify a property that an item doesn't have, but the item will
not be returned by the query no matter what other filters are provided.

Properties with nested values are not currently supported for filtering on.

A property filter is not intended to search on an item's `datetime` property.
Use the start_datetime and end_datetime seach parameters for that.
    

In [None]:
items = stac.search(start_datetime=datetime(2015, 1, 1), end_datetime=datetime(2016, 1, 1), query='eo:cloud < 10')

In [None]:
len(items)