# The Open Data Cube and Metadata

The ODC has been purposely built to access and analyse earth observation data and metadata. In this notebook, we will explore basic methods that can be used to extract metadata from DEA products stored on the public s3, which we have indexed into our instance of the ODC.

In [1]:
# First we will load some Python modules
import datacube
import pandas

# And we set some configuration
%matplotlib inline
pandas.set_option('display.max_colwidth', 200)
pandas.set_option('display.max_rows', None)

## First steps
Let's explore what products have been indexed into our ODC instance and what associated information there is available within the metadata.

In [12]:
dc = datacube.Datacube(app='metadata-demo') # Start a datacube instance.


In [8]:
products = dc.list_products() # Get all available product information
products

Unnamed: 0_level_0,name,description,creation_time,product_type,lon,lat,platform,instrument,label,time,format,crs,resolution,tile_size,spatial_dimensions
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
10,ls5_fc_albers,"Landsat 5 Fractional Cover 25 metre, 100km tile, Australian Albers Equal Area projection (EPSG:3577)",,fractional_cover,,,LANDSAT_5,TM,,,,EPSG:3577,"[-25, 25]","[100000.0, 100000.0]","(y, x)"
5,ls5_nbart_geomedian_annual,"Surface Reflectance Geometric Median 25 metre, 100km tile, Australian Albers Equal Area projection (EPSG:3577)",,surface_reflectance_statistical_summary,,,LANDSAT_5,TM,,,GeoTIFF,EPSG:3577,"[-25, 25]","[100000.0, 100000.0]","(y, x)"
11,ls7_fc_albers,"Landsat 7 Fractional Cover 25 metre, 100km tile, Australian Albers Equal Area projection (EPSG:3577)",,fractional_cover,,,LANDSAT_7,ETM,,,,EPSG:3577,"[-25, 25]","[100000.0, 100000.0]","(y, x)"
4,ls7_nbart_geomedian_annual,"Surface Reflectance Geometric Median 25 metre, 100km tile, Australian Albers Equal Area projection (EPSG:3577)",,surface_reflectance_statistical_summary,,,LANDSAT_7,ETM,,,GeoTIFF,EPSG:3577,"[-25, 25]","[100000.0, 100000.0]","(y, x)"
2,ls8_fc_albers,"Landsat 8 Fractional Cover 25 metre, 100km tile, Australian Albers Equal Area projection (EPSG:3577)",,fractional_cover,,,LANDSAT_8,OLI_TIRS,,,GeoTIFF,EPSG:3577,"[-25, 25]",,"(y, x)"
3,ls8_nbart_geomedian_annual,"Surface Reflectance Geometric Median 25 metre, 100km tile, Australian Albers Equal Area projection (EPSG:3577)",,surface_reflectance_statistical_summary,,,LANDSAT_8,OLI,,,GeoTIFF,EPSG:3577,"[-25, 25]","[100000.0, 100000.0]","(y, x)"
6,s2a_l1c_aws_pds,Sentinel-2A MSI L1C - AWS PDS,,level1,,,SENTINEL_2A,MSI,,,JPEG2000,,,,
8,s2a_nrt_granule,Sentinel-2A MSI NRT - NBAR NBART and Pixel Quality,,ard,,,SENTINEL_2A,MSI,,,GeoTiff,,,,
7,s2b_l1c_aws_pds,Sentinel-2B MSI L1C - AWS PDS,,level1,,,SENTINEL_2B,MSI,,,JPEG2000,,,,
9,s2b_nrt_granule,Sentinel-2B MSI NRT - NBAR NBART and Pixel Quality,,ard,,,SENTINEL_2B,MSI,,,GeoTiff,,,,


In [9]:
# Here we search products for those that match the string 'SENT', i.e., Sentinel
products[products['platform'].str.match('SENT')]


Unnamed: 0_level_0,name,description,creation_time,product_type,lon,lat,platform,instrument,label,time,format,crs,resolution,tile_size,spatial_dimensions
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
6,s2a_l1c_aws_pds,Sentinel-2A MSI L1C - AWS PDS,,level1,,,SENTINEL_2A,MSI,,,JPEG2000,,,,
8,s2a_nrt_granule,Sentinel-2A MSI NRT - NBAR NBART and Pixel Quality,,ard,,,SENTINEL_2A,MSI,,,GeoTiff,,,,
7,s2b_l1c_aws_pds,Sentinel-2B MSI L1C - AWS PDS,,level1,,,SENTINEL_2B,MSI,,,JPEG2000,,,,
9,s2b_nrt_granule,Sentinel-2B MSI NRT - NBAR NBART and Pixel Quality,,ard,,,SENTINEL_2B,MSI,,,GeoTiff,,,,


In [10]:
# Measurements are variables that are available for a product
measurements = dc.list_measurements() # Get all measurement information
measurements

Unnamed: 0_level_0,Unnamed: 1_level_0,aliases,dtype,flags_definition,name,nodata,spectral_definition,units
product,measurement,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
ls5_fc_albers,BS,[bare],int8,,BS,-1,,percent
ls5_fc_albers,PV,[green_veg],int8,,PV,-1,,percent
ls5_fc_albers,NPV,[dead_veg],int8,,NPV,-1,,percent
ls5_fc_albers,UE,[err],int8,,UE,-1,,1
ls5_nbart_geomedian_annual,blue,,int16,,blue,-999,,1
ls5_nbart_geomedian_annual,green,,int16,,green,-999,,1
ls5_nbart_geomedian_annual,red,,int16,,red,-999,,1
ls5_nbart_geomedian_annual,nir,,int16,,nir,-999,,1
ls5_nbart_geomedian_annual,swir1,,int16,,swir1,-999,,1
ls5_nbart_geomedian_annual,swir2,,int16,,swir2,-999,,1


## Querying Metadata for an Individual Dataset

This is an example of a basic query. More advanced queries can be done to filter based on product lineage, among other things.

In [None]:
dc = datacube.Datacube(app='advance-query-example')
scenes = dc.find_datasets(
    product='s2a_nrt_granule', 
    time=('2018-09-01', ('2018-10-01')),
    x=(146.30, 146.40), y=(-43.30,-43.20)
)
scenes

In [14]:
for scene in scenes:
    uid =  scene.id
    bounds = scene.bounds
    time = scene.center_time

    print("uid: {},\t date: {},\t bounds: {}".format(
        uid,
        time.date(),
        [bounds.left, bounds.bottom, bounds.right, bounds.top]
    ))


uid: 8ae129db-3457-41a4-8126-f4d5ad411aab,	 date: 2018-09-17,	 bounds: [399960, 5190220, 509760, 5300020]
uid: 9be02a41-93ea-43e2-a1f9-ff062be46a3e,	 date: 2018-09-30,	 bounds: [399960, 5190220, 509760, 5300020]
uid: 52b10515-a1d3-43e4-b07f-1a4984bc4e47,	 date: 2018-09-10,	 bounds: [399960, 5190220, 509760, 5300020]
uid: c1cd1733-f02f-4cff-a54d-376fb3844bfc,	 date: 2018-09-27,	 bounds: [399960, 5190220, 509760, 5300020]
uid: 9af91f68-6977-43f8-9d04-cd265ac54a81,	 date: 2018-09-07,	 bounds: [399960, 5190220, 509760, 5300020]


In [19]:
dir(scenes[0].metadata)

['creation_time',
 'format',
 'grid_spatial',
 'id',
 'instrument',
 'label',
 'lat',
 'lon',
 'measurements',
 'platform',
 'product_type',
 'sources',
 'time']

In [15]:
for scene in scenes:
    print ('location of metadata:\n')
    print (scene.id)
    print (scene.uris)
    print ('---------------------\n')
    
    for measurement in scene.measurements:
        print("insturement: {},\t path: {}".format(measurement, scenes[0].measurements[measurement]['path']))

location of meta-data
8ae129db-3457-41a4-8126-f4d5ad411aab
['s3://dea-public-data/L2/sentinel-2-nrt/S2MSIARD/2018-09-17/S2A_OPER_MSI_ARD_TL_EPAE_20180917T012034_A016901_T55GDN_N02.06/ARD-METADATA.yaml']
---------------------
insturement: fmask,	 path: QA/S2A_OPER_MSI_ARD_TL_EPAE_20180917T012034_A016901_T55GDN_N02.06_FMASK.TIF
insturement: exiting,	 path: SUPPLEMENTARY/S2A_OPER_MSI_ARD_TL_EPAE_20180917T012034_A016901_T55GDN_N02.06_EXITING.TIF
insturement: incident,	 path: SUPPLEMENTARY/S2A_OPER_MSI_ARD_TL_EPAE_20180917T012034_A016901_T55GDN_N02.06_INCIDENT.TIF
insturement: nbar_red,	 path: NBAR/NBAR_B04.TIF
insturement: nbar_blue,	 path: NBAR/NBAR_B02.TIF
insturement: nbart_red,	 path: NBART/NBART_B04.TIF
insturement: timedelta,	 path: SUPPLEMENTARY/S2A_OPER_MSI_ARD_TL_EPAE_20180917T012034_A016901_T55GDN_N02.06_TIMEDELTA.TIF
insturement: nbar_green,	 path: NBAR/NBAR_B03.TIF
insturement: nbar_nir_1,	 path: NBAR/NBAR_B08.TIF
insturement: nbar_nir_2,	 path: NBAR/NBAR_B8A.TIF
insturement: n