# Data Set

A Data Set is a collection for Data Layers for which defaults (e.g. attributes, security controls etc) can be applied. 

In most cases, the Data Set serves as a collection for interrelated data, usually acquired contemporaneously, (e.g. the 'bands' in the data provided by the ESA Sentinel 2 satellite) however, there is nothing that restricts a Data Set from use on more informal terms.

Searching for Data Sets was discussed in [Catalog - General](general.ipynb), the following section will discuss how to retrieve the extensive metadata about a Data Set.

In [None]:
%pip install configparser
%pip install ibmpairs

In [1]:
import os
import ibmpairs.client as client
import ibmpairs.catalog as catalog
import configparser

config = configparser.RawConfigParser()
config.read('../../../auth/secrets.ini')
# Best practice is not to include secrets in source code so we read
# an api key, tenant id and org id from a secrets.ini file.
# You could set the credentials in-line here but we don't
# recommend it for security reasons.

EI_API_KEY    = config.get('EI', 'api.api_key')
EI_TENANT_ID  = config.get('EI', 'api.tenant_id') 
EI_ORG_ID     = config.get('EI', 'api.org_id') 

# Authenticate and get a client object.
ei_client = client.get_client(api_key   = EI_API_KEY,
                              tenant_id = EI_TENANT_ID,
                              org_id    = EI_ORG_ID)

2025-01-10 10:35:22 - paw - INFO - The client authentication method is assumed to be OAuth2.
2025-01-10 10:35:22 - paw - INFO - Legacy Environment is False
2025-01-10 10:35:22 - paw - INFO - The authentication api key type is assumed to be IBM EIS, because the api key prefix 'PHX' is present.
2025-01-10 10:35:26 - paw - INFO - Authentication success.
2025-01-10 10:35:26 - paw - INFO - HOST: https://api.ibm.com/geospatial/run/na/core/v3


## Get a list of Data Sets

In order to return all data sets available to a user, you can execute the `ibmpairs.catalog.get_data_sets()` method.

In [2]:
ds_list = catalog.get_data_sets()
ds_list.display()

Unnamed: 0,id,name,description_short,description_long
0,574,15-46 day ECMWF weather forecast (ML post-proc...,ML Post-processed temperature and precipitatio...,ML Post-processed temperature and precipitatio...
1,575,1-15 day ECMWF weather forecast (ML post-proce...,ML Post-processed temperature and precipitatio...,ML Post-processed temperature and precipitatio...
2,306,Atmospheric weather (ERA5),A global reanalysis data set produced by ECMWF...,ERA5 is the direct successor to the ERA Interi...
3,63,High resolution aerial imagery (USDA NAIP),High resolution (<1m) aerial imagery from the ...,National Agriculture Imagery Program (NAIP) ac...
4,369,Buoy Data Wave Summary,Precise wave conditions around bouys belonging...,Local measurements of wave attributes and thei...
...,...,...,...,...
97,36,GFM Firescars,,
98,35,GFM Above Ground Biomass,,
99,38,Above Ground Biomass,Above Ground Biomass related data,Above Ground Biomass data generated using GFM ...
100,70,TWC AGE Data,"AGE Data layers from The Weather Company, an I...",


## Get a Data Set
In order to return all metadata about a Data Set, the `ibmpairs.catalog.get_data_set()` helper function can be used with a provided Data Set ID:

In [3]:
ds = catalog.get_data_set(id = "177")
print(ds)

{
    "category": {
        "id": 1,
        "name": "Satellite"
    },
    "created_at": "1593733829000",
    "crs": "",
    "data_set_response": {},
    "data_source_attribution": "Source: European Space Agency - ESA; Contains modified Copernicus Sentinel data [2018 and Ongoing]",
    "data_source_description": "Level-2A is generated by the Payload Data Ground Segment using the Sen2Cor processor. Level-2A products are made available to users via the Copernicus Open Access Hub: https://scihub.copernicus.eu/dhus/#/home",
    "data_source_links": [
        "https://sentinel.esa.int/web/sentinel/sentinel-data-access"
    ],
    "data_source_name": "European Space Agency Sentinel-2",
    "description_links": [],
    "description_long": "Sentinel-2 is a set of two satellites in polar orbit 180 degrees apart. It monitors land surface and coastal waters every 5 days at the equator and more frequently at mid-latitudes. The coverage is between latitudes 56\u00b0 south and 84\u00b0 north. Image