# GWS Data Library: Exploring Datasets
This notebook demonstrates how to use the GWCDataClient from the gwc_datalib Python library to interact with the GWC Data Infrastructure. 

The goal is to allow researchers to discover, access, and manage datasets published across different repositories (e.g., Azure Blob, Dataverse, ERDA) in a unified way.

## Getting Started

In [1]:
from gwc_datalib.client import GWSDataClient

client = GWSDataClient()

When you run the above cell, you may be prompted to enter your Auth0 username and password, unless your credentials are already configured in a .env file. This ensures that access to your datasets is secure and user-specific.

## Searching Datasets in the Catalog

In [2]:
client.list_datasets(
    tag="soil",
    # name = 'soil',
)

[{'dataset_name': 'Soil Moisture Denmark',
  'description': 'Monthly soil moisture raster files from Denmark for years 2018 to 2023',
  'storage_service': 'AzureBlob',
  'storage_format': 'tif',
  'dataset_metadata': {'delimiter': None,
   'has_header': None,
   'data_types': None,
   'filename_pattern': '<country>_<date_from>_<date_to>_TSEB-PT_100m_RZSM.tif',
   'additional_metadata': {},
   'bands': 4,
   'band_descriptions': {'Band 1': 'Red',
    'Band 2': 'Green',
    'Band 3': 'Blue',
    'Band 4': 'NIR'},
   'coordinate_system': 'EPSG:4326',
   'resolution': '[100.0, 100.0]',
   'extent': 'None',
   'nodata_value': '-9999',
   'compression': 'None'},
  'tags': ['soil'],
  'version': '0.0.1',
  'country': 'DK'}]

This retrieves all public datasets tagged with "soil". You can also search by name using the name= parameter, or leave both blank to list all public datasets.

## Listing Your Uploaded Datasets

In [3]:
client.list_user_datasets()

[{'dataset_name': 'Soil Moisture Denmark',
  'description': 'Monthly soil moisture raster files from Denmark for years 2018 to 2023',
  'storage_service': 'AzureBlob',
  'storage_format': 'tif',
  'dataset_metadata': {'delimiter': None,
   'has_header': None,
   'data_types': None,
   'filename_pattern': '<country>_<date_from>_<date_to>_TSEB-PT_100m_RZSM.tif',
   'additional_metadata': {},
   'bands': 4,
   'band_descriptions': {'Band 1': 'Red',
    'Band 2': 'Green',
    'Band 3': 'Blue',
    'Band 4': 'NIR'},
   'coordinate_system': 'EPSG:4326',
   'resolution': '[100.0, 100.0]',
   'extent': 'None',
   'nodata_value': '-9999',
   'compression': 'None'},
  'tags': ['soil'],
  'version': '0.0.1',
  'country': 'DK'}]

This retrieves all datasets added by the authenticated user

## Loading a Dataset

First, use the name of the dataset to load it

In [4]:
soil_moisture_dataset = client.load_dataset(dataset_name="Soil Moisture Denmark")

Now you can list the files available

In [5]:
soil_moisture_dataset.files

['denmark_20180101_20180131_TSEB-PT_100m_RZSM.tif',
 'denmark_20180201_20180228_TSEB-PT_100m_RZSM.tif',
 'denmark_20180301_20180331_TSEB-PT_100m_RZSM.tif',
 'denmark_20180401_20180430_TSEB-PT_100m_RZSM.tif',
 'denmark_20180501_20180531_TSEB-PT_100m_RZSM.tif',
 'denmark_20180601_20180630_TSEB-PT_100m_RZSM.tif',
 'denmark_20180701_20180731_TSEB-PT_100m_RZSM.tif',
 'denmark_20180801_20180831_TSEB-PT_100m_RZSM.tif',
 'denmark_20180901_20180930_TSEB-PT_100m_RZSM.tif',
 'denmark_20181001_20181031_TSEB-PT_100m_RZSM.tif',
 'denmark_20181101_20181130_TSEB-PT_100m_RZSM.tif',
 'denmark_20181201_20181231_TSEB-PT_100m_RZSM.tif',
 'denmark_20190101_20190131_TSEB-PT_100m_RZSM.tif',
 'denmark_20190201_20190228_TSEB-PT_100m_RZSM.tif',
 'denmark_20190301_20190331_TSEB-PT_100m_RZSM.tif',
 'denmark_20190401_20190430_TSEB-PT_100m_RZSM.tif',
 'denmark_20190501_20190531_TSEB-PT_100m_RZSM.tif',
 'denmark_20190601_20190630_TSEB-PT_100m_RZSM.tif',
 'denmark_20190701_20190731_TSEB-PT_100m_RZSM.tif',
 'denmark_20

To load the data you can either create download links for the files or load it directly into XArray (if Raster) or Pandas (if CSV)

In [7]:
soil_moisture_dataset.get_download_links(
    file_name="denmark_20231201_20231231_TSEB-PT_100m_RZSM.tif"  # Leave as None to download all files
)

[{'file': 'denmark_20231201_20231231_TSEB-PT_100m_RZSM.tif',
  'url': 'https://ccubed.blob.core.windows.net/soil-moisture-denmark/SM/output/100m/denmark/denmark_20231201_20231231_TSEB-PT_100m_RZSM.tif?se=2025-05-16T16%3A23%3A45Z&sp=r&sv=2025-01-05&sr=b&sig=tTUcwHyuE2LPGFUA99FEJDk8ywj3LIH9Axaf4/lCokY%3D'}]

In [6]:
soil_moisture_dataset.to_xarray(
    file_name="denmark_20231201_20231231_TSEB-PT_100m_RZSM.tif"  # Leave as None to download all files
)