## How to Query satellittdata.no Using CSW

The **Catalogue Service for the Web (CSW)** is an **Open Geospatial Consortium (OGC)** standard that enables structured querying and discovery of geospatial metadata over the web. It provides a standardised interface to search metadata catalogues using parameters such as keywords, spatial extent, and time. This tutorial demonstrates how to use CSW to query datasets hosted on [satellittdata.no](https://www.satellittdata.no/), Norway’s national satellite data portal. With Python tools, we can efficiently discover and retrieve metadata about available Earth observation products.

> **Note**: This tutorial is in development. The below functionality should work but we plan to expand this functionality in the future.

## Importing libraries

Let's begin by importing the Python libraries we’ll use in this session.

If you don’t already have **OWSLib** installed, you may need to run:

```bash
pip install owslib
```

Now we can import the libraries

In [1]:
from owslib import fes
from owslib.csw import CatalogueServiceWeb
from datetime import datetime, timedelta
import pytz

## Defining functions

Let's now define some functions that will help us use CSW to query. We will use these functions to perform a range of queries in the next section.

In [2]:
def _get_csw_connection(endpoint):
    """ Connect to CSW server
    """
    csw = CatalogueServiceWeb(endpoint, timeout=60)
    return csw

def _get_freetxt_search(kw_names):
    """
    Retuns a CSW search object based on input string
    """
    freetxt_filt = fes.PropertyIsLike('apiso:AnyText',  literal=('%s' % kw_names),
                                      escapeChar="\\", singleChar=".",
                                      wildCard="*", matchCase="True")
    return freetxt_filt

def _get_csw_records(csw, filter_list, pagesize=10, maxrecords=1):
    """
    Iterate `maxrecords`/`pagesize` times until the requested value in
    `maxrecords` is reached.
    """
    csw_records = {}
    startposition = 0
    nextrecord = getattr(csw, "results", 1)
    while nextrecord != 0:
        csw.getrecords2(
            constraints=filter_list,
            startposition=startposition,
            maxrecords=pagesize,
            outputschema="http://www.opengis.net/cat/csw/2.0.2",
            esn='full',
        )
        print(csw.results)
        csw_records.update(csw.records)
        if csw.results["nextrecord"] == 0:
            break
        startposition += pagesize + 1  # Last one is included.
        if startposition >= maxrecords:
            break
    csw.records.update(csw_records)

def _fes_date_filter(start, stop, constraint="overlaps"):
    """
    Take datetime-like objects and returns a fes filter for date range
    (begin and end inclusive).
    NOTE: Truncates the minutes!!!
    """
    start = start.strftime("%Y-%m-%d %H:00")
    stop = stop.strftime("%Y-%m-%d %H:00")
    if constraint == "overlaps":
        propertyname = "apiso:TempExtent_begin"
        begin = fes.PropertyIsLessThanOrEqualTo(propertyname=propertyname, literal=stop)
        propertyname = "apiso:TempExtent_end"
        end = fes.PropertyIsGreaterThanOrEqualTo(
            propertyname=propertyname, literal=start
        )
    elif constraint == "within":
        propertyname = "apiso:TempExtent_begin"
        begin = fes.PropertyIsGreaterThanOrEqualTo(
            propertyname=propertyname, literal=start
        )
        propertyname = "apiso:TempExtent_end"
        end = fes.PropertyIsLessThanOrEqualTo(propertyname=propertyname, literal=stop)
    else:
        raise NameError("Unrecognized constraint {}".format(constraint))
    return begin, end

def _create_filter(bbox, start, stop, product_name_pattern):
    """
    Create a CSW-compatible filter for querying satellittdata.no.

    Parameters:
        bbox (list or tuple): A bounding box defined as [minx, miny, maxx, maxy]
                              in decimal degrees (longitude, latitude).
        start (str): Start date in ISO format (e.g. '2025-06-01T00:00:00Z').
        stop (str): End date in ISO format (e.g. '2025-06-30T23:59:59Z').
        product_name_pattern (str): A text pattern or full name of the product 
                                    to search for (e.g. 'S1*GRDM*').

    Returns:
        csw (owslib.csw.CatalogueServiceWeb): A CSW connection object.
        filter_list (list): A list of FES (Filter Encoding Specification) 
                            constraints to use in a CSW query.
    """
    endpoint='https://nbs.csw.met.no'
    crs='urn:x-ogc:def:crs:EPSG:6.18:4326'

    constraints = []
    # connect to endpoint
    try:
        csw = _get_csw_connection(endpoint)
    except Exception as e:
        print("Exception: %s" % str(e))
    
    if product_name_pattern:
        freetxt_filt = _get_freetxt_search(product_name_pattern)
        constraints.append(freetxt_filt)
    
    if all(v is not None for v in [start, stop]):
        begin, end = _fes_date_filter(start, stop)
        constraints.append(begin)
        constraints.append(end)
    
    if bbox:
         bbox_crs = fes.BBox(bbox, crs=crs)
         constraints.append(bbox_crs)
    
    if len(constraints) >= 2:
        filter_list = [fes.And(constraints)]
    else:
        filter_list = constraints

    return csw, filter_list
    

## Querying Using CSW

There are three main parameters you can use to query the catalogue:

- **Bounding box**: A bounding box defined as [minx, miny, maxx, maxy] in decimal degrees (longitude, latitude).
- **Time range**: Specify a start and end datetime in ISO format (e.g. '2025-06-01T00:00:00Z').
- **Product name (or pattern)**: This can be a full product identifier or a partial match using wildcards, e.g.:
    - A specific product: `S1A_EW_GRDM_1SDH_20250613T174547_20250613T174623_059631_076768_FDA0`
    - All Sentinel-2 products: `S2*`
    - All Sentinel-1 GRDM products: `S1*GRDM*`
    - Any other similar pattern can be used to filter products by name.

Let's first return the CSW records for an example query.

In [3]:
bbox = [-10, 75, 10, 90] # minx, miny, maxx, maxy
start = datetime(2025, 6, 5, 00, 00, 00).replace(tzinfo=pytz.utc)
stop = datetime(2025, 7, 3, 00, 00, 00).replace(tzinfo=pytz.utc)
product_name_pattern = 'S1*GRDM*'

csw, filter_list = _create_filter(bbox, start, stop, product_name_pattern)

_get_csw_records(csw, filter_list, pagesize=3, maxrecords=6)
csw.records

{'matches': 285, 'returned': 3, 'nextrecord': 4}
{'matches': 285, 'returned': 3, 'nextrecord': 7}


OrderedDict([('9484cbcc-c1d0-48d8-0de6-0107560ca083',
              <owslib.catalogue.csw2.CswRecord at 0x7206880e2ad0>),
             ('9484cbcc-c1d0-48d8-0de6-0107560ca710',
              <owslib.catalogue.csw2.CswRecord at 0x7206880e3750>),
             ('9484cbcc-c1d0-48d8-0de6-0107560cce39',
              <owslib.catalogue.csw2.CswRecord at 0x7206880e2a90>),
             ('f124218a-3dcb-4ee1-ac93-1051d5efbbc3',
              <owslib.catalogue.csw2.CswRecord at 0x7206a4ed3910>),
             ('30794d32-44bf-40b4-bf59-bb9deb03f863',
              <owslib.catalogue.csw2.CswRecord at 0x7206880e1650>),
             ('3191358e-d41d-4948-85e6-c00935d982fe',
              <owslib.catalogue.csw2.CswRecord at 0x7206880e2710>)])

We have used pagesize and maxrecords to limit the number of products returned. Many products were found, but we stated that we only wanted 3 to be returned at a time, and 6 in total.

Let's now extract some useful information from this query, such as various URLs for accessing the data.

### Returning dictionaries that include all the URLs

In [4]:
for key, value in list(csw.records.items()):
    for ref in value.references:
        print(ref)

{'scheme': 'OPeNDAP:OPeNDAP', 'url': 'https://nbstds.met.no/thredds/dodsC/NBS/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T072052_20250630T072152_003009_006209_645B.nc'}
{'scheme': 'OGC:WMS', 'url': 'https://adc-wms.met.no/get_wms/9484cbcc-c1d0-48d8-0de6-0107560ca083/wms'}
{'scheme': 'WWW:DOWNLOAD-1.0-http--download', 'url': 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T072052_20250630T072152_003009_006209_645B.zip'}
{'scheme': 'OPeNDAP:OPeNDAP', 'url': 'https://nbstds.met.no/thredds/dodsC/NBS/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T071952_20250630T072052_003009_006209_F006.nc'}
{'scheme': 'OGC:WMS', 'url': 'https://adc-wms.met.no/get_wms/9484cbcc-c1d0-48d8-0de6-0107560ca710/wms'}
{'scheme': 'WWW:DOWNLOAD-1.0-http--download', 'url': 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T071952_20250630T072052_003009_006209_F006.zip'}
{'scheme': 'OPeNDAP:OPeNDAP', 'url': 'https://nbstds.met.n

### Returning a List of OPeNDAP URLs

**OPeNDAP** (Open-source Project for a Network Data Access Protocol) is a protocol designed for remote access to scientific datasets in **netCDF** format. It enables users to retrieve specific subsets of data over the web—such as selected variables, spatial regions, or time ranges—without needing to download the entire file. Access is performed via standard HTTP requests, making it ideal for working with large datasets efficiently.

In the NBS project, the following products are served in netCDF format via OPeNDAP:

- **Sentinel-1 GRD** products (30-day rolling archive)  
- **Sentinel-2** products (365-day rolling archive)

In [5]:
url_opendap = []

for key, value in list(csw.records.items()):
    for ref in value.references:
        if ref['scheme'] == 'OPeNDAP:OPeNDAP':
            url_opendap.append(ref['url'])

url_opendap

['https://nbstds.met.no/thredds/dodsC/NBS/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T072052_20250630T072152_003009_006209_645B.nc',
 'https://nbstds.met.no/thredds/dodsC/NBS/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T071952_20250630T072052_003009_006209_F006.nc',
 'https://nbstds.met.no/thredds/dodsC/NBS/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T085810_20250630T085910_003010_006213_9C2C.nc',
 'https://nbstds.met.no/thredds/dodsC/NBS/S1A/2025/06/30/EW/S1A_EW_GRDM_1SDH_20250630T175659_20250630T175748_059879_077004_798C.nc',
 'https://nbstds.met.no/thredds/dodsC/NBS/S1A/2025/06/11/EW/S1A_EW_GRDM_1SDH_20250611T063951_20250611T064055_059595_076620_4563.nc',
 'https://nbstds.met.no/thredds/dodsC/NBS/S1A/2025/06/26/EW/S1A_EW_GRDM_1SDH_20250626T070538_20250626T070638_059814_076DC0_D27C.nc']

### Returning a List of WMS URLs

**WMS** (Web Map Service) is an Open Geospatial Consortium (OGC) standard for serving georeferenced map images over the web. In the NBS project, WMS is used to visualise Sentinel products directly in a web mapping application, allowing users to preview datasets without downloading the raw data. These visualisations can be accessed by following the provided WMS URLs.

In [6]:
url_wms = []

for key, value in list(csw.records.items()):
    for ref in value.references:
        if ref['scheme'] == 'OGC:WMS':
            url_wms.append(ref['url'])

url_wms

['https://adc-wms.met.no/get_wms/9484cbcc-c1d0-48d8-0de6-0107560ca083/wms',
 'https://adc-wms.met.no/get_wms/9484cbcc-c1d0-48d8-0de6-0107560ca710/wms',
 'https://adc-wms.met.no/get_wms/9484cbcc-c1d0-48d8-0de6-0107560cce39/wms',
 'https://adc-wms.met.no/get_wms/f124218a-3dcb-4ee1-ac93-1051d5efbbc3/wms',
 'https://adc-wms.met.no/get_wms/30794d32-44bf-40b4-bf59-bb9deb03f863/wms',
 'https://adc-wms.met.no/get_wms/3191358e-d41d-4948-85e6-c00935d982fe/wms']

### Returning a List of URLs for Direct Download

In the NBS project, the Sentinel products are available for direct download via a THREDDS catalogue.  
**THREDDS** (Thematic Real-time Environmental Distributed Data Services) is a web server developed by Unidata that provides metadata and data access for scientific datasets, typically using standard protocols such as OPeNDAP, HTTP, and WMS.

In [7]:
url_download = []

for key, value in list(csw.records.items()):
    for ref in value.references:
        if ref['scheme'] == 'WWW:DOWNLOAD-1.0-http--download':
            url_download.append(ref['url'])

url_download

['https://nbstds.met.no/thredds/fileServer/nbsArchive/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T072052_20250630T072152_003009_006209_645B.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T071952_20250630T072052_003009_006209_F006.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1C/2025/06/30/EW/S1C_EW_GRDM_1SDH_20250630T085810_20250630T085910_003010_006213_9C2C.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1A/2025/06/30/EW/S1A_EW_GRDM_1SDH_20250630T175659_20250630T175748_059879_077004_798C.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1A/2025/06/11/EW/S1A_EW_GRDM_1SDH_20250611T063951_20250611T064055_059595_076620_4563.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S1A/2025/06/26/EW/S1A_EW_GRDM_1SDH_20250626T070538_20250626T070638_059814_076DC0_D27C.zip']

### Full Example of Querying and Returning URLs for Direct Download

In [8]:
bbox = [11, 60, 12, 61] # minx, miny, maxx, maxy
start = datetime(2025, 6, 5, 00, 00, 00).replace(tzinfo=pytz.utc)
stop = datetime(2025, 7, 3, 00, 00, 00).replace(tzinfo=pytz.utc)
product_name_pattern = 'S2*L2A*'

csw, filter_list = _create_filter(bbox, start, stop, product_name_pattern)

_get_csw_records(csw, filter_list, pagesize=10, maxrecords=100)

url_download = []

for key, value in list(csw.records.items()):
    for ref in value.references:
        if ref['scheme'] == 'WWW:DOWNLOAD-1.0-http--download':
            url_download.append(ref['url'])

url_download

{'matches': 61, 'returned': 10, 'nextrecord': 11}
{'matches': 61, 'returned': 10, 'nextrecord': 21}
{'matches': 61, 'returned': 10, 'nextrecord': 32}
{'matches': 61, 'returned': 10, 'nextrecord': 43}
{'matches': 61, 'returned': 10, 'nextrecord': 54}
{'matches': 61, 'returned': 7, 'nextrecord': 0}


['https://nbstds.met.no/thredds/fileServer/nbsArchive/S2C/2025/06/13/S2C_MSIL2A_20250613T104641_N0511_R051_T33VUG_20250613T134507.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S2C/2025/06/07/S2C_MSIL2A_20250607T102621_N0511_R108_T32VPM_20250607T161214.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S2C/2025/06/13/S2C_MSIL2A_20250613T104641_N0511_R051_T33VUH_20250613T134507.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S2B/2025/06/22/S2B_MSIL2A_20250622T102559_N0511_R108_T33VUH_20250622T124912.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S2B/2025/06/22/S2B_MSIL2A_20250622T102559_N0511_R108_T32VPM_20250622T124912.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S2B/2025/06/22/S2B_MSIL2A_20250622T102559_N0511_R108_T33VUG_20250622T124912.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsArchive/S2B/2025/06/22/S2B_MSIL2A_20250622T102559_N0511_R108_T32VPN_20250622T124912.zip',
 'https://nbstds.met.no/thredds/fileServer/nbsAr