<img src="EMODnet_compact_colour.png" align="right" width="40%"></img>
# EMODnet OGC services Workshop

## 1. Search through metadata using the OGC Catalogue Service (CSW)

The OWSLib library is a python library that allows for accessing data and metadata through OGC services. The library abstracts all of the traditional HTTP API calls into programmatic function calls. We will be using this library throughout the tutorial to interact with the MissionAtlantic GeoNode OGC services.

The library is available from GitHub @  https://github.com/geopython/OWSLib

In [1]:
from owslib.csw import CatalogueServiceWeb

#### Create a CatalogueServiceWeb object connecting to the EMODnet catalogue webservice


In [2]:
csw = CatalogueServiceWeb('https://emodnet.ec.europa.eu/geonetwork/emodnet/eng/csw')

#### Inspect its properties using print()

In [3]:
print(csw.identification.type)
print(csw.identification.title)
print(csw.identification.version)
print([op.name for op in csw.operations])

CSW
None
2.0.2
['GetCapabilities', 'DescribeRecord', 'GetDomain', 'GetRecords', 'GetRecordById', 'Transaction', 'Harvest']


#### Inspect the supported GetDomain parameters

In [4]:
csw.get_operation_by_name("GetRecords").parameters

{'resultType': {'values': ['hits', 'results', 'validate']},
 'outputFormat': {'values': ['application/xml']},
 'outputSchema': {'values': ['http://www.opengis.net/cat/csw/2.0.2',
   'http://www.isotc211.org/2005/gfc',
   'http://www.w3.org/ns/dcat#',
   'http://www.isotc211.org/2005/gmd',
   'http://standards.iso.org/iso/19115/-3/mdb/2.0']},
 'typeNames': {'values': ['csw:Record',
   'gfc:FC_FeatureCatalogue',
   'dcat',
   'gmd:MD_Metadata',
   'mdb:MD_Metadata']},
 'CONSTRAINTLANGUAGE': {'values': ['FILTER', 'CQL_TEXT']}}

#### Get supported constraint languages

In [5]:
csw.getdomain('GetRecords.CONSTRAINTLANGUAGE')
csw.results

{'type': 'csw:Record',
 'parameter': 'GetRecords.CONSTRAINTLANGUAGE',
 'values': []}

#### Get the supported elementsets

In [6]:
csw.getdomain('GetRecords.ElementSetName')
csw.results

{'type': 'csw:Record',
 'parameter': 'GetRecords.ElementSetName',
 'values': ['brief', 'summary', 'full']}

#### Get supported output formats

In [7]:
csw.getdomain('GetRecords.outputFormat')
csw.results

{'type': 'csw:Record',
 'parameter': 'GetRecords.outputFormat',
 'values': ['application/xml']}

#### Get supported output schemas

In [8]:
csw.getdomain('GetRecords.outputSchema')
csw.results

{'type': 'csw:Record',
 'parameter': 'GetRecords.outputSchema',
 'values': ['http://www.isotc211.org/2005/gmd',
  'http://www.opengis.net/cat/csw/2.0.2',
  'http://www.isotc211.org/2005/gfc',
  'http://www.w3.org/ns/dcat#',
  'http://standards.iso.org/iso/19115/-3/mdb/2.0']}

#### Search data by using OGC Filter Encoding

In [9]:
from owslib.fes import PropertyIsEqualTo, PropertyIsLike

>##### Example: search for AnyText fields that equal "Mediterranean Sea"
'AnyText' searches for the term within any of the text fields in the datasets

In [10]:
anytext_query = PropertyIsEqualTo('apiso:AnyText','Mediterranean Sea')
csw.getrecords2(constraints=[anytext_query], maxrecords=5, esn='full',outputschema='http://www.isotc211.org/2005/gmd')
print(csw.results)
for rec in csw.records:
    print(csw.records[rec].identification[0].title)
    print(csw.records[rec].identification[0].abstract)
    print("----")

{'matches': 758, 'returned': 5, 'nextrecord': 6}
Modelled occurrence probability for Posidonia oceanica meadows across the Mediterranean Sea
This dataset is an output of the ÔÇ£Mediterranean Sensitive HabitatsÔÇØ project (MEDISEH). It shows under a raster form modelled spatial distributions of Posidonia oceanica across the Mediterranean Sea. Posidonia oceanica is endemic to the Mediterranean Sea, where it is the dominant seagrass, covering about 50,000 km2 of coastal to offshore sandy and rocky areas down to depths of about 45 m. P. oceanica is a protected species according to EU legislation (Habitat directive), the Bern and Barcelona Conventions and several national legislations. While its distribution is well documented along the European shores of the western Mediterranean Sea, limited information is available about the southern shore and the eastern Mediterranean Sea. In order to bridge this information gap, one of the goals of Task 1.3 of the MEDISEH project was to model P. oceani

> ##### Example: search for title fields that contain "Physics"
'Title' searches for the term solely within dataset titles

In [11]:
title_query = PropertyIsLike('apiso:Title','%Physics%', wildCard="%")
csw.getrecords2(constraints=[title_query], maxrecords=5, esn='full',outputschema='http://www.isotc211.org/2005/gmd')
print(csw.results)
for rec in csw.records:
    print(csw.records[rec].identification[0].title)
    print(csw.records[rec].identification[0].abstract)
    print("----")

{'matches': 22, 'returned': 5, 'nextrecord': 6}
EMODnet Physics - Deseasonalized Sea Level monthly means Global Oceans
EMODnet Physics - Deseasonalized Sea Level monthly means Global Oceans. This product is based, uses and reprocess the CMEMS product id. SEALEVEL_GLO_PHY_CLIMATE_L4_REP_OBSERVATIONS_008_057.
----
EMODnet Physics - Registry of continuous noise monitoring sites
This product displays the fixed stations and transects to assess continuous noise. Data are collected from ICES DB (https://www.ices.dk/data/data-portals/Pages/Continuous-Noise.aspx), the JOMOPANS (https://jomopansgestool.au.dk/en/about) project and QUIETSEA (https://quietseas.eu/).
----
geophysics_pol_index
Geophysics that are shown as polygons. Sometimes the real position of geophysical lines cannot be shown because of confidentiality reasons and in this case a polygon that shows the approximate location is used instead. In other cases the geophysics is best represented by a polygon – for example for 3D seismic s

#### Search with a CQL query
CQL allows to combine search term and filters to create more elaborate queries, narrowing down down the results here we are looking for "Absolute" "Sea Level"

> Example: search for Lophelia and 2021 in AnyText fields

In [12]:
csw.getrecords2(cql='csw:AnyText like \'%Sea Level%\' AND dc:title like \'%Absolute%\'' , maxrecords=5,esn='full',outputschema='http://www.isotc211.org/2005/gmd')  
print(csw.results)              # For more information on query syntax, please refer to: https://developer.atlassian.com/server/confluence/advanced-searching-using-cql/ . 
for rec in csw.records:
    print(csw.records[rec].identification[0].title)
    print(csw.records[rec].identification[0].abstract)
    print("----")

{'matches': 7, 'returned': 5, 'nextrecord': 6}
EMODnet Physics - Absolute Sea Level Trend
EMODnet Physics - Absolute Sea Level Trends - trends are derived from the DUACS delayed-time (DT-2018 version) altimeter gridded maps of sea level anomalies based on a stable number of altimeters (two) in the satellite constellation
----
EMODnet Physics - Map of the Absolute Sea Level Trend (DUACS) - ERDDAP
EMODnet Physics - Absolute Sea Level Trends - trends are derived from the DUACS delayed-time (DT-2018 version) altimeter gridded maps of sea level anomalies based on a stable number of altimeters (two) in the satellite constellation.
----
EMODnet Physics - Map of the Absolute Sea Level Trend (GLORYS12V) - ERDDAP
EMODnet - Regional sea level trends are derived from the GLORYS12v1 delayed-time (DT-2018 version) altimeter gridded maps of sea level anomalies based on a stable number of altimeters (two) in the satellite constellation
----
EMODnet Physics - Map of the Absolute Sea Level Trend (DUACS)

#### Get record metadata
Choosing one specific record from the list returned above in order to inspect it in more details. 

In [13]:
from pprint import pprint
records = [csw.records[rec] for rec in csw.records]
record = records[0]
pprint(vars(record.identification[0]))

{'abstract': 'EMODnet Physics - Absolute Sea Level Trends - trends are derived '
             'from the DUACS delayed-time (DT-2018 version) altimeter gridded '
             'maps of sea level anomalies based on a stable number of '
             'altimeters (two) in the satellite constellation',
 'abstract_url': None,
 'accessconstraints': [],
 'aggregationinfo': '',
 'alternatetitle': None,
 'bbox': <owslib.iso.EX_GeographicBoundingBox object at 0x000001C4D9C6AD50>,
 'classification': [],
 'contact': [<owslib.iso.CI_ResponsibleParty object at 0x000001C4D98F9150>],
 'contributor': [],
 'creator': [],
 'date': [<owslib.iso.CI_Date object at 0x000001C4DA05C050>],
 'datetype': [],
 'denominators': [],
 'distance': [],
 'edition': None,
 'extent': <owslib.iso.EX_Extent object at 0x000001C4D9947810>,
 'graphicoverview': [],
 'identtype': 'dataset',
 'keywords': [<owslib.iso.MD_Keywords object at 0x000001C4D9947B90>,
              <owslib.iso.MD_Keywords object at 0x000001C4DA15DC10>,
      

#### Get record data
Accessing the actual online distribution methods for the selected record.

In [14]:
for resource in record.distribution.online:
    print('Description: ', resource.description)
    print('Protocol: ', resource.protocol)
    print('URL: ', resource.url)
    print("---")

Description:  EMODNET_SEA_LEVEL_TREND
Protocol:  OGC:WMS
URL:  https://prod-erddap.emodnet-physics.eu/ncWMS/wms?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.3.0&DATASET=EMODNET_SEA_LEVEL_TREND
---


### Store queried data and save as a CSV file
The example that follows aims to show how to utilise the concepts viewed above. here we use a simple query on the title of records, (but any of the previously showcased queries can work) we then save the gathered data into a dataframe that is then converted into a .csv file which can be imported in Excel for example.

In [15]:
import pandas as pd
Query_word = 'Baltic'

# Set the CSW query properties
title_query = PropertyIsLike('apiso:Title',f'%{Query_word}%', wildCard="%")
csw.getrecords2(constraints=[title_query], maxrecords=300, esn='full',outputschema='http://www.isotc211.org/2005/gmd')
print(csw.results)

# Create an empty dictionary to store URLs and protocols by title
data_dict = {'Title': [], 'Abstract': [], 'URLs': [], 'Protocols': []}

# Loop through the retrieved records
for rec in csw.records:
    record = csw.records[rec]
    # Extract relevant information from the record
    title = record.identification[0].title
    abstract = record.identification[0].abstract
    
    # Initialize lists to store URLs and protocols for each record title
    urls = []
    protocols = []
    
    # Collect URLs and protocols associated with each record title
    for resource in record.distribution.online:
        if resource.url:
            urls.append(resource.url)
        if resource.protocol:
            protocols.append(resource.protocol)
    
    # Combine URLs and protocols into single strings
    urls_str = '; \n'.join(urls)  
    protocols_str = '; \n'.join(protocols)
    
    # Append data to dictionary
    data_dict['Title'].append(title)
    data_dict['Abstract'].append(abstract)
    data_dict['URLs'].append(urls_str)
    data_dict['Protocols'].append(protocols_str)

# Create a DataFrame from the dictionary
df = pd.DataFrame(data_dict)

# Sort the DataFrame by 'Title' column and reset index
df_sorted = df.sort_values(by='Title').reset_index(drop=True)

# Specify the CSV file path
csv_file_path = 'data/CSW_data.csv'

# Manually open the CSV file with line terminator as empty string
csv_file = open(csv_file_path, 'w', newline='')

# Write the DataFrame to the CSV file
df_sorted.to_csv(csv_file, index=False)

# Manually close the CSV file
csv_file.close()

print(f"Data saved to {csv_file_path}")

{'matches': 112, 'returned': 112, 'nextrecord': 0}
Data saved to data/CSW_data.csv


### [>> Next: Visualize data using OGC Web Mapping Service (WMS)](Tutorial_Part_2_WMS.ipynb) 

<hr>

<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img style="float: right" alt="Creative Commons Lizenzvertrag" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a>