# 4.3 Accessing metadata via Python from a Jupyter Notebook
Now that we know how searching for metadata works through geoportals, we want to look a bit more behind the HTML interface. When you're filtering the datasets there are requests executed on the CSW in the background. The construction of the requests is always the same at CSWs. But before we can start let's quickly set up our all other components which we need. The OGC have an own Python library, called OWSLib, for client programming. This library allows the users to access and utilize geospatial data from their online services, like WMS or CSW, via Python. After installing the library you will also need to import the CSW class from OWSLib and some methods for later use. 

In [2]:
%pip install OWSLib
from owslib.csw import CatalogueServiceWeb
from owslib.fes import PropertyIsEqualTo, BBox, PropertyIsLike



[notice] A new release of pip is available: 23.0.1 -> 23.1.2
[notice] To update, run: C:\Users\Tobias\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.


Now we need the URL from the INSPIRE CSW. I've you search for the "INSPIRE CSW" in your browser, you should get the the catalogue service in one of the first links (https://inspire-geoportal.ec.europa.eu/GeoportalProxyWebServices/resources/OGCCSW202?request=GetCapabilities&service=CSW). It's the GetCapabilities request from the CSW. In the GetCapabilities you get all the metadata for the catalogue service.
But to use the catalogue service from INSPIRE in the CSW class from OWSLib you only need the first part of the URL until the "?". With that and the function CatalogueServiceWeb(), we can create a connection to the CSW. 

In [3]:
CSW_URL = 'https://inspire-geoportal.ec.europa.eu/GeoportalProxyWebServices/resources/OGCCSW202'
csw = CatalogueServiceWeb(CSW_URL)

Let's first see which operations the INSPIRE CSW has.

In [11]:
[op.name for op in csw.operations]

['GetCapabilities', 'DescribeRecord', 'GetRecords', 'GetRecordById']

So we need the metadata about the GetRecords operation before we can search for a dataset. We want to know for what values we can search for.

In [10]:
for constraint in csw.constraints:
    print(f"  {constraint}")


  IsoProfiles
  PostEncoding


Now look at the SupportedISOQueryables and search for the values of the contraint. They are listed two times in GetRecords but we only want them listed one time to get an better oerview.

In [12]:
temp = 0
for constraint in op.constraints:
    if 'SupportedISOQueryables' in constraint.name and temp == 0:
        temp += 1
        print(constraint.values)

NameError: name 'op' is not defined

There are quite a few values which we can use. But not all are very helpful for us.

In [32]:
response = csw.getrecords2([PropertyIsEqualTo('csw:AnyText', 'Hannover')])
if response is not None:
    # Ergebnisse auswerten
    for rec in response.records:
        print(rec.title)
else:
    print("Keine Ergebnisse gefunden.")

Keine Ergebnisse gefunden.


In [79]:
# Suchkriterien definieren
suchtext = "Hannover"
suchattribut = "Title"

# Filter erstellen
filter = PropertyIsLike(propertyname=suchattribut, literal=f"*{suchtext}*", escapeChar="\\", wildCard="*")

# GetRecords-Anfrage mit Filter durchführen
response = csw.getrecords2(constraints=[filter], maxrecords=10)

# Überprüfen, ob Ergebnisse vorhanden sind
if response is not None:
    # Ergebnisse auswerten
    for rec in response.records:
        print(rec.title)
else:
    print("Keine Ergebnisse gefunden.")

Keine Ergebnisse gefunden.


In [21]:
bbox_query = BBox([52.244094, 9.212036, 52.574420, 10.318909])
csw.getrecords2(constraints=[bbox_query])
csw.results

for rec in csw.results:
    print(csw)

<owslib.catalogue.csw2.CatalogueServiceWeb object at 0x00000270A451CBB0>
<owslib.catalogue.csw2.CatalogueServiceWeb object at 0x00000270A451CBB0>
<owslib.catalogue.csw2.CatalogueServiceWeb object at 0x00000270A451CBB0>


In [15]:
print(csw.getrecords2(constraints=[PropertyIsEqualTo('dc:identifier', '412966c0-a49f-40d2-8419-d7eca338da0f')], maxrecords=1))

None


In [22]:
rec = csw.records[0]
print(rec.title)
print(rec.references)

KeyError: 0

In [23]:
caps = csw.getcapabilities()

# verfügbare Suchkriterien ausgeben
for op in csw.operations:
    if 'GetRecords' in op.name:
        print(f"\nSearch criteria for {op.name}:")
        for constraint in op.constraints:
            print(f"  {constraint.name}")

AttributeError: 'CatalogueServiceWeb' object has no attribute 'getcapabilities'

In [36]:
csw.getrecords2()
print(csw.results)

{'matches': 250203, 'returned': 10, 'nextrecord': 11}


In [52]:
hannover_query = PropertyIsLike('csw:title', '%Hannover%')
csw.getrecords2(constraints=[hannover_query])
csw.results

{'matches': 35, 'returned': 10, 'nextrecord': 11}

In [63]:
for recid in csw.records:
    record = csw.records[recid]
    print(record.title)