# 4.3 Accessing metadata via Python from a Jupyter Notebook
Now that we know how an implemented interface of an catalog service looks, we want to take a look behind the geoportals and software such as QGIS. We now write our own software with Jupyter notebooks which interacts with OGC CSW catalog services.  
But before we can start let's quickly set up our all other components which we need. The OGC have an own Python library, called OWSLib, for client programming. This library allows the users to access and utilize geospatial data from their online services, like WMS or CSW, via Python. After installing the library you will also need to import the CSW class from OWSLib and some methods for later use.     
If you're interested in OWSLib and want to look more into that you can find an detailed documentation for OWSLib under https://owslib.readthedocs.io/en/latest/.

In [1]:
%pip install OWSLib
from owslib.csw import CatalogueServiceWeb
from owslib.fes import PropertyIsEqualTo, BBox, PropertyIsLike


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.0.1 -> 23.2.1
[notice] To update, run: C:\Users\Tobias\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Now we need the URL from a CSW, let's use the Catalog Service from the previous examples. If you search for the "geodatenkatalog csw" in your prefered browser, you should get the catalog service in one of the first links (https://gdk.gdi-de.org/gdi-de/srv/ger/csw?service=CSW&version=2.0.2&REQUEST=GetCapabilities&SERVICE=CSW). It's the GetCapabilities request from the CSW, remember that you get all the metadata for the catalog service itself. You could interact with the service in your browser, too. But that is a bit more tricky as you have to write all your requests in the URL.      
But to use the catalogue service from INSPIRE in the CSW class from OWSLib you only need the first part of the URL until before the "?". With that and the function CatalogueServiceWeb(), we can create a connection to the CSW. 

In [2]:
CSW_URL = 'https://gdk.gdi-de.org/gdi-de/srv/ger/csw'
csw = CatalogueServiceWeb(CSW_URL)

Yeah we got a connection!    

------

In the beginning of the module we learned about the operations of a CSW. Can you remember any except the GetCapabilities() operation? If not here are some reminders for you of the main operations:     

1. GetCapabilites(): This operation allows retrieving the available functions and properties of the CSW, including supported operations and data schema.

2. DescribeRecord(): This operation allows retrieving information about the available metadata records, including the metadata schema and supported metadata elements. 

3. GetRecords(): This operation enables retrieving metadata records based on specific criteria such as keywords, spatial extents, or time intervals.

4. GetRecordById(): This operation allows retrieving metadata records by their unique identification number. 

5. Transaction(): This operation supports adding, updating, or deleting metadata records in the CSW.


Let's take look on the operations which the catalog service has.

In [3]:
op = [op.name for op in csw.operations]
print(op)

['GetCapabilities', 'DescribeRecord', 'GetDomain', 'GetRecords', 'GetRecordById', 'Transaction', 'Harvest']


-------
The GetRecords() operation from the CSW is implemented through the getrecords2() method from the OWSLib library. In the method you can give "constraints" to filter through the CSW. With the argument maxrecords you can also set the number of the first records that will be returned (The default would be 10). 
There are some methods like PropertyIsLike() or PropertyIsEqualTo(), which we imported in the beginning, to make filters. There you can insert an string with the information, which element should be searched for, and a string with the keyword you want to search for. As Elements you can use for example:
- csw:Title: which search for a specific title
- csw:AnyText: which finds all records which contain the specific given string anywhere.

After that you can get the results with the results method from the OWSLib library.


In [4]:
NSG_filter = PropertyIsLike('csw:Title', '%Naturschutzgebiete%')
csw.getrecords2(constraints=[NSG_filter], maxrecords=20)
csw.results

{'matches': 89, 'returned': 20, 'nextrecord': 21}

To get an more informative output, for example the titles or something else, of your results, you can make a for loop and print the records.

In [8]:
for rec in csw.records:
    print(csw.records[rec].title)
    print(csw.records[rec].abstract)
    print(" ")

Naturschutzgebiete in den Landkreisen Nordvorpommern und Rügen in Übersicht
Das StAUN HST verwaltet 32  NSG in den LK Norrdvorpommern und Rügen, die in der Karte "Naturschutzflächen Mecklenburg-Vorpommern", Ausgabe 1995 des Landesvermessungsamtes M-V
(1:250000) anhand von Schutzgebietsnummern registriert sind.
Schutzgebietsnummern für NSG mit Verordnung: 4, 9, 13, 14, 16, 18, 43 B, 210, 252, 253, 254, 255,
256, 257, 273, 276, 285, 286, 292, 311.
Schutzgebietsnummern für NSG mit Behandlungsrichtlinien: 4, 9, 13, 14, 16, 18, 21, 22, 23, 43 A, 46, 62, 80, 83, 128, 129, 130, 294, 295.
Für einige Schutzgebiete liegen Gutachten zur Renaturierung bzw. Studien zum gegenwärtigen Zustand vor.
 
Dez.210 Arten- und Biotopschutz, Naturschutzgebiete Amtsbereich StALU MS Neubrandenburg
Das Dezernat 210 "Arten- und Biotopschutz, Naturschutzgebiete" hat als Hauptaufgabe die Erhaltung des vorhandenen Arteninventars der freilebenden Tiere und Pflanzen und der Ökosysteme zu sichern.

Das Dezernat 210 ist 

You can also try to filter with BBox(). That will make it possible to filter for datasets in a specific area. The method will use an array with latitudes and longitudes. Therefore you need two points in this order: [latMin, longMin, latMax, longMax]. 

In [9]:
bbox_query = BBox([52.839976, 7.474823, 53.098018, 7.911530])
csw.getrecords2(constraints=[bbox_query])
csw.results

{'matches': 3051, 'returned': 10, 'nextrecord': 11}

If you already know the identifier of our wanted dataset and want to look if it is in the catalog, you can also search for that aswell with the getrecordbyid() method. It represents the GetRecordsByID operation from the CSW.

In [10]:
csw.getrecordbyid(id=['EE85FE8F-BD05-4A6D-813B-6ABC4514B18B'])
csw.records['EE85FE8F-BD05-4A6D-813B-6ABC4514B18B'].title


'Naturschutzgebiete (NSG)'

-------
So now that you know how to search in an CSW with OWSLib try to find the dataset that we have found in the last excercise in the geoportal aswell with the mtheod getrecords2().

In [None]:
# define filters


# GetRecords() method
response = csw.getrecords2(constraints=[filter], maxrecords=10)

# If there are results print the titles of them
if response is not None:
    # Show results
    for rec in response.records:
        print(rec.title)
else:
    print("There are no results found. Please try to work out your filters")

------
# Example Answer
------

In [None]:
# define filters


# GetRecords() method
response = csw.getrecords2(constraints=[filter], maxrecords=10)

# If there are results print the titles of them
if response is not None:
    # Ergebnisse auswerten
    for rec in response.records:
        print(rec.title)
else:
    print("There are no results found. Please try to work out your filters")

-------
## Testing  

-------


In [12]:
csw.getrecords2()
print(csw.results)

{'matches': 600007, 'returned': 10, 'nextrecord': 11}


In [None]:
caps = csw.getcapabilities()

# verfügbare Suchkriterien ausgeben
for op in csw.operations:
    if 'GetRecords' in op.name:
        print(f"\nSearch criteria for {op.name}:")
        for constraint in op.constraints:
            print(f"  {constraint.name}")