#### **Open-Data-Management-SH**

#### `Name & Surname: Abdulkadir Arslan`

##### Note: I converted and saved all the outputs as a JSON file to make them human-readable and provided them in the zip file. Exceptionally, I saved only the output of question-2 as a JSON-LD file as it is asked.

##### Introduction: Please use the portal “Open-Data Schleswig-Holstein” (https://opendata.schleswig-holstein.de) to develop Python programs to demonstrate the use of data schemas, RDF files, SPARQL, and CKAN due to the following guidelines:

##### Question-1: Select actual data sets which are public domain (“gemeinfrei”) and support linked data and are available as RDF files. Download the newest versions of the RDF files for data processing.

##### Answer-1: I selected actual data sets by applying the following filters over “Open-Data Schleswig-Holstein” (https://opendata.schleswig-holstein.de); 'gemeinfrei', 'linked data' and RDF.  And then I downloaded the newest following  RDF files to be able to answer the rest of the questions; 'schulen-2023-11-01.rdf' and 'polizeidienststellen-2023-10 26.rdf'.

##### Question-2: Develop a Python program to demonstrate the use of RDF files by using the Python package RDFLib. Examine the RDF file for schools (“Schulen”). Which data schema is used due to ‘schema.org’? How does the data schema look like in JSON-LD format?

##### Answer-2:

In [1]:
from rdflib import Graph, Namespace, URIRef
from rdflib.plugin import register, Serializer
from rdflib.namespace import RDF

file_path_schulen = 'schulen-2023-11-01.rdf'
graph_schulen = Graph()
graph_schulen.parse(file_path_schulen, format='xml')

schema_ns_schulen = Namespace('http://schema.org/')

schema_used = None
for s, p, o in graph_schulen.triples((None, None, None)):
    if schema_ns_schulen in p:
        schema_used = schema_ns_schulen
        break

if schema_used is None:
    for s, p, o in graph_schulen.triples((None, RDF.type, None)):
        if schema_ns_schulen in str(s):
            schema_used = s.split('#')[1]
            break

print(f'Data schema used: {schema_used}')

jsonld_data_schulen = graph_schulen.serialize(format='json-ld', indent=2)

print('\nJSON-LD format:')
print(jsonld_data_schulen)

output_file_path_schulen = 'schulen_sh.jsonld'
with open(output_file_path_schulen, 'w', encoding='utf-8') as output_file:
    output_file.write(jsonld_data_schulen)

print(f'\nJSON-LD data is saved')

Data schema used: http://schema.org/

JSON-LD format:
[
  {
    "@id": "https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9115925",
    "@type": [
      "http://schema.org/School"
    ],
    "http://schema.org/address": [
      {
        "@id": "_:N9b4824494d5049c8a7dcd775e3ffab5a"
      }
    ],
    "http://schema.org/areaServed": [
      {
        "@id": "https://zufish.schleswig-holstein.de/portaldeeplink?tsa_gebiet_id=9007406"
      }
    ],
    "http://schema.org/email": [
      {
        "@value": "foez-ge.Norderstedt@schule.landsh.de"
      }
    ],
    "http://schema.org/employee": [
      {
        "@id": "_:N7783b87563c748bda6f5ed185a410e23"
      }
    ],
    "http://schema.org/faxNumber": [
      {
        "@value": "+49 40 5224010"
      }
    ],
    "http://schema.org/location": [
      {
        "@id": "_:N0bc795cb53d449a19596357b2ea7ba95"
      }
    ],
    "http://schema.org/makesOffer": [
      {
        "@id": "https://zufish.schleswig-holstein.de/portald

##### Question-3: Examine the RDF file for police stations (“Polizeidienststellen”). Use SPARQL to query all police stations located in “Kiel”. Provide another example query of your choice.

##### Answer-3.1: Quering all police stations located in "Kiel"

In [2]:
from rdflib import Graph, Namespace, Literal
from rdflib.plugins.sparql import prepareQuery
import json

file_path_polizeidienststellen = 'polizeidienststellen-2023-10-26.rdf'
graph_polizeidienststellen = Graph()
graph_polizeidienststellen.parse(file_path_polizeidienststellen, format='xml')

sparql_query_polizeidienststellen = prepareQuery(
    """
    PREFIX schema: <http://schema.org/>
    SELECT ?policeStation ?name ?locality
    WHERE {
        ?policeStation rdf:type schema:PoliceStation ;
                       schema:address [ schema:addressLocality ?locality ] .
        FILTER (regex(?locality, "Kiel", "i"))
        OPTIONAL { ?policeStation schema:name ?name }
    }
    """
)

results_polizeidienststellen = graph_polizeidienststellen.query(sparql_query_polizeidienststellen)

print("Kriminalpolizeistelle Police Stations:")
for row in results_polizeidienststellen:
    print(f"Police Station: {row.policeStation}")
    print(f"Name: {row.name}")
    print(f"Locality: {row.locality}")
    print("\n")

output_list_polizeidienststellen = []
for row in results_polizeidienststellen:
    output_list_polizeidienststellen.append({
        'PoliceStation': str(row.policeStation),
        'Name': str(row.name) if row.name else None,
        'Locality': str(row.locality),
    })

output_file_path_polizeidienststellen = 'kiel_polizeidienststellen.json'
with open(output_file_path_polizeidienststellen, 'w') as json_file:
    json.dump(output_list_polizeidienststellen, json_file, indent=2)

print(f"Results are saved as a json file")


Kriminalpolizeistelle Police Stations:
Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9092250
Name: 3. Polizeirevier Kiel
Locality: Kiel


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9092804
Name: 4. Polizeirevier Kiel
Locality: Kiel


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093224
Name: Kriminalpolizeistelle Kiel
Locality: Kiel


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093825
Name: Landeskriminalamt
Locality: Kiel


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9094379
Name: Polizei-Bezirksrevier Kiel
Locality: Kiel


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9095329
Name: Polizeidirektion Kiel
Locality: Kiel


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9095973
Name: Polizeistation Dietrichsdorf
Locality: Kiel


Police Station: https://zufish.

##### Answer-3.2: I provided a query of the Police Stations that are the Criminal Investigation Departments ("Kriminalpolizeistelle") in Schleswig-Holstein by using SPARQL as another example.

In [3]:
file_path_kriminalpolizeistelle = 'polizeidienststellen-2023-10-26.rdf'
graph_kriminalpolizeistelle = Graph()
graph_kriminalpolizeistelle.parse(file_path_kriminalpolizeistelle, format='xml')

sparql_query_kriminalpolizeistelle = prepareQuery(
    """
    PREFIX schema: <http://schema.org/>
    SELECT ?policeStation ?name ?locality
    WHERE {
        ?policeStation rdf:type schema:PoliceStation ;
                       schema:name ?name ;
                       schema:address [ schema:addressLocality ?locality ] .
        FILTER (regex(?name, "Kriminalpolizeistelle", "i"))
    }
    """
)

results_kriminalpolizeistelle = graph_kriminalpolizeistelle.query(sparql_query_kriminalpolizeistelle)

print("Kriminalpolizeistelle Police Stations:")
for row in results_kriminalpolizeistelle:
    print(f"Police Station: {row.policeStation}")
    print(f"Name: {row.name}")
    print(f"Locality: {row.locality}")
    print("\n")

output_list_kriminalpolizeistelle = []
for row in results_kriminalpolizeistelle:
    output_list_kriminalpolizeistelle.append({
        'PoliceStation': str(row.policeStation),
        'Name': str(row.name),
        'Locality': str(row.locality),
    })

output_file_path_kriminalpolizeistelle = 'kriminalpolizeistelle_sh.json'
with open(output_file_path_kriminalpolizeistelle, 'w') as json_file:
    json.dump(output_list_kriminalpolizeistelle, json_file, indent=2)

print(f"Results are saved as a json file")


Kriminalpolizeistelle Police Stations:
Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9092998
Name: Kriminalpolizeistelle Bad Schwartau
Locality: Bad Schwartau


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093028
Name: Kriminalpolizeistelle Eckernförde
Locality: Eckernförde


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093041
Name: Kriminalpolizeistelle Elmshorn
Locality: Elmshorn


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093056
Name: Kriminalpolizeistelle Geesthacht
Locality: Geesthacht


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093069
Name: Kriminalpolizeistelle Niebüll
Locality: Niebüll


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9093082
Name: Kriminalpolizeistelle Norderstedt
Locality: Norderstedt


Police Station: https://zufish.schleswig-holstein.de/portaldeeplink?tsa_o

##### Question-4: Use the CKAN Action API endpoint of the portal “Open-Data Schleswig-Holstein” (https://opendata.schleswig-holstein.de/api/action) to explore the organizations, packages,  groups, resources, licenses, and other data resources in a Python program. Demonstrate the use of the CKAN Action API with examples. How can the data sets of the portal be found and accessed via CKAN?

##### Answer-4: To be able to answer question-4, first, I accessed                          [SH Thema Open Data Portal](https://www.schleswig-holstein.de/DE/landesregierung/themen/digitalisierung/open-data/Infos/Entwicklerdoku/entwicklerdoku_node.html) and then I found the [CKAN API Guide](https://docs.ckan.org/en/2.8/api/) link in the open data portal. I went through the CKAN API Guide Link to provide what you asked in question-4.

In [4]:
#4.1 Exploring the 'organizations' by the use of CKAN Action API

import urllib.request
import urllib.parse
import pprint

url_organizationlist = 'http://opendata.schleswig-holstein.de/api/action/organization_list'
data_organizationlist = urllib.parse.urlencode({}).encode('utf-8')
response_organizationlist = urllib.request.urlopen(url_organizationlist, data_organizationlist)
assert response_organizationlist.code == 200

response_dict_organizationlist = json.loads(response_organizationlist.read().decode('utf-8'))

assert response_dict_organizationlist['success'] is True
result_organizationlist = response_dict_organizationlist['result']
pprint.pprint(result_organizationlist)

output_file_path_organizationlist = 'organizations.json'
with open(output_file_path_organizationlist, 'w') as json_file:
    json.dump(result_organizationlist, json_file, indent=2)

print(f"\nResults are saved as a json file")

['awsh-abfallwirtschaft-sudholstein-gmbh',
 'amt-bad-oldesloe-land',
 'amt-buechen',
 'amt-eidertal',
 'amt-elmshorn-land',
 'amt-haddeby',
 'amt-nortorfer-land',
 'amt-schlei-ostsee',
 'amt-suederbrarup',
 'antikensammlung',
 'bkzsh',
 'bkg',
 'bast',
 'bundesnetzagentur',
 'coworkland',
 'compgen',
 'delfi',
 'landesmuseum-dithmarschen',
 'fairtrade-deutschland',
 'finanzministerium',
 'fgho',
 'ammersbek',
 'buechen',
 'stockelsdorf',
 'glueckstadt',
 'luebeck',
 'histsem',
 'klimaschutzagentur-rendsburg-eckernfoerde',
 'kreis-herzogtum-lauenburg',
 'nordfriesland',
 'kreis-ostholstein',
 'kreis-pinneberg',
 'rendsburg-eckernforde',
 'kreis-schleswig-flensburg',
 'kreis-stormarn',
 'kreisarchiv-stormarn',
 'kunsthalle-kiel',
 'landesamt-fur-denkmalpflege',
 'llnl',
 'llur',
 'lfu',
 'lvermgeo',
 'lazuf',
 'landesamt-fur-soziale-dienste',
 'landesarchiv',
 'lbv',
 'lkn',
 'landeshauptstadt-kiel',
 'landesjagdverband',
 'landeskriminalamt-schleswig-holstein',
 'landesmeldestelle',
 'm

In [5]:
#4.2 Exploring the 'packages' / Finding the datasets of the portal by the use of
#CKAN Action API

#The datasets of the portal called package. On early CKAN versions, datasets were called 
#“packages” and this name has stuck in some places, specially internally and on API calls. 
#Package has exactly the same meaning as “dataset”.
#To find and access datasets on a CKAN portal, we can use the CKAN API. CKAN provides 
#a set of API endpoints that allow us to search for datasets, retrieve information about 
#specific retrieve information about specific datasets, and access the data resources 
#associated with them. (Resource: CKAN API Guide)

url_packagelist = 'http://opendata.schleswig-holstein.de/api/action/package_list'
data_packagelist = urllib.parse.urlencode({}).encode('utf-8')
response_packagelist = urllib.request.urlopen(url_packagelist, data_packagelist)
assert response_packagelist.code == 200

response_dict_packagelist = json.loads(response_packagelist.read().decode('utf-8'))

assert response_dict_packagelist['success'] is True
result_packagelist = response_dict_packagelist['result']
pprint.pprint(result_packagelist)

output_file_path_packagelist = 'packages(datasets).json'
with open(output_file_path_packagelist, 'w') as json_file:
    json.dump(result_packagelist, json_file, indent=2)

print(f"\nResults are saved as a json file")

['01001000_bp_030_hildebrandstrasse_urschrift',
 '01001000_bp_033_friedrich_ebert_strasse_urschrift',
 '01001000_bp_034_muerwiker_strasse_1-aenderung',
 '01001000_bp_034_muerwiker_strasse_urschrift',
 '01001000_bp_035_strandfrieden_4-vereinfachte_aenderung',
 '01001000_bp_036_schoene_aussicht_noerdlicher_teil_2-aenderung',
 '01001000_bp_036_schoene_aussicht_noerdlicher_teil_urschrift',
 '01001000_bp_036i_schoene_aussicht_suedlicher_teil_1-aenderung',
 '01001000_bp_036i_schoene_aussicht_suedlicher_teil_urschrift',
 '01001000_bp_037_johannismuehle_1-aenderung',
 '01001000_bp_037_johannismuehle_2-aenderung',
 '01001000_bp_037_johannismuehle_urschrift',
 '01001000_bp_038_travestrasse_1-aenderung',
 '01001000_bp_038_travestrasse_1-vereinfachte_aenderung',
 '01001000_bp_038_travestrasse_2-aenderung',
 '01001000_bp_038_travestrasse_2-vereinfachte_aenderung',
 '01001000_bp_038_travestrasse_3-aenderung',
 '01001000_bp_038_travestrasse_3-vereinfachte_aenderung',
 '01001000_bp_038_travestrasse_4-

In [6]:
#4.3 Exploring the 'groups' by the use of CKAN Action API

url_grouplist = 'http://opendata.schleswig-holstein.de/api/action/group_list'
data_grouplist = urllib.parse.urlencode({}).encode('utf-8')
response__grouplist = urllib.request.urlopen(url_grouplist, data_grouplist)
assert response__grouplist.code == 200

response_dict_grouplist = json.loads(response__grouplist.read().decode('utf-8'))

assert response_dict_grouplist['success'] is True
result_grouplist = response_dict_grouplist['result']
pprint.pprint(result_grouplist)

output_file_path_grouplist = 'groups.json'
with open(output_file_path_grouplist, 'w') as json_file:
    json.dump(result_grouplist, json_file, indent=2)

print(f"\nResults are saved as a json file")

['soci',
 'educ',
 'ener',
 'heal',
 'intr',
 'just',
 'agri',
 'gove',
 'regi',
 'envi',
 'tran',
 'econ',
 'tech']

Results are saved as a json file


In [7]:
#4.4 Exploring the 'resources' by the use of CKAN Action API

url_resource = 'http://opendata.schleswig-holstein.de/api/action/current_package_list_with_resources'
data_resource = urllib.parse.urlencode({}).encode('utf-8')
response_resource = urllib.request.urlopen(url_resource, data_resource)
assert response_resource.code == 200

response_dict_resource = json.loads(response_resource.read().decode('utf-8'))

assert response_dict_resource['success'] is True
result_resource = response_dict_resource['result']
pprint.pprint(result_resource)

output_file_path_resource = 'resources.json'
with open(output_file_path_resource, 'w') as json_file:
    json.dump(result_resource, json_file, indent=2)

print(f"\nResults are saved as a json file")


[{'author': None,
  'author_email': None,
  'creator_user_id': '3d0ba916-bc51-4bb7-a011-f2c5b35b1175',
  'extras': [{'key': 'issued', 'value': '2023-05-30T00:00:00'},
             {'key': 'licenseAttributionByText', 'value': ''},
             {'key': 'spatial',
              'value': '{"type": "Polygon", "coordinates": [[[10.0327, '
                       '54.2508], [10.0327, 54.4314], [10.2186, 54.4314], '
                       '[10.2186, 54.2508], [10.0327, 54.2508]]]}'},
             {'key': 'temporal_end', 'value': '2023-10-26T00:00:00'},
             {'key': 'temporal_start', 'value': '2023-07-26T00:00:00'}],
  'groups': [{'description': '',
              'display_name': 'Verkehr',
              'id': 'tran',
              'image_display_url': '',
              'name': 'tran',
              'title': 'Verkehr'}],
  'id': '1afa7e98-902d-40eb-ab09-9e636371abcc',
  'isopen': True,
  'license_id': 'http://dcat-ap.de/def/licenses/cc-zero',
  'license_title': 'Creative Commons CC Zero L

In [8]:
#4.5 Exploring the 'licenses' by the use of CKAN Action API

url_licenselist = 'http://opendata.schleswig-holstein.de/api/action/license_list'
data_licenselist = urllib.parse.urlencode({}).encode('utf-8')
response_licenselist = urllib.request.urlopen(url_licenselist, data_licenselist)
assert response_licenselist.code == 200

response_dict_licenselist = json.loads(response_licenselist.read().decode('utf-8'))

assert response_dict_licenselist['success'] is True
result_licenselist = response_dict_licenselist['result']
pprint.pprint(result_licenselist)

output_file_path_licenselist = 'licenses.json'
with open(output_file_path_licenselist, 'w') as json_file:
    json.dump(result_licenselist, json_file, indent=2)

print(f"\nResults are saved as a json file")


[{'id': 'http://dcat-ap.de/def/licenses/ccpdm/1.0',
  'od_conformance': 'approved',
  'osd_conformance': 'not reviewed',
  'status': 'active',
  'title': 'gemeinfrei',
  'url': 'http://creativecommons.org/publicdomain/mark/1.0/'},
 {'id': 'http://dcat-ap.de/def/licenses/dl-zero-de/2.0',
  'od_conformance': 'approved',
  'osd_conformance': 'approved',
  'status': 'active',
  'title': 'Datenlizenz Deutschland – Zero – Version 2.0',
  'url': 'https://www.govdata.de/dl-de/zero-2-0'},
 {'id': 'http://dcat-ap.de/def/licenses/dl-by-de/2.0',
  'od_conformance': 'approved',
  'osd_conformance': 'not reviewed',
  'status': 'active',
  'title': 'Datenlizenz Deutschland Namensnennung 2.0',
  'url': 'https://www.govdata.de/dl-de/by-2-0'},
 {'id': 'http://dcat-ap.de/def/licenses/officialWork',
  'od_conformance': 'approved',
  'osd_conformance': 'not reviewed',
  'status': 'active',
  'title': 'Amtliches Werk, lizenzfrei nach §5 Abs. 1 UrhG',
  'url': 'http://www.gesetze-im-internet.de/urhg/__5.html

In [9]:
#4.6 Exploring the 'tags' by the use of CKAN Action API  
#(The first additional data resource in addition to the data resources you asked )

url_taglist = 'http://opendata.schleswig-holstein.de/api/action/tag_list'
data_taglist = urllib.parse.urlencode({}).encode('utf-8')
response_taglist = urllib.request.urlopen(url_taglist, data_taglist)
assert response_taglist.code == 200

response_dict_taglist = json.loads(response_taglist.read().decode('utf-8'))

assert response_dict_taglist['success'] is True
result_taglist = response_dict_taglist['result']
pprint.pprint(result_taglist)

output_file_path_taglist = 'tags.json'
with open(output_file_path_taglist, 'w') as json_file:
    json.dump(result_taglist, json_file, indent=2)

print(f"\nResults are saved as a json file")

['PM25',
 'Einsammlung von Verpackungen in Schleswig-Holstein',
 'Walkerbach',
 'Krukow',
 'Hemme',
 'witzeeze-am-31.12.',
 'Marine',
 'Ellerbek',
 'Bad Oldesloe',
 'ostenfeld-rendsburg',
 'Antike',
 'advmis',
 'nindorf-am-31.12.',
 'Dollerup',
 'duvensee',
 'Oster-Ohrstedt',
 'Harbour Seal',
 'warnau-am-31.12.',
 'Trockenrasen',
 'hingstheide',
 'Teilhabe',
 'mucheln-am-31.12.',
 'Trinkwasseranlagen',
 'Gartenstadt Weiche Teilplan C',
 'Mode',
 'Trennewurther Fleth',
 'Hanerau-Hademarschen',
 'Vogelschutzgebiet',
 'sulfatsaure Böden',
 'schretstaken',
 'Anteile der Real-',
 'Bundestagswahl am ZEITSTEMPEL',
 'corine-land-cover',
 'Siedlung Elisenhof',
 'heiligenstedten',
 'brokstedt-am-31.12.',
 'Direktzahlungen',
 'jersbek-am-31.12.',
 'Bahnhofsumfeld',
 'witzwort-am-31.12.',
 'Seth',
 'martensrade-am-31.12.',
 'Baumaßnahmen)',
 'Rotwildwegeplan',
 'ulsnis',
 'raa-besenbek-am-31.12.',
 'Kationenaustauschkapazität',
 'wasbek',
 'Naturpark',
 'Wahlergebnis Bundestagswahlen',
 'Westllich

In [10]:
#4.7 Exploring the 'details of an organization(kunsthalle-kiel)' by the use of CKAN  
#Action API (The second additional data resource in addition to the data resources you asked)

url_orgdet = 'http://opendata.schleswig-holstein.de/api/action/organization_show?id=kunsthalle-kiel'
data_orgdet = urllib.parse.urlencode({}).encode('utf-8')
response_orgdet = urllib.request.urlopen(url_orgdet, data_orgdet)
assert response_orgdet.code == 200

response_dict_orgdet = json.loads(response_orgdet.read().decode('utf-8'))

assert response_dict_orgdet['success'] is True
result_orgdet = response_dict_orgdet['result']
pprint.pprint(result_orgdet)

output_file_path_orgdet = 'kunsthalle-kiel_details.json'
with open(output_file_path_orgdet, 'w') as json_file:
    json.dump(result_orgdet, json_file, indent=2)

print(f"\nResults are saved as a json file")

{'approval_status': 'approved',
 'created': '2021-03-24T06:25:07.605107',
 'description': 'Die Kunsthalle ist Museum mit eigener Sammlung und '
                'Ausstellungshalle sowie ein Universitätsinstitut der '
                'Chrisitan-Albrechts-Universität zu Kiel',
 'display_name': 'Kunsthalle Kiel',
 'extras': [{'group_id': '24b059dd-75f1-4a49-8327-9034b31df62f',
             'id': '4adcea74-c747-476f-9d0d-bb3e1ab4961d',
             'key': 'gnd',
             'state': 'active',
             'value': 'https://d-nb.info/gnd/8513-3'},
            {'group_id': '24b059dd-75f1-4a49-8327-9034b31df62f',
             'id': '808dcdb9-f510-4548-aaa3-a2c0ad2c9ed5',
             'key': 'location',
             'state': 'active',
             'value': '24105 Kiel'},
            {'group_id': '24b059dd-75f1-4a49-8327-9034b31df62f',
             'id': '7c0de4d2-5d1e-4fd3-9b20-cb9488b703aa',
             'key': 'mail',
             'state': 'active',
             'value': ''},
            {'g