### Q1. Select data sets: Gemeinfrei, RDF file and Linked Data 


* RDF Datasets: "Gerichte", "Finanzämter", "Schulen","Polizeidienststellen"

Quellen: 
* https://opendata.schleswig-holstein.de/dataset/gerichte-2024-01-28
* https://opendata.schleswig-holstein.de/dataset/schulen-2024-01-31
* https://opendata.schleswig-holstein.de/dataset/polizeidienststellen-2024-01-28
* https://opendata.schleswig-holstein.de/dataset/finanzamter-2024-01-28




In [1]:
from rdflib import Graph


g = Graph()


g.parse("gerichte-2024-01-28.rdf", format="xml")


print(f"The graph of Gerichte has {len(g)} Tripel.")

The graph of Gerichte has 895 Tripel.


In [2]:

g = Graph()


g.parse("schulen-2024-01-31.rdf", format="xml")


print(f"The Graph of Schulen has {len(g)} Tripel.")

The Graph of Schulen has 24754 Tripel.


In [4]:
g = Graph()

g.parse("finanzamter-2024-01-28.rdf", format="xml")

print(f"The Graph of Finanzamter has {len(g)} Tripel.")

The Graph of Finanzamter has 305 Tripel.


In [5]:
g = Graph()

g.parse("polizeidienststellen-2024-01-28.rdf", format="xml")

print(f"The Graph of Poliziendienstestellen has {len(g)} Tripel.")

The Graph of Poliziendienstestellen has 4189 Tripel.


### Q2. Schulen RDF File 

The dataset which I have selected is: https://opendata.schleswig-holstein.de/dataset/schulen-2024-01-31

In [6]:
# Load the RDF file
rdf_file_path = "schulen-2024-01-31.rdf" 

# Create an RDF Graph and parse the RDF file
graph = Graph()
graph.parse(rdf_file_path, format="xml")  

# Print the number of triples in the graph
print(f"Graph has {len(graph)} triples.\n")

# Analyze namespaces in the RDF file
print("Namespaces detected in the RDF file:")
for prefix, namespace in graph.namespaces():
    print(f"Prefix: {prefix}, Namespace: {namespace}")


# Check if 'schema.org' is used
schema_namespace = "http://schema.org/"
if any(schema_namespace in str(namespace) for _, namespace in graph.namespaces()):
    print("\nThe data schema 'schema.org' is used in this RDF file.")
else:
    print("\nThe data schema 'schema.org' is NOT used in this RDF file.")



Graph has 24754 triples.

Namespaces detected in the RDF file:
Prefix: brick, Namespace: https://brickschema.org/schema/Brick#
Prefix: csvw, Namespace: http://www.w3.org/ns/csvw#
Prefix: dc, Namespace: http://purl.org/dc/elements/1.1/
Prefix: dcat, Namespace: http://www.w3.org/ns/dcat#
Prefix: dcmitype, Namespace: http://purl.org/dc/dcmitype/
Prefix: dcterms, Namespace: http://purl.org/dc/terms/
Prefix: dcam, Namespace: http://purl.org/dc/dcam/
Prefix: doap, Namespace: http://usefulinc.com/ns/doap#
Prefix: foaf, Namespace: http://xmlns.com/foaf/0.1/
Prefix: geo, Namespace: http://www.opengis.net/ont/geosparql#
Prefix: odrl, Namespace: http://www.w3.org/ns/odrl/2/
Prefix: org, Namespace: http://www.w3.org/ns/org#
Prefix: prof, Namespace: http://www.w3.org/ns/dx/prof/
Prefix: prov, Namespace: http://www.w3.org/ns/prov#
Prefix: qb, Namespace: http://purl.org/linked-data/cube#
Prefix: schema, Namespace: https://schema.org/
Prefix: sh, Namespace: http://www.w3.org/ns/shacl#
Prefix: skos, Na

The dataset contains a variety of details about each school, including:**

**Geographic information:** Several schools have specific geographic data such as zip codes and coordinates listed to show where the school is located. 

**Offerings and Services:** Some schools have specific offerings and services linked to specific service identifiers through the schema:makesOffer relation, indicating that the schools provide various educational and support services.

**Administrative and functional data:** Data such as the job title Head of School shows the administrative and functional aspects of the school organization. The use of email addresses for communication purposes is also documented.

**Coverage areas:** The predicate schema:areaServed indicates that information is available on the spatial responsibility of schools serving specific local or regional areas.

This information provides a basic insight into the structural and operational organization of schools in Schleswig-Holstein, based on the data presented in RDF format.

#### Investigation of the RDF dataset Schemas for schools

The dataset uses various schemas from schema.org to systematically structure the information. After a detailed analysis of the schemas used in the dataset, here are the most important results:

* Schema.org/School: the dataset explicitly classifies 1027 entries as schools, emphasizing the primary use of this schema. Each entry under this schema provides specific information about individual schools, including their educational offerings and organizational details.

* Schema.org/PostalAddress: Each of the 1027 schools has a detailed postal address listed in the dataset. This schema plays an important role in the localization of schools and communication.

* Schema.org/Place: With 973 entries, almost every school is also defined as a place, which shows that the dataset contains not only education-specific but also extensive geographical information.

* Schema.org/GeoCoordinates: The geographical coordinates are given for 973 schools, which emphasizes the importance of spatial data processing and use in the context of education data.

* Schema.org/Person: Information on 928 persons associated with the schools, such as teaching staff or administrative staff, is also part of the dataset. This indicates a comprehensive coverage of stakeholders associated with the schools.

Conclusion:
- Together, these schemas provide a comprehensive overview of educational institutions in Schleswig-Holstein. They not only enable a detailed representation of each school, but also the use of this data for further analysis and applications. The use of schema.org facilitates the integration and interoperability of the data across different platforms.



JSON-LD Format 

In [7]:
# Serialize the RDF data into JSON-LD format
json_ld_data = graph.serialize(format="json-ld", indent=4)

# Print the serialized JSON-LD data
print(json_ld_data)

# Optionally, save the JSON-LD data to a file
output_file = "output.jsonld"
with open(output_file, "w") as f:
    f.write(json_ld_data)

print(f"JSON-LD data saved to {output_file}")

[
    {
        "@id": "https://zufish.schleswig-holstein.de/portaldeeplink?tsa_oe_id=9099793",
        "@type": [
            "http://schema.org/School"
        ],
        "http://schema.org/address": [
            {
                "@id": "_:N5ded461304af4bda9bddcc498921f587"
            }
        ],
        "http://schema.org/areaServed": [
            {
                "@id": "https://zufish.schleswig-holstein.de/portaldeeplink?tsa_gebiet_id=9006402"
            }
        ],
        "http://schema.org/email": [
            {
                "@value": "Goethe-gemeinschaftsschule.kiel@schule.landsh.de"
            }
        ],
        "http://schema.org/employee": [
            {
                "@id": "_:N707a15e38ddc41659226fb4b283ed357"
            }
        ],
        "http://schema.org/faxNumber": [
            {
                "@value": "+49 431 26042869"
            }
        ],
        "http://schema.org/location": [
            {
                "@id": "_:N1b6ec6885dee47dda

##### Explanation of the JSON-LD representation:
* **@id:** This is the unique identifier for the school within the Linked Data Framework, in this case a URL that uniquely identifies the school.

* **@type:** This specifies the type of entity, here http://schema.org/School, which indicates that the entity is a school.

* **ttp://schema.org/address:** This refers to the address of the school. In this case, the address is represented as a blank node (_:N6186295e12574a6aa9b28a00a697b498), which could contain detailed address information.

* **http://schema.org/email:** The e-mail address of the school is specified as a literal value.

* **http://schema.org/name:** The name of the school is also specified as a literal value.

This JSON-LD representation shows how linked data can be used to provide structured and linked information. The use of URIs and types from the Schema.org vocabulary enables a standardized and interoperable representation of the data.



### Q3. RDF File for Police Stations 

https://opendata.schleswig-holstein.de/dataset/polizeidienststellen-2024-01-28

In [8]:
# Load the RDF file
rdf_file_police_path = "polizeidienststellen-2024-01-28.rdf"  
graph_police = Graph()
graph_police.parse(rdf_file_police_path, format="xml")  

# SPARQL Query 1: Police Stations in Kiel
query = """
PREFIX schema: <http://schema.org/>

SELECT ?name ?streetAddress ?postalCode ?telephone ?email
WHERE {
    ?policeStation a schema:PoliceStation .
    ?policeStation schema:name ?name .
    ?policeStation schema:address ?address .
    ?address schema:addressLocality "Kiel" .
    ?address schema:streetAddress ?streetAddress .
    ?address schema:postalCode ?postalCode .
}
"""

# Execute the query
results_1 = graph_police.query(query)

# Print results for Query 1
print("Total Number of Police Stations in Kiel in this RDF File: ", len(results_1), '\n')
print("Police Stations in Kiel:")
for row in results_1:
        print(f"Name: {row.name}, Address: {row.streetAddress},Postal Code:  {row.postalCode}")

Total Number of Police Stations in Kiel in this RDF File:  21 

Police Stations in Kiel:
Name: 3. Polizeirevier Kiel, Address: Von-der-Tann-Straße 34,Postal Code:  24114
Name: 4. Polizeirevier Kiel, Address: Werftstraße 217,Postal Code:  24143
Name: Kriminalpolizeistelle Kiel, Address: Blumenstraße 2 - 4,Postal Code:  24103
Name: Landeskriminalamt, Address: Mühlenweg 166,Postal Code:  24116
Name: Polizei-Bezirksrevier Kiel, Address: Mühlenweg 166,Postal Code:  24116
Name: Polizeidirektion Kiel, Address: Gartenstraße 7,Postal Code:  24103
Name: Polizeistation Dietrichsdorf, Address: Ivensring 27,Postal Code:  24149
Name: Polizeistation Friedrichsort, Address: Fritz-Reuter-Straße 94,Postal Code:  24159
Name: Polizeistation Hassee, Address: Rendsburger Landstraße 206,Postal Code:  24113
Name: Polizeistation Holtenau, Address: Kanalstraße 46,Postal Code:  24159
Name: Polizeistation Mettenhof, Address: Skandinaviendamm 251,Postal Code:  24114
Name: Polizeistation Schilksee, Address: Langenf

In [9]:
# SPARQL Query: All Police Stations with Coordinates
query_neumunster = """
PREFIX schema: <http://schema.org/>

SELECT ?name ?streetAddress ?postalCode ?telephone ?email
WHERE {
    ?policeStation a schema:PoliceStation .
    ?policeStation schema:name ?name .
    ?policeStation schema:address ?address .
    ?address schema:addressLocality "Neumünster" .
    ?address schema:postalCode ?postalCode .
    OPTIONAL { ?policeStation schema:telephone ?telephone . }
    OPTIONAL { ?policeStation schema:email ?email . }
}
"""

# Execute the query
results_neu = graph_police.query(query_neumunster)

print("Number of Police Stations in Neumünster", len(results_neu))

# Print the results
print("Police Stations data in Neumünster: \n")
for row in results_neu:
       print(f"Name: {row.name},{row.postalCode}, "
          f"Telephone: {row.telephone}, Email: {row.email}")


Number of Police Stations in Neumünster 12
Police Stations data in Neumünster: 

Name: 2. Polizeirevier Neumünster,24539, Telephone: +49 4321 945-1202, Email: Neumuenster.PRev02@polizei.landsh.de
Name: Polizeistation Neumünster-Mitte,24539, Telephone: +49 4321 690360, Email: Mitte.PSt.Neumuenster@polizei.landsh.de
Name: Polizeidirektion Neumünster,24539, Telephone: +49 4321 945-0, Email: Neumuenster.PD@polizei.landsh.de
Name: Polizeistation Einfeld,24536, Telephone: +49 4321 52703, Email: Einfeld.PSt@polizei.landsh.de
Name: Polizeistation Neumünster-Südost, Zweigstelle Gadeland,24534, Telephone: +49 4321 945250, Email: nms.suedost..PSt@polizei.landsh.de
Name: Polizeistation Neumünster-Nord, Zweigstelle Tungendorf,24536, Telephone: +49 4321 301924, Email: nms-nord.PSt@polizei.landsh.de
Name: Polizeistation Wittorf,24539, Telephone: +49 4321 7079946, Email: Wittorf.PSt@polizei.landsh.de
Name: 1. Polizeirevier Neumünster,24539, Telephone: +49 4321 945-1111, Email: Neumuenster.PRev01@poliz

#### 4. CKAN Action API 

In [10]:
import requests

#CKAN API
url = "https://opendata.schleswig-holstein.de/api/action/organization_list"

#HTTP GET
response = requests.get(url)


if response.status_code == 200:
    organizations = response.json()['result']
    print("Organisationen:")
    for org in organizations:
        print(org)
else:
    print(f"Fehler: {response.status_code}")

Organisationen:
adac
awsh-abfallwirtschaft-sudholstein-gmbh
aktivregion-uthlande
amt-bad-oldesloe-land
amt-buechen
amt-daenischenhagen
amt-eidertal
amt-elmshorn-land
amt-haddeby
amt-huettener-berge
amt-marne-nordsee
amtmittelangeln
amt-nortorfer-land
amt-schlei-ostsee
schrevenborn
amt-suederbrarup
antikensammlung
alsh
bkzsh
bkg
bast
bundesnetzagentur
coworkland
compgen
delfi
landesmuseum-dithmarschen
fh-westkueste
fairtrade-deutschland
finanzministerium
fgho
ammersbek
buechen
stockelsdorf
glueckstadt
luebeck
histsem
klimaschutzagentur-rendsburg-eckernfoerde
kreis-herzogtum-lauenburg
nordfriesland
kreis-ostholstein
kreis-pinneberg
rendsburg-eckernforde
kreis-schleswig-flensburg
kreis-stormarn
kreisarchiv-stormarn
kunsthalle-kiel
landesamt-fur-denkmalpflege
llnl
llur
lfu
lvermgeo
lazuf
landesamt-fur-soziale-dienste
landesarchiv
lbv
lkn
landeshauptstadt-kiel
landesjagdverband
landeskriminalamt-schleswig-holstein
landesmeldestelle
mbwfk
mbwk
mekun
melund
mikws
milig
mjg
mllev
msgjfs
msjfsi

In [11]:
#CKAN API For Package
url = "https://opendata.schleswig-holstein.de/api/action/package_list"

#HTTP GET
response = requests.get(url)


if response.status_code == 200:
    datasets = response.json()['result']
    print("\nDatasets:")
    for dataset in datasets[:10]:
        print(dataset)
else:
    print(f"Fehler: {response.status_code}")


Datasets:
01001000_bp_030_hildebrandstrasse_urschrift
01001000_bp_033_friedrich_ebert_strasse_urschrift
01001000_bp_034_muerwiker_strasse_1-aenderung
01001000_bp_034_muerwiker_strasse_urschrift
01001000_bp_035_strandfrieden_4-vereinfachte_aenderung
01001000_bp_036_schoene_aussicht_noerdlicher_teil_2-aenderung
01001000_bp_036_schoene_aussicht_noerdlicher_teil_urschrift
01001000_bp_036i_schoene_aussicht_suedlicher_teil_1-aenderung
01001000_bp_036i_schoene_aussicht_suedlicher_teil_urschrift
01001000_bp_037_johannismuehle_1-aenderung


In [12]:
#CKAN API-Endpoint for Group List
url = "https://opendata.schleswig-holstein.de/api/action/group_list"

#HTTP GET
response = requests.get(url)


if response.status_code == 200:
    groups = response.json()['result']
    print("\nGruppen:")
    for group in groups:
        print(group)
else:
    print(f"Fehler: {response.status_code}")


Gruppen:
soci
educ
ener
heal
intr
just
agri
gove
regi
envi
tran
econ
tech


* How can the data sets of the portal be found and accessed via CKAN?

The portal's datasets can be found and retrieved via the CKAN API by reading the API documentation to identify the available endpoints and then using them either via ckanapi or direct HTTP POST requests. 

Available endpoints:
* organization_list: Lists all organizations.
* package_list: Lists all data sets (packages).
* group_list: Lists all groups.
* license_list: Lists all licenses.
* package_show: Shows details of a specific package.

Documentation: https://docs.ckan.org/en/2.8/api/