# Linked Geodata poster
- Author: David Swinkels
- Date 24 August 2018


# Introduction

<div style="text-align: justify"> Linked data is a set of design principles to publish data on the web and to perform data management to save relations of features in a graph database. Linked data provides the best current method to make links in data understandable by machines and humans via the web and locally. Linked geodata adds a spatial component to linked data. </div>

This notebook will showcase some theory and how to make linked geodata.

In [2]:
# Import all libraries
import rdflib
import geopandas
import pandas
import requests


from rdflib import Graph, Namespace, RDF, RDFS, OWL

# Concept
<div style="text-align: justify"> The concept of linked data is elegantly simple; storing relations and sharing individual resources via unique IRIs. Linked data is often visualized in a graph to show relations (see figure below) and the data is stored in relations as triples instead of values in columns and rows. A triple consists of three resources: a subject, a predicate and an object. The predicate defines the type of relation the subject and object have. Furthermore, each resource, whether that is a subject, predicate or object, is stored as an IRI. An IRI is an International Resource Identifier and similar as a Uniform Resource Identifier [URI] or Uniform Resource Locator [URL], but an IRI also allows international characters. Objects can optionally be stored as literals, such as a string, integer or date. Triples store relations in a framework called Resource Description Framework [RDF]. RDF supports data merging with different schemas and evolving schemas. Linked data allows a query to find out who the grandfather of Angie is based on semantic relations. </div>

<img src="./images/LinkedDataGraph.png">

<div style="text-align: justify"> An RDF stores a vocabulary and instances. A vocabulary defines the concepts and relationships in an area of concern. Vocabularies store classes and properties.Instances are entities of classes. </div>

|Triple notation: | subject | predicate | object |
|------|------|------|------|
|Class definition: | <dbo:Person> | <rdf:type> | <rdfs:Class> |
|Property definition: | <dbo:spouse> | <rdf:type> | <owl:ObjectProperty>|
|Entity instance: | <person:Maggie> | <rdf:type> | <dbo:Person>|
|Relation instance: | <person:Maggie> | <dbo:spouse> | <person:Peter>|
|Value instance: | <person:Angie> | <dbo:birthDate> | 15 May 2001|




<div style="text-align: justify"> Long IRIs, that link to unique resources, are shortened with a prefix to improve readability and save data storage. IRI 'http://www.dbpedia.org/ontology/Person' is shortened to dbo:Person. Dbo is the shortened prefix that links to the namespace 'http://www.dbpedia.org/ontology/' and Person is the unique ID or Class. Instead of having to say 'http://www.dbpedia.org/ontology/Mother', you can refer to the IRI as dbo:Mother. The prefix and namespaces are defined in the RDF file before the triple instances. The graph is stored as triples with namespaces and prefixes. These namespaces and prefixes can be re-used. This is very powerful, because the semantics are persistent and are used across domains. One public ontology is the building ontology from the land administration office in the Netherlands. This ontology stores all classes, properties, concepts, regulations and restrictions of building data as can be seen below. </div>


In [4]:
# Download building Ontology
tripleHeader = {'Accept': 'text/turtle'}
response = requests.get("http://bag.basisregistraties.overheid.nl/def/bag", headers=tripleHeader)
#print(response.text)

The downloaded building ontology is very complete. We will get some more hands-on experience with linked data by making our own ontology. The goal is to define classes and relations to be able to integrate multiple types of data. The building data of the BAG is seperated in buildings, addresses and building functions. An ontology makes it possible to create entities that are connected.

In [6]:
# Create simple building ontology
g = Graph()

building = Namespace('https://bag.basisregistraties.overheid.nl/bag/doc/pand/')
address = Namespace('https://bag.basisregistraties.overheid.nl/bag/doc/adres/')
buildingFunction = Namespace('https://bag.basisregistraties.overheid.nl/bag/doc/verblijfsobject/')
bag = Namespace('http://bag.basisregistraties.overheid.nl/def/bag#')
g.bind(prefix="bag", namespace=bag)

# Create building, address and building function class
g.add((bag.Pand, RDF.type, OWL.Class))
g.add((bag.Address, RDF.type, OWL.Class))
g.add((bag.BuildingFunction, RDF.type, OWL.Class))

# Create ObjectProperties that link buildings to addresses and functions
g.add((bag.pandrelatering, RDF.type, OWL.ObjectProperty))
g.add((bag.pandrelatering, RDFS.domain, bag.BuildingFunction))
g.add((bag.pandrelatering, RDFS.range, bag.Pand))
g.add((bag.hoofdadres, RDF.type, OWL.ObjectProperty))
g.add((bag.hoofdadres, RDFS.domain, bag.Pand))
g.add((bag.hoofdadres, RDFS.range, bag.Address))