# Exercise 1: RDFLib Basics

In this exercise you will Learn how to use RDFLib to:
- Create an RDF Graph.
- Load and Save an RDF Graph.
- Iterate through the triples in a RDF Graph.
- Update and Delete RDF triples in an RDF Graph.

[RDFLib](https://rdflib.readthedocs.io/en/stable/index.html) is a pure Python package that contains [most functionality](https://en.wikipedia.org/wiki/RDFLib) you need to work with RDF, including:
- *Parsers* and *Serializers* for RDF/XML, N3, Ntripples, N-Quads, Turtle and JSON-LD (via a plugin).
- A Graph interface which can be backed by any one of a number of Store implementations.
- A SPARQL 1.1. implementation, supporting SPARQL 1.1. Queries and Update Statements.

As a first step, let's import the python modules we need.

In [1]:
# pip install rdflib
import rdflib
from rdflib import Graph, URIRef, BNode, Literal, Namespace
from rdflib.namespace import RDF, RDFS, OWL, XSD, FOAF, DCTERMS, SDO, SKOS

What have we just imported from RDFLib?

The namespaces (RDF, RDFS, OWL, XSD, FOAF, DCTERMS, SDO, SKOS) we have imported are the BASE URIs for the following vocabularies (ontologies):
- Resource Description Framework (RDF)
- Resource Description Framework Schema (RDFS)
- Web Ontology Language (OWL)
- XML Schema (XSD)
- Fiend Of A Friend (FOAF)
- Dublin Core Terms (DCTERMS)
- Schema.Org (SDO)
- Simple Knowledge Organization System (SKOS)

We can also create a namespace for our own Base URI. As a best practice, this URI should be a URL with an actual
web page behind it, but in this case, it is just a URI without any web page.
- Let's just make up our own BASE URI: http://example.org/
- Let's call the namespace variable 'EX'.  
`EX = Namespace('http://example.org')`

Let's print out the Base URIs for these namespaces so we see what they look like.

In [2]:
# First, let's create our own Base URI namespace
EX = Namespace('http://example.org/')

# Now, let's print out the BASE URIs for these namespaces,
# including the imported namespaces and our own namespace
print(f'EX      : {EX}')
print(f'RDF     : {RDF}')
print(f'RDFS    : {RDFS}')
print(f'OWL     : {OWL}')
print(f'XSD     : {XSD}')
print(f'FOAF    : {FOAF}')
print(f'DCTERMS : {DCTERMS}')
print(f'SDO     : {SDO}')
print(f'SKOS    : {SKOS}')

EX      : http://example.org/
RDF     : http://www.w3.org/1999/02/22-rdf-syntax-ns#
RDFS    : http://www.w3.org/2000/01/rdf-schema#
OWL     : http://www.w3.org/2002/07/owl#
XSD     : http://www.w3.org/2001/XMLSchema#
FOAF    : http://xmlns.com/foaf/0.1/
DCTERMS : http://purl.org/dc/terms/
SDO     : https://schema.org/
SKOS    : http://www.w3.org/2004/02/skos/core#


From the lecture, we know that e.g.:
- The RDF namespace has a resource called `type`.
- The RDF namespace has a resource called `Property`.
- The RDFS namespace has a resource called `label`.
- The RDFS namespace has a resource called `range`.
- The RDFS namespace has a resource called `domain`.
- The RDFS namespace has a resource called `Class`.
- The FOAF namespace has a resource called `name`.

Let's check this:

In [3]:
print(RDF.type)
print(RDF.Property)
print(RDFS.label)
print(RDFS.range)
print(RDFS.domain)
print(RDFS.Class)
print(FOAF.name)

http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/2000/01/rdf-schema#label
http://www.w3.org/2000/01/rdf-schema#range
http://www.w3.org/2000/01/rdf-schema#domain
http://www.w3.org/2000/01/rdf-schema#Class
http://xmlns.com/foaf/0.1/name


We also know that, for example:
- The RDF namespace **does not** have a resource called `typexxx`.
- The RDFS namespace **does not** have a resource called `labelxxx`.
- The FOAF namespace **does not** have a resource called `namexxx`.

In [4]:
try: print(RDF.typexxx)
except Exception as e: print(e)

try: print(RDFS.labelxxx)
except Exception as e: print(e)

try: print(FOAF.namexxx)
except Exception as e: print(e)

term 'typexxx' not in namespace 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
term 'labelxxx' not in namespace 'http://www.w3.org/2000/01/rdf-schema#'
term 'namexxx' not in namespace 'http://xmlns.com/foaf/0.1/'


We can also add a resource, e.g. `mason`, to our own namespace `EX`.

In this case (for our own namespace without an actual RDF web page behind it), we can do this in two different ways:
- `EX.mason`
- `EX[mason]`

Let's add a resource called `mason` and a resource called `shyla` to namespace `EX`.

In [5]:
print(EX.mason)
print(EX['shyla'])

http://example.org/mason
http://example.org/shyla


The remaining Python imports are classes representing:
- A RDF Graph (Graph)
- A URI (URIRef)
- A Blank Node (BNode)
- A literal (Literal)

To create a RDF Graph, and add RDF triples to it, we could do the following.

In [6]:
# Create a RDF Graph
g = Graph()

# Create a RDF URI node to use as the subject for multiple triples
mason = URIRef('http://example.org/mason')

# Add triples using the Graph objecet's add() method
g.add( (mason, RDF.type, FOAF.Person) )
g.add( (mason, RDFS.label, Literal('Mason Carter')) )
g.add( (mason, FOAF.name, Literal('Mason Carter')) )
g.add( (mason, FOAF.nick, Literal('masonc', lang='en')) )
g.add( (mason, FOAF.mbox, URIRef('mailto:mason@example.org')) )
g.add( (mason, FOAF.mbox, URIRef('mailto:masonc@example.org')) )

# Add another person
# This time, let's add the person using our
# EX namespace instead of the full URI
#shyla = URIRef('http://example.org/shyla')

# Add triples using the Graphs add() method
g.add( (EX.shyla, RDF.type, FOAF.Person) )
g.add( (EX.shyla, RDFS.label, Literal('Shyla Sharples')) )
g.add( (EX.shyla, FOAF.name, Literal('Shyla Sharples')) )
g.add( (EX.shyla, FOAF.nick, Literal('shyla', datatype=XSD.string)) )
g.add( (EX.shyla, FOAF.mbox, URIRef('mailto:shyla@example.org')) )

<Graph identifier=N451c94a566e34653a53b9fc2ab50b82a (<class 'rdflib.graph.Graph'>)>

We can iterate over all the triples in our graph and print them out as follows.

In [7]:
# Iterate over triples in the graph and print them out
for s, p, o in g:
    print(s, p, o)

http://example.org/mason http://xmlns.com/foaf/0.1/name Mason Carter
http://example.org/mason http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://xmlns.com/foaf/0.1/Person
http://example.org/shyla http://xmlns.com/foaf/0.1/name Shyla Sharples
http://example.org/mason http://www.w3.org/2000/01/rdf-schema#label Mason Carter
http://example.org/shyla http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://xmlns.com/foaf/0.1/Person
http://example.org/mason http://xmlns.com/foaf/0.1/mbox mailto:mason@example.org
http://example.org/shyla http://xmlns.com/foaf/0.1/mbox mailto:shyla@example.org
http://example.org/mason http://xmlns.com/foaf/0.1/mbox mailto:masonc@example.org
http://example.org/shyla http://xmlns.com/foaf/0.1/nick shyla
http://example.org/mason http://xmlns.com/foaf/0.1/nick masonc
http://example.org/shyla http://www.w3.org/2000/01/rdf-schema#label Shyla Sharples


Let's:

1. Iterate over all the subjects that have a predicate `FOAF.type` and the object `FOAF.Person`.
2. Then, iterate over all objects that have a subject from step 1 and the predicate `FOAF.nick`.
3. Finally, let's print out the objects from step 2.

In [8]:
# For each foaf:Person in the graph, print out their nickname's value
for person in g.subjects(RDF.type, FOAF.Person):
    for nick in g.objects(person, FOAF.nick):
        print(nick)

masonc
shyla


We can serialize our graph to a string in a specific RDF format. Let's use the *Turtle* format we have seen during the lecture.

In [9]:
# Serialize the graph to a string in the Tutle format
ttl = g.serialize(format='ttl')

# Print all the triples (in the Tutle format)
print(ttl)

@prefix ns1: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/mason> a ns1:Person ;
    rdfs:label "Mason Carter" ;
    ns1:mbox <mailto:mason@example.org>,
        <mailto:masonc@example.org> ;
    ns1:name "Mason Carter" ;
    ns1:nick "masonc"@en .

<http://example.org/shyla> a ns1:Person ;
    rdfs:label "Shyla Sharples" ;
    ns1:mbox <mailto:shyla@example.org> ;
    ns1:name "Shyla Sharples" ;
    ns1:nick "shyla"^^xsd:string .




Notice that:
- The prefix for the FOAF namespace is called `ns1`.
- We get the full URIs for the resources in our own namespace `EX`.

Let's fix this so that:
- The prefix for the FOAF namespace is called `foaf`.
- We use the prefix `ex` for our own namespace `EX`.
  - Where the resources use the `ex` prefix instead of the full URIs.

In [10]:
# Bind the FOAF and EX namespaces to a prefix for more readable output
g.bind('foaf', FOAF)
g.bind('ex', EX)

# Serialise the graph to the Tutle format, and print it out
print(g.serialize(format='ttl'))

@prefix ex: <http://example.org/> .
@prefix ns1: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:mason a ns1:Person ;
    rdfs:label "Mason Carter" ;
    ns1:mbox <mailto:mason@example.org>,
        <mailto:masonc@example.org> ;
    ns1:name "Mason Carter" ;
    ns1:nick "masonc"@en .

ex:shyla a ns1:Person ;
    rdfs:label "Shyla Sharples" ;
    ns1:mbox <mailto:shyla@example.org> ;
    ns1:name "Shyla Sharples" ;
    ns1:nick "shyla"^^xsd:string .




Oops! It looks like we can only bind namespaces to prefixes once (the prefix for 'foaf' is still 'ns1'), so we better do it right after creating the graph. Let's try this.

In [11]:
# Create a RDF Graph
g = Graph()

mason = URIRef('http://example.org/mason')
g.add( (mason, RDF.type, FOAF.Person) )
g.add( (mason, RDFS.label, Literal('Mason Carter')) )
g.add( (mason, FOAF.name, Literal('Mason Carter')) )
g.add( (mason, FOAF.nick, Literal('masonc', lang='en')) )
g.add( (mason, FOAF.mbox, URIRef('mailto:mason@example.org')) )
g.add( (mason, FOAF.mbox, URIRef('mailto:masonc@example.org')) )

shyla = URIRef('http://example.org/shyla')
g.add( (shyla, RDF.type, FOAF.Person) )
g.add( (shyla, RDFS.label, Literal('Shyla Sharples')) )
g.add( (shyla, FOAF.name, Literal('Shyla Sharples')) )
g.add( (shyla, FOAF.nick, Literal('shyla', datatype=XSD.string)) )
g.add( (shyla, FOAF.mbox, URIRef('mailto:shyla@example.org')) )

g.bind('foaf', FOAF)
g.bind('ex', EX)

print(g.serialize(format='ttl'))

@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:mason a foaf:Person ;
    rdfs:label "Mason Carter" ;
    foaf:mbox <mailto:mason@example.org>,
        <mailto:masonc@example.org> ;
    foaf:name "Mason Carter" ;
    foaf:nick "masonc"@en .

ex:shyla a foaf:Person ;
    rdfs:label "Shyla Sharples" ;
    foaf:mbox <mailto:shyla@example.org> ;
    foaf:name "Shyla Sharples" ;
    foaf:nick "shyla"^^xsd:string .




Now the RDF graph looks better, but:
- We've given Mason the nickname `masonc` instead of just `mason`.
- We've also added a faulty email address using this nickname.

Let's fix this

In [12]:
# Replace Literal value
g.set( (mason, FOAF.nick, Literal('mason', lang='en')) )

# Remove triple from the graph
g.remove( (mason, FOAF.mbox, URIRef('mailto:masonc@example.org')) )

print(g.serialize(format='ttl'))

@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:mason a foaf:Person ;
    rdfs:label "Mason Carter" ;
    foaf:mbox <mailto:mason@example.org> ;
    foaf:name "Mason Carter" ;
    foaf:nick "mason"@en .

ex:shyla a foaf:Person ;
    rdfs:label "Shyla Sharples" ;
    foaf:mbox <mailto:shyla@example.org> ;
    foaf:name "Shyla Sharples" ;
    foaf:nick "shyla"^^xsd:string .




Now the graph looks good, so let's save it to the file system as `example.ttl`.

In [13]:
g.serialize('example.ttl', format='ttl')

<Graph identifier=N1d3fbebc179948de8770f4a07adc284d (<class 'rdflib.graph.Graph'>)>

Let's also make sure we can reload it.

In [14]:
# Parse in an RDF file from the file system
h = Graph()
rdf_format = 'ttl'
# If we don't know which RDF format the file is written in,
# we can use the following function to try to guess the correct format
rdf_format = rdflib.util.guess_format('example.ttl')
h.parse('example.ttl', format=rdf_format)

print(h.serialize(format='ttl'))

@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:mason a foaf:Person ;
    rdfs:label "Mason Carter" ;
    foaf:mbox <mailto:mason@example.org> ;
    foaf:name "Mason Carter" ;
    foaf:nick "mason"@en .

ex:shyla a foaf:Person ;
    rdfs:label "Shyla Sharples" ;
    foaf:mbox <mailto:shyla@example.org> ;
    foaf:name "Shyla Sharples" ;
    foaf:nick "shyla"^^xsd:string .




Finally, let's delete Mason from the loaded RDF graph.

In [15]:
# Remove all triples that match the subject, predicate and object.
# If we use the value 'None', it means it will match any value.
# In this case, we are removing all triples that match 'mason'
# as the subject, and match any value for the predicate and object.
h.remove( (mason, None, None) )

print(h.serialize(format='ttl'))

@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:shyla a foaf:Person ;
    rdfs:label "Shyla Sharples" ;
    foaf:mbox <mailto:shyla@example.org> ;
    foaf:name "Shyla Sharples" ;
    foaf:nick "shyla"^^xsd:string .


