# RDFlib Core
### A notebook demonstrating basic core functionality of the RDFlib toolkit

Parts:

A. Loading & Serializing
B. Native RDFlib accessing of graph data
C. SPARQL querying graphs
D. Namespaces
E. Creating data
F. Merging graphs


## A. Loading & Serializing
A.1. Import the main rdflib class, Graph

In [54]:
from rdflib import Graph

A.2. Create and load the graph from an RDF file

In [55]:
g = Graph()
g.parse("data/bdm.ttl", format="turtle")

<Graph identifier=Nda59456b2f844730abec20f1e7242663 (<class 'rdflib.graph.Graph'>)>

A.3. Print the number of triples in the graph - graph length - to confirm load

In [56]:
print(len(g))

180


A.4. Load data from the web
The RDF format is known from the Media Type returned in the HTTP headers

In [None]:
g2 = Graph()
g2.parse(location="http://pid.geoscience.gov.au/sample/AU1000005?_view=igsn-o&_format=text/turtle")
print(g2.serialize(format="turtle").decode())

A.5. Serialize the in-memory graph using another RDF format (XML)

In [57]:
g.serialize("data/bdm.rdf", format="xml")

A.6. Show the contents of the newly created file for comparison

In [58]:
with open("data/bdm.rdf") as f:
    print(f.read())

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
   xmlns:dcterms="http://purl.org/dc/terms/"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:sdo="https://schema.org/"
   xmlns:skos="http://www.w3.org/2004/02/skos/core#"
>
  <rdf:Description rdf:about="http://linked.data.gov.au/def/borehole-drilling-method/kastern-core">
    <skos:inScheme rdf:resource="http://linked.data.gov.au/def/borehole-drilling-method"/>
    <skos:definition xml:lang="en">Large volume, relatively undisturbed sediment core (~3m long) from a Kasten Corer. The corer is constructed of stainless steel and is square in cross-section. A weight of several hundred kilograms on top of the corer pushes it 2-3 metres into the seabed.</skos:definition>
    <skos:prefLabel xml:lang="en">Kastern Core</skos:prefLabel>
    <skos:broader rdf:resource="http://linked.data.gov.au/def/borehole-drilling-method/direct-push"/>
    <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
  </rdf:Descriptio

## B. Native RDFlib accessing of graph data

B.1. Loop through graph, printing the subjects of all triples - no filtering

In [59]:
for s, p, o in g.triples((None, None, None)):
    print(s)

http://linked.data.gov.au/def/borehole-drilling-method/kastern-core
http://linked.data.gov.au/def/borehole-drilling-method/hand-auger
http://linked.data.gov.au/def/borehole-drilling-method/auger
http://linked.data.gov.au/def/borehole-drilling-method/vibrocore
http://linked.data.gov.au/def/borehole-drilling-method/sonic
http://linked.data.gov.au/def/borehole-drilling-method/percussion-drilling
http://linked.data.gov.au/def/borehole-drilling-method/percussion-drilling
http://linked.data.gov.au/def/borehole-drilling-method/air-core
http://linked.data.gov.au/def/borehole-drilling-method/cable-tool-drilling
http://linked.data.gov.au/def/borehole-drilling-method/geoprobe-core
http://linked.data.gov.au/def/borehole-drilling-method
http://linked.data.gov.au/def/borehole-drilling-method/delft-sampler
http://linked.data.gov.au/def/borehole-drilling-method/delft-sampler
http://linked.data.gov.au/def/borehole-drilling-method/hand-auger
http://linked.data.gov.au/def/borehole-drilling-method/cable-t

B.2. Getting just SKOS Concepts in the graph - filter by type
First import namespaces from RDFlib

In [60]:
from rdflib.namespace import RDF, SKOS

for s, p, o in g.triples((None, RDF.type, SKOS.Concept)):
    print(s)

http://linked.data.gov.au/def/borehole-drilling-method/hand-auger
http://linked.data.gov.au/def/borehole-drilling-method/window-sampler
http://linked.data.gov.au/def/borehole-drilling-method/auger
http://linked.data.gov.au/def/borehole-drilling-method/vibrocore
http://linked.data.gov.au/def/borehole-drilling-method/sidewall-core-bullet
http://linked.data.gov.au/def/borehole-drilling-method/diamond-core
http://linked.data.gov.au/def/borehole-drilling-method/reverse-circulation-drilling
http://linked.data.gov.au/def/borehole-drilling-method/core-drilling
http://linked.data.gov.au/def/borehole-drilling-method/delft-sampler
http://linked.data.gov.au/def/borehole-drilling-method/piston-core
http://linked.data.gov.au/def/borehole-drilling-method/cable-tool-drilling
http://linked.data.gov.au/def/borehole-drilling-method/gravity-core
http://linked.data.gov.au/def/borehole-drilling-method/power-auger
http://linked.data.gov.au/def/borehole-drilling-method/sidewall-core
http://linked.data.gov.au/

B.3. Print out labels for Concepts, not their URIs

In [61]:
for s, p, o in g.triples((None, RDF.type, SKOS.Concept)):
    for s2, p2, o2 in g.triples((s, SKOS.prefLabel, None)):
        print(o2)

Hand Auger
Window Sampler
Auger
Vibrocore
Sidewall Core Bullet
Diamond Core
Reverse Circulation Drilling
Core Drilling
Delft Sampler
Piston Core
Cable Tool Drilling
Gravity Core
Power Auger
Sidewall Core
Sidewall Core Mechanical
Conventional Core
Geoprobe Core
Rotary Hammer Drilling
Mackintosh Probe
Hydraulic Rotary Drilling
Direct Push
Kastern Core
Percussion Drilling
Rab Drilling
Box Core
Air Core
Sonic
Mackereth Core
Probe
Cone Penetrometer Test Probe


B.4. Print out only Concepts with "Core" in the label

In [62]:
for s, p, o in g.triples((None, RDF.type, SKOS.Concept)):
    for s2, p2, o2 in g.triples((s, SKOS.prefLabel, None)):
        if "Core" in str(o2):
            print(o2)

Sidewall Core Bullet
Diamond Core
Core Drilling
Piston Core
Gravity Core
Sidewall Core
Sidewall Core Mechanical
Conventional Core
Geoprobe Core
Kastern Core
Box Core
Air Core
Mackereth Core


## C. SPARQL querying graphs

Emulating the last query above - Concepts with "Core" in the label

C.1. Formulate the query

In [63]:
q = """
    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

    SELECT ?uri ?pl
    WHERE {
        ?uri rdf:type skos:Concept .
        ?uri skos:prefLabel ?pl .

        FILTER REGEX(?pl, "Core")
    }
    ORDER BY ?pl
    """

C.2. Pose the query to the graph

In [64]:
results =  g.query(q)

C.3. Loop through results and print

In [65]:
for r in results:
     print(str(r["uri"]), str(r["pl"]))

http://linked.data.gov.au/def/borehole-drilling-method/air-core Air Core
http://linked.data.gov.au/def/borehole-drilling-method/box-core Box Core
http://linked.data.gov.au/def/borehole-drilling-method/conventional-core Conventional Core
http://linked.data.gov.au/def/borehole-drilling-method/core-drilling Core Drilling
http://linked.data.gov.au/def/borehole-drilling-method/diamond-core Diamond Core
http://linked.data.gov.au/def/borehole-drilling-method/geoprobe-core Geoprobe Core
http://linked.data.gov.au/def/borehole-drilling-method/gravity-core Gravity Core
http://linked.data.gov.au/def/borehole-drilling-method/kastern-core Kastern Core
http://linked.data.gov.au/def/borehole-drilling-method/mackereth-core Mackereth Core
http://linked.data.gov.au/def/borehole-drilling-method/piston-core Piston Core
http://linked.data.gov.au/def/borehole-drilling-method/sidewall-core Sidewall Core
http://linked.data.gov.au/def/borehole-drilling-method/sidewall-core-bullet Sidewall Core Bullet
http://lin

## D. Namespaces

D.1. Remove all triples from the graph

In [66]:
g.remove((None, None, None))  # removal with no filter, so all triples gone!
print(len(g))  # should print zero

# can also close graph
g.close()

# or do normal reset
g = Graph()

0


D.2. Add data to the new, empty, graph by loading text RDF in the Turtle format but not very neatly written

In [67]:
data = """
        <http://example.com/p1> a <http://example.com/Person> .
        <http://example.com/p1> <http://example.com/name> "Nick"@en .
        <http://example.com/p1> <http://example.com/name> "Mikolajek"@pl .
        """
g.parse(data=data, format="turtle")
print(len(g))  # should print 3


3


D.3. Serialize graph back as Turtle, see the compressed form

In [None]:
print(g.serialize(format="turtle").decode())


D.4. Import Namespace from RDFlib and declare a test namespace

In [None]:
g = Graph()
from rdflib import Namespace
EG = Namespace("http://example.com/")

D.5. Bind this namespace to a prefix for this graph - this must be done per-graph

In [None]:
g = Graph()
g.bind("eg", EG)
data = """
        <http://example.com/p1> a <http://example.com/Person> .
        <http://example.com/p1> <http://example.com/name> "Nick"@en .
        <http://example.com/p1> <http://example.com/name> "Mikolajek"@pl .
        """
g.parse(data=data, format="turtle")

D.6. Serialize graph again, now notice the Namespace prefix "eg" in use

In [71]:
print(g.serialize(format="turtle").decode())


@prefix ns1: <http://example.com/> .

ns1:p1 a ns1:Person ;
    ns1:name "Nick"@en,
        "Mikolajek"@pl .




## E. Creating data

E.1. Reset graph

In [36]:
g = Graph()  # reset graph
g.bind("eg", EG)
print(len(g))  # should print 0

0


E.2. Import URIRef & Literal - RDFlib classes for RDF things

In [37]:
from rdflib import URIRef, Literal

E.3. Create triple by triple - the data above

In [38]:
g.add((
    URIRef("http://example.com/p1"),    # subject
    RDF.type,                           # predicate, same as 'a'
    URIRef("http://example.com/Person")
))
g.add((URIRef("http://example.com/p1"), URIRef("http://example.com/name"), Literal("Nick", lang="en")))
g.add((
    EG.p1,
    EG.name,
    Literal("Mikolajek", lang="pl")
))
print(g.serialize(format="turtle").decode())

@prefix eg: <http://example.com/> .

eg:p1 a eg:Person ;
    eg:name "Nick"@en,
        "Mikolajek"@pl .




E.4. Remove just the triple for the Polish name Mikolajek and re-serialise contents

In [39]:
g.remove((
    EG.p1,
    EG.name,
    Literal("Mikolajek", lang="pl")
))
print(g.serialize(format="turtle").decode())

@prefix eg: <http://example.com/> .

eg:p1 a eg:Person ;
    eg:name "Nick"@en .




## F. Merging graphs

F.1. Create a new graph, g2, and put in some data with a subject URI in common with our existing data, EG.p1

In [None]:
from rdflib.namespace import XSD

g2 = Graph()
g2.add((
    EG.p1,
    EG.birthdate,
    Literal("1982-05-11", datatype=XSD.date)
))
g2.add((
    EG.p1,
    EG.age,
    Literal("38", datatype=XSD.integer)
))
print(g2.serialize(format="turtle").decode())

F.2. Merge the contents of g & g2 by adding the graphs

In [None]:
print("No. triples in g: {}".format(len(g)))
print("No. triples in g2: {}".format(len(g2)))
g3 = g + g2
print("No. triples in g3: {}".format(len(g3)))

F.3. Print the merged graph

In [None]:
g3.bind("eg", EG)
print(g3.serialize(format="turtle").decode())