# RDFlib Core
### A notebook demonstrating basic core functionality of the RDFlib toolkit

Parts:

* A. Loading & Serializing
* B. Native RDFlib accessing of graph data
* C. SPARQL querying graphs
* D. Namespaces
* E. Creating data
* F. Merging graphs


## A. Loading & Serializing
A.1. Import the main rdflib class, Graph

In [None]:
from rdflib import Graph

A.2. Create and load the graph from an RDF file

In [None]:
g = Graph()
g.parse("data/bdm.ttl", format="turtle")

A.3. Print the number of triples in the graph - graph length - to confirm load

In [None]:
print(len(g))

A.4. Load data from the web
The RDF format is known from the Media Type returned in the HTTP headers

In [None]:
g2 = Graph()
g2.parse(location="http://pid.geoscience.gov.au/sample/AU1000005?_view=igsn-o&_format=text/turtle")
print(g2.serialize(format="turtle").decode())

A.5. Serialize the in-memory graph using another RDF format (XML)

In [None]:
g.serialize("data/bdm.rdf", format="xml")

A.6. Show the contents of the newly created file for comparison

In [None]:
with open("data/bdm.rdf") as f:
    print(f.read())

## B. Native RDFlib accessing of graph data

B.1. Loop through graph, printing the subjects of all triples - no filtering

In [None]:
for s, p, o in g.triples((None, None, None)):
    print(s)

B.2. Getting just SKOS Concepts in the graph - filter by type
First import namespaces from RDFlib

In [None]:
from rdflib.namespace import RDF, SKOS

for s, p, o in g.triples((None, RDF.type, SKOS.Concept)):
    print(s)

B.3. Print out labels for Concepts, not their URIs

In [None]:
for s, p, o in g.triples((None, RDF.type, SKOS.Concept)):
    for s2, p2, o2 in g.triples((s, SKOS.prefLabel, None)):
        print(o2)

B.4. Print out only Concepts with "Core" in the label

In [None]:
for s, p, o in g.triples((None, RDF.type, SKOS.Concept)):
    for s2, p2, o2 in g.triples((s, SKOS.prefLabel, None)):
        if "Core" in str(o2):
            print(o2)

## C. SPARQL querying graphs

Emulating the last query above - Concepts with "Core" in the label

C.1. Formulate the query

In [None]:
q = """
    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

    SELECT ?uri ?pl
    WHERE {
        ?uri rdf:type skos:Concept .
        ?uri skos:prefLabel ?pl .

        FILTER REGEX(?pl, "Core")
    }
    ORDER BY ?pl
    """

C.2. Pose the query to the graph

In [None]:
results =  g.query(q)

C.3. Loop through results and print

In [None]:
for r in results:
     print(str(r["uri"]), str(r["pl"]))

## D. Namespaces

D.1. Remove all triples from the graph

In [None]:
g.remove((None, None, None))  # removal with no filter, so all triples gone!
print(len(g))  # should print zero

# can also close graph
g.close()

# or do normal reset
g = Graph()

D.2. Add data to the new, empty, graph by loading text RDF in the Turtle format but not very neatly written

In [None]:
data = """
        <http://example.com/p1> a <http://example.com/Person> .
        <http://example.com/p1> <http://example.com/name> "Nick"@en .
        <http://example.com/p1> <http://example.com/name> "Mikolajek"@pl .
        """
g.parse(data=data, format="turtle")
print(len(g))  # should print 3


D.3. Serialize graph back as Turtle, see the compressed form

In [None]:
print(g.serialize(format="turtle").decode())


D.4. Import Namespace from RDFlib and declare a test namespace

In [None]:
g = Graph()
from rdflib import Namespace
EG = Namespace("http://example.com/")

D.5. Bind this namespace to a prefix for this graph - this must be done per-graph

In [None]:
g = Graph()
g.bind("eg", EG)
data = """
        <http://example.com/p1> a <http://example.com/Person> .
        <http://example.com/p1> <http://example.com/name> "Nick"@en .
        <http://example.com/p1> <http://example.com/name> "Mikolajek"@pl .
        """
g.parse(data=data, format="turtle")

D.6. Serialize graph again, now notice the Namespace prefix "eg" in use

In [None]:
print(g.serialize(format="turtle").decode())


## E. Creating data

E.1. Reset graph

In [None]:
g = Graph()  # reset graph
g.bind("eg", EG)
print(len(g))  # should print 0

E.2. Import URIRef & Literal - RDFlib classes for RDF things

In [None]:
from rdflib import URIRef, Literal

E.3. Create triple by triple - the data above

In [None]:
g.add((
    URIRef("http://example.com/p1"),    # subject
    RDF.type,                           # predicate, same as 'a'
    URIRef("http://example.com/Person")
))
g.add((URIRef("http://example.com/p1"), URIRef("http://example.com/name"), Literal("Nick", lang="en")))
g.add((
    EG.p1,
    EG.name,
    Literal("Mikolajek", lang="pl")
))
print(g.serialize(format="turtle").decode())

E.4. Remove just the triple for the Polish name Mikolajek and re-serialise contents

In [None]:
g.remove((
    EG.p1,
    EG.name,
    Literal("Mikolajek", lang="pl")
))
print(g.serialize(format="turtle").decode())

## F. Merging graphs

F.1. Create a new graph, g2, and put in some data with a subject URI in common with our existing data, EG.p1

In [None]:
from rdflib.namespace import XSD

g2 = Graph()
g2.add((
    EG.p1,
    EG.birthdate,
    Literal("1982-05-11", datatype=XSD.date)
))
g2.add((
    EG.p1,
    EG.age,
    Literal("38", datatype=XSD.integer)
))
print(g2.serialize(format="turtle").decode())

F.2. Merge the contents of g & g2 by adding the graphs

In [None]:
print("No. triples in g: {}".format(len(g)))
print("No. triples in g2: {}".format(len(g2)))
g3 = g + g2
print("No. triples in g3: {}".format(len(g3)))

F.3. Print the merged graph

In [None]:
g3.bind("eg", EG)
print(g3.serialize(format="turtle").decode())