# Linked Data with Omeka S Provided Example

Linked data can be retrieved from the Omeka S API,
and you can process this information using the `rdflib` in Python.

## Setup

This section imports requests for working the REST API. More importantly, however,
are various import statements for `rdflib`.
Of particular note are the import of various specific RDF datatypes, including Graph, URIRef, Literal, and BNode (for blank nodes).
The serializer and parser functions assist in processing graph data into various transport formats, including RDF in XML, JSON, and turtle, among others.
Finally, note the last import line, which imports the Namespace function as well as various built-in schemes,
including regular Resource Description Framework datatypes (both RDF and RDFs), Friend of a Friend (FOAF), DublinCore extended terms (DCTERMS), and schema.org (SDO).

In [None]:
import requests
import rdflib
from rdflib import Graph, URIRef, Literal, BNode, Namespace, plugin, Variable
from rdflib.serializer import Serializer
from rdflib.plugin import register, Parser
from rdflib.namespace import RDF, RDFS, FOAF, DCTERMS, SDO

In [None]:
# create sample data to add to the graph
newData = {
    'Jane Austen' : {
        'https://schema.org/deathDate' : 1817,
        'https://schema.org/birthDate' : 1775,
        'https://schema.org/deathPlace': 'https://en.wikipedia.org/wiki/England'
    },
    'Octavia E. Butler' : { 
        'https://schema.org/deathDate' : 2006,
        'https://schema.org/birthDate' : 1947,
        'https://schema.org/deathPlace': 'https://en.wikipedia.org/wiki/Lake_Forest_Park,_Washington'
        },
    'Herman Melville' : { 
        'https://schema.org/deathDate' : 1891,
        'https://schema.org/birthDate' : 1819,
        'https://schema.org/deathPlace' : 'https://en.wikipedia.org/wiki/New_York_City'
        }
}

Add namespace information for Omeka S's scheme:

In [None]:
omekas_ns = Namespace('http://omeka.org/s/vocabs/o#')

## Retrieve Data from Omeka S

Search for all the items in the specified set

In [None]:
url = 'http://jajohnst.si676.si.umich.edu/omeka-s/api'

action = '/items'

# if you create items in your Omeka S site,
# your item set will have a different id (specific to your site)
parameters = {
    'item_set_id':311,
}

In [None]:
r = requests.get(url + action, params=parameters)

print(r.url)
print(r.status_code)

In [None]:
r.json()

## Parse data with the RDFLib module

Using the `rdflib` module capabilities, parse this data.

First, create an RDF graph from it:

In [None]:
g = Graph().parse(data=r.text, format='json-ld')

Add the Omeka S namespace (`omekas_ns`, prefixed as `o`):

In [None]:
g.bind('o',omekas_ns)

Now, look through the graph. The graph is a series of "triples",
which are subject-predicate-object tuples. These can be modified. example, after the initial look, you can remove all of those with the Omeka S namespace (`o`).
Note that RDFLib may drop or delete any orphaned subjects or objects that may not be part of a triple. 

In [None]:
for s, p, o in g:
    print(f'{s} -> {p} -> {o} .')

### Outputting, saving, and serializing

Convert the graph to 'Turtle' format

In [None]:
ser = g.serialize(format='turtle')

print(ser)

Save it to a file

In [None]:
with open('item-set-graph-1.ttl', 'w') as f:
    f.write(ser)

## Parsing, Modifying, and Adding to the Graph

Now try to remove the Omeka data in order to get a closer look
at the collection specific data.

In [None]:
# remove the omeka specific data
for triple in g:
    if 'http://omeka.org/s/vocabs/o#' in triple[1]:
        g.remove(triple)

In [None]:
for s, p, o in g:
    print(f'{s} -> {p} -> {o}')

### Adding the "newData"

In [None]:
# add the "newData" by looping (iterating) through the data
# and adding to the appropriate elements

# Note: this will only work if the Keys are in the data already on the site,
# so the data must be uploaded and added first

for author_name in newData: 
    for s, p, o, in g.triples((None, FOAF.name, Literal(author_name))):
        deathDate = newData[o.value]['https://schema.org/deathDate']
        deathPlace = newData[o.value]['https://schema.org/deathPlace']
        g.add((s, URIRef('https://schema.org/deathDate'), Literal(deathDate)))
        g.add((s, URIRef('https://schema.org/deathPlace'), URIRef(deathPlace)))

To demonstrate how the graph changed, serialize the new graph

In [None]:
ser2 = g.serialize(format='turtle')

with open('item-set-graph-2.ttl', 'w') as f:
    f.write(ser2)