First we'll do a call to the OMDB API website, to retrieve a JSON representation of "Fear and Loathing in Las Vegas"

*(note that this deviates from the course slides, which use the 'old' imdbapi.org service)*

See <https://www.omdbapi.com> for examples on how to use this service

In [12]:
import requests

omdb_api_url = "http://www.omdbapi.com/"

params = {
    't': 'Fear and Loathing in Las Vegas',
    'plot': 'short',
    'r': 'json',
    'y': ''
}



Retrieve the response, interpret it as JSON, and show it

In [13]:
response = requests.get(omdb_api_url, params=params).json()

response

{u'Actors': u'Johnny Depp, Benicio Del Toro, Tobey Maguire, Ellen Barkin',
 u'Awards': u'1 win & 3 nominations.',
 u'Country': u'USA',
 u'Director': u'Terry Gilliam',
 u'Genre': u'Adventure, Comedy, Drama',
 u'Language': u'English',
 u'Metascore': u'41',
 u'Plot': u'An oddball journalist and his psychopathic lawyer travel to Las Vegas for a series of psychedelic escapades.',
 u'Poster': u'http://ia.media-imdb.com/images/M/MV5BNjA2ZDY3ZjYtZmNiMC00MDU5LTgxMWEtNzk1YmI3NzdkMTU0XkEyXkFqcGdeQXVyNjQyMjcwNDM@._V1_SX300.jpg',
 u'Rated': u'R',
 u'Released': u'22 May 1998',
 u'Response': u'True',
 u'Runtime': u'118 min',
 u'Title': u'Fear and Loathing in Las Vegas',
 u'Type': u'movie',
 u'Writer': u'Hunter S. Thompson (book), Terry Gilliam (screenplay), Tony Grisoni (screenplay), Tod Davies (screenplay), Alex Cox (screenplay)',
 u'Year': u'1998',
 u'imdbID': u'tt0120669',
 u'imdbRating': u'7.7',
 u'imdbVotes': u'214,224'}

This is JSON, but not very 'neat'. See how we still need to do some parsing for the 'Writer' and 'Actors' fields.

Let's go over this response, and create a simple RDF representation for it:

* Create an RDF Graph
* Define our namespace, and add a prefix binding to the graph. We'll use the IMDB domain, though this will break our ability to dereference the URIS

In [26]:
from rdflib import Graph, Namespace, URIRef, Literal, XSD, RDF

# create an RDF graph
g = Graph()


IMDB = Namespace("http://imdb.com/")

g.bind('imdb',IMDB)

The most straightforward way to convert to RDF is to iterate over the keys/values in the dataset

But because we want to do some additional parsing of the values, we'll do it one by one:

* The imdbID is the basis for the URI of our movie
* We use the IMDB namespace object, and the value of the 'imdbID' field to create the URI

In [27]:
fear_URI = IMDB[response['imdbID']]

print "Created {}".format(fear_URI)

Created http://imdb.com/tt0120669


We now start adding triples where that URI is the subject:

* we invent the properties/predicates as we go along, but stay close to the original name from the response. 
* some of the 'objects' of our triples are literals, and some are URIs. We create URIs where possible. 

In [30]:
# Every result we get is of type 'imdb:Movie'
g.add((fear_URI, RDF.type, IMDB['Movie']))

# The value for imdb:title is an english language literal (mind the double brackets)
g.add((fear_URI, IMDB['title'], Literal(response['Title'], lang='en')))

# The value for imdb:actor, is for each actor a Literal string 
# (though we could have generated URIs for every actore)

# Split the 'Actors' value by comma, and then strip every element of trailing spaces:
actors = [a.strip() for a in response['Actors'].split(',')]

# Iterate over the 'actors' we found
for actor in actors:
    g.add((fear_URI, IMDB['actor'], Literal(actor)))
    
# The language is again an english language literal
g.add((fear_URI, IMDB['language'], Literal(response['Language'], lang='en')))

# The runtime should be an XSD duration, but we have to strip the 'min' part, and replace it with 'M'
# See e.g. <http://www.w3schools.com/xml/schema_dtypes_date.asp>
duration = response['Runtime'].replace(' min', 'M')
g.add((fear_URI, IMDB['runtime'], Literal(duration, datatype=XSD['duration'])))

# The genres are again a comma separated list. They could be 
genres = [genre.strip() for genre in response['Genre'].split(',')]
for genre in genres:
    g.add((fear_URI, IMDB['genre'], Literal(genre, lang='en')))
    
# The rating is a literal value
g.add((fear_URI, IMDB['rated'], Literal(response['Rated'])))
    
# The writers are a comma separated list, so here we go again:
writers = [w.strip() for w in response['Writer'].split(',')]
for writer in writers:
    g.add((fear_URI, IMDB['writer'], Literal(writer)))
    
# The director is a single literal in this case (but perhaps there could be more???):
g.add((fear_URI, IMDB['director'], Literal(response['Director'])))

# The plot is an english literal
g.add((fear_URI, IMDB['plot'], Literal(response['Plot'], lang='en')))

# The year is an XSD gYear
g.add((fear_URI, IMDB['year'], Literal(response['Year'], datatype=XSD.gYear)))

# The IMDB rating is a double
g.add((fear_URI, IMDB['rating'], Literal(response['imdbRating'], datatype=XSD.double)))

# The poster is a URL
g.add((fear_URI, IMDB['poster'], URIRef(response['Poster'])))

Let's see what this look likes

In [33]:
print g.serialize(format='turtle')

@prefix imdb: <http://imdb.com/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

imdb:tt0120669 a imdb:Movie ;
    imdb:actor "Benicio Del Toro",
        "Ellen Barkin",
        "Johnny Depp",
        "Tobey Maguire" ;
    imdb:director "Terry Gilliam" ;
    imdb:genre "Adventure"@en,
        "Comedy"@en,
        "Drama"@en ;
    imdb:language "English"@en ;
    imdb:plot "An oddball journalist and his psychopathic lawyer travel to Las Vegas for a series of psychedelic escapades."@en ;
    imdb:poster <http://ia.media-imdb.com/images/M/MV5BNjA2ZDY3ZjYtZmNiMC00MDU5LTgxMWEtNzk1YmI3NzdkMTU0XkEyXkFqcGdeQXVyNjQyMjcwNDM@._V1_SX300.jpg> ;
    imdb:rated "R" ;
    imdb:rating 7.7e+00 ;
    imdb:runtime "118M"^^xsd:duration ;
    imdb:title "Fear and Loathing in Las Vegas"@en ;
    imdb:writer "Alex Cox (screenplay)",
  

## Facebook

Let's do the same thing for a Facebook example

In [79]:
user_id = "594486635" # Replace this with your user id (or the id of the user you want to access)
access_token = "YOUR ACCESS TOKEN" # Replace this with your own access token
facebook_url = "https://graph.facebook.com/v2.7/{}/movies".format(user_id)

params = {'access_token': access_token, 'fields': 'category, name'}

response = requests.get(facebook_url, params=params).json()

print response

{u'paging': {u'cursors': {u'after': u'ODMwNTUyNDg1NAZDZD', u'before': u'Njg4MjU3MDE0NTg5OTU5'}}, u'data': [{u'category': u'Movie', u'name': u'Strawman - The Nature Of The Cage', u'id': u'688257014589959'}, {u'category': u'Movie', u'name': u'Human Flight 3D The Movie', u'id': u'289482145973'}, {u'category': u'Movie', u'name': u'Waking Life', u'id': u'109327925759703'}, {u'category': u'Movie', u'name': u'Fear and Loathing in Las Vegas', u'id': u'105638652803531'}, {u'category': u'Community', u'name': u'Zeitgeist', u'id': u'32985985640'}, {u'category': u'Movie', u'name': u'The Fountain', u'id': u'8305524854'}]}


For convenience (because we cannot share the access token), we provide the expected response here. Remove this if you want to try it out with your own access token

In [80]:
response = {
  "data": [
    {
      "category": "Movie",
      "name": "Strawman - The Nature Of The Cage",
      "id": "688257014589959"
    },
    {
      "category": "Movie",
      "name": "Human Flight 3D The Movie",
      "id": "289482145973"
    },
    {
      "category": "Movie",
      "name": "Waking Life",
      "id": "109327925759703"
    },
    {
      "category": "Movie",
      "name": "Fear and Loathing in Las Vegas",
      "id": "105638652803531"
    },
    {
      "category": "Community",
      "name": "Zeitgeist",
      "id": "32985985640"
    },
    {
      "category": "Movie",
      "name": "The Fountain",
      "id": "8305524854"
    }
  ],
  "paging": {
    "cursors": {
      "before": "Njg4MjU3MDE0NTg5OTU5",
      "after": "ODMwNTUyNDg1NAZDZD"
    }
  }
}

Add a facebook namespace & prefix to a new Facebook-specific graph

In [81]:
fbg = Graph()

FB = Namespace("https://graph.facebook.com/")
fbg.bind('fb', FB)

We'll generate a URI again for our focal point (the Facebook user, we'll use the `user_id` specified earlier), and iterate over all results, to create movies. Since we now have Facebook identifiers for all movies, we can build URIs for them. Cool.

In [82]:
user_URI = FB["user-{}".format(user_id)] # QNames cannot start with a digit; oldfashioned.

# Every user is of type fb:User
fbg.add((user_URI, RDF.type, FB['User']))

print "Created {}".format(user_URI)

Created https://graph.facebook.com/user-594486635


In [83]:
for like in response['data']:
    # The user likes every movie we found, and every movie is a resource 
    # whose URI is based on its internal facebook ID
    movie_URI = FB["movie-{}".format(like['id'])]
    
    fbg.add((user_URI, FB['likes'], movie_URI))
    
    # Every movie is of the type specified by the 'category' field
    fbg.add((movie_URI, RDF.type, FB[like['category']]))
    # Every movie has a name (english literal)
    fbg.add((movie_URI, FB['name'], Literal(like['name'], lang='en')))

Let's see what this looks like

In [84]:
print fbg.serialize(format='turtle')

@prefix fb: <https://graph.facebook.com/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

fb:user-594486635 a fb:User ;
    fb:likes fb:movie-105638652803531,
        fb:movie-109327925759703,
        fb:movie-289482145973,
        fb:movie-32985985640,
        fb:movie-688257014589959,
        fb:movie-8305524854 .

fb:movie-105638652803531 a fb:Movie ;
    fb:name "Fear and Loathing in Las Vegas"@en .

fb:movie-109327925759703 a fb:Movie ;
    fb:name "Waking Life"@en .

fb:movie-289482145973 a fb:Movie ;
    fb:name "Human Flight 3D The Movie"@en .

fb:movie-32985985640 a fb:Community ;
    fb:name "Zeitgeist"@en .

fb:movie-688257014589959 a fb:Movie ;
    fb:name "Strawman - The Nature Of The Cage"@en .

fb:movie-8305524854 a fb:Movie ;
    fb:name "The Fountain"@en .




Putting this together....

In [85]:
newgraph = g + fbg

print newgraph.serialize(format='turtle')

@prefix fb: <https://graph.facebook.com/> .
@prefix imdb: <http://imdb.com/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

imdb:tt0120669 a imdb:Movie ;
    imdb:actor "Benicio Del Toro",
        "Ellen Barkin",
        "Johnny Depp",
        "Tobey Maguire" ;
    imdb:director "Terry Gilliam" ;
    imdb:genre "Adventure"@en,
        "Comedy"@en,
        "Drama"@en ;
    imdb:language "English"@en ;
    imdb:plot "An oddball journalist and his psychopathic lawyer travel to Las Vegas for a series of psychedelic escapades."@en ;
    imdb:poster <http://ia.media-imdb.com/images/M/MV5BNjA2ZDY3ZjYtZmNiMC00MDU5LTgxMWEtNzk1YmI3NzdkMTU0XkEyXkFqcGdeQXVyNjQyMjcwNDM@._V1_SX300.jpg> ;
    imdb:rated "R" ;
    imdb:rating 7.7e+00 ;
    imdb:runtime "118M"^^xsd:duration ;
    imdb:title "Fear and Loathing in Las Vegas"@en ;