# Building a graph in RDF using `rdflib`

First we need to intialize a Graph object in `rdflib`:

In [22]:
import rdflib as rdf
import rdflib.namespace

In [23]:
g = rdf.Graph()

In RDF, a graph is constructed from *triples*, each of which has three components:

  * *subject*: the entity being annotated
  * *predicate*: a relation between the subject and the object
  * *object*: another entity or a literal value

We'll represent the **anytime crepes** recipe by making programmatic calls to `rdflib`, starting with a URL constructed from the recipe `id` as an initial node.
We'll show that as our first subject `s` to annotate using RDF triples.

In [24]:
uri = "https://www.food.com/recipe/327593"
s = rdf.URIRef(uri)
s

rdflib.term.URIRef('https://www.food.com/recipe/327593')

Then we'll use the [`rdf:type`](https://www.w3.org/TR/rdf-schema/#ch_type) for the predicate `p` to describe the subject as an instance of `wmt:Recipe`

In [25]:
from rdflib.namespace import RDF

p = RDF.type
p

rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type')

While the first two nodes in the graph used vocabularies that are predefined in `rdflib`, now we'll need to reference other vocabularies.
We'll need to use the [`NamespaceManager`](https://rdflib.readthedocs.io/en/stable/namespaces_and_bindings.html) in `rdflib` to bind and access the namespaces for those vocabularies, which is the `nm` variable:

In [26]:
nm = g.namespace_manager

In [27]:
uri = "http://purl.org/heals/food/"
ns_wtm = rdf.Namespace(uri)

prefix = "wtm"
nm.bind(prefix, ns_wtm)

Now we can use this `wtm:` namespace to reference the object `o` as the `wtm:Recipe` entity:

In [28]:
o = ns_wtm.Recipe
o

rdflib.term.URIRef('http://purl.org/heals/food/Recipe')

Note how that object resolves to the URL <http://purl.org/heals/food/Recipe> -- which is a link to the vocabulary's RDF description.

Finally, we'll add the tuple `(s, p, o,)` to the graph:

In [29]:
g.add((s, p, o,))
g

<Graph identifier=N7a356e022da5426581dd3be3c6b9481a (<class 'rdflib.graph.Graph'>)>

Now let's add the remaining metadata for the **anytime crepes** recipe.
The required cooking time of "8 minutes" can be represented as a predicate `wtm:hasCookTime` and the literal `8` which we'll define as an `xsd:integer` value:

In [30]:
p = ns_wtm.hasCookTime
p

rdflib.term.URIRef('http://purl.org/heals/food/hasCookTime')

In [31]:
from rdflib.namespace import XSD

o = rdf.Literal("8", datatype=XSD.integer)
o

rdflib.term.Literal('8', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#integer'))

In [32]:
g.add((s, p, o,))

Now let's add the three ingredients of eggs, milk, flour:

In [33]:
p = ns_wtm.hasIngredient
p

rdflib.term.URIRef('http://purl.org/heals/food/hasIngredient')

In [34]:
uri = "http://purl.org/heals/ingredient/"
ns_ind = rdf.Namespace(uri)

prefix = "ind"
nm.bind(prefix, ns_ind)

In [35]:
o = ns_ind.ChickenEgg
o

rdflib.term.URIRef('http://purl.org/heals/ingredient/ChickenEgg')

In [36]:
g.add((s, p, o,))

In [37]:
g.add((s, p, ns_ind.CowMilk,))
g.add((s, p, ns_ind.WholeWheatFlour,))

To confirm what we've built so far, we can iterate through each of the `(s, p, o,)` triples in the graph:

In [38]:
for s, p, o in g:
    print(s, p, o)

https://www.food.com/recipe/327593 http://purl.org/heals/food/hasCookTime 8
https://www.food.com/recipe/327593 http://purl.org/heals/food/hasIngredient http://purl.org/heals/ingredient/CowMilk
https://www.food.com/recipe/327593 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/heals/food/Recipe
https://www.food.com/recipe/327593 http://purl.org/heals/food/hasIngredient http://purl.org/heals/ingredient/ChickenEgg
https://www.food.com/recipe/327593 http://purl.org/heals/food/hasIngredient http://purl.org/heals/ingredient/WholeWheatFlour


## Serialization

First let's show how to serialize the graph as `n3` triples, also known as `ttl` or [*turtle*](https://www.w3.org/TR/turtle/) format.
This will be returned from RDF as a byte array, so we'll need to use a Unicode [*codec*](https://docs.python.org/3/library/codecs.html) to convert the serialized graph into a string:

In [39]:
s = g.serialize(format="n3")
print(s.decode("utf8"))

@prefix ind: <http://purl.org/heals/ingredient/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix wtm: <http://purl.org/heals/food/> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://www.food.com/recipe/327593> a wtm:Recipe ;
    wtm:hasCookTime 8 ;
    wtm:hasIngredient ind:ChickenEgg,
        ind:CowMilk,
        ind:WholeWheatFlour .




Similarly, we can serialize the graph as `n3` triples to a file `tmp.ttl` in the local directory:

In [40]:
g.serialize(destination="tmp.ttl", format="n3", encoding="utf-8")

Try taking a look at the `tmp.ttl` file.
Is it the same as the serialization shown above?

Next, let's serialize the graph in [JSON-LD](https://json-ld.org/) format, stored in the `tmp.jsonld` file in the local directory:

In [41]:
data = g.serialize(
    format="json-ld",
    indent=2,
    encoding="utf-8",
    )

with open("tmp.jsonld", "wb") as f:
    f.write(data)

Try taking a look at the `tmp.jsonld` file.
Each entity, relation, and literal value has a full URL known as an *IRI* (internationalized resource locator) which [identifies a resource](https://www.w3.org/TR/json-ld11/#iris) used to define it.

We can make these JSON-LD files a bit more succinct by adding a `context` that defines prefixes for each of the vocabularies used:

In [28]:
context = {
    "@language": "en",
    "wtm": "http://purl.org/heals/food/",
    "ind": "http://purl.org/heals/ingredient/",
    }

In [29]:
context

{'@language': 'en',
 'wtm': 'http://purl.org/heals/food/',
 'ind': 'http://purl.org/heals/ingredient/'}

Now we'll serialize again as JSON-LD, this time using the context:

In [30]:
data = g.serialize(
    format="json-ld",
    context=context,
    indent=2,
    encoding="utf-8",
    )

with open("tmp.jsonld", "wb") as f:
    f.write(data)

See the difference in the resulting file?