# Knowledge Representation on the Web -- RDF tutorial

In this tutorial we'll learn the basics of interacting with RDF graphs with Python. We'll be using rdflib for this, a widely used Ptyhon library for RDF (all documentation can be found [here](https://rdflib.readthedocs.io/en/stable/index.html))

## Imports
These are the main classes and types we'll be using from rdflib

In [170]:
import sys

from rdflib import Graph, ConjunctiveGraph, Literal, BNode, Namespace, RDF, URIRef, RDFS
from rdflib.namespace import DC, FOAF

import pprint

## Loading data remotely and from files

rdflib accepts importing RDF data from a variety of sources, either locally from a file (including an extensive support of serializations), or remotely via a URI (this is a great way of checking practically if URIs return RDF according to the 3rd Linked Data principle).

A Graph object is always required to load triples.
**Note**: to load quads, and hence supporting named graphs, you'll need to use an instance of ConjunctiveGraph instead

**Exercise 1** 

For each step, use a different cell: 
1. create two graphs using rdflib:
    - and load one with triples from the site https://csarven.ca/ and/or http://www.w3.org/People/Berners-Lee/card 
    - load one with triples from ./data/ingredients.rdf. 

In [171]:
#TIP: look at the documentation of the rdflib library for how to LOAD and PARSE a graph - https://rdflib.readthedocs.io/en/stable/gettingstarted.html

In [172]:
def check_graph(graph):
    ''' Additional function to check if graph is empty and print graph's length.'''
    # Loop through each triple in the graph (subj, pred, obj)
    for subj, pred, obj in graph:
        # Check if there is at least one triple in the Graph
        if (subj, pred, obj) not in graph:
           print("There are no triples in the Graph")
    # Print the number of "triples" in the Graph
    print(f"The graph has {len(graph)} statements.")

In [173]:
g1 = Graph()

# Parse in an RDF file hosted on the Internet
g1.parse("https://csarven.ca/")

check_graph(g1)

The graph has 537 statements.


In [174]:
g2 = Graph()

# Parse in an RDF file hosted on the Internet
g2.parse("http://www.w3.org/People/Berners-Lee/card")

check_graph(g2)

The graph has 86 statements.


In [175]:
ingredients = Graph()

ingredients.parse("./data/ingredients.rdf")

check_graph(ingredients)

The graph has 837 statements.


## Serialising and saving RDF graphs

There are different formats for storing RDF triples. Semantically, these mean the same, they differ only in their syntax. 


Use the function Graph.serialize(format). 

**Exercise 2**

1. serialise one of the graphs to the .ttl, .xml and .nt format, and print the first n lines to compare the syntax
1. save your graph in the turtle format to the ./data/ folder

In [176]:
def print_n_lines(graph, format, n):
    serialized = graph.serialize(format=format)
    lines = serialized.splitlines()
    for i in range(min(n, len(lines))):
        print(lines[i])
    print("\n")

In [177]:
def compare_formats(graph, n):
    # Print out the first n lines in the RDF Turtle format
    print("------ Turtle format ------")
    print_n_lines(graph, "turtle", n)
    # Print out the first n lines in the RDF XML format
    print("------ XML format ------")
    print_n_lines(graph, "xml", n)
    # Print out the first n lines in the RDF N-Triples format
    print("------ N-Triples format ------")
    print_n_lines(graph, "nt", n)

In [178]:
# 1. serialize the chosen graph
compare_formats(ingredients, 20)

------ Turtle format ------
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix ind: <http://purl.org/heals/ingredient/> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix sm: <https://www.omg.org/techprocess/ab/SpecificationMetadata/> .
@prefix wtm: <http://purl.org/heals/food/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ind:AlmondMeal a obo:FOODON_03400662,
        obo:FOODON_03400685,
        wtm:Ingredient,
        owl:NamedIndividual ;
    rdfs:label "almond meal" ;
    dcterms:source "Wikipedia, \"Almond meal.\" [Online]. Available:https://en.wikipedia.org/wiki/Almond_meal. [Accessed: Nov. 10, 2018]" ;
    wtm:hasGluten false ;
    wtm:hasGlycemicIndex "25"^^xsd:nonNegativeInteger ;
    skos:definition "an ingredient made from ground up almonds" ;
    skos:scopeNote "used as an ingredient" .


------ X

In [179]:
# 2. save the graph in ttl format
ingredients.serialize(destination="./data/ing.ttl")

<Graph identifier=N2bfbca9180ee450cab235689fad0136c (<class 'rdflib.graph.Graph'>)>

In [180]:
# test
test = Graph()
test.parse("./data/ing.ttl")
check_graph(test)

The graph has 837 statements.


##  Merging graphs

Merging graphs can be done via sequential parsings or by the overloaded operator +

**Note:** Set-theoretic graph semantics apply

The Food knowledge graph FoodKG contains a graph of statements about ingredients, as well as a graph with statements about recipes. 

**Exercise 3**: 

1. load ./data/ingredients.rdf and ./data/ghostbusters.ttl into a single graph, either by sequential parsing or using the operator +.

2. count the number of statements in each graph, and the intersection of the two graphs. 

3. check whether the combined graph is connected (using graph.connected()) 

4. load ./data/ingredients.rdf and ./data/recipes.rdf into a single graph, either by sequential parsing or using the operator +. 

5. count the number of statements in each graph, and the intersection of the two graphs. 

6. check whether the combined graph is connected (using graph.connected()). Explain the result with respect to point 3! 

In [181]:
#look at rdflib documentation - Navigating Graphs

# 1
ghost = Graph()
ghost.parse("./data/ghostbusters.ttl")

union = Graph()
union = ingredients + ghost

print("Total # of statements:")
check_graph(union)

Total # of statements:
The graph has 53174 statements.


In [182]:
# 2
print("# of statements in Ingredients graph:")
check_graph(ingredients)
print("# of statements in Ghostbusters graph:")
check_graph(ghost)

intersection = ingredients & ghost
print("# of statements in the intersection:")
check_graph(intersection)

# of statements in Ingredients graph:
The graph has 837 statements.
# of statements in Ghostbusters graph:
The graph has 52337 statements.
# of statements in the intersection:
The graph has 0 statements.


In [183]:
# 3
union.connected()

False

In [184]:
# 4
recipes = Graph()
recipes.parse("./data/recipes.rdf")

union_recipes = Graph()
union_recipes.parse("./data/ingredients.rdf")
union_recipes.parse("./data/recipes.rdf")

print("Total # of statements:")
check_graph(union_recipes)

Total # of statements:
The graph has 1299 statements.


In [185]:
# 5
print("# of statements in Ingredients graph:")
check_graph(ingredients)
print("# of statements in Recipes graph:")
check_graph(recipes)

intersection_recipes = ingredients & recipes
print("# of statements in the intersection:")
check_graph(intersection_recipes)

# of statements in Ingredients graph:
The graph has 837 statements.
# of statements in Recipes graph:
The graph has 480 statements.
# of statements in the intersection:
The graph has 18 statements.


In [186]:
# 6
union_recipes.connected()

False

Thus, there are no shared predicates or properties connecting the common subjects or objects in both graphs: ingredients and recipes (ghostbusters).

## Namespaces 

Remind yourself what namespaces are. 

In RDFLib, the namespace module defines many common namespaces such as RDF, RDFS, OWL, FOAF, SKOS, etc., but you can also easily add URIs within a different namespace:


In [187]:
TEACH = Namespace("http://linkedscience.org/teach/ns#")
TEACH.Teacher

rdflib.term.URIRef('http://linkedscience.org/teach/ns#Teacher')

Check out the specification to see which other terms are used within the TEACH namespace. http://linkedscience.org/teach/ns/#sec-specification. 
You can use a NamespaceManager to bind a prefix to a namespace: 

In [188]:
g = Graph()
g.namespace_manager.bind('TEACH', URIRef('http://linkedscience.org/teach/ns#'))
TEACH.Teacher.n3(g.namespace_manager)

'TEACH:Teacher'

In [189]:
KRW = Namespace("http://krw.vu.nl/data#")

#creating individuals within your namespace
KRW.Teacher
KRW.Student

rdflib.term.URIRef('http://krw.vu.nl/data#Student')

**Exercise 4:**
1. create your own namespace (can be made up) 

In [190]:
SPACE = Namespace("http://example.org/space/")

SPACE.Planet
SPACE.Star

rdflib.term.URIRef('http://example.org/space/Star')

In [191]:
instance = URIRef('http://example.org/space/Instance')
print(instance)

http://example.org/space/Instance


In [192]:
space_g = Graph()
space_g.namespace_manager.bind('space', SPACE)

In [193]:
print(SPACE.Planet.n3(space_g.namespace_manager))

space:Planet



## Creating RDF triples

Triples are added to the graph with the function Graph.add()

The parameter is a triple given in a Python **tuple** (subject, predicate, object)

Notice the namespace convenience syntax!

**Exercise 5:** 

1. create a new graph and add triples (~10) within your made-up namespace using Graph.add(). These triples can be about anything, for instance ingredients or recipes. Make sure they include the predicates RDF.type, RDFS.label and RDFS.subClassOf

2. open yourRDF.ttl, and write your triples out by hand in a syntax of your choice (turtle is recommended, notice the file extension!). Load the triples here with rdflib. 

In [194]:
# create graph
solar_g = Graph()

# example namespace
SOLAR = Namespace("http://example.org/solarsystem/")
solar_g.namespace_manager.bind('solar', SOLAR)

# add triples using store's add method.
sun = URIRef("http://example.org/solarsystem/Sun")
earth = SOLAR.Earth
moon = SOLAR.Moon

star = SOLAR.Star # == URIRef("http://example.org/solarsystem/Star")
planet = SOLAR.Planet
satellite = SOLAR.Satellite

radius = Literal(695000)

solar_g.add((sun, RDF.type, star))
solar_g.add((sun, SOLAR.has_radius, radius))
solar_g.add((sun, RDFS.label, Literal("Sun")))
solar_g.add((earth, RDF.type, planet))
solar_g.add((moon, RDF.type, satellite))
solar_g.add((moon, SOLAR.satellite_of, earth))
solar_g.add((star, RDFS.subClassOf, SOLAR.CelestialBody))
solar_g.add((planet, RDFS.subClassOf, SOLAR.CelestialBody))
solar_g.add((satellite, RDFS.subClassOf, SOLAR.CelestialBody))

<Graph identifier=Nace9904861654a069fcb9bc144d8a755 (<class 'rdflib.graph.Graph'>)>

In [195]:
check_graph(solar_g)

The graph has 9 statements.


In [196]:
# save the graph to destination in ttl format - myRDF.ttl (look at RDFLib documentation - Loading and saving RDF)
# solar_g.serialize(destination="./data/yourRDF.ttl")

In [197]:
# load the saved graph and print it in ttl format
my_rdf = Graph()
my_rdf.parse("./data/yourRDF.ttl")
check_graph(my_rdf)

The graph has 9 statements.


In [198]:
print(my_rdf.serialize(format="turtle"))

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix solar: <http://example.org/solarsystem/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

solar:Moon a solar:Satellite ;
    solar:satellite_of solar:Earth .

solar:Sun a solar:Star ;
    rdfs:label "Sun" ;
    solar:has_radius 695000 .

solar:Earth a solar:Planet .

solar:Planet rdfs:subClassOf solar:CelestialBody .

solar:Satellite rdfs:subClassOf solar:CelestialBody .

solar:Star rdfs:subClassOf solar:CelestialBody .




## Navigating graphs

rdflib uses iterators to navigate Graphs. The methods for navigating subjects, predicates and objects are Graph.subjects, Graph.predicates, Graph.objects

**Exercise 6:**

1. print all the triples in yourRDF.ttl
2. print all subjects in yourRDF.ttl
3. print all predicates in yourRDF.ttl
4. print all objects in yourRDF.ttl


In [199]:
#TIP you have to loop in the graph 
# 1
for subj, pred, obj in my_rdf:
    print(f"({subj.n3(solar_g.namespace_manager)}, {pred.n3(solar_g.namespace_manager)}, {obj.n3(solar_g.namespace_manager)})")

(solar:Sun, rdf:type, solar:Star)
(solar:Sun, rdfs:label, "Sun")
(solar:Planet, rdfs:subClassOf, solar:CelestialBody)
(solar:Star, rdfs:subClassOf, solar:CelestialBody)
(solar:Moon, solar:satellite_of, solar:Earth)
(solar:Earth, rdf:type, solar:Planet)
(solar:Satellite, rdfs:subClassOf, solar:CelestialBody)
(solar:Moon, rdf:type, solar:Satellite)
(solar:Sun, solar:has_radius, "695000"^^xsd:integer)


In [200]:
# 2
for subj in my_rdf.subjects():
    print(subj)

http://example.org/solarsystem/Sun
http://example.org/solarsystem/Sun
http://example.org/solarsystem/Planet
http://example.org/solarsystem/Star
http://example.org/solarsystem/Moon
http://example.org/solarsystem/Earth
http://example.org/solarsystem/Satellite
http://example.org/solarsystem/Moon
http://example.org/solarsystem/Sun


In [201]:
# 3
for pred in my_rdf.predicates():
    print(pred)

http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2000/01/rdf-schema#label
http://www.w3.org/2000/01/rdf-schema#subClassOf
http://www.w3.org/2000/01/rdf-schema#subClassOf
http://example.org/solarsystem/satellite_of
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2000/01/rdf-schema#subClassOf
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://example.org/solarsystem/has_radius


In [202]:
# 4
for obj in my_rdf.objects():
    print(obj)

http://example.org/solarsystem/Star
Sun
http://example.org/solarsystem/CelestialBody
http://example.org/solarsystem/CelestialBody
http://example.org/solarsystem/Earth
http://example.org/solarsystem/Planet
http://example.org/solarsystem/CelestialBody
http://example.org/solarsystem/Satellite
695000


We can also filter the subjects, predicates and objects we want to retrieve, and match their values like in a database "join" operation


**Exercise 7:**

1. print all subject types in yourRDF.ttl
2. print all subject labels yourRDF.ttl

In [203]:
# 1
for subj, pred, obj in my_rdf.triples((None,  RDF.type, None)):
    print(obj.n3(solar_g.namespace_manager))
    # print(f"{subj.n3(solar_g.namespace_manager)} is a {obj.n3(solar_g.namespace_manager)}")

solar:Satellite
solar:Star
solar:Planet


In [204]:
# 2
for subj, pred, obj in my_rdf.triples((None,  RDFS.label, None)):
    print(obj.n3(solar_g.namespace_manager))
    # print(f"{subj.n3(solar_g.namespace_manager)} is a {obj.n3(solar_g.namespace_manager)}")

"Sun"


### Basic triple matching (almost querying!)

We use method Graph.triples and a Python tuple that acts as a mask for specifying our criteria

**Exercise 8:**

1. check whether a triple is in your graph -> print true or false
2. print all triples related to a certain subject in your graph
3. print all triples related to a certain object in your graph

In [205]:
# 1.1.
if (moon, None, None) in my_rdf:
    print(True)
    print("This graph contains triples about Moon.")
else:
    print(False)

True
This graph contains triples about Moon.


In [206]:
# 1.2.
if (earth, None, star) in my_rdf:
    print(True)
    print("Earth is a star.")
else:
    print(False)
    print("Earth is not a star.")

False
Earth is not a star.


In [207]:
# 2
for subj, pred, obj in my_rdf.triples((SOLAR.Sun,  None, None)):
    print(f"({subj.n3(solar_g.namespace_manager)}, {pred.n3(solar_g.namespace_manager)}, {obj.n3(solar_g.namespace_manager)})")

(solar:Sun, rdf:type, solar:Star)
(solar:Sun, rdfs:label, "Sun")
(solar:Sun, solar:has_radius, "695000"^^xsd:integer)


In [208]:
# 3
for subj, pred, obj in my_rdf.triples((None,  None, SOLAR.CelestialBody)):
    print(f"({subj.n3(solar_g.namespace_manager)}, {pred.n3(solar_g.namespace_manager)}, {obj.n3(solar_g.namespace_manager)})")

(solar:Planet, rdfs:subClassOf, solar:CelestialBody)
(solar:Satellite, rdfs:subClassOf, solar:CelestialBody)
(solar:Star, rdfs:subClassOf, solar:CelestialBody)


## Assignment part 1: your own webapplication. 

You are a chef in a restaurant, and you need to serve someone that is gluten intolerant. 

1. load the ./data/recipes.rdf and ./data/ingredients.rdf datasets in one graph
2. query your graph (as we did in previous exercises) to retrieve all recipes without gluten
3. query your graph for all recipes that you can make for your gluten intolerant guest. 
4. the guest asks you whether there are more options. Can you find the recipes for which an ingredient with gluten can be replaced, solely using pattern matching? (Hint: you need to write multiple of these pattern matching queries, and check the predicate __substitutesFor__) 
5. another guest is allergic to pecan nuts, which recipes could you serve them (including those for which pecan nuts can be replaced) 

**Note that this is a bit tedious: later on, we will be querying more complicated patterns with SPARQL!**

In [209]:
# 1. load the ./data/recipes.rdf and ./data/ingredients.rdf datasets in one graph
ingredients = Graph()
recipes = Graph()
graph = Graph()
ingredients.parse("./data/ingredients.rdf")
recipes.parse("./data/recipes.rdf")
graph.parse("./data/ingredients.rdf")
graph.parse("./data/recipes.rdf")

print("Total # of statements:")
check_graph(graph)

Total # of statements:
The graph has 1299 statements.


In [210]:
FOOD = Namespace("http://purl.org/heals/food/")
INGREDIENT = Namespace("http://purl.org/heals/ingredient/")

In [211]:
print("Ingredients with gluten:")
for subj, pred, obj in graph.triples((None,  FOOD.hasGluten, Literal(True))):
    print(f"({subj}, {pred}, {obj})")

Ingredients with gluten:
(http://purl.org/heals/ingredient/AllPurposeFlour, http://purl.org/heals/food/hasGluten, true)
(http://purl.org/heals/ingredient/ChickenBroth, http://purl.org/heals/food/hasGluten, true)
(http://purl.org/heals/ingredient/Kamut, http://purl.org/heals/food/hasGluten, true)
(http://purl.org/heals/ingredient/KamutFlour, http://purl.org/heals/food/hasGluten, true)
(http://purl.org/heals/ingredient/Mayonnaise, http://purl.org/heals/food/hasGluten, true)
(http://purl.org/heals/ingredient/WholeWheatFlour, http://purl.org/heals/food/hasGluten, true)


In [212]:
dishes_with_gluten = set()

print("Recipes with gluten:")
for ingredient, pred, bool in graph.triples((None,  FOOD.hasGluten, Literal(True))):
    for dish, has_ing, ing in graph.triples((None, FOOD.hasIngredient, ingredient)):
        if dish not in dishes_with_gluten:
            print(f"{dish} has ingerient with gluten {ing}")
        dishes_with_gluten.add(dish)

print(f"{len(dishes_with_gluten)} recipes contain gluten")

Recipes with gluten:
http://purl.org/heals/ingredient/AlmondBiscotti has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/BananaBread has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/Brownies has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/GoldenKamutBread has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/KamutMuffin has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/KamutPancake has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/WhiteBread has ingerient with gluten http://purl.org/heals/ingredient/AllPurposeFlour
http://purl.org/heals/ingredient/ThaiChicken has ingerient with gluten http://purl.org/heals/ingredient/ChickenBroth
http://purl.org/h

In [213]:
# 2. query your graph to retrieve all recipes without gluten

print("Recipes without gluten:")
recipes_without_gluten = []
count = 0
for recipe, pred, obj in graph.triples((None, RDF.type, FOOD.Recipe)):
    if recipe not in dishes_with_gluten:
        recipes_without_gluten.append(recipe)
        print(f"{recipe}")
        count += 1

print(f"{count} recipes")

Recipes without gluten:
http://purl.org/heals/ingredient/BakedChickenTender
http://purl.org/heals/ingredient/BananaBlueberryAlmondFlourMuffin
http://purl.org/heals/ingredient/BeefNilaga
http://purl.org/heals/ingredient/BeefStew
http://purl.org/heals/ingredient/BraisedBalsamicChicken
http://purl.org/heals/ingredient/CornedBeefHash
http://purl.org/heals/ingredient/FlourlessCoconutAndAlmondCake
http://purl.org/heals/ingredient/GlutenFreeCoconutCake
http://purl.org/heals/ingredient/GrilledChickenKabob
http://purl.org/heals/ingredient/PotRoastWithVegetables
http://purl.org/heals/ingredient/SaucyShepherdPie
http://purl.org/heals/ingredient/SmotheredChickenBreast
12 recipes


In [214]:
# 2. variant 2

print("Recipes without gluten:")
recipes_without_gluten2 = []
count2 = 0
for recipe, pred, obj in graph.triples((None, RDF.type, FOOD.Recipe)):
    no_gluten_recipe = True
    for dish, pred2, ingredient, in graph.triples((recipe, FOOD.hasIngredient, None)):
        if (ingredient, FOOD.hasGluten, Literal(True)) in graph:
            no_gluten_recipe = False
            break
    if not no_gluten_recipe:
        continue
    recipes_without_gluten2.append(recipe)
    print(recipe)
    count2 += 1

print(f"{count2} recipes")

Recipes without gluten:
http://purl.org/heals/ingredient/BakedChickenTender
http://purl.org/heals/ingredient/BananaBlueberryAlmondFlourMuffin
http://purl.org/heals/ingredient/BeefNilaga
http://purl.org/heals/ingredient/BeefStew
http://purl.org/heals/ingredient/BraisedBalsamicChicken
http://purl.org/heals/ingredient/CornedBeefHash
http://purl.org/heals/ingredient/FlourlessCoconutAndAlmondCake
http://purl.org/heals/ingredient/GlutenFreeCoconutCake
http://purl.org/heals/ingredient/GrilledChickenKabob
http://purl.org/heals/ingredient/PotRoastWithVegetables
http://purl.org/heals/ingredient/SaucyShepherdPie
http://purl.org/heals/ingredient/SmotheredChickenBreast
12 recipes


In [215]:
# 3. query your graph for all recipes that you can make for your gluten intolerant guest

print("Recipes recipes with ingredients which we definitely know do not contain gluten:")
recipes_gluten_intolerant = []
count3 = 0

for recipe, pred, obj in graph.triples((None, RDF.type, FOOD.Recipe)):
    gluten_intolerant_recipe = True
    for dish, pred2, ingredient, in graph.triples((recipe, FOOD.hasIngredient, None)):
        if (ingredient, FOOD.hasGluten, Literal(False)) not in graph:
            if (ingredient, FOOD.hasGluten, Literal(True)) in graph:
                # print(f"{ingredient} in {dish} contains gluten")
                pass
            else:
                # print(f"We're not sure that {ingredient} in {dish} does not contains gluten")
                pass
            gluten_intolerant_recipe = False
            break
    if not gluten_intolerant_recipe:
        continue
    recipes_gluten_intolerant.append(recipe)
    print(recipe)
    count3 += 1

print(f"{count3} recipes")

Recipes recipes with ingredients which we definitely know do not contain gluten:
http://purl.org/heals/ingredient/BakedChickenTender
http://purl.org/heals/ingredient/BananaBlueberryAlmondFlourMuffin
http://purl.org/heals/ingredient/BeefNilaga
http://purl.org/heals/ingredient/BraisedBalsamicChicken
http://purl.org/heals/ingredient/CornedBeefHash
http://purl.org/heals/ingredient/FlourlessCoconutAndAlmondCake
http://purl.org/heals/ingredient/GlutenFreeCoconutCake
http://purl.org/heals/ingredient/GrilledChickenKabob
http://purl.org/heals/ingredient/SaucyShepherdPie
http://purl.org/heals/ingredient/SmotheredChickenBreast
10 recipes


In [216]:
# 4. the guest asks you whether there are more options. Can you find the recipes for which an ingredient with gluten can be replaced, solely using pattern matching?
# (Hint: you need to write multiple of these pattern matching queries, and check the predicate substitutesFor)

print("Recipes for which an ingredient with gluten can be replaced:")
count_more = 0

for recipe, pred, obj in graph.triples((None, RDF.type, FOOD.Recipe)):
    has_gluten = False
    can_replaced = True
    for subj, has_ing, ingredient in graph.triples((recipe,  FOOD.hasIngredient, None)):
        if (ingredient, FOOD.hasGluten, Literal(True)) in graph:
            has_gluten = True
            can_replaced = False
            if (None, FOOD.substitutesFor, ingredient) not in graph:
                # print(f"{ingredient} has no substitutes")
                # print("\n")
                pass
            for s, p, o in graph.triples((None, FOOD.substitutesFor, ingredient)):
                if (s, FOOD.hasGluten, Literal(False)) in graph:
                    # print(f"{ingredient} can be replaced with {s} in {recipe}")
                    can_replaced = True
                    break
                else:
                    # print(f"{ingredient} cannot be replaced with {s} in {recipe} as it's not gluten-free")
                    # print("\n")
                    pass
    # ingredient with gluten cannot be replaced
    # skip recipes without gluten
    if not can_replaced or not has_gluten:
        continue
    count_more += 1
    print(f"{recipe}")
    # print("\n")

print(f"{count_more} recipes")

Recipes for which an ingredient with gluten can be replaced:
http://purl.org/heals/ingredient/AlmondBiscotti
http://purl.org/heals/ingredient/BananaBread
http://purl.org/heals/ingredient/Brownies
http://purl.org/heals/ingredient/WhiteBread
4 recipes


In [217]:
# 5. another guest is allergic to pecan nuts, which recipes could you serve them (including those for which pecan nuts can be replaced)

print("Recipes without pecan nuts, including the ones for which pecan nuts can be replaced:")
count_pecan = 0
comment = ""

for recipe, pred, obj in graph.triples((None, RDF.type, FOOD.Recipe)):
    can_replaced = True
    comment = ""
    for subj, has_ing, ingredient in graph.triples((recipe,  FOOD.hasIngredient, None)):
        if ingredient == INGREDIENT.Pecan:
            if (None, FOOD.substitutesFor, ingredient) in graph:
                comment = " - pecan can be replaced"
                pass
            else:
                # print(f"Pecan cannot be replaced in {recipe}")
                # print("\n")
                can_replaced = False
                break
    if not can_replaced:
        continue
    print(f"{recipe}{comment}")
    # print("\n")
    count_pecan += 1

print(f"{count_pecan} recipes")

Recipes without pecan nuts, including the ones for which pecan nuts can be replaced:
http://purl.org/heals/ingredient/AlmondBiscotti
http://purl.org/heals/ingredient/BakedChickenTender
http://purl.org/heals/ingredient/BananaBlueberryAlmondFlourMuffin
http://purl.org/heals/ingredient/BananaBread
http://purl.org/heals/ingredient/BeefNilaga
http://purl.org/heals/ingredient/BeefStew
http://purl.org/heals/ingredient/BraisedBalsamicChicken
http://purl.org/heals/ingredient/Brownies
http://purl.org/heals/ingredient/ChickenSalad - pecan can be replaced
http://purl.org/heals/ingredient/CornedBeefHash
http://purl.org/heals/ingredient/FlourlessCoconutAndAlmondCake
http://purl.org/heals/ingredient/GlutenFreeCoconutCake
http://purl.org/heals/ingredient/GoldenKamutBread
http://purl.org/heals/ingredient/GrilledChickenKabob
http://purl.org/heals/ingredient/KamutMuffin
http://purl.org/heals/ingredient/KamutPancake
http://purl.org/heals/ingredient/PotRoastWithVegetables
http://purl.org/heals/ingredient/S