# 📘 KG course SPARQL notebook

A notebook to run SPARQL queries for the KG course at UM DACS.

1. Update the `g.parse()` calls in the first cell to import your RDF files.
2. In the same folder as the notebook create files with your SPARQL queries (e.g. `q1.rq`), and execute them with `run_query(g, 'q1.rq')`

Use the `.rq` file extension to get SPARQL syntax coloration

In [None]:
import sys
!{sys.executable} -m pip install pandas oxrdflib Pygments

import pandas as pd
from IPython.display import display, HTML
from pygments import highlight
from pygments.lexers import SparqlLexer
from pygments.formatters import HtmlFormatter
from rdflib import Graph

def run_query(graph, query_path):
    try:
        with open(query_path, 'r') as file:
            query = file.read()
    except Exception as _e:
        print(f"No file for {query_path}")
        return
    results = graph.query(query)
    # Display the SPARQL query
    formatted_query = highlight(query, SparqlLexer(), HtmlFormatter(style='solarized-dark', full=True, nobackground=True))
    display(HTML(formatted_query))
    # Convert results to a Pandas DataFrame
    res_list = []
    for row in results:
        res_list.append([str(item) for item in row])
    df = pd.DataFrame(res_list, columns=[str(var) for var in results.vars]) if len(res_list) > 0 else pd.DataFrame()
    # Display the DataFrame as a table in Jupyter Notebook
    display(HTML(df.to_html()))

g = Graph(store="Oxigraph")

# TODO: modify/add paths to your RDF files
g.parse("./food_kg_altered.ttl")

print(f"Working with {len(g)} triples")

1. Identify one type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q1.rq')

2. Identify a second type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q2.rq')

3. Identify a third type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q3.rq')

4. Identify a forth type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q4.rq')

5. Identify a fifth type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q5.rq')

6. Identify a sixth type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q6.rq')

7. Identify a seventh type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q7.rq')

8. Identify an eighth type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q8.rq')

9. Identify a seventh type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q9.rq')

10. Identify a final type of quality check different than above, write and run SPARQL to implement the check and return the violating entities.

In [None]:
run_query(g, 'q10.rq')