# KEN3140 (Semantic Web) Lab 10
### Semantic Data Integration

**Date:** 4 October 2021

**Author:** Kody Moodley & Remzi Celebi

**Affiliation:** Institute of Data Science, Maastricht University

**License:** [GNU Affero General Public License v3.0](https://www.gnu.org/licenses/agpl-3.0.txt)

**Notebook description:**

As it was described in Lecture 10, different ontologies, vocabularies and knowledge graphs often define similar information and terms. When we examine the semantics (meaning) of terms across different vocabularies, sometimes we find that two terms with completely different IRIs actually share similar or equivalent meaning. 

In order to query integrated RDF data that is described using different terms with semantically related meaning, across different ontologies, we sometimes have to **map** (explicitly define the semantic relationships) between these terms. In this lab you will have complete exercises to practically execute this task on some example RDF data and vocabularies.

We will use the Java RDF library [Apache Jena](https://jena.apache.org/) for this lab again.

### Import and setup Apache Jena Library

In [None]:
%jars apache-jena-3.16.0/lib/*.jar

In [None]:
import org.apache.jena.riot.RDFDataMgr;
import org.apache.jena.rdf.model.*;
import org.apache.jena.util.PrintUtil;
import org.apache.jena.vocabulary.RDF;
import org.apache.jena.query.Query;
import org.apache.jena.query.QueryExecution;
import org.apache.jena.query.QueryExecutionFactory;
import org.apache.jena.query.QueryFactory;
import org.apache.jena.query.QuerySolution;
import org.apache.jena.query.ResultSetFormatter;
import org.apache.jena.query.ResultSet;
import org.apache.jena.reasoner.ReasonerRegistry;
import java.io.File;
import org.apache.commons.io.FileUtils;
import java.nio.charset.StandardCharsets;

### Load an RDF file about family relations

Load the RDF file, and create `runSparqlQuery()` function to run SPARQL queries in a single call from a string or a file.

In [None]:
Model baseModel = RDFDataMgr.loadModel("Lab10_OWL_familyrelations.ttl");
//ReasonerRegistry reasonerRegistry = new ReasonerRegistry();
InfModel model = ModelFactory.createInfModel(ReasonerRegistry.getOWLReasoner(), baseModel);
// The schema model can be provided separately
// InfModel model = ModelFactory.createInfModel(uselessReasonerRegistry.getOWLReasoner(), schema, model);

In [None]:
static void runSparqlQuery(String queryString, Model model) throws java.io.IOException {
    if (queryString.endsWith(".rq")) {
        queryString = FileUtils.readFileToString(new File(queryString), StandardCharsets.UTF_8);
    }
    System.out.println(queryString);
    Query query = QueryFactory.create(queryString);
    QueryExecution qexec = QueryExecutionFactory.create(query, model);
    ResultSetFormatter.out(qexec.execSelect(), model);
} 

#### Task 1:

Study the contents of the file "Lab10_OWL_familyrelations.ttl" carefully. Notice that there are two graphs about family members but both graphs use different terminology to describe the content.

#### Task 2: 

Write a SPARQL query in "sparql-getSiblings.rq" to retrieve all sibling pairs from the **combined** graph. Your query should only use vocabulary from **one** of the graphs.

In order to complete the task, you need to add suitable mappings (statements) in the "Lab10_OWL_familyrelations.ttl" file. Please add these under "Task 3" in "Lab10_OWL_familyrelations.ttl".

Run the cell below to observe the results you get back.



In [None]:
runSparqlQuery("sparql-getSiblings.rq", model)

#### Task 3: 

Write a SPARQL query in "sparql-getAllMales.rq" to retrieve all persons with the male gender from the **combined** graph. Your query should only use vocabulary from **one** of the graphs.

In order to complete the task, you need to add suitable mappings (statements) in the "Lab10_OWL_familyrelations.ttl" file. Please add these under "Task 2" in "Lab10_OWL_familyrelations.ttl".

Run the cell below to observe the results you get back.

In [None]:
runSparqlQuery("sparql-getAllMales.rq", model)

#### Task 4: 

Write a SPARQL query in "sparql-getParents.rq" to retrieve all persons and their parents from the **combined** graph. Your query should only use vocabulary from **one** of the graphs.

In order to complete the task, you need to add suitable mappings (statements) in the "Lab10_OWL_familyrelations.ttl" file. Please add these under "Task 4" in "Lab10_OWL_familyrelations.ttl".

Run the cell below to observe the results you get back.

In [None]:
runSparqlQuery("sparql-getParents.rq", model)

#### Task 5: 

Write a SPARQL query in "sparql-getGivenAndFamilyNames.rq" to retrieve all persons in the **combined** graph and display their first and last names. Your query should only use vocabulary from **one** of the graphs. **Human readable labels for all first and last names should be given for all persons**.

In order to complete the task, you need to add suitable mappings (statements) in the "Lab10_OWL_familyrelations.ttl" file. Please add these under "Task 5" in "Lab10_OWL_familyrelations.ttl".

Run the cell below to observe the results you get back.

In [None]:
runSparqlQuery("sparql-getGivenAndFamilyNames.rq", model)

#### Task 6 (If you don't finish - homework): 

Write a SPARQL query to transform (represent) **all** the statements in "RDF Graph 2" of "Lab10_OWL_familyrelations.ttl" using **only** the vocabulary from "RDF Graph 1".