# R2RML Tutorial using Morph-KGC

We will start by using the `morph-kgc` library to carry out the official R2RML example over at https://www.w3.org/TR/r2rml/#overview.

First, we create a sample database called `simple.db` with two tables.

In [None]:
import sqlite3

import pandas as pd

conn = sqlite3.connect("simple.db")

for table in ["EMP", "DEPT"]:
    df = pd.read_csv(f"simple/data/{table.lower()}.csv")
    df.to_sql(table, conn, index=False, if_exists="replace")
    print(f"Table: {table}\n{df}\n")

Table: EMP
   EMPNO  ENAME    JOB  DEPTNO
0   7369  SMITH  CLERK      10

Table: DEPT
   DEPTNO      DNAME       LOC
0      10  APPSERVER  NEW YORK



Next, let's configure `morph-kgc` to use a mapping.

In [2]:
import morph_kgc

morph_kgc.materialize_set("config/simple.ini")

INFO | 2025-01-10 10:17:17,096 | Parallelization is not supported for darwin when running as a library. If you need to speed up your data integration pipeline, please run through the command line.
INFO | 2025-01-10 10:17:17,389 | 7 mapping rules retrieved.
INFO | 2025-01-10 10:17:17,395 | Mapping partition with 7 groups generated.
INFO | 2025-01-10 10:17:17,396 | Maximum number of rules within mapping group: 1.
INFO | 2025-01-10 10:17:17,396 | Mappings processed in 0.298 seconds.
INFO | 2025-01-10 10:17:17,511 | Number of triples generated in total: 7.


{'<http://data.example.com/department/10> <http://example.com/ns#location> "NEW YORK"',
 '<http://data.example.com/department/10> <http://example.com/ns#name> "APPSERVER"',
 '<http://data.example.com/department/10> <http://example.com/ns#staff> "1"',
 '<http://data.example.com/department/10> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.com/ns#Department>',
 '<http://data.example.com/employee/7369> <http://example.com/ns#department> <http://data.example.com/department/10>',
 '<http://data.example.com/employee/7369> <http://example.com/ns#name> "SMITH"',
 '<http://data.example.com/employee/7369> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.com/ns#Employee>'}

Next, let's work with a many-to-many database.

In [None]:
conn = sqlite3.connect("many-to-many.db")

for table in ["EMP", "DEPT", "EMP2DEPT"]:
    df = pd.read_csv(f"many-to-many/data/{table.lower()}.csv")
    df.to_sql(table, conn, index=False, if_exists="replace")
    print(f"Table: {table}\n{df}\n")

Table: EMP
   EMPNO  ENAME         JOB
0   7369  SMITH       CLERK
1   7369  SMITH  NIGHTGUARD
2   7400  JONES    ENGINEER

Table: DEPT
   DEPTNO      DNAME       LOC
0      10  APPSERVER  NEW YORK
1      20   RESEARCH    BOSTON

Table: EMP2DEPT
   EMPNO  DEPTNO
0   7369      10
1   7369      20
2   7400      10



As triples:

In [4]:
morph_kgc.materialize_set("config/many-to-many.ini")

INFO | 2025-01-10 10:17:17,528 | Parallelization is not supported for darwin when running as a library. If you need to speed up your data integration pipeline, please run through the command line.
INFO | 2025-01-10 10:17:17,698 | 2 mapping rules retrieved.
INFO | 2025-01-10 10:17:17,700 | Mapping partition with 2 groups generated.
INFO | 2025-01-10 10:17:17,701 | Maximum number of rules within mapping group: 1.
INFO | 2025-01-10 10:17:17,701 | Mappings processed in 0.172 seconds.
INFO | 2025-01-10 10:17:17,708 | Number of triples generated in total: 6.


{'<http://data.example.com/employee=7369/department=10> <http://example.com/ns#department> <http://data.example.com/department/10>',
 '<http://data.example.com/employee=7369/department=10> <http://example.com/ns#employee> <http://data.example.com/employee/7369>',
 '<http://data.example.com/employee=7369/department=20> <http://example.com/ns#department> <http://data.example.com/department/20>',
 '<http://data.example.com/employee=7369/department=20> <http://example.com/ns#employee> <http://data.example.com/employee/7369>',
 '<http://data.example.com/employee=7400/department=10> <http://example.com/ns#department> <http://data.example.com/department/10>',
 '<http://data.example.com/employee=7400/department=10> <http://example.com/ns#employee> <http://data.example.com/employee/7400>'}

Or, alternatively

In [5]:
morph_kgc.materialize_set("config/many-to-many2.ini")

INFO | 2025-01-10 10:17:17,714 | Parallelization is not supported for darwin when running as a library. If you need to speed up your data integration pipeline, please run through the command line.
INFO | 2025-01-10 10:17:17,918 | 1 mapping rules retrieved.
INFO | 2025-01-10 10:17:17,921 | Mapping partition with 1 groups generated.
INFO | 2025-01-10 10:17:17,921 | Maximum number of rules within mapping group: 1.
INFO | 2025-01-10 10:17:17,921 | Mappings processed in 0.205 seconds.
INFO | 2025-01-10 10:17:17,924 | Number of triples generated in total: 3.


{'<http://data.example.com/employee/7369> <http://example.com/ns#department> <http://data.example.com/department/10>',
 '<http://data.example.com/employee/7369> <http://example.com/ns#department> <http://data.example.com/department/20>',
 '<http://data.example.com/employee/7400> <http://example.com/ns#department> <http://data.example.com/department/10>'}