## Materialization 101

An RDFS graph is a semantic multigraph, a collection of nodes with edges and labels, out of which there could be multiple outgoing edges from a single node.

The "semantic" part means that its nodes have _semantics_.

In order to understand the semantics of RDFS, we ought to first have a look at how the data really looks like.

Data-format wise, it is usually stored with the file extension `.nt`. It is a very simple format: one triple per line.

There are two parts of an RDFS graph.

The first is the _TBOX_, the ontology, the set of nodes that encodes the hierarchy of the graph:

```
(employee, rdf:type, Class)
(faculty, rdfs:subClassOf, employee)
(professor, rdfs:subClassOf, faculty)
(teaches, rdf:type, rdf:Property)
(lectures, rdfs:subPropertyOf, teaches)
(teaches, rdfs:domain, professor)
(course, rdf:type, Class)
(teaches, rdfs:range, course)
```

In this graph, we define that *employee* is a _Class_, *faculty* is a _type_ of *employee*, *professor* is a _subClass_ of *faculty*, *teaches* is a _type_ of _property_, *lectures* is a _subProperty_ of *teaches*, and that *teaches* is in the _domain_ of *professor*, alongside with *course*, a _Class_, is in the _range_ of *teaches*.

Next up there is the _ABOX_, which is where assertions about individuals based on the rules we've defined on the _TBOX_ are made.

```
(professor1, lectures, course1)
```

From this assertion, much implicit knowledge could be derived. For instance, `lecture` is a subclass of `teaches`, hence we can say that the professor teaches. `teaches` is in the domain of `professor`, so `professor1` must be a professor. 

Now, Materialization.

RDFS has a set of _entailment_ rules which dictate its _semantics_.

Here are they(the ones that matter, for now):

```
:A(?y, rdf:type, ?x) :- :T(?a, rdfs:domain, ?x), :A(?y, ?a, ?z) . // 1
:A(?z, rdf:type, ?x) :- :T(?a, rdfs:range, ?x), :A(?y, ?a, ?z) . // 2
:T(?x, rdfs:subPropertyOf, ?z) :- :T(?x, rdfs:subPropertyOf, ?y), :T(?y, rdfs:subPropertyOf, ?z) . // 3
:T(?x, rdfs:subClassOf, ?z) :- :T(?x, rdfs:subClassOf, ?y), :T(?y, rdfs:subClassOf, ?z) . // 4
:A(?x, ?b, ?y) :- :T(?a, rdfs:subPropertyOf, ?b), :A(?x, ?a, ?y) . // 5
:A(?z, rdf:type, ?y) :- :T(?x, rdfs:subClassOf, ?y), :A(?z, rdf:type, ?x) . // 6
```

The way to read a rule is quite straightforward.

For instance, `:T(?x, rdfs:subClassOf, ?z) :- :T(?x, rdfs:subClassOf, ?y), :T(?y, rdfs:subClassOf, ?z) .` is spelled as: If the tbox triples (?x, rdfs:subClassOf, ?y)
and (?y, rdfs:subClassOf, ?z) exist in the tbox, then (?x, subClassOf, ?z) *must* exist in the tbox as well.

To _materialize_ an RDFS graph, means adding all triples which *must* exist.

For instance, materializing the given _TBOX_ yields the following triples to be added:

```
(faculty, rdfs:type, Class)
(professor, rdf:type, Class)
(professor, rdfs:subClassOf, employee)
(lectures, rdf:type, rdf:Property)
```

And now for the _ABOX_, we get:

```
(professor1, rdf:type, professor)
(course1, rdf:type, course)
(professor1, teaches, course1)
(professor1, rdf:type, faculty)
(professor1, rdf:type, employee)
```

As it can be seen, there are is no more *knowledge* that can be inferred.

## Experiments | Thesis
In this article https://ceur-ws.org/Vol-3337/semrec_paper4.pdf the authors propose a way to __learn__ how to materialize RDFS graphs in a way that could be transferred to other unseen RDFS graphs.

This is not as hard of a task as it seems, because RDFS materialization has a __fixed__ set of semantic nodes (the ones with prefix rdf: and rdfs:). This is different from attempting to infer
from natural language.

### Task 1

The code of that article lives here: https://github.com/Monireh2/kg-deductive-reasoner/tree/master

Convert what is in `memn2n` to use `tensorflow` in a way that it could be just dropped in here: https://github.com/Monireh2/kg-deductive-reasoner/blob/master/train_test_kg_reasoner.py

Run the tests, verify that your solution has the same result as the from-scratch one.