This project is part of the course on databases taught by Mr. Sylvain Salvati, and it’s about first order logic as a query language. Throughout this project, we will first propose a naive representation of the evaluation of first order formulas in Python, and then, in order to push this approach further, we will look for a deep representation and evaluation of these formulas by compiling FOL formulas to a Non-recursive Datalog program. A useful description can be found in here.
Interesting concepts and reminders of database foundations can be found here
The data used during this project can be found here.
In this subsection we will try our first (’naive’) approach to represent and evaluate first order formulas. In order to do that, we will use the fact that most of modern languages are higher- order in the sense that they allow functions to take functions as argument. Therefore, we will implement first order quantifiers using this feature so that we can write in our programming language first order formulae as follow (this method is called shallow embedding. For example, the following query:
forall(lambda x: not artist(x) or (artist(x) and (actor(x) or director(x))))
refers to the statement: "Each artist is either an actor or a film director.", and this yields false, which means that there is an artist who is neither an actor nor a film director.
Rather than using simple function, the elements of the model will be represented as Python classes, in other words, each of the elements in our model will be a Python object. Each one of these objects will have a method accept, that will accept a visitor. In fact, we will be using Design - Visitor Pattern to evaluate first order formulas as well as in other processes, which we will see later, such as finding Free variables or Range-restricted variables of a given formula. Each one of those visitors can run visitor operations over any set of elements without figuring out their concrete classes. The accept operation directs a call to the appropriate operation in the visitor object.
The main steps of this section are:
- Implementation of Range Restricted Interpretation (RRI) and Free Variables (FV) visitors.
- Renoval of Forall & handling Negation (Not) : put Not as low as possible in a formulae !
- Represent Datalog programs.
- Compile FOL to datalog.
- Unfolding in datalog. (unfold rules that have variables that are not range restricted).
- Test range restrection for datalog rules.
- Compute the right datalog program.
- Execution of the datalog program.
For more details, check out the project report.