# Introduction to Probability Theory Using ProbLog

This notebook is intended to provide a soft introduction to the **ProbLog**'s probabilistic logic programming framework through the implementation of  basic concepts of **Probability Theory**. 

The official reference for the ProbLog2 system is available at [https://dtai.cs.kuleuven.be/problog/index.html](https://dtai.cs.kuleuven.be/problog/index.html).

Documentation can be found at [http://problog.readthedocs.io](http://problog.readthedocs.io).

This tutorial is organized as follows:
1. [Installation](#1.-Installation)
2. [Hello, ProbLog!](#2.-Hello,-ProbLog!)
3. [Probability Theory](#3.-Probability-Theory)
4. [Sampling](#4.-Sampling)


## 1. Installation

Installation of ProbLog2 framework is pretty easy using the pip tool:

In [1]:
!pip3 install problog



To check the installation just enter the following command into the terminal:

In [2]:
!problog --version

2.1.0.18


## 2. Hello, ProbLog!

First of all, in order to access the ProbLog2's API we need to import some modules:

In [3]:
from problog.program import PrologString
from problog import get_evaluatable

The next thing to do is to define the probabilistic model. In this case, we'll use the ```PrologString``` function to create the model from a string. Later, we'll see how to build it from ProbLog's ```Term``` objects.

In [4]:
model = PrologString(
'''
0.8::p.
0.1::q.
r :- not(p), q.
evidence(q).
query(r).
'''
)

Finally, to run the model we need to build an evaluatable structure from our model and call its ```evaluate``` method.

In [5]:
get_evaluatable().create_from(model).evaluate()

{r: 0.19999999999999998}

In the rest of this tutorial, we shall use these helper functions:

In [6]:
def run(model_str):
    model = PrologString(model_str)
    print(get_evaluatable().create_from(model).evaluate())

## 3. Probability Theory

### 3.1 Random Variables and Probabilistic Facts

We start by one of the simplest concepts in probability theory: *random variables*.

In ProbLog, we define boolean random variables through **probabilistic facts**. A probabilistic fact is simply a logical fact annotated with a probability measure that relates to the likelihood of the logical fact being true in a random world. We call this probability the **success probability** of the boolean random variable.

In [7]:
m1 = '''
0.65::a.
0.10::b.
query(a).
query(b).
'''
run(m1)

{a: 0.65, b: 0.10000000000000002}


### 3.2 Possible Worlds and Logical Rules

Probabilistic facts are all independent of each other and a complete truth assignment to all of them in a ProbLog program defines a **possible world** with probability given by the product of the success probabilities of the facts that are true in the assignment and the failure probabilities of the facts that are false in the assignment.

In order to compute the probability of a possible world we need to be able to compute the join probability of all probabilistic facts. The standard way to compute join probabiities is to use **logical rules**. A logical rule contains a **head** atom and a set of **body** atoms.

Note how the model ```m2``` defines the atom ```w1``` representing a possible world in which both atoms ```a``` and ```b``` are true. The rule ```w1 :- a, b.``` has the atom ```w1``` as its head and the atoms ```a``` and ```b``` as body atoms.

In [8]:
m2 = '''
0.65::a.
0.10::b.
w1 :- a, b.
query(w1).
'''
run(m2)

{w1: 0.06500000000000003}


### 3.3 Negation

So far, we've been able to compute the success probability of atoms. Nevertheless, one may be interested in computing the complementary probability of some random variables, which is sometimes called the **failure probability**. 

In order to compute the failure probability of a given random variable, we need to introduce the a fundamental concept: **negation**. In ProbLog (as in Logic Programming), negation is **negation as failure**, meaning that a literal of the form ```not(p)``` is only true if we cannot assert the truth of atom ```p``` given all the program's facts and background knowledge.

Note that the model ```m3``` defines logical rules whose bodies contain negated atoms. This is the only way to compute the failure probability of logical atoms.

In [9]:
m3 = '''
0.65::a.
0.10::b.
not_a :- not(a).
not_b :- not(b).
query(not_a).
query(not_b).
'''
run(m3)

{not_a: 0.3499999999999999, not_b: 0.9}


With negation, we can now compute the whole joint distribution over logical atoms ```a``` and ```b```.

In [10]:
m4 = '''
0.65::a.
0.10::b.
w1 :- a, b.
w2 :- a, not(b).
w3 :- not(a), b.
w4 :- not(a), not(b).
query(w1).
query(w2).
query(w3).
query(w4).
'''
run(m4)

{w1: 0.06500000000000003, w2: 0.585, w3: 0.035, w4: 0.31499999999999995}


### 3.3 Conditional Probability and Probabilistic Rules

In ProbLog, we can also define conditional probabilities using **probabilistic rules**. A probabilistic rule is a logical rule annotated with a probability. It can be shown that this is equivalent to defining an auxiliary (rather unique) probabilistic fact and adding it to the body of the original logical rule.

In [11]:
m5 = '''
0.2::p.
0.3::q.
0.8::r :- p.
0.6::r :- not(p), q.
0.1::r :- not(p), not(q).
evidence(p, false).
evidence(q, true).
query(r).
'''
run(m5)

{r: 0.6}


Model ```m5``` is equivalent to model ```m6```:

In [12]:
m6 = '''
0.2::p.
0.3::q.
0.8::aux1.
0.6::aux2.
0.1::aux3.
r :- p, aux1.
r :- not(p), q, aux2.
r :- not(p), not(q), aux3.
evidence(p, false).
evidence(q, false).
query(r).
'''
run(m6)

{r: 0.10000000000000002}


## 4. Sampling

In addition to be able to solve inference tasks, ProbLog framework also allows sampling.

In [13]:
from problog.tasks import sample

model = '''
0.8::a.
0.2::b.
c :- a, b.
query(a).
query(b).
query(c).
'''
print("Exact:")
run(model)

result = list(sample.sample(PrologString(model), n=1000, format='dict'))
count = { atom: 0 for atom in result[0].keys() }
for model_sample in result:
    for atom, value in model_sample.items():
        count[atom] += int(value)
estimates = { atom: positive / len(result) for atom, positive in count.items() }

print("Approximate:")
print(estimates)

Exact:
{a: 0.8, b: 0.2, c: 0.15999999999999998}
Approximate:
{a: 0.784, b: 0.211, c: 0.169}
