# Logic Tensor Networks
This notebook goes through some key concepts in logic tensor networks. Some of the content is modified from the excellent tutorials and examples available at https://github.com/tommasocarraro/LTNtorch. We will start off by going through the basics of constants, predicates, variables, and training, and finish up with an exercise implementing the running example of friends and movie ratings.

In [22]:
import ltn
import torch
import numpy as np

## Constants

Constants are vectors, matrices, or tensors. To define a vector constant of value $(3.4, 6.2)$, we first create a PyTorch tensor, and wrap that in a `ltn.Constant` constructor.

In [23]:
c1_tensor = torch.tensor([3.4, 6.2])
c1 = ltn.Constant(c1_tensor)

We can also simply do that in one step, or with higher rank tensors:

In [24]:
c1 = ltn.Constant(torch.tensor([3.4, 6.2]))
c2 = ltn.Constant(torch.tensor([[3.4, 7.6], [6.2, 4.5]]))

Create a constant with value $(1.3, 9.8)$ and one with value $\begin{pmatrix}
4.5 & 8 & -2.5\\
2 & -1.5 & 2.1
\end{pmatrix}$

In [25]:
c3 = ltn.Constant(torch.tensor([1.3, 9.8]))

c4 = ltn.Constant(torch.tensor([[4.5, 8, -2.5],[2, -1.5, 2.1]]))

Sometime we will want to be able to train the constants (e.g. if we are learning embeddings). We can set constants to be trainable using the argument `trainable=True`

In [26]:
c1 =  ltn.Constant(torch.tensor([3.4, 6.2]), trainable=True)

Create a constant with value $(1.3, 9.8)$ and one with value $\begin{pmatrix}
4.5 & 8 & -2.5\\
2 & -1.5 & 2.1
\end{pmatrix}$ which are both set to be trainable.

We can get the value of a constant using the `value` attribute:

In [27]:
print(c1.value)

tensor([3.4000, 6.2000], requires_grad=True)


Get the value of `c2`. How is it different from `c1`?

In [28]:
print(c2.value)

tensor([[3.4000, 7.6000],
        [6.2000, 4.5000]])


## Predicates

The predicates we use will most often be feedforward neural networks. We can build these easily in PyTorch.

In [31]:
# We build a feedforward neural network by creating a class
# that extends `torch.nn.Module`

class ModelP2(torch.nn.Module):
    """For more info on how to use torch.nn.Module:
    https://pytorch.org/docs/stable/generated/torch.nn.Module.html"""
    # define how to initialise the network
    def __init__(self):
        super(ModelP2, self).__init__()
        # We set up the activation functions
        self.elu = torch.nn.ELU()
        self.sigmoid = torch.nn.Sigmoid()
        # We define the weight matrices
        self.dense1 = torch.nn.Linear(4, 5)
        self.dense2 = torch.nn.Linear(5, 1) # returns one value in [0,1]
    
    # Now we define the forward pass
    def forward(self, x):
        x = self.elu(self.dense1(x))
        return self.sigmoid(self.dense2(x))

# Set up an object as an instance of the class
modelP2 = ModelP2()
# Finally, wrap it with `ltn.Predicate`
P2 = ltn.Predicate(model=modelP2)

Create a predicate called P3, using a feedforward neural network, that takes in a 5 dimensional input, has one hidden layer of size 8, another hidden layer of size 10, and output of size 1. Activation functions at the hidden layers are  ELU, and at the output layer is a sigmoid.

In [51]:
class ModelP3(torch.nn.Module):
    # define how to initialise the network
    def __init__(self):
        super(ModelP3, self).__init__()
        # We set up the activation functions
        self.elu = torch.nn.ELU()
        self.sigmoid = torch.nn.Sigmoid()
        # We define the weight matrices
        self.dense1 = torch.nn.Linear(5, 8)
        self.dense2 = torch.nn.Linear(8, 10)
        self.dense3 = torch.nn.Linear(10, 1)
        # Set up the activation functions
        
        # Define the weight matrices
        
    # Define the forward pass
    def forward(self, x):
        x = self.elu(self.dense1(x))
        x = self.elu(self.dense2(x))
        return self.sigmoid(self.dense3(x))
    
# Set up an object as an instance of the class
modelP3 = ModelP3()
# Finally, wrap it with `ltn.Predicate`
P3 = ltn.Predicate(model=modelP3)

## Variables

Variables are lists of values. They can be implemented as below:

In [52]:
x = ltn.Variable('x', torch.randn((10, 2)))
y = ltn.Variable('y', torch.randn((5, 2)))

Create a variable called `z` containing 6 randomly initialised vectors of size 4.

In [53]:
z = ltn.Variable('z', torch.randn(6, 4))

Apply the predicate `P2` to the variable and print the output. What do you notice about the shape of the output? Does it make sense to you?

In [57]:
P2(z)

LTNObject(value=tensor([0.5251, 0.6094, 0.4433, 0.4807, 0.5444, 0.6796],
       grad_fn=<ViewBackward0>), free_vars=['z'])

We can also create variables as stacks of trainable constants. For example:

In [58]:
var_dict = {}
for i, zi in enumerate(z.value):
    var_dict[i] = ltn.Constant(zi, trainable=True)
    
var_x = ltn.Variable("var_x", torch.stack([i.value for i in var_dict.values()]))
    

## Connectives

Fuzzy operators in LTN have all been implemented. We can construct connectives as follows:

In [59]:
Not = ltn.Connective(ltn.fuzzy_ops.NotStandard())
And = ltn.Connective(ltn.fuzzy_ops.AndProd())
Or = ltn.Connective(ltn.fuzzy_ops.OrProbSum())
Implies = ltn.Connective(ltn.fuzzy_ops.ImpliesReichenbach())

How would you construct a set of connectives that use the Lukasiewicz operators? Hint: look at documentation here: https://logictensornetworks.github.io/LTNtorch/fuzzy_ops.html

## Learning embeddings with LTNs

We are going to build an extended model of the users/film example we have been looking at in class.

We will have 8 users: $a$, $b$, $c$, $d$, $e$, $f$, $g$, $h$.

and 4 films: $j$, $k$, $l$, $m$.

Our **domains** are $people$ and $films$.

**Variables** are $x$, with domain $people$, $y$ with domain $people$ and $u$ with domain $films$

We will have predicates $F(x,y)$ for *friends* and $L(x, u)$ for *likes*

Question: What is the input domain of $F$? What is the input domain of $L$?

**Axioms**
$\mathcal{F} = \{(a, b), (a, c), (b,d), (c, d), (e, f), (e, g), (e, h), (f, h)\}$

$\mathcal{L} = \{(a, j), (a, l),  (b, l), (c, l), (e, k), (f, k), (g, k), (g, m), (h, m)\}$

- $F(x, y)$ for $(x, y) \in \mathcal{F}$
- $\neg F(x, y)$ for $(x, y) \not\in \mathcal{F}, u < v$
- $L(x, u)$ for $(x, u) \in \mathcal{L}$
- $\neg L(c, j)$, $\neg L(b, m)$, $\neg L(c, m)$, $\neg L(h, k)$
- $\forall x \neg F(x, x)$
- $\forall xyu (F(x, y) \land L(x, u)) \implies L(y, u)$

We can see this is not strictly logically satisfiable, since we have $F(a, c)$, $L(a, j)$, but also $\neg L(c, j)$

#### Grounding
- $\mathcal{G}(people) = \mathbb{R}^5$
- $\mathcal{G}(films) = \mathbb{R}^3$
- $\mathcal{G}(a) = \vec{a},..., \mathcal{G}(h)= \vec{h} \in \mathbb{R}^5$
- $\mathcal{G}(j) = \vec{j},..., \mathcal{G}(m)= \vec{m} \in \mathbb{R}^3$
- $\mathcal{G}(x) = \mathcal{G}(y)= [\vec{a},...,\vec{h}]$
- $\mathcal{G}(u) = [\vec{k},...,\vec{m}]$
- $\mathcal{G}(F)$ is a function (feedforward neural network) from $x, y$ to $[0, 1]$. This is the truth value for whether $x$ and $y$ are friends.
- $\mathcal{G}(L)$ is a function (feedforward neural network) from $x, u$ to $[0, 1]$. This is the truth value for whether $x$ likes film $u$.

#### Data
We will start off by randomly initializing vectors for the people and the movies. 

We will then go on to define the knowledge we have about who is friends with who and who likes which movie.

In [60]:
import ltn
import torch

# Specify values for the dimensions of the people embeddings and the movie embeddings


# Initialize a dictionary of the form {'a': trainable ltn constant, ...} for each of the people

# Initialize a dictionary of the form {'j': trainable ltn constant, ...} for each of the movies

# For each of 'friends' and 'movies', initialize a list of tuples of strings that specify the relations
# 'friends' and 'movies'


In [61]:
# We define the predicates F and L as feedforward neural networks
class ModelF(torch.nn.Module):
    def __init__(self):
        super(ModelF, self).__init__()
        # Specify ELU and sigmoid activation functions

        # Specify 3 layers, one with input dimension suitable for the 'friends' predicate
        # and output dim 16, one 16 to 16, and one 16 to 1

    def forward(self, *x):
        # Specify the forward pass with ELU on the hidden layers and sigmoid on the output
    

class ModelL(torch.nn.Module):
    def __init__(self):
        super(ModelL, self).__init__()
        # Specify ELU and sigmoid activation functions

        # Specify 3 layers, one with input dimension suitable for the 'likes' predicate
        # and output dim 16, one 16 to 16, and one 16 to 1

    def forward(self, *x):
        # Specify the forward pass with ELU on the hidden layers and sigmoid on the output

# Wrap the models in ltn.Predicate

# Define connectives, quantifiers, and SatAgg using the product configuration


IndentationError: expected an indented block (2601511706.py, line 14)

Our aim is to modify the vectors  in *people* and *movies*, and the parameters of the predicates, so that the axioms in the knowledge based are maximally satisfied.

We can also define some queries. For example we can ask whether if two people like the same movie, then they are friends. How would you write this out?

$\phi_1 = \forall x, y, u (L(x, u) \land L(y, u)) \implies F(x, y)$

Do you predict that the following will have a high or a low truth value in this knowledge base after training?

$\phi_2 = \forall x, y, u (F(x, y) \land \neg L(x, u)) \implies L(y, u)$


In [62]:
# this function returns the satisfaction level of the logical formula phi1
def phi1():
    # Create variables p, q, and r and initialize with the values from 'people' and 'movies'

    # Return the truth value of phi1

# this function returns the satisfaction level of the logical formula phi2
def phi2():
    # Create variables p, q, and r and initialize with the values from 'people' and 'movies'

    # Return the truth value of phi1


IndentationError: expected an indented block (2469519695.py, line 8)

We now set up the training loop to train the embeddings for `people` and `movies`

In [63]:
# We have to optimize the parameters of the three predicates and also of the embeddings
params = list(F.parameters()) + list(L.parameters()) +\
            [i.value for i in people.values()] + [i.value for i in movies.values()]
optimizer = torch.optim.Adam(params, lr=0.001)

# Set up a training loop for 1000 epochs
for epoch in range(1000):
    # set a variable p_exist to be 1 upto 200 epochs and 6 thereafter

    
    optimizer.zero_grad()
    
    # create variables x_, y_, and z_, grounded with values from the `people` dictionary
    """
    NOTE: we update the embeddings at each step
        -> we should re-compute the variables.
    """
    

    # Set up a variable sat_agg which is the result of aggregating the truth values of all the axioms
    sat_agg = SatAgg(

        
        #Axioms about friends
 
        # Likes:

        # friendship is anti-reflexive (set p=5)

        # friendship is symmetric (set p=5)

        # everyone has a friend

        # Friends like similar movies
    )
    
    loss = 1. - sat_agg
    loss.backward()
    optimizer.step()

    # we print metrics every 20 epochs of training
    if epoch % 20 == 0:
        print(" epoch %d | loss %.4f | Train Sat %.3f | Phi1 Sat %.3f | Phi2 Sat %.3f" % (epoch, loss,
                    sat_agg, phi1(), phi2()))


NameError: name 'F' is not defined

### Visualizing results

We wil now plot the  starting state and the state after training

In [64]:
import pandas as pd
import numpy as np
import math
import matplotlib.pyplot as plt

# Display options for pandas
pd.options.display.max_rows=999
pd.options.display.max_columns=999
pd.set_option('display.width',1000)
pd.options.display.float_format = '{:,.2f}'.format

# Heatmap function
def plt_heatmap(df, vmin=None, vmax=None):
    plt.pcolor(df, vmin=vmin, vmax=vmax)
    plt.yticks(np.arange(0.5,len(df.index),1),df.index)
    plt.xticks(np.arange(0.5,len(df.columns),1),df.columns)
    plt.colorbar()


In [65]:
# Create a dataframe holding facts about friends before training. Set the value to 1 if x< y and (x, y) is in friends, 
# 0 if x<y and (x, y) not in friends, and nan otherwise 


# Create a dataframe holding facts about likes before training. Set the value to 1 if (x, y) is in likes, 
# and nan otherwise 


In [66]:
# Initialize variables p, q, and r with the values from people for p and q and movies for r


In [67]:
# Call the predicate F on variables p and q and set up a dataframe with the resulting value


# Call the predicate L on variables p and r and set up a dataframe with the resulting value


In [68]:
# Plot the facts about friends as a heatmap


In [69]:
# Plot the facts about likes as a heatmap


In [70]:
# Plot the truth values after training as a heatmap


In [71]:
# Plot the truth values for likes after training as a heatmap
