# Knowledge Enhanced Neural Networks - Tutorial Notebook
With this notebook we present a simple application of KENN on the Citeseer Dataset, where relational logical knowledge is employed to improve the predictions of a baseline NN.

In [1]:
import pandas as pd 
import tensorflow as tf
import numpy as np 

from tensorflow import keras
import matplotlib.pyplot as plt
from tensorflow.keras import Model
from tensorflow.keras import layers
from KENN2.parsers import relational_parser
from tensorflow.keras.activations import softmax

In [2]:
# SET RANDOM SEED for tensorflow and numpy
random_seed = 0
tf.random.set_seed(random_seed)
np.random.seed(random_seed)

## The Citeseer Dataset
The Citeseer Dataset consists of 
- **3312 scientific publications** classified into one of six classes. 
- The citation network consists of **4732 links**. 
- Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of **3703 unique words**. 

The task is to **correctly classify each scientific publication**, given in input the features for each sample and the relational information provided by the citation network.

Here we use the **Inductive** paradigm: we consider only the edges (x,y) such that both x and y are in the Training/Validation/Test Set;

## Data Representation
![title](imgs/data_repr.png)
We represent all the features as a matrix with $n$ rows and $k$ columns, where $n$ is the number of nodes and $k$ is the number of features.

The relational data (i.e. the relations between the nodes of the network) are summarized in a table of index couples: for each edge in the network, we store the index of the first node and the index of the second node in a single row of the **indexes** matrix. The **relations** table contains the truth value of each ordered couple in the indexes matrix; note that we consider only couples of connected nodes. Furthermore, note that with "truth value" we mean a number in the range $[-\infty, \infty]$, since, inside the architecture of KENN, those are considered as the preactivations of the final predictions.

In [3]:
# IMPORT FEATURES
training_features = np.genfromtxt('dataset/CiteSeer/training_features.csv', delimiter=',')
validation_features = np.genfromtxt('dataset/CiteSeer/validation_features.csv', delimiter=',')
test_features = np.genfromtxt('dataset/CiteSeer/test_features.csv', delimiter=',')

# IMPORT LABELS
training_labels = np.genfromtxt('dataset/CiteSeer/training_labels.csv', delimiter=',')
validation_labels = np.genfromtxt('dataset/CiteSeer/validation_labels.csv', delimiter=',')
test_labels = np.genfromtxt('dataset/CiteSeer/test_labels.csv', delimiter=',')

# IMPORT EDGES INDEXES
indexes_training = np.genfromtxt('dataset/CiteSeer/indexes_training.csv', delimiter=',', dtype=int)
indexes_validation = np.genfromtxt('dataset/CiteSeer/indexes_validation.csv', delimiter=',', dtype=int)
indexes_test = np.genfromtxt('dataset/CiteSeer/indexes_test.csv', delimiter=',', dtype=int)

# IMPORT RELATIONS
relations_training = np.genfromtxt('dataset/Citeseer/relations_training.csv', delimiter=',')
relations_validations = np.genfromtxt('dataset/CiteSeer/relations_validation.csv', delimiter=',')
relations_test = np.genfromtxt('dataset/Citeseer/relations_test.csv', delimiter=',')

# make relations arrays column vectors
relations_training = np.expand_dims(relations_training, axis=1)
relations_validations = np.expand_dims(relations_validations, axis=1)
relations_test = np.expand_dims(relations_test, axis=1)

n_features = training_features.shape[1]

## Experiment Setup
![title](imgs/experiment_setup.png)


In this example, the relational data is injected directly from the citation network. KENN uses this data to increase the truth value of each clause that is given as input in the prior knowledge.
Specifically, in this case, the knowledge codifies the idea that papers cite works that are related to them (i.e. the topic of a paper is often the same of the paper it cites). 

For this reason we instantiate the clause:
$$\forall x \forall y \quad T(x) \land Cite(x,y) \rightarrow T(y)$$
multiple times, for all the topics $T$.

## Define the models
Here we define a Standard Sequential Model, and a Relational KENN model with Tensorflow Subclassing.

In [4]:
class Standard(Model):
    def __init__(self):
        super(Standard, self).__init__()

    def build(self, input_shape):
        self.h1 = layers.Dense(50, input_shape=input_shape, activation='relu')
        self.d1 = layers.Dropout(0.5)
        self.h2 = layers.Dense(50, input_shape=(50,), activation='relu')
        self.d2 = layers.Dropout(0.5)
        self.h3 = layers.Dense(50, input_shape=(50,), activation='relu')
        self.d3 = layers.Dropout(0.5)

        self.last_layer = layers.Dense(
            6, input_shape=(50,), activation='linear')

    def preactivations(self, inputs):
        x = self.h1(inputs)
        x = self.d1(x)
        x = self.h2(x)
        x = self.d2(x)
        x = self.h3(x)
        x = self.d3(x)

        return self.last_layer(x)
        
    def call(self, inputs, **kwargs):
        z = self.preactivations(inputs)

        return z, softmax(z)

In [5]:
class Kenn(Standard):
    """
    Model with 3 KENN layers.
    for each individual clause enhancer.
    """

    def __init__(self, knowledge_file, *args, **kwargs):
        super(Kenn, self).__init__(*args, **kwargs)
        self.knowledge = knowledge_file

    def build(self, input_shape):
        super(Kenn, self).build(input_shape)
        self.kenn_layer_1 = relational_parser(self.knowledge)
        self.kenn_layer_2 = relational_parser(self.knowledge)
        self.kenn_layer_3 = relational_parser(self.knowledge)

    def call(self, inputs, **kwargs):
        features = inputs[0]
        relations = inputs[1]
        sx = inputs[2]
        sy = inputs[3]

        z = self.preactivations(features)
        z, _ = self.kenn_layer_1(z, relations, sx, sy)
        z, _ = self.kenn_layer_2(z, relations, sx, sy)
        z, _ = self.kenn_layer_3(z, relations, sx, sy)

        return softmax(z)

## Train the model

In [6]:
# Training parameters
verbose = True
n_epochs = 300

# Early Stopping parameters
min_delta = 0.001
es_patience = 10

In [7]:
def accuracy(predictions, labels):
    correctly_classified = tf.equal(
        tf.argmax(predictions, 1), tf.argmax(labels, 1))
    return tf.reduce_mean(tf.cast(correctly_classified, tf.float32))

def callback_early_stopping(AccList, min_delta=min_delta, patience=es_patience):
    """
    Takes as argument the list with all the validation accuracies. 
    If patience=k, checks if the mean of the last k accuracies is higher than the mean of the 
    previous k accuracies (i.e. we check that we are not overfitting). If not, stops learning.
    """
    # No early stopping for 2*patience epochs
    if len(AccList)//patience < 2:
        return False
    # Mean loss for last patience epochs and second-last patience epochs
    mean_previous = np.mean(AccList[::-1][patience:2*patience])
    mean_recent = np.mean(AccList[::-1][:patience])
    delta = mean_recent - mean_previous

    if delta <= min_delta:
        print(
            "*CB_ES* Validation Accuracy didn't increase in the last %d epochs" % (patience))
        print("*CB_ES* delta:", delta)
        return True
    else:
        return False

In [8]:
# Trains the base NN
def train_standard_nn():
    """
    Trains Standard model with the Training Set, validates on Validation Set
    and evaluates accuracy on the Test Set.
    """
    optimizer = keras.optimizers.Adam()
    loss = keras.losses.CategoricalCrossentropy(from_logits=False)

    train_losses = []
    valid_losses = []
    valid_accuracies = []
    train_accuracies = []

    # TRAIN AND EVALUATE STANDARD MODEL
    for epoch in range(n_epochs):
        with tf.GradientTape() as tape:
            _, predictions = standard_model(training_features)
            training_loss = loss(predictions, training_labels)

            gradient = tape.gradient(training_loss, standard_model.variables)
            optimizer.apply_gradients(zip(gradient, standard_model.variables))


        _, t_predictions = standard_model(training_features)
        t_loss = loss(t_predictions,training_labels)

        _, v_predictions = standard_model(validation_features)
        v_loss = loss(v_predictions, validation_labels)

        train_losses.append(t_loss.numpy())
        valid_losses.append(v_loss.numpy())

        t_accuracy = accuracy(t_predictions, training_labels)
        v_accuracy = accuracy(v_predictions, validation_labels)

        train_accuracies.append(t_accuracy.numpy())
        valid_accuracies.append(v_accuracy.numpy())

        if verbose and epoch % 10 == 0:
            print(
                "Epoch {}: Training Loss: {:5.4f} Validation Loss: {:5.4f} | Train Accuracy: {:5.4f} Validation Accuracy: {:5.4f};".format(
                    epoch, t_loss, v_loss, t_accuracy, v_accuracy))
        
        # Early Stopping
        stopEarly = callback_early_stopping(valid_accuracies)
        if stopEarly:
            print("callback_early_stopping signal received at epoch= %d/%d" %
                    (epoch, n_epochs))
            print("Terminating training ")
            break

    return (train_losses,
            valid_losses,
            valid_accuracies,
            train_accuracies)

In [9]:
# Define and build model
standard_model = Standard()
standard_model.build((n_features,))

#Train
history_standard = train_standard_nn()

Epoch 0: Training Loss: 13.3542 Validation Loss: 13.4059 | Train Accuracy: 0.3598 Validation Accuracy: 0.2687;
Epoch 10: Training Loss: 12.2838 Validation Loss: 13.1026 | Train Accuracy: 0.8674 Validation Accuracy: 0.3881;
Epoch 20: Training Loss: 8.8543 Validation Loss: 11.9123 | Train Accuracy: 0.9432 Validation Accuracy: 0.4478;
Epoch 30: Training Loss: 3.1537 Validation Loss: 10.3593 | Train Accuracy: 0.9924 Validation Accuracy: 0.5672;
Epoch 40: Training Loss: 0.3242 Validation Loss: 9.5506 | Train Accuracy: 1.0000 Validation Accuracy: 0.5075;
*CB_ES* Validation Accuracy didn't increase in the last 10 epochs
*CB_ES* delta: -0.01791042
callback_early_stopping signal received at epoch= 44/300
Terminating training 


In [10]:
# plt.plot(history_standard[3], label='Train Accuracy')
# plt.plot(history_standard[2], label='Validation Accuracy')
# plt.legend(loc='best')
# plt.show()

In [11]:
_, predictions_test = standard_model(test_features)
test_accuracy = accuracy(predictions_test, test_labels)
print("Standard model Test Accuracy: {:.5f}%".format(test_accuracy.numpy() * 100))

Standard model Test Accuracy: 52.53271%


In [12]:
#### TRAIN KENN model
def train_kenn():
    optimizer = keras.optimizers.Adam()
    loss = keras.losses.CategoricalCrossentropy(from_logits=False)

    train_losses = []
    valid_losses = []
    valid_accuracies = []
    train_accuracies = []

    for epoch in range(n_epochs):
        with tf.GradientTape() as tape:

            predictions_KENN = kenn_model(
                [training_features, relations_training, np.expand_dims(indexes_training[:,0], axis=1), np.expand_dims(indexes_training[:,1], axis=1)])

            l = loss(predictions_KENN, training_labels)

            gradient = tape.gradient(l, kenn_model.variables)
            optimizer.apply_gradients(zip(gradient, kenn_model.variables))

        t_predictions = kenn_model(
                [training_features, relations_training, np.expand_dims(indexes_training[:,0], axis=1), np.expand_dims(indexes_training[:,1], axis=1)])
        t_loss = loss(t_predictions, training_labels)



        v_predictions = kenn_model([validation_features, relations_validations, np.expand_dims(indexes_validation[:,0], axis=1), np.expand_dims(indexes_validation[:,1], axis=1)])
        v_loss = loss(v_predictions, validation_labels)

        train_losses.append(t_loss)
        valid_losses.append(v_loss)

        t_accuracy = accuracy(t_predictions, training_labels)
        v_accuracy = accuracy(v_predictions, validation_labels)

        train_accuracies.append(t_accuracy)
        valid_accuracies.append(v_accuracy)

        if verbose and epoch % 10 == 0:
            print(
                "Epoch {}: Training Loss: {:5.4f} Validation Loss: {:5.4f} | Train Accuracy: {:5.4f} Validation Accuracy: {:5.4f};".format(
                    epoch, t_loss, v_loss, t_accuracy, v_accuracy))

            # Early Stopping
        stopEarly = callback_early_stopping(valid_accuracies)
        if stopEarly:
            print("callback_early_stopping signal received at epoch= %d/%d" %
                    (epoch, n_epochs))
            print("Terminating training ")
            break
    return (train_losses,
        valid_losses,
        valid_accuracies,
        train_accuracies)

In [13]:
kenn_model = Kenn('knowledge_base')
kenn_model.build((n_features,))

history_kenn = train_kenn()

Epoch 0: Training Loss: 13.3305 Validation Loss: 13.4524 | Train Accuracy: 0.3220 Validation Accuracy: 0.1940;
Epoch 10: Training Loss: 12.1357 Validation Loss: 13.2729 | Train Accuracy: 0.8561 Validation Accuracy: 0.2836;
Epoch 20: Training Loss: 8.8251 Validation Loss: 12.6146 | Train Accuracy: 0.9508 Validation Accuracy: 0.2836;
Epoch 30: Training Loss: 3.4365 Validation Loss: 11.2155 | Train Accuracy: 1.0000 Validation Accuracy: 0.4627;
Epoch 40: Training Loss: 0.3959 Validation Loss: 9.3138 | Train Accuracy: 1.0000 Validation Accuracy: 0.5821;
Epoch 50: Training Loss: 0.0516 Validation Loss: 8.1978 | Train Accuracy: 1.0000 Validation Accuracy: 0.6418;
Epoch 60: Training Loss: 0.0153 Validation Loss: 7.8132 | Train Accuracy: 1.0000 Validation Accuracy: 0.6269;
*CB_ES* Validation Accuracy didn't increase in the last 10 epochs
*CB_ES* delta: -1.1920929e-07
callback_early_stopping signal received at epoch= 64/300
Terminating training 


In [14]:
ind_x = np.expand_dims(indexes_test[:,0], axis=1)
ind_y = np.expand_dims(indexes_test[:,1], axis=1)

predictions_test_kenn = kenn_model(
    [test_features, relations_test, ind_x, ind_y])

test_accuracy_kenn = accuracy(predictions_test_kenn, test_labels)
print("KENN model Test Accuracy: {:.5f}%".format(test_accuracy_kenn.numpy() * 100))

KENN model Test Accuracy: 62.05971%


In [15]:
# plt.plot(history_kenn[3], label='Train Accuracy')
# plt.plot(history_kenn[2], label='Validation Accuracy')
# plt.legend(loc='best')
# plt.show()