<a href="https://colab.research.google.com/github/cric96/DL-exercise/blob/main/test_with_rnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural Networks applied in Aggregate Computing
In this notebook, I tried to apply Neural Network and Recurrent Neural Networks (RNN) in the context of Aggregate Computing (AC).

## Model

Usually, RNN are trained in indipendent sequences. In case of Aggregate Computing, the temporal sequence are correlated with each other following a neighbourhood policy. 

The key idea here is:
- express the system as a graph. Offline we can imagine to access to the entire system
- in each time step, we aggregate data from the neighbours and then we use RNN to compute the right output

# Imports

In [1]:
import tensorflow as tf
from datetime import datetime
from numpy.random import seed

# Simple Graph

In this model, I model graph as:
$G = (N, E)$

Where, $N$ contains a feature vector and an output vector.

$E$ is expressed as adjacency matrix, so: $E_{i,j} = 1$ iff $i$ is neighbour of $j$.

Input is a matrix contains the concatenation of feature vector and output vector, something like: $ I(n) = <f(n), o(n)>$.
Where $<a, b, c>$ means the column-wise fector concatenation. 
For instance, giving a vector $a = [1, 2]$ and $b = [2, 3]$, $<a,b> = [1, 2, 2, 3]$.

The neighbour aggregation data can be computed leveraging matrix multiplication --- so increasing traning performance.
$$ reduction(E * F) $$
Where F contains all node features.
$reduction$ is a multiset function (e.g. summation, mean,...).

Pay attention, this model can be easly used in a decentralized situation.
Indeed, the data aggregation will be done without global adjency matrix but only retrieving neighbour data. The dense layer already works locally.

In [66]:
n = 50000
feature = tf.constant([
                     [1], 
                     [0],
                     [0],
                     [0],
                    ], dtype=tf.float32)
output = tf.constant([
                     [0], 
                     [n],
                     [n],
                     [n],
                    ], dtype=tf.float32)

input = tf.concat([feature, output], axis = 1)
input = input[:,tf.newaxis, :]

neigh = tf.constant([
                     [n, 1, n, n], 
                     [1, n, 1, n],
                     [n, 1, n, 1],
                     [n, n, 1, n],
                    ], dtype=tf.float32)

ground = tf.constant([
                     [0], 
                     [1],
                     [2],
                     [3],
                    ], dtype=tf.float32)

# Forward Pass

Taking the output of previous step, adjecency matrix, and feature vector, this function compute the value of the next evaluation.
So
1. compute neighbourhood feature via aggregation
2. concat neighbourhood feature with local feature and previous output
3. perform a forward pass of a neural network.


In [54]:
def forward(result, neigh, feature, model, log_enable=False):
  if(log_enable):
    print("Input = ", result)
  reshape = tf.reshape(result[:,:, 1], feature.shape[0])
  neigh_evaluation = tf.reduce_min(tf.multiply(neigh, reshape), 1) ## pass reduction strategy
  input_network = tf.concat([result[:,:, 0], neigh_evaluation[:, tf.newaxis]], 1)
  if(log_enable):
    print("Neighbour Aggregation = ", input_network[:, tf.newaxis])
  result = model(input_network[:, tf.newaxis])
  result = tf.reshape(result, feature.shape[:2]) ## adapt the shape in order to work both in linear model and in the recurrent model.
  result = tf.concat([feature, result], axis = 1)
  if(log_enable):
    print("Output = ", result)
  return tf.expand_dims(result, [1])

# Model creation
Create a sequential model given layers and the input shape

In [5]:
def create_model(input_shape, layers):
  input_layer = tf.keras.layers.InputLayer(input_shape=input_shape[1:], batch_size=input_shape[0])
  layers.insert(0, input_layer)
  return tf.keras.Sequential(layers)

## Recurrent model
Simple network creation with multiple recurrent layer 


In [7]:
def instantiate_layers(output_count = 1):
  return [
    tf.keras.layers.GRU(units = 4, activation='relu', return_sequences=True, stateful=True),
    tf.keras.layers.GRU(units = output_count, activation='relu', return_sequences=False, stateful=True, bias_initializer='ones'),
  ]

## Linear Model
Simple network with multiple linear layer

In [8]:
def instantiate_linear_layers(output_count = 1):
  return [
    tf.keras.layers.Dense(units = 16, activation='relu'),
    tf.keras.layers.Dense(units = 8, activation='relu'),
    tf.keras.layers.Dense(units = output_count, activation='relu', bias_initializer='ones')
  ]

## Model instatiation


In [9]:
linear = "linear"
recurrent = "recurrent"
same = "same"

def instatiate(input, mode = "same", output_count = 1):
  if mode == linear:
    return create_model(input.shape, instantiate_linear_layers(output_count))
  elif mode == recurrent:
    return create_model(input.shape, instantiate_layers(output_count))

def copy_model(model, mode = "same", input = "", output_count = 1):
  if mode == linear:
    return model
  elif mode == recurrent:
    change_model = instatiate(input, recurrent, output_count)
    change_model.set_weights(model.get_weights())
    return change_model
mode = linear
model = instatiate(input, linear)

# Train function

In [55]:
## TODO pass input and network
def train(model, forward_fn, data, iteration, stabilise_in, stabilisation_check, loss, optimizer, each=100, verbose=False):
  input, neigh, feature, ground = data
  for j in range(iteration):
    with tf.GradientTape() as tape:
      result = input
      to_backprop = 0
      for i in range(stabilise_in):
        result = forward_fn(result, neigh, feature, model)
      for i in range(stabilisation_check):
        result = forward_fn(result, neigh, feature, model)
        to_backprop += 1 / stabilisation_check * loss(ground, result[:, 0, 1:2])
      model.reset_states()
      
      gradient = tape.gradient(to_backprop, model.weights)
      optimizer.apply_gradients(zip(gradient, model.weights))
    if(j % each == 0):
      if(verbose):
        partial_result = result[:, 0, 1:2]
        tf.print("Ground truth: \n", tf.reshape(ground, ground.shape[0]))
        tf.print("Current prediciton \n", tf.reshape(partial_result, result.shape[0]))
        tf.print("Full output \n", result)
      
      print("Epoch ", j ,"Loss = ", tf.reduce_sum(to_backprop).numpy())


## Train loop
Here I want only to find a function that overfits, so it solve this specific problem.


In [58]:
#seed(42)
#tf.random.set_seed(42)
model = instatiate(input, mode) ## comment to avoid the recomputation of weights

iteration = 2000
stabilise_in = 2
stabilisation_check = 10

loss = tf.losses.mse
optimizer = tf.optimizers.Adam()
data = (input, neigh, feature, ground)
train(model, forward, data, iteration, stabilise_in, stabilisation_check, loss, optimizer, 50)

Epoch  0 Loss =  5.618208
Epoch  50 Loss =  3.8985057
Epoch  100 Loss =  2.0568602
Epoch  150 Loss =  0.51155466
Epoch  200 Loss =  0.05412233
Epoch  250 Loss =  0.006758415
Epoch  300 Loss =  0.0030281323
Epoch  350 Loss =  0.0023959447
Epoch  400 Loss =  0.0026372876
Epoch  450 Loss =  0.0030181608
Epoch  500 Loss =  0.0034021153
Epoch  550 Loss =  0.0050702016
Epoch  600 Loss =  0.0067576747
Epoch  650 Loss =  0.0013892739
Epoch  700 Loss =  0.000451607
Epoch  750 Loss =  0.00015032403
Epoch  800 Loss =  4.7090678e-05
Epoch  850 Loss =  0.07242025
Epoch  900 Loss =  0.05615524
Epoch  950 Loss =  0.04536298
Epoch  1000 Loss =  0.04049966
Epoch  1050 Loss =  0.038068235
Epoch  1100 Loss =  0.036590878
Epoch  1150 Loss =  0.035115395
Epoch  1200 Loss =  0.033584524
Epoch  1250 Loss =  0.032018617


KeyboardInterrupt: 

# Validation
Check the model result

In [74]:
result = input
for i in range(2):
  result = forward(result, neigh, feature, model)
print(result)
model.reset_states()

tf.Tensor(
[[[1.        0.553976 ]]

 [[0.        1.0857401]]

 [[0.        1.0857401]]

 [[0.        1.0857401]]], shape=(4, 1, 2), dtype=float32)


## Save model

In [None]:
model.save('model_dense_2')

# Check result, generalisation
In this part, I try to use the same network in another graph, to see if it can be used in different graphs.

## Linear network

In [12]:
n = 50000
feature_validation = tf.constant([
                     [1], 
                     [0],
                     [0],
                     [0],
                     [0]
                    ], dtype=tf.float32)
output_validation = tf.constant([
                     [0], 
                     [n],
                     [n],
                     [n],
                     [n]
                    ], dtype=tf.float32)
input_validation = tf.concat([feature_validation, output_validation], axis = 1)
input_validation = input_validation[:,tf.newaxis, :]
neigh_validation = tf.constant([
                     [n, 1, n, n, n], 
                     [1, n, 1, n, n],
                     [n, 1, n, 1, n],
                     [n, n, 1, n, 1],
                     [n, n, n, 1, n],
                    ], dtype=tf.float32)

change_model = copy_model(model, mode, input_validation)

result_validation = input_validation
model.reset_states()
for i in range(4):
  result_validation = forward(result_validation, neigh_validation, feature_validation, change_model, False)
print(result_validation)

tf.Tensor(
[[[1.        1.1871182]]

 [[0.        1.3581653]]

 [[0.        1.3921162]]

 [[0.        1.399143 ]]

 [[0.        1.3996701]]], shape=(5, 1, 2), dtype=float32)


## Square like

In [None]:
n = 50000
feature_validation = tf.constant([
                     [1], 
                     [0],
                     [0],
                     [0],
                    ], dtype=tf.float32)
output_validation = tf.constant([
                     [0], 
                     [n],
                     [n],
                     [n],
                    ], dtype=tf.float32)
input_validation = tf.concat([feature_validation, output_validation], axis = 1)
input_validation = input_validation[:,tf.newaxis, :]
neigh_validation = tf.constant([
                     [n, 1, 1, n], 
                     [1, n, n, 1],
                     [1, n, n, 1],
                     [n, 1, 1, n],
                    ], dtype=tf.float32)

change_model = copy_model(model, mode, input_validation)

result_validation = input_validation
model.reset_states() ## if has recurrent layers
for i in range(2):
  result_validation = forward(result_validation, neigh_validation, feature_validation, change_model, False)
print(result_validation)

## Local test

In [None]:
print(model(tf.constant([[0, 100]]))) ## should be ~ 101
print(model(tf.constant([[0, 10]]))) ## should be ~ 11

# Advanced graph data
In this case, we suppose to have a collective state that will evolve accordingly the node execution

In [48]:
n = 100
feature_state = tf.constant([[1],[0],[0],[0]], dtype=tf.float32)
output_state = tf.constant([[0],[n],[n],[n]], dtype=tf.float32)
state = tf.constant([[0],[0],[0],[0]], dtype=tf.float32)
#state = tf.random.uniform((8, 1), minval=0, maxval=1)

input_state = tf.concat([feature_state, output_state, state], axis = 1)
input_state = input_state[:,tf.newaxis, :]

neigh_state = tf.constant([
                     [n, 1, n, n],
                     [1, n, 1, n],
                     [n, 1, n, 1],
                     [n, n, 1, n]
                    ], dtype=tf.float32)
ground_state = tf.constant([[0],[1],[2],[3]], dtype=tf.float32)
## normalisation
# feature_state = feature_state / n
# output_state = output_state / n
# neigh_state = neigh_state / n
# ground_state = ground_state / n

## Forward pass rivisited


In [45]:
def forward_with_state(result, neigh, feature, model_eval, log_enable=False):
  input_shape_data = result[:, :, 1:]
  if(log_enable):
    print("Input = ", result)
  old_output = tf.reshape(result[:,:, 1], feature.shape[0])
  old_state = tf.reshape(result[:,:,2], feature.shape[0])
  max_value = tf.reduce_max(neigh)
  neigh_with_zero = tf.where(neigh == max_value, 0.0, 1.0)
  neigh_output_evaluation = tf.reduce_min(tf.multiply(neigh, old_output), 1) ## pass reduction strategy
  neigh_state_evaluation = tf.reduce_sum(tf.multiply(neigh_with_zero, old_state), 1) ## state evolution strategy
  input_network = tf.concat([result[:,:, 0], neigh_output_evaluation[:, tf.newaxis], neigh_state_evaluation[:, tf.newaxis]], 1)
  if(log_enable):
    print("Neighbour Aggregation = ", input_network[:, tf.newaxis])
  result = model_eval(input_network[:, tf.newaxis])
  result = tf.reshape(result, (input_shape_data.shape[0], input_shape_data.shape[2])) ## adapt the shape in order to work both in linear model and in the recurrent model.
  result = tf.concat([feature, result], axis = 1)
  if(log_enable):
    print("Output = ", result)
  return tf.expand_dims(result, [1])

# Train revisited

In [46]:
#seed(42)
#tf.random.set_seed(42)
model_state = instatiate(input_state, mode, 2) ## comment to avoid the recomputation of weights
iteration = 2000
stabilise_in = 4
stabilisation_check = 10

loss = tf.losses.mse
optimizer = tf.optimizers.Adam()
data_state = (input_state, neigh_state, feature_state, ground_state)
train(model_state, forward_with_state, data_state, iteration, stabilise_in, stabilisation_check, loss, optimizer, 50)

result = input_state
for i in range(10):
  result = forward_with_state(result, neigh_state, feature_state, model_state)
print(result)
model_state.reset_states()

Ground truth: 
 [0 1 2 3]
Current prediciton 
 [0.744017601 0.847516596 0.865683258 1.00766766]
Full output 
 [[[1 0.744017601 1.00143075]]

 [[0 0.847516596 1.00442922]]

 [[0 0.865683258 0.999147832]]

 [[0 1.00766766 1.0463928]]]
Epoch  0 Loss =  5.832711
Ground truth: 
 [0 1 2 3]
Current prediciton 
 [0.312663794 1.1858381 1.35088348 1.20530963]
Full output 
 [[[1 0.312663794 1.3992821]]

 [[0 1.1858381 0.952123523]]

 [[0 1.35088348 1.13881898]]

 [[0 1.20530963 1.10157347]]]
Epoch  50 Loss =  3.774544
Ground truth: 
 [0 1 2 3]
Current prediciton 
 [0.151855707 1.36951458 1.84292841 1.73007667]
Full output 
 [[[1 0.151855707 1.69233215]]

 [[0 1.36951458 1.15353513]]

 [[0 1.84292841 1.43383014]]

 [[0 1.73007667 1.34738517]]]
Epoch  100 Loss =  1.7968051
Ground truth: 
 [0 1 2 3]
Current prediciton 
 [0.243330419 1.40868688 2.19076133 2.34219766]
Full output 
 [[[1 0.243330419 1.72881305]]

 [[0 1.40868688 1.07939732]]

 [[0 2.19076133 1.50950706]]

 [[0 2.34219766 1.50494123]]]


KeyboardInterrupt: 

# Validation revisited

In [52]:
result = input_state
for i in range(10):
  result = forward_with_state(result, neigh_state, feature_state, model_state, False)
  
print(result)

tf.Tensor(
[[[0.        4.4675436 1.7458205]]

 [[0.        4.3244076 1.6575043]]

 [[0.        4.299906  1.6525524]]

 [[0.        4.4896474 1.7541965]]], shape=(4, 1, 3), dtype=float32)
