## Before training the network

##### 1. Download the QM9 data at: https://figshare.com/collections/Quantum_chemistry_structures_and_properties_of_134_kilo_molecules/978904
##### 2. First, you will have to run "file-parser_v3.py". Before you do, change the directory in the code to the directory where you saved the dataset
##### 3. Run "read_hdf5_v3.py". This takes a bit and creates a file called "qm9-own-features.h5"
##### 4. Run "read_qm9_adjacency_v2.py".
##### 5. Run "custom_graph_network.py". Remember to change the directory to the directory where you want to work

## After training the network

##### 1. After training, go to the ECCConv file in the Python 3.9 folder at ./local/lib/python3.9/site-packages/spektral/layers/convolutional.
##### 2. At the return statement in line 187 add messages to the return statement. That is \"return output\" -> \"return output, messages\"
##### 3. Now go to the custom_graph_network.py file add the variable \"msg\" to each conv call. That is \"x = self.conv1([x,a,e])\" -> \"x, msg = self.conv1([x,a,e])\" and so on.
##### 4. Then, in the custom_graph_network.py uncomment \"self.msg_list.append(self.extract_msg(msg,e))\" after every conv call you want the message from.
##### 5. Next, run the code in the sequence shown below. This will yield a numpy array with the message data.

### Extracting the messages

In [None]:
from custom_qm9_dataset_v2 import QM9, Net
from spektral.data import DisjointLoader
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import math

In [None]:
# Import the QM9 function from custom_graph_network.py

dataset = QM9(raw_h5file='qm9-raw.h5',
                      processed_h5file='qm9-own-features.h5',
                      dataset='train', 
                      test_fraction = 0.9,
                      col=10) 

# Load the data using spektral's DisjointLoader
loader = DisjointLoader(dataset,batch_size=1)

# Get a batch of data from the disjont loader
batch = loader.__next__()

# This gets the inputs and the targets from the loaded batch
inputs, target = batch

In [None]:
# Load the trained model weights from where you stored them
model = Net()
model.load_weights("/home/sebastian/thesisnn/models/my_model_init_zero")

In [None]:
# Loop over the data set to get the messages appened to the message list
# This passes data through the trained net one at a time. Take a bit!

for batch in loader:
   inp, _ = batch
   _ = model(inp)

In [None]:
# Save the messages to a numpy array
msg_data = np.array(model.msg_list)
np.save('msg_data')

In [None]:
# Extracting messages and corresponding inputs from each layer only
# This only works if you uncommented "self.msg_list.append(self.extract_msg(msg,e))" after EVERY convolutional layer.
ecc1_data = msg_data[0::3]
np.save('ecc1_data', ecc1_data)
ecc2_data = msg_data[1::3]
np.save('ecc2_data', ecc2_data)
ecc3_data = msg_data[2::3]
np.save('ecc3_data', ecc3_data)

### PySR

##### 1. Go to the "PySR_discovery.py" file
##### 2. Make sure to specify the correct ECCConv-message you want to approximate.
##### 3. Run "PySR_discovery.py"
##### 4. Get the best equation

In [None]:
# This is a sample PySR code

msg_data = np.load('ecc3_data.npy')

y = msg_data[:,0] #Assigning the messages to y
X = msg_data[:,1:] #Assigning the inputs to X
    
model = PySRRegressor(
niterations=5,
binary_operators=["+", "-", "*", "div"],
unary_operators=[
    "exp",
    "cos",
    "sin",
    "xsq(x) = x^2",
    "inv(x) = 1/x",  # Custom operator (julia syntax)
],
model_selection="best",
loss="loss(x, y) = (x - y)^2",  # Custom loss function (julia syntax)
)
model.fit(X, y)
