Load dependencies

In [None]:
import numpy as np
import pandas as pd
import os
import sys
import json
import tensorflow as tf
import NN_Predictor

In [None]:
from pathlib import Path
path = Path.cwd()
sys.path.append(path)
file_path = os.path.join(path.parent,'files', 'NN')
output_path = os.path.join(path.parent,'files', 'custom scores')
rxn_ids = np.load(file_path+'/rxn_ids.npy').astype('str')

#Tensorflow complains because the NN is not optimized perfectly
tf.get_logger().setLevel('ERROR')

## Introduction

The purpose of this notebook is to show some examples of using the NN

## Making predictions using the NN

Load in your favourite Neural Network

In [None]:
NN = NN_Predictor.load_NN(path=file_path+'/NN_full.h5')

Load in the data, the current NN requires a binary list indicating presences of reactions, it is important that this list is in the right order (maybe something that I can solve later?).

First lets try it with a single reaction set

In [None]:
input_data = np.load(file_path+'/example_binary.npy')

This example already has the right order so no worries, we can immediately make our prediction

In [None]:
prediction = NN_Predictor.make_prediction(input_data,NN)

We can then save this as an dictionary (.json) so we know which prediction corresponds to which reaction

In [None]:
pdic = dict(zip(rxn_ids,prediction.astype('float')))
p_file = open(output_path+"/prediction_example.json", 'w')
json.dump(pdic, p_file)
p_file.close()

What if we only have a list of reaction ids? (following part can maybe be integrated in make prediction function, this was a bit of a random idea but it seems to work)

In [None]:
#creates a random list of 500 reactions
rand_reaction_set = np.random.choice(rxn_ids, 500, replace=False)

#converts this to the right format
b_list = NN_Predictor.convert_reaction_list(rand_reaction_set)


# Multiple predictions

We can also make predictions for multiple models at the same time, starting with a reaction presence dataframe where the rows are the different reactions and the columns are the model ids:

In [None]:
#multiple binary reaction lists (csv)
input_path = file_path+'/Sample_reaction_presence.csv'
df = pd.read_csv(input_path, index_col=0)
model_ids = df.columns
rxn_ids = df.index
input_data= df.to_numpy().T #The neural network makes predictions per row so we need to transpose the input

We can then create a array of prediction scores with the same order of reaction and model ids

In [None]:
#   load network and make prediction
prediction = NN_Predictor.make_prediction(input_data,NN)

Finally we can create a dataframe of predictions so we can see which score corresponds to which model and reaction id

In [None]:
t_prediction = prediction.T  #Transpose the data back, this is convuluted, I will see if I can make this better
df_p = pd.DataFrame(index=rxn_ids, columns=model_ids, data=t_prediction)
df_p.to_csv(output_path+'/Multiple_NN_predictions.csv')

In [None]:
df_p