[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/toby-htx/ONNX-Sharing-Session/blob/main/Demo1_Interoperability.ipynb)

# **Demo 1: Interoperability**

**PyTorch -> ONNX -> Tensorflow**

In this demo, we are going to convert a model written in the PyTorch framework to ONNX format, then convert the new ONNX model into a Tensorflow model. Specifically, our Tensorflow BiLSTM model that takes in Google's Word2Vec embeddings as input has been rewritten with the PyTorch framework, and had been used for the Fine Grained Sentiment Analysis workstream. We will convert this model into an ONNX model, and subsequently a Tensorflow model.

You will need to change the **Runtime** to have a **TPU hardware accelerator**, then select '**Run All**'.


##Secton 1: PyTorch Model##

1) We need to **import the Word2Vec embeddings**. This will take around **20 minutes**  as it is huge.

In [None]:
import gensim.downloader as api

w2v = api.load("word2vec-google-news-300") 

2) Import the dataset and preprocess it.

In [None]:
!git clone https://github.com/toby-htx/Onnx-Sharing-Session.git

In [None]:
import pandas as pd
from sklearn import preprocessing
import re

def process_text(document):
     
    # Remove extra white space from text
    document = re.sub(r'\s+', ' ', document, flags=re.I)
         
    # Remove all the special characters from text
    document = re.sub(r'\W', ' ', str(document))
 
    return document

df = pd.read_csv('./Onnx-Sharing-Session/Data/Isear(Fear&Joy).csv')
df = df[['Emotion','Statement']]
df['preprocessedStatement'] = df.Statement.apply(process_text)

le = preprocessing.LabelEncoder()
df['Emotion'] = le.fit_transform(df['Emotion'])

X = df['preprocessedStatement'].tolist()
Y = df.pop('Emotion').tolist()

list(le.inverse_transform([0,1]))

train_iter = (zip(Y,X))

3) We have to build the **Vocab** and **embedding matrix** using the **PyTorch** way.


In [None]:
import torch
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator

tokenizer = get_tokenizer('basic_english')

def yield_tokens(data_iter):
    for _, text in data_iter:
        yield tokenizer(text)

vocab = build_vocab_from_iterator(yield_tokens(train_iter), specials=["<unk>"])
vocab.set_default_index(vocab["<unk>"])

In [None]:
import numpy as np

vocab_size = vocab.__len__()
weights_matrix = np.zeros((vocab_size, 300))
words_found = 0

for i, word in enumerate(vocab.get_itos()):
    try: 
        weights_matrix[i] = w2v[word]
        words_found += 1
    except KeyError:
        pass
        # weights_matrix[i] = np.zeros((1, 300))
        # weights_matrix[i] = np.random.rand(1, emb_dim)

4) Split the data into training, validation, and test sets.


In [None]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test= train_test_split(X,Y,test_size=0.05,stratify = Y)

x_val, y_val = x_train[:100], y_train[:100] 
x_train, y_train = x_train[100:], y_train[100:]

train_data = list(zip(y_train,x_train))
valid_data = list(zip(y_val,x_val))
test_data = list(zip(y_test,x_test))

In [None]:
text_pipeline = lambda x: vocab(tokenizer(x))
label_pipeline = lambda x: int(x)

5) The datasets should be loaded using **PyTorch's DataLoader**, and prepared with a **collate_fn**.


In [None]:
from torch.nn.utils.rnn import pad_sequence

def collate_batch(batch):

    max_len =131 #Diff from PyTorch notebook

    #Diff from PyTorch notebook: label_list, text_list, text_len  = [], [], []
    label_list, text_list  = [], []
   
    for (_label,_text) in batch:
        label_list.append(_label)
        processed_text = torch.tensor(text_pipeline(_text), dtype=torch.int64)
        text_list.append(processed_text)
        #Diff from PyTorch notebook: text_len.append(len(processed_text))
   
    label_list = torch.tensor(label_list, dtype=torch.int64)

    #Diff from PyTorch notebook: text_len = torch.tensor(text_len, dtype=torch.int64)

    text_list[0] = nn.ConstantPad1d((0, max_len - text_list[0].shape[0]), 0)(text_list[0])
   
    text_list_padded = pad_sequence(text_list, batch_first=True, padding_value=0)
   
    return label_list, text_list_padded #Diff from PyTorch notebook: , text_len

6) Prepare the BiLSTM model architecture with the PyTorch framework. 
The model architecture was inspired by that used in *Z. Hameed and B. Garcia-Zapirain, "Sentiment classification using a single-layered BiLSTM model", IEEE Access, vol. 8, pp. 73992-74001, 2020.*


In [None]:
class LSTM_W2V(torch.nn.Module) :
    def __init__(self, vocab_size, embedding_dim, hidden_dim, weights) :
        super().__init__()

        self.hidden_dim = hidden_dim
        self.embeddings = nn.Embedding(vocab_size, embedding_dim, padding_idx=0)
        self.embeddings.weight.data.copy_(torch.from_numpy(weights))
        self.embeddings.weight.requires_grad = False
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True, bidirectional=True)
        self.maxpool = nn.MaxPool1d(1)
        self.avgpool = nn.AvgPool1d(1)
        self.linear = nn.Linear(hidden_dim*2, hidden_dim*2)
        self.linear2 = nn.Linear(hidden_dim*2, 2)

    def forward(self, x): #Diff from PyTorch notebook: text_len
        
        h0 = torch.zeros(2, x.size(0), self.hidden_dim)
        c0 = torch.zeros(2, x.size(0), self.hidden_dim)

        x = self.embeddings(x)
        #Diff from PyTorch notebook: packed_embedded = pack_padded_sequence(input=x, lengths=text_len, batch_first=True, enforce_sorted=False)
        lstm_out, (ht, ct) = self.lstm(x, (h0,c0)) # Diff from PyTorch notebook: packed_embedded
        #Diff from PyTorch notebook: lstm_out, output_lengths = pad_packed_sequence(lstm_out, batch_first=True)

        out_max_pool=self.maxpool(lstm_out)
        out_avg_pool=self.avgpool(lstm_out)

        out = torch.cat((out_max_pool, out_avg_pool), 1)
        out = out[:, -1, :]

        out = F.relu(self.linear(out))
        preds = self.linear2(out)
            
        return preds

7) Instantiate the BiLSTM model.


In [None]:
import torch.nn as nn
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
import torch.nn.functional as F

embedding_dim=300
hidden_dim=32

model = LSTM_W2V(vocab_size, embedding_dim, hidden_dim, weights_matrix)

8) Prepare **DataLoaders** to pass datasets into the model.


In [None]:
from torch.utils.data import DataLoader

BATCH_SIZE = 1

train_dl = DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True, collate_fn=collate_batch)
val_dl = DataLoader(valid_data, batch_size=BATCH_SIZE,collate_fn=collate_batch)
test_dl = DataLoader(test_data, batch_size=1,collate_fn=collate_batch)

9) Prepare functions to train, evaluate, and test the model.


In [None]:
def train_model(model, epochs=10, lr=0.001):
    parameters = filter(lambda p: p.requires_grad, model.parameters())
    optimizer = torch.optim.Adam(parameters, lr=lr)
        
    for i in range(epochs):
        model.train()
        sum_loss = 0.0
        total = 0
        for y, x in train_dl: #Diff from PyTorch notebook: len
            y = y.long()
            x = x.long()
            y_pred = model(x) #Diff from PyTorch notebook: len
            optimizer.zero_grad()
            loss = F.cross_entropy(y_pred, y)
            loss.backward()
            optimizer.step()
            sum_loss += loss.item()*y.shape[0]
            total += y.shape[0]
        val_loss, val_acc = validation_metrics(model, val_dl)
        if i % 5 == 1:
            print("train loss %.3f, val loss %.3f, val accuracy %.3f" % (sum_loss/total, val_loss, val_acc))
        
def validation_metrics(model, valid_dl):
    model.eval()
    correct = 0
    total = 0
    sum_loss = 0.0
    for y, x in valid_dl: #Diff from PyTorch notebook: len
        y = y.long()
        x = x.long()
        y_hat = model(x) #Diff from PyTorch notebook: len
        loss = F.cross_entropy(y_hat, y)
        pred = torch.max(y_hat, 1)[1]
        correct += (pred == y).float().sum()
        total += y.shape[0]
        sum_loss += loss.item()*y.shape[0]
    return sum_loss/total, correct/total

def predict_test_cases(model, test_dl):
    model.eval()
    pred_list = []
    with torch.no_grad():
        for _, x in test_dl: #Diff from PyTorch notebook: len
            x = x.long()
            y_hat = model(x) #Diff from PyTorch notebook: len
            pred = torch.max(y_hat, 1)[1]
            pred_list.append(pred)
    return pred_list

10) Train the model. Note that model performance will not be ideal as we are trying to train it as fast as we can, hence the number of epochs is only 2.


In [None]:
train_model(model, epochs=2, lr=0.1)

11) Test the model.


In [None]:
pred_list = predict_test_cases(model, test_dl)

In [None]:
from sklearn.metrics import classification_report

print(classification_report(y_test, pred_list))

12) Save the model.


In [None]:
torch.save(model.state_dict(), 'saved_weights.pt')

##Secton 2: ONNX Model##


13) Export the PyTorch model as an **ONNX model**.


In [None]:
path='./saved_weights.pt'
model.load_state_dict(torch.load(path))
model.eval()

In [None]:
dummy_input = torch.rand((1,131), requires_grad=True).long()

torch.onnx.export(model,                     # model being run
                  dummy_input,               # model input (or a tuple for multiple inputs)
                  "model_trial.onnx",        # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  verbose =True
                  )

14) Download the required packages to use **ONNX** and to convert an ONNX model to **Tensorflow**. Note that some Tensorflow versions might not be compatible. This is a disadvantage of using ONNX: **you need to make sure the versions between ONNX and the DL frameworks are compatible**.


In [None]:
!pip install onnx onnx-tf onnxruntime

##Secton 3: Tensorflow Model##


15) Convert the ONNX model into a Tensorflow model


In [None]:
import onnx
from onnx_tf.backend import prepare

model_onnx = onnx.load('./model_trial.onnx')

tf_rep = prepare(model_onnx)

#Export tensorflow model as .pb file
tf_rep.export_graph('./model')

16) Use **Tensorflow framework/code** to prepare your inputs to the model.


In [None]:
import tensorflow as tf
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

def tf_prepare(text):
    t = Tokenizer()
    t.fit_on_texts(text)
    encoded_text = t.texts_to_sequences(text)
    x = pad_sequences(encoded_text, maxlen=131, padding='post')
    x = x.astype(np.int64)
    x_tf =tf.convert_to_tensor(x)
    return x_tf

text = ['I love shiba inus']

x_tf = tf_prepare(text)

17) Run the **Tensorflow** model.

In [None]:
import time

start_time_tf = time.time()
tf_outputs = tf_rep.run(x_tf)._0
print("Time taken by Tensorflow model: ", time.time() - start_time_tf)

In [None]:
tf_outputs_clean = np.argmax(tf_outputs, 1)
print(tf_outputs_clean)

##Extra: ONNX Model Test##

18) Run the **ONNX** model and compare.

In [None]:
#test if ONNX conversion worked
import onnxruntime as rt

text = ['I love shiba inus']

def onnx_prepare(text):
    t = Tokenizer()
    t.fit_on_texts(text)
    encoded_text = t.texts_to_sequences(text)
    x = pad_sequences(encoded_text, maxlen=131, padding='post')
    x = x.astype(np.int64)
    return x

x_onnx = onnx_prepare(text)

model = ('model_trial.onnx')
start_time = time.time()
session = rt.InferenceSession(model)
input_name = session.get_inputs()[0].name
label_name = session.get_outputs()[0].name
onnx_predictions = session.run([label_name], {input_name: x_onnx})[0]

onnx_pred_clean = np.argmax(onnx_predictions, 1)

print(onnx_pred_clean)
print("Time taken by ONNX model: ", time.time() - start_time)

**Question**: Why not just import the ONNX model in Demo 2 and convert it into PyTorch?

**Answer**: There is a library called onnx2pytorch, but it is full of bugs.
