<a href="https://www.kaggle.com/code/kacperkodo/sarcasm-detection-using-the-ivy-library?scriptVersionId=171170703" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# DEPENDANCIES AND SETUP

Installing kaggle and uploading the API key necessary to use it.

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/sarcasm/train-balanced-sarc.csv.gz
/kaggle/input/sarcasm/train-balanced-sarcasm.csv
/kaggle/input/sarcasm/test-balanced.csv
/kaggle/input/sarcasm/test-unbalanced.csv


In [2]:
!pip install -q kaggle
# from google.colab import files
# from google.colab import userdata
import os
# files.upload(); #Upload kaggle.json - you can get from the kaggle account settings, from the API section.

In [3]:
# UNCOMMENT BELOW IF YOU'RE RUNNING THE NOTEBOOK OUTSIDE KAGGLE

# kaggle_api_key = open('kaggle.json', "w+")
# kaggle_api_key.write('<this is where you copy the contents of your kaggle.json>') # kaggle.json - you can get it from the kaggle account settings, from the API section.
# !mkdir ~/.kaggle
# !cp kaggle.json ~/.kaggle/
# !chmod 600 ~/.kaggle/kaggle.json
# !kaggle datasets list

Installing packages necessary to use torch's transformers.

In [None]:
!pip install tqdm boto3 requests regex sentencepiece sacremoses botocore>=1.34.79

To use the API, credentials need to be copied into the kaggle folder. If everything works, the output will show the list of available datasets.

Preparing the ivy library.

In [None]:
#Insert the correct user when cloning the repos. Make sure that they are up-to-date.

!git clone "https://github.com/Kacper-W-Kozdon/demos.git"
# !git clone "https://github.com/Kacper-W-Kozdon/ivy.git"
!pip install -U -q paddlepaddle ivy accelerate>=0.21.0  2>/dev/null # If ran in a notebook with only cpu enabled, edit "paddlepaddle-gpu" to "paddlepaddle"

Next: import the ivy library and get the dataset.

In [None]:
import ivy

Import the libraries suggested in the model which is to be transpiled.

In [None]:
# Import necessary libraries
import pandas as pd  # For data manipulation and analysis
import gc  # For garbage collection to manage memory
import re  # For regular expressions
import numpy as np  # For numerical operations and arrays
import tensorflow as tf
import torch  # PyTorch library for deep learning
import paddle

In [None]:
# Libraries to accompany torch's transformers
import tqdm
import boto3
import requests
import regex
import sentencepiece
import sacremoses

import warnings  # For handling warnings
warnings.filterwarnings("ignore")  # Ignore warning messages

from transformers import AutoModel, AutoTokenizer  # Transformers library for natural language processing
# from transformers import TextDataset, LineByLineTextDataset, DataCollatorForLanguageModeling, \
# pipeline, Trainer, TrainingArguments, DataCollatorWithPadding  # Transformers components for text processing
from transformers import TextDataset, LineByLineTextDataset, DataCollatorForLanguageModeling, \
pipeline, TrainingArguments, DataCollatorWithPadding
from transformers import AutoModelForSequenceClassification  # Transformer model for sequence classification

import accelerate

# from nlp import Dataset  # Import custom 'Dataset' class for natural language processing tasks
from imblearn.over_sampling import RandomOverSampler  # For oversampling to handle class imbalance
# import datasets  # Import datasets library
# from datasets import Dataset, Image, ClassLabel  # Import custom 'Dataset', 'ClassLabel', and 'Image' classes
from transformers import pipeline  # Transformers library for pipelines
from bs4 import BeautifulSoup  # For parsing HTML content

import matplotlib.pyplot as plt  # For data visualization
import itertools  # For working with iterators
from sklearn.metrics import (  # Import various metrics from scikit-learn
    accuracy_score,  # For calculating accuracy
    roc_auc_score,  # For ROC AUC score
    confusion_matrix,  # For confusion matrix
    classification_report,  # For classification report
    f1_score  # For F1 score
)

# from datasets import load_metric  # Import load_metric function to load evaluation metrics

from tqdm import tqdm  # For displaying progress bars

tqdm.pandas()  # Enable progress bars for pandas operations

In [None]:
device = "gpu:0" if paddle.device.cuda.device_count() else "cpu" # Either "gpu" or "gpu:0".
ivy.set_default_device(device)
ivy.set_soft_device_mode(True)


In [None]:
print(ivy.default_device())
print(ivy.num_gpus())
print(torch.cuda.is_available())

Set the seeds.

In [None]:
tf.keras.utils.set_random_seed(0)
torch.manual_seed(0)
paddle.seed(0)

Get the API key for ivy transpiler from your account and upload it to the project. Move it to the correct directory.

In [None]:
pwd

First we're loading the tokenizer and the model from torch. All of the basic set-up instructions can be found here: https://colab.research.google.com/github/pytorch/pytorch.github.io/blob/master/assets/hub/huggingface_pytorch-transformers.ipynb#scrollTo=72d8f2de

In [None]:
tokenizer = torch.hub.load('huggingface/pytorch-transformers', 'tokenizer', 'bert-base-cased')


In [None]:
from ivy.stateful.module import Module
from ivy.stateful.sequential import Sequential
from ivy.stateful.layers import *
from ivy.stateful.losses import *
from ivy.stateful.optimizers import *
from ivy.stateful.activations import *
from ivy.stateful.initializers import *
from ivy.stateful.norms import *


In [None]:
# df = pd.read_csv("/content/demos/Contributor_demos/Sarcasm Detection/train-balanced-sarcasm.csv")
df = pd.read_csv("/kaggle/input/sarcasm/train-balanced-sarcasm.csv")
df = df.drop_duplicates()
df = df.rename(columns={'comment': 'title'})
df = df[['label', 'title']]
df = df[~df['label'].isnull()]
df = df[~df['title'].isnull()]
df.sample(5)

# DATASET AND MODEL OVERVIEW

In [None]:
!echo -n API_KEY > .ivy/key.pem

In [None]:
def count_words(text: str) -> int:
    return len(text.split())

def count_symbols(text: str) -> int:
    return len("".join(text.split()))

def symbol_to_word_ratio(text: str) -> float:
    return count_symbols(text)/count_words(text)

def upper_lower_ratio(text: str) -> float:
    text = "".join(text.split())
    return sum(1 for c in text if c.isupper())/(max([sum(1 for c in text if c.islower()), 1]))

df['word_count'] = df["title"].apply(count_words)
df['symbol_count'] = df["title"].apply(count_symbols)
df["upper_lower_ratio"] = df["title"].apply(upper_lower_ratio)
df["symbol_to_word_ratio"] = df["title"].apply(symbol_to_word_ratio)
df.sample(5)

A few plots to see some some characteristics of the data.

In [None]:
df_no_sarc = df.where(df["label"] == 0)
df_no_sarc = df_no_sarc.where(df_no_sarc["word_count"] <= 51)
df_sarc = df.where(df["label"] == 1)
df_sarc = df_sarc.where(df_sarc["word_count"] <= 51)
df_no_sarc = df_no_sarc[np.isfinite(df_no_sarc["word_count"])]
df_sarc = df_sarc[np.isfinite(df_sarc["word_count"])]
plt.style.use('_mpl-gallery-nogrid')

hist_df_no_sarc, bin_edges_no = np.histogram(df_no_sarc["word_count"].values, density=True)
hist_df_sarc, bin_edges = np.histogram(df_sarc["word_count"].values, density=True)
# plot:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

bin_mids_no = [(bin_edges_no[i+1] + bin_edges_no[i])/2 for i in range(len(bin_edges_no) - 1)]
bin_mids = [(bin_edges[i+1] + bin_edges[i])/2 for i in range(len(bin_edges) - 1)]
ax1.bar(bin_mids_no, hist_df_no_sarc, width=bin_edges_no[1] - bin_edges_no[0])
ax2.bar(bin_mids, hist_df_sarc, width=bin_edges[1] - bin_edges[0])
ax1.set_title("Hist no sarcasm")
ax1.set_ylabel("density")
ax1.set_xlabel("word count")
ax1.set_xticks(bin_edges_no)
ax1.grid(True)
ax2.set_title("Hist sarcasm")
ax2.set_xlabel("word count")
ax2.set_xticks(bin_edges)
ax2.grid(True)
plt.show()

In [None]:
df_no_sarc = df.where(df["label"] == 0)
df_no_sarc = df_no_sarc.where(df_no_sarc["symbol_count"] <= 201)
df_sarc = df.where(df["label"] == 1)
df_sarc = df_sarc.where(df_sarc["symbol_count"] <= 201)
df_no_sarc = df_no_sarc[np.isfinite(df_no_sarc["symbol_count"])]
df_sarc = df_sarc[np.isfinite(df_sarc["symbol_count"])]
plt.style.use('_mpl-gallery-nogrid')

hist_df_no_sarc, bin_edges_no = np.histogram(df_no_sarc["symbol_count"].values, density=True)
hist_df_sarc, bin_edges = np.histogram(df_sarc["symbol_count"].values, density=True)
# plot:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

bin_mids_no = [(bin_edges_no[i+1] + bin_edges_no[i])/2 for i in range(len(bin_edges_no) - 1)]
bin_mids = [(bin_edges[i+1] + bin_edges[i])/2 for i in range(len(bin_edges) - 1)]
ax1.bar(bin_mids_no, hist_df_no_sarc, width=bin_edges_no[1] - bin_edges_no[0])
ax2.bar(bin_mids, hist_df_sarc, width=bin_edges[1] - bin_edges[0])
ax1.set_title("Hist no sarcasm")
ax1.set_ylabel("density")
ax1.set_xlabel("symbol count")
ax1.set_xticks(bin_edges_no)
ax1.grid(True)
ax2.set_title("Hist sarcasm")
ax2.set_xlabel("symbol count")
ax2.set_xticks(bin_edges)
ax2.grid(True)
plt.show()

In [None]:
df_no_sarc = df.where(df["label"] == 0)
df_no_sarc = df_no_sarc.where(df_no_sarc["upper_lower_ratio"] <= 0.3)
df_sarc = df.where(df["label"] == 1)
df_sarc = df_sarc.where(df_sarc["upper_lower_ratio"] <= 0.3)
df_no_sarc = df_no_sarc[np.isfinite(df_no_sarc["upper_lower_ratio"])]
df_sarc = df_sarc[np.isfinite(df_sarc["upper_lower_ratio"])]
plt.style.use('_mpl-gallery-nogrid')

hist_df_no_sarc, bin_edges_no = np.histogram(df_no_sarc["upper_lower_ratio"].values, density=True)
hist_df_sarc, bin_edges = np.histogram(df_sarc["upper_lower_ratio"].values, density=True)
# plot:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

bin_mids_no = [(bin_edges_no[i+1] + bin_edges_no[i])/2 for i in range(len(bin_edges_no) - 1)]
bin_mids = [(bin_edges[i+1] + bin_edges[i])/2 for i in range(len(bin_edges) - 1)]
ax1.bar(bin_mids_no, hist_df_no_sarc, width=bin_edges_no[1] - bin_edges_no[0])
ax2.bar(bin_mids, hist_df_sarc, width=bin_edges[1] - bin_edges[0])
ax1.set_title("Hist no sarcasm")
ax1.set_ylabel("density")
ax1.set_xlabel("upper/lower ratio")
ax1.set_xticks(bin_edges_no)
ax1.grid(True)
ax2.set_title("Hist sarcasm")
ax2.set_xlabel("upper/lower ratio")
ax2.set_xticks(bin_edges)
ax2.grid(True)
plt.show()

In [None]:
df_no_sarc = df.where(df["label"] == 0)
df_no_sarc = df_no_sarc.where(df_no_sarc["symbol_to_word_ratio"] <= 11)
df_sarc = df.where(df["label"] == 1)
df_sarc = df_sarc.where(df_sarc["symbol_to_word_ratio"] <= 11)
df_no_sarc = df_no_sarc[np.isfinite(df_no_sarc["symbol_to_word_ratio"])]
df_sarc = df_sarc[np.isfinite(df_sarc["symbol_to_word_ratio"])]
plt.style.use('_mpl-gallery-nogrid')

hist_df_no_sarc, bin_edges_no = np.histogram(df_no_sarc["symbol_to_word_ratio"].values, density=True)
hist_df_sarc, bin_edges = np.histogram(df_sarc["symbol_to_word_ratio"].values, density=True)
# plot:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

bin_mids_no = [(bin_edges_no[i+1] + bin_edges_no[i])/2 for i in range(len(bin_edges_no) - 1)]
bin_mids = [(bin_edges[i+1] + bin_edges[i])/2 for i in range(len(bin_edges) - 1)]
ax1.bar(bin_mids_no, hist_df_no_sarc, width=bin_edges_no[1] - bin_edges_no[0])
ax2.bar(bin_mids, hist_df_sarc, width=bin_edges[1] - bin_edges[0])
ax1.set_title("Hist no sarcasm")
ax1.set_ylabel("density")
ax1.set_xlabel("symbols/words ratio")
ax1.set_xticks(bin_edges_no)
ax1.grid(True)
ax2.set_title("Hist sarcasm")
ax2.set_xlabel("symbols/words ratio")
ax2.set_xticks(bin_edges)
ax2.grid(True)
plt.show()

In [None]:
gc.collect()

# BUILDING LSTM ON CORE IVY

In [None]:
# dir(tokenizer)

Setting up the device for the computations.

In [None]:
import cudf

In [None]:
train_test_ratio = 0.9
frac_dataset = 0.2

In [None]:
df = cudf.read_csv("/kaggle/input/sarcasm/train-balanced-sarcasm.csv")
df = df.drop_duplicates()
df = df.rename(columns={'comment': 'title'})
df = df[['label', 'title']]
df = df[~df['label'].isnull()]
df = df[~df['title'].isnull()]
df.sample(frac=1).reset_index(drop=True)
df.sample(5)


In [None]:
df_full = df
df_size = len(df_full)
split = int(df_size * train_test_ratio * frac_dataset)
cutoff = int(df_size * frac_dataset)
df = df_full.iloc[:split,:]
df_eval = df_full.iloc[split:cutoff,:]
print(len(df))

In [None]:
print(torch.cuda.is_available())
device = ivy.as_native_dev("gpu:0")
ivy.set_default_device("gpu:0")
print(ivy.default_device())
ivy.set_soft_device_mode(True)
print(device)

In [None]:
print(tokenizer.vocab_size)
print(tokenizer.all_special_tokens_extended)
print(tokenizer.all_special_ids)
print(tokenizer.pad_token_id)

In [None]:
sample = list(df.sample(8)["title"].to_pandas())
print(sample)
tokenizer(sample, add_special_tokens=True, padding=True, truncation=True)

In [None]:
ivy.set_backend("torch")
num_embeddings = tokenizer.vocab_size
embedding_dim = 3
pad_token_id = tokenizer.pad_token_id
input_channels = embedding_dim
num_classes = 2
output_channels = 1
num_layers = 1
linear_input_channels = 2
max_length = 13
tokenizer.model_max_length = max_length
eps = 1e-05
testing_input = df.sample(8)["title"]
batch_size = 64
linear_input_channels = (tokenizer.model_max_length + 3) * batch_size # 3 comes from the hidden states of the LSTM
linear_output_channels = num_classes * batch_size
normalized_shape = (num_classes)

class LSTM_postproc(Module):

    def __init__(self):
        super(LSTM_postproc, self).__init__()

    def __call__(self, args):

        lstm_output, lstm_state = args
        lstm_state_latest, lstm_state_hidden = lstm_state
        lstm_state_latest = ivy.array(lstm_state_latest)
        # print(lstm_state_hidden, lstm_state_latest)
        lstm_state_hidden = ivy.array([state for state in lstm_state_hidden][0])

        lstm_state = ivy.concat((lstm_state_latest, lstm_state_hidden), axis=0).reshape((batch_size, -1, 1))
        # print(lstm_output.shape, lstm_state.shape)
        out = ivy.concat([lstm_output, lstm_state], axis=1)
        out = out.flatten()
        return out

class Tokenizer(Module):

    def __init__(self, tokenizer):
        super(Tokenizer, self).__init__()
        self.tokenizer = tokenizer

    def __call__(self, args):
        args = list(args)
        return self.tokenizer(args, add_special_tokens=True, max_length=max_length, padding="max_length", truncation=True)["input_ids"]

class Reshaper(Module):

    def __init__(self):
        super(Reshaper, self).__init__()

    def __call__(self, args):
        return args.reshape((batch_size, num_classes))

ivy_LSTM = Sequential(
    Tokenizer(tokenizer),
    Embedding(num_embeddings, embedding_dim, pad_token_id),
    LSTM(input_channels, output_channels, num_layers=1, return_sequence=True, return_state=True, device=None, v=None, dtype=None),
    LSTM_postproc(),
    Linear(linear_input_channels, linear_output_channels, with_bias=True),
    Reshaper(),
    Sigmoid(),
    Softmax(),
)

In [None]:
print(dir(ivy_LSTM))
print(ivy_LSTM.device)

In [None]:
from torch.utils.data import Dataset

In [None]:
class ivy_Dataset(Dataset):
    def __init__(self, df):
        self.num_samples = df['title'].size
        self.data = [[entry[0], entry[1]] for entry in zip(df["title"].to_pandas(), df["label"].to_pandas())]

    def __getitem__(self, idx):
        title = self.data[idx][0]
        label = self.data[idx][1]
        return title, label

    def __len__(self):
        return self.num_samples
    


In [None]:
training_data = ivy_Dataset(df)


In [None]:
df_sample = df.sample(10)
data_sample = [[entry[0], entry[1]] for entry in zip(df_sample["title"].to_pandas(), df_sample["label"].to_pandas())]
data_sample[9][1]

In [None]:
from torch.utils.data import DataLoader
train_dataloader = DataLoader(training_data, batch_size=batch_size, shuffle=True)

In [None]:
def ivy_train_loader(dataset = df, batch_size = 4):
    num_batches = int(len(dataset)/batch_size)
    out = ((dataset["title"][batch_idx * batch_size : batch_idx * batch_size + batch_size].to_pandas(), dataset["label"][batch_idx * batch_size : batch_idx * batch_size + batch_size].to_pandas()) for batch_idx in range(num_batches))
    return out

loader = ivy_train_loader(batch_size=batch_size)
for batch_id, data in tqdm(enumerate(loader)):
    x_data = data[0]
    y_data = data[1]
    ivy_LSTM_test_out = ivy_LSTM(x_data)
    # print()
    # print(ivy.sum(ivy_LSTM_test_out, axis=1))
    if batch_id == 10:
        break

In [None]:
for batch_id, data in tqdm(enumerate(train_dataloader)):
    x_data = data[0]
    y_data = data[1]
    ivy_LSTM_test_out = ivy_LSTM(x_data)
    if batch_id == 10:
        break

It seems that in this case just a simple generator is comparable or slightly faster than a proper data loader.

In [None]:
def one_hot(args, num_clases = 2):
    out = [[1 if idx == elem else 0 for idx in range(2)] for elem in args]
    return out

def argmax(args):
    out = [ivy.argmax(elem) for elem in args]
    return out

print(one_hot([0, 0, 1, 0]))
print(argmax(ivy.array([[0.49967843, 0.50032151],
       [0.49986687, 0.50013322],
       [0.49912587, 0.50087422],
       [0.50080854, 0.4991914 ],
       [0.50049627, 0.4995037 ],
       [0.4998956 , 0.50010443],
       [0.50008798, 0.49991205],
       [0.50053447, 0.49946556]])))

In [None]:
ivy.set_backend("torch")
num_embeddings = tokenizer.vocab_size
embedding_dim = 3
pad_token_id = tokenizer.pad_token_id
input_channels = embedding_dim
num_classes = 2
output_channels = 1
num_layers = 1
linear_input_channels = 2
max_length = 13
tokenizer.model_max_length = max_length
eps = 1e-05
batch_size = 64
testing_input = df.sample(batch_size)["title"]
testing_labels = df.sample(batch_size)["label"]

linear_input_channels = (tokenizer.model_max_length + 3) * batch_size # 3 comes from the hidden states of the LSTM
linear_output_channels = num_classes * batch_size
normalized_shape = (num_classes)

class LSTM_postproc(Module):

    def __init__(self):
        super(LSTM_postproc, self).__init__()

    def _forward(self, args):

        lstm_output, lstm_state = args
        lstm_state_latest, lstm_state_hidden = lstm_state
        lstm_state_latest = ivy.array(lstm_state_latest)
        # print(lstm_state_hidden, lstm_state_latest)
        lstm_state_hidden = ivy.array([state for state in lstm_state_hidden][0])

        lstm_state = ivy.concat((lstm_state_latest, lstm_state_hidden), axis=0).reshape((batch_size, -1, 1))
        # print(lstm_output.shape, lstm_state.shape)
        out = ivy.concat([lstm_output, lstm_state], axis=1)
        out = out.flatten()
        return out

class Tokenizer(Module):

    def __init__(self, tokenizer):
        super(Tokenizer, self).__init__()
        self.tokenizer = tokenizer

    def _forward(self, args):
        args = list(args)
        return self.tokenizer(args, add_special_tokens=True, max_length=max_length, padding="max_length", truncation=True)["input_ids"]

class Reshaper(Module):

    def __init__(self):
        super(Reshaper, self).__init__()

    def _forward(self, args):
        return args.reshape((batch_size, num_classes))

class Argmax(Module):

    def __init__(self):
        super(Argmax, self).__init__()

    def _forward(self, args):
        return ivy.argmax(args, axis=-1)

class ivy_Embed(Module):
    
    def __init__(self, embedding):
        super(ivy_Embed, self).__init__()
        self.embedding = embedding
        
    def _forward(self, args):
        out = self.embedding(args).float()
        return out

embedding = Embedding(num_embeddings, embedding_dim, pad_token_id)

ivy_LSTM = Sequential(
    Tokenizer(tokenizer),
    ivy_Embed(embedding),
    LSTM(input_channels, output_channels, num_layers=1, return_sequence=True, return_state=True, device=None, v=None, dtype=None),
    LSTM_postproc(),
    Linear(linear_input_channels, linear_output_channels, with_bias=True),
    Reshaper(),
    Sigmoid(),
    Softmax(),
    Argmax(),
)

In [None]:
testing_labels = ivy.array(testing_labels)

In [None]:
testing_labels = one_hot(testing_labels)

In [None]:

ivy_LSTM(testing_input.to_pandas())

In [None]:
testing_labels = ivy.array(testing_labels).flatten()
print(testing_labels)

In [None]:
dir(ivy_LSTM)

In [None]:
v = ivy_LSTM.v
learning_rate = 3e-5
opt = SGD(lr=learning_rate, inplace=True, stop_gradients=True, trace_on_next_step=False)
# print(v)
ivy_LSTM.train(mode=True)
ivy.set_backend("torch")

# print(tokens)
# print(tokens.requires_grad, ivy_LSTM.training)
# print(ivy_LSTM(tokens))
loss_fn = CrossEntropyLoss(axis=-1, epsilon=1e-07, reduction='sum')
predictions = ivy_LSTM(testing_input.to_pandas()).flatten()
predictions.requires_grad = True
loss = loss_fn(list(testing_labels.to_pandas()), predictions.float()).float()
# print(loss, list(testing_labels.to_pandas()))
loss.backward()
print(loss.grad)

# opt.step(v, loss)


<bound method _wrap_function.<locals>.new_function of [34m{[0m
    [32mgrad[0m[35m:[0m null,
    [32msubmodules[0m[35m:[0m [34m{[0m
        [32mv1[0m[35m:[0m [34m{[0m
            [32membedding[0m[35m:[0m [34m{[0m
                [32mw[0m[35m:[0m (<[34mclass[0m ivy.data_classes.array.array.Array> [35mshape=[0m[28996, 3])
            [34m}[0m
        [34m}[0m,
        [32mv2[0m[35m:[0m [34m{[0m
            [32minput[0m[35m:[0m [34m{[0m
                [32mlayer_0[0m[35m:[0m [34m{[0m
                    [32mw[0m[35m:[0m (<[34mclass[0m ivy.data_classes.array.array.Array> [35mshape=[0m[3, 4])
                [34m}[0m
            [34m}[0m,
            [32mrecurrent[0m[35m:[0m [34m{[0m
                [32mlayer_0[0m[35m:[0m [34m{[0m
                    [32mw[0m[35m:[0m ivy.array([[0.83389348, 0.16461061, 1.13583314, -0.35058311]], dev=gpu[35m:[0m0)
                [34m}[0m
            [34m}[0m
        [34

In [92]:
def train_ivy(model):
    logs = []
    learning_rate = 3e-5
    opt = SGD(lr=learning_rate, inplace=True, stop_gradients=True, trace_on_next_step=False)
    loss_fn = CrossEntropyLoss(axis=-1, epsilon=1e-07, reduction='sum')
    epochs = 2
    grads = ivy.zeros_like(model.v)
    classifier = model
    

    for epoch in range(epochs):
        train_loader = ivy_train_loader(dataset = df, batch_size = batch_size)
        for batch_id, data in tqdm(enumerate(train_loader)):
            model.v.grad = model.v
            x_data = data[0]
            y_data = list(data[1])
            # print(y_data)
            # The transpiled model seems to have problems with inputs, so instead of feeding it a container, we map onto one.
            predictions = classifier(x_data)

            loss = loss_fn(y_data, predictions).float()
            loss.requires_grad = True
#             print(f"LOSS: {loss}")

            # acc = paddle.metric.accuracy(predicts, y_data) # This needs to be corrected.
            loss.backward()
#             print(f"GRAD: {model.v.grad}")
            # update parameters
            grads = x_data.grad
            model.v = opt.step(model.v, grads)

            if batch_id % 100 == 0:
                # print("\nepoch: {}, batch_id: {}, loss is: {}, acc is: {}".format(epoch, batch_id, loss.numpy(), acc))
                logs.append([[epoch, batch_id, loss]])

      # opt.clear_grad()
    gc.collect()


    return logs, model

In [93]:
logs, ivy_LSTM = train_ivy(ivy_LSTM)

TypeError: must be real number, not NoneType

In [229]:
ivy_LSTM.save("Ivy_Sarcasm_Detection_Demo")
!cp "Ivy_Sarcasm_Detection_Demo" "/kaggle/working/demos/Contributor_demos/Sarcasm Detection"

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [230]:
ivy_LSTM.eval()
ivy_LSTM.train(False)


Sequential(v0=Tokenizer(), v1=Embedding(num_embeddings=28996, embedding_dim=3, padding_idx=0), v2=LSTM(3, 1), v3=LSTM_postproc(), v4=Linear(in_features=1024, out_features=128, with_bias=True), v5=Reshaper(), v6=Sigmoid(complex_mode=jax), v7=Softmax(axis=-1, complex_mode=jax), v8=Argmax())

In [261]:
def eval_ivy(model):
    logs = []
    learning_rate = 3e-5
    opt = SGD(lr=learning_rate, inplace=True, stop_gradients=True, trace_on_next_step=False)
    loss_fn = CrossEntropyLoss(axis=-1, epsilon=1e-07, reduction='sum')
    epochs = 2
    grads = ivy.zeros_like(model.v)
    classifier = model
    train_loader = ivy_train_loader(dataset = df_eval, batch_size = batch_size)


    for batch_id, data in tqdm(enumerate(train_loader)):

        x_data = data[0]
        y_data = list(data[1])
        # print(y_data)
        # The transpiled model seems to have problems with inputs, so instead of feeding it a container, we map onto one.
        predictions = classifier(x_data).float()
        acc = ivy.matmul(predictions, y_data).float()/batch_size
        loss = loss_fn(predictions, y_data).float()
        

        logs.append([loss, acc])

      # opt.clear_grad()
    gc.collect()

    return ivy.mean(logs, axis=0)

    

In [262]:
logs_eval = eval_ivy(ivy_LSTM)

36it [01:02,  1.73s/it]


In [1]:
print(logs_eval)

NameError: name 'logs_eval' is not defined

In [249]:
print(len(df_eval))

2329


In [None]:
train_test_ratio = 0.95
frac_dataset = 1
df_size = len(df_full)
split = int(df_size * train_test_ratio * frac_dataset)
cutoff = int(df_size * frac_dataset)
df = df_full.iloc[:split,:]
df_eval = df_full.iloc[split:cutoff,:]

In [None]:
def train_ivy(model):
    logs = []
    learning_rate = 3e-5
    opt = SGD(lr=learning_rate, inplace=True, stop_gradients=True, trace_on_next_step=False)
    loss_fn = CrossEntropyLoss(axis=-1, epsilon=1e-07, reduction='sum')
    epochs = 1
    grads = ivy.zeros_like(model.v)
    classifier = model
    

    for epoch in range(epochs):
        train_loader = ivy_train_loader(dataset = df, batch_size = batch_size)
        for batch_id, data in tqdm(enumerate(train_loader)):

            x_data = data[0]
            y_data = list(data[1])
            # print(y_data)
            # The transpiled model seems to have problems with inputs, so instead of feeding it a container, we map onto one.
            predictions = classifier(x_data)

            loss = loss_fn(predictions, y_data).float()
            loss.requires_grad = True
            # print(f"LOSS: {loss}")

            # acc = paddle.metric.accuracy(predicts, y_data) # This needs to be corrected.
            loss.backward()
            grads = x_data.grad
            # update parameters
            model.v = opt.step(model.v, grads)

            if batch_id % 300 == 0:
                # print("\nepoch: {}, batch_id: {}, loss is: {}, acc is: {}".format(epoch, batch_id, loss.numpy(), acc))
                logs.append([[epoch, batch_id, loss]])

      # opt.clear_grad()
    gc.collect()


    return logs, model

In [None]:
logs, ivy_LSTM = train_ivy(ivy_LSTM)

In [None]:
ivy_LSTM.save("Ivy_Sarcasm_Detection_Demo")
!cp "Ivy_Sarcasm_Detection_Demo" /kaggle/working/demos/Contributor_demos/Sarcasm Detection