## Google's T5 base fine-tuned on Twitter Sarcasm Dataset for Sequence classification (as text generation) downstream task.

In [1]:
from transformers import AutoTokenizer, AutoModelWithLMHead


  from .autonotebook import tqdm as notebook_tqdm


In [2]:
import torch
# Check PyTorch has access to MPS (Metal Performance Shader, Apple's GPU architecture)
print(f"Is MPS (Metal Performance Shader) built? {torch.backends.mps.is_built()}")
print(f"Is MPS available? {torch.backends.mps.is_available()}")

# Set the device      
device = "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using device: {device}")

Is MPS (Metal Performance Shader) built? True
Is MPS available? False
Using device: cpu


The T5 model was presented in "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" paper

### Here we make use of a pre-trained SOTA model for sarcasm detection

In [3]:

tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-sarcasm-twitter")

model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-sarcasm-twitter")




In [4]:

def eval_conversation(text):

  input_ids = tokenizer.encode(text + '</s>', return_tensors='pt')

  output = model.generate(input_ids=input_ids, max_length=3)
  
  dec = [tokenizer.decode(ids) for ids in output]

  label = dec[0]

  return 1 if 'derison' in label else 0


In [5]:
import pandas as pd

data = pd.read_csv('download.csv')

In [9]:
y_1k = data.label[:1000]

In [10]:
ypred_1k = data.comment[:1000].apply(eval_conversation)

## Determine f1-score

### NOTE: We are considering only 1000 records from the given text dataset due to lack of processing time and resources

In [13]:
from sklearn.metrics import f1_score, accuracy_score

f1_score(y_true=y_1k, y_pred=ypred_1k)

0.6590538336052203

### Accuracy score

In [14]:
accuracy_score(y_true=y_1k, y_pred=ypred_1k)

0.582

## Conclusion

We used a pretrained SOTA model - Google's T5 base fine-tuned on Twitter Sarcasm Dataset for Sequence classification (as text generation) downstream task.
Thus leveraged the transfer learning to our needs.

This model yielded an f1-score of 0.66 an dan accuracy score of 0.58. which is better compared to the metrics obtained in our previous attempts in building classical base models.