<a href="https://colab.research.google.com/github/AlaFalaki/tutorial_notebooks/blob/main/classification/hf_evaluate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A sample code of how to use Huggingface's Evaluate library
The code is the supplementary material to the story published in NLPiation medium blog. Follow [the link](https://medium.com/@nlpiation/how-to-use-the-huggingface-evaluate-library-in-action-with-batching-2948929015bf) for a detailed explanation of the diverse beam search and following code.

# Download and Load Libraries

In [1]:
!pip install -q torch==1.13.1 datasets==2.9.0 evaluate==0.4.0 transformers==4.26.0

## Import Libraries

In [2]:
import torch
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer
import evaluate
from datasets import load_dataset
from datasets import Dataset

from tqdm import tqdm
import pandas as pd
from sklearn.model_selection import train_test_split

# Load The Dataset

In [3]:
sentiment140 = load_dataset("sentiment140", cache_dir="./ds_sentiment140")



  0%|          | 0/2 [00:00<?, ?it/s]

We first need to convert the dataset to Dataframe and split it, sinec the dataset does not have a fixed test or validation set.

In [4]:
df = sentiment140["train"].to_pandas()

In [5]:
df['sentiment'] = df['sentiment'].replace(4, 1)

In [6]:
X_train, X_test, y_train, y_test = train_test_split(df['text'], df['sentiment'], test_size=0.2, random_state=1)

In [7]:
X_test, X_val, y_test, y_val  = train_test_split(X_test, y_test, test_size=0.5, random_state=1)

In [8]:
print("Train:", len(y_train), " / Test:", len(y_test), " / Val:", len(y_val))

Train: 1280000  / Test: 160000  / Val: 160000


Now, convert the separated texts and labels to  Dataframes back to a joint Dataframe.

In [9]:
train_df = pd.DataFrame(X_train,columns=['text'])
train_df['sentiment'] = y_train
train_df.head()

Unnamed: 0,text,sentiment
1556092,@JessicaKnows I use it and do like it.,1
868905,Almost home aaand I need to pee rather badly....,1
218471,dropping the marmite and cheese covered bread ...,0
620327,Having issues with Xfire broadcast. Cancelled ...,0
981867,@kunaldua ask for Hermes Heritage complex. Its...,1


In [10]:
test_df = pd.DataFrame(X_test,columns=['text'])
test_df['sentiment'] = y_test
test_df.head()

Unnamed: 0,text,sentiment
932067,@door_kicker hey tofu is super good for u...an...,1
909762,Caps lost. ARGH! But HP game evening was much ...,1
1275248,@Ellen_F OF has already been on in Oz. Not sur...,1
1274799,"@alexpham4 with teleporation, I wouldn't need ...",1
1530405,Omg! i have 16 followers! thank u thank u thaa...,1


In [11]:
valid_df = pd.DataFrame(X_val,columns=['text'])
valid_df['sentiment'] = y_val
valid_df.head()

Unnamed: 0,text,sentiment
60473,It is still raining and more storms are moving...,0
1174268,@JaydyGaGa ... Was well suprised ... I was li...,1
1404666,@ddlovato I'm sure you will Demi.,1
380353,@LightFoundDark yes Geographie and i dont know...,0
470328,"Everyone follow @truthtweet, shows which celeb...",0


Lastly, convert the Dataframes to the Huggingface dataset object.

In [12]:
hf_train = Dataset.from_pandas(train_df)
hf_test = Dataset.from_pandas(test_df)
hf_valid = Dataset.from_pandas(valid_df)

The following fields are optional if you want to save the Dataset objects for later.

In [13]:
# hf_train.save_to_disk('./hf-cache/processed_sentiment140/train')
# hf_test.save_to_disk('./hf-cache/processed_sentiment140/test')
# hf_valid.save_to_disk('./hf-cache/processed_sentiment140/valid')

# from datasets import load_from_disk

# train_set = load_from_disk('./hf-cache/processed_sentiment140/train')
# valid_set = load_from_disk('./hf-cache/processed_sentiment140/test')
# test_set = load_from_disk('./hf-cache/processed_sentiment140/valid')

⚠️ Bonus: You should also consider renaming the "sentiment" column to "label" if you want to use this dataset for training using the Huggingface's Trainer function.

In [14]:
# train_set = hf_train.rename_column("sentiment", "label")
# valid_set = hf_valid.rename_column("sentiment", "label")

⚠️ Comment the following fields if it is not a test run. It will select 32 datapoints from the dataset for faster prediction.

In [15]:
hf_test = hf_test.select( range(32) )

It is not possible to do batching and itterate over the Huggingface dataset. So, the PyTorch DataLoader will take care of that.

In [16]:
ds_loader = torch.utils.data.DataLoader(
    hf_test,
    batch_size=16,
    num_workers=4,
    pin_memory=True,
)



# Load Tokenizer and Model

I chose a RoBERTa model from the Huggingface hub that is finetuned for this dataset.

In [17]:
tokenizer = AutoTokenizer.from_pretrained("pig4431/Sentiment140_roBERTa_5E", cache_dir="./hf-cache/roberta")

In [18]:
model = AutoModelForSequenceClassification.from_pretrained("pig4431/Sentiment140_roBERTa_5E", cache_dir="./hf-cache/roberta")

Put the model on GPU if available.

In [19]:
if torch.cuda.is_available():
  model.to('cuda')

# Load the Metrics

In [20]:
metrics = evaluate.combine(["accuracy", "f1", "precision", "recall"])

# Prediction Loop

In [21]:
for batch in tqdm( ds_loader ):        
  # Tokenize
  inputs = tokenizer(batch['text'], return_tensors="pt", padding=True)

  if torch.cuda.is_available():
      model.to('cuda')
      
  # Make Predictions
  with torch.no_grad():
      logits = model(**inputs).logits

  # Find the Predicted Label
  predicted_class_id = logits.argmax(dim=-1)

  # Add the batch result to Evaluator object
  metrics.add_batch(references=batch['sentiment'], predictions=predicted_class_id)

100%|██████████| 2/2 [00:08<00:00,  4.36s/it]


In [22]:
metrics.compute()

{'accuracy': 0.8125,
 'f1': 0.8333333333333334,
 'precision': 0.9375,
 'recall': 0.75}