# Intro

This notebook is part of a series of notebooks that aim to reuse open-source LLM models to perform a binary classification task.

Notebooks can be run completely independently from the others and besides dataset_utils.py have no common local dependencies. (As a result,
you can expect a little bit of code redundancy between notebooks) 

**The task is to detect toxic comments out of text comments retrieved from different news websites.**

For more information, see dataset_utils.py or search for 'Civil Comments dataset' online.

-----
This notebook **performs Zero-shot classifications** via the remote HF Serverless API, a useful free service to test out LLMs.
(Usage is obviously limited to a small number of daily requests)

In [1]:
from tqdm import tqdm

import evaluate
from huggingface_hub import InferenceClient

from utils import dataset_utils

Datasets cache is False


In [3]:
token = os.environ["HF_TOKEN_SERVERLESS_API"] # ADD YOUR TOKEN TO YOUR ENV! (It's a free service)
client = InferenceClient(
    token=token,
)

# Load Dataset

In [4]:
comments_dataset = dataset_utils.load_sampled_ds(ds_size=200)

Map:   0%|          | 0/200 [00:00<?, ? examples/s]

Map:   0%|          | 0/200 [00:00<?, ? examples/s]

Map:   0%|          | 0/200 [00:00<?, ? examples/s]

In [5]:
# Our dataset already has 3 splits ready

# Our target is the 'is_toxic' binary column
# The main feature we'll use is the free text 'text' column
comments_dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'toxicity', 'severe_toxicity', 'obscene', 'threat', 'insult', 'identity_attack', 'sexual_explicit', 'is_toxic'],
        num_rows: 200
    })
    validation: Dataset({
        features: ['text', 'toxicity', 'severe_toxicity', 'obscene', 'threat', 'insult', 'identity_attack', 'sexual_explicit', 'is_toxic'],
        num_rows: 200
    })
    test: Dataset({
        features: ['text', 'toxicity', 'severe_toxicity', 'obscene', 'threat', 'insult', 'identity_attack', 'sexual_explicit', 'is_toxic'],
        num_rows: 200
    })
})

In [12]:
model = "meta-llama/Meta-Llama-3-8B-Instruct"
# Other available models for chat_completion:
# model = "google/gemma-2-2b-it"
# model = "mistralai/Mistral-Nemo-Instruct-2407"

def fetch_zero_shot_classification_from_llm(model, comment):
    response = client.chat_completion(
    	model=model,
    	messages=[
            {"role": "system", "content": "You are a helpful assistant. You will receive comments from users and you need to determines whether those would be commonly judged 'toxic' on a social media app or not. The definition of 'toxic comment' here would be: 'a rude, disrespectful, or unreasonable comment that is somewhat likely to make you leave a discussion or give up on sharing your perspective'. Please answer with 'Yes' or 'No' only and nothing else."},
            {"role": "user", "content": "Is this comment toxic? '{comment}'"}
        ],
    	max_tokens=5,
    )
    # Note we can also retrieve the logprob for the returned token, but not for all available tokens.
    # That makes optimising on logprobs data harder to do.
    returned_message = response.choices[0]["message"]["content"]

    return {"prediction": 'yes' in returned_message.lower()}  # Note anything outside of 'yes' wouldbe considered a no!

# test it before running it on the whole dataset
fetch_zero_shot_classification_from_llm(model, "I like you!")

{'prediction': False}

In [13]:
comments_dataset = comments_dataset.map(
    lambda comment_data: fetch_zero_shot_classification_from_llm(model, comment_data["text"]),
    keep_in_memory=True,
    load_from_cache_file=False,
    batched=False
)

Map:   0%|          | 0/200 [00:00<?, ? examples/s]

Map:   0%|          | 0/200 [00:00<?, ? examples/s]

HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions (Request ID: 6RW7GohCCNaxiFnNAe16g)

Rate limit reached. You reached free usage limit (reset daily). Please subscribe to PRO or Enterprise Hub to get a higher limit: https://hf.co/pricing

# Evaluate

In [8]:
clf_metrics = evaluate.combine(["accuracy", "f1", "precision", "recall"])

In [9]:
clf_metrics.compute(
    references=comments_dataset["validation"]["is_toxic"],
    predictions=comments_dataset["validation"]["prediction"]
)

  _warn_prf(average, modifier, msg_start, len(result))


{'accuracy': 0.89, 'f1': 0.0, 'precision': 0.0, 'recall': 0.0}

# Final test

In [10]:
# When you're happy with your tuning, run the evaluation on the test set and report your results on the sheet!
clf_metrics.compute(
    references=comments_dataset["test"]["is_toxic"],
    predictions=comments_dataset["test"]["prediction"]
)

  _warn_prf(average, modifier, msg_start, len(result))


{'accuracy': 0.94, 'f1': 0.0, 'precision': 0.0, 'recall': 0.0}