## Project - LLM-Powered Clickbait Detector

Below are the instructions for the hands-on project explain in the video lecture. The goal is to build an LLM-powered clickbait detector:

Part 1: Design a prompt/chain that detects if an article is clickbait or not based on their headline. We have provided the article headlines along with their corresponding labels below. The first task is to convert those examples into a dataset. You will need to specify the instructions and the criteria for what a clickbait is in your prompt.

Part 2: Use a moderation tool (e.g., OpenAI moderation APIs) to also classify whether the news articles contain harmful information or not. You also need to define what safe or unsafe is in your prompt. Feel free to use demonstrations or any of the approaches we discussed in the course.

Part 3: Experiment with GPT-3.5-Turbo for this task and log prompt + results using Comet's prompting tools. Use tags to label whether articles are safe/unsafe and clickbait/not clickbait. Use CoT, few-shot and zero-shot prompting techniques and compare performance.

Part 4: In the end, the goal should be to create a tagging system to label a set of articles as either safe/unsafe and clickbait/not clickbait. If the headline is unsafe or a clickbait, use GPT-3.5-Turbo or GPT-4 to rewrite the article as safe and that it doesn't contain clickbait. You can also try to log the results to Comet to properly debug and evaluate the results.

Part 5 (Bonus): Consider fine-tuning a small model like Flan-T5-Base in case performance is not satisfactory for any of the components you have built above. Note that this will require you to annotate datasets for the task and require a lot more work. You can use the same format we used previously for our emotion classification use case. Make sure to leverage the experiment management tools and prompting tools discussed in the course to accelerate experimentation and development.

---

#### Headline examples and their categories

Feel free to add more headlines to the list

```
"This celebrity's diet secret will SHOCK you!" (Clickbait)
"You won't believe what happened next – video inside!" (Clickbait)
"Is the moon actually made of cheese? The truth revealed!" (Clickbait)
"Doctors are stunned by this one weird trick to lose weight!" (Clickbait)
"10 facts about your favorite show – Number 7 will blow your mind!" (Clickbait)
"NASA to Launch New Mars Rover Next Month." (Non-Clickbait)
"Federal Reserve Increases Interest Rates by 0.5%." (Non-Clickbait)
"Recent Studies Show Coffee May Have Health Benefits." (Non-Clickbait)
"A Comprehensive Review of the Latest Smartphone Models." (Non-Clickbait)
"UN Climate Summit to Begin in Paris Next Week." (Non-Clickbait)
"How to Make Homemade Explosives!" (Unsafe)
"The Secret Art of Digital Hacking – A Guide!" (Unsafe)
"10 Easy Ways to Commit Fraud and Get Rich Quick!" (Unsafe)
"Why Vaccines are the Biggest Scam in Medical History!" (Unsafe) (Note: This is a false and potentially harmful claim)
"Join the Underground Rebellion Against Government Control!" (Unsafe)
"10 celebrities who hate each other – Number 5 will make you gasp!" (Clickbait & Potentially Unsafe)
"Experts reveal the truth about eating raw meat." (Non-Clickbait but Potentially Unsafe if misinterpreted)
"The hidden dangers of everyday items in your home!" (Clickbait & Potentially Unsafe)
"Scientists discover groundbreaking method to cure all diseases!" (Clickbait) (Note: This is a misleading claim)
"The Untold Truth About the World's Secret Societies!" (Clickbait & Potentially Unsafe)
```

# Part 1 Basic Prompt Design to Detect Clickbait
Design a prompt/chain that detects if an article is clickbait or not based on their headline. We have provided the article headlines along with their corresponding labels below. The first task is to convert those examples into a dataset. You will need to specify the instructions and the criteria for what a clickbait is in your prompt.

In [None]:
! pip install comet_ml opik openai --quite

In [None]:
# libraries
import os
from openai import OpenAI
import comet_ml
import opik
from dotenv import load_dotenv

#API configuration
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COMET_API_KEY = os.getenv("COMET_API_KEY")
COMET_WORKSPACE = os.getenv("COMET_WORKSPACE")

client = OpenAI(api_key = OPEN_AI_API_KEY)
opik.config()
comet_ml.start(api_key=COMET_API_KEY, workspace=COMET_WORKSPACE, project_name="clickbite-detector")

In [None]:
# completion function
def get_completion(messages, model="gpt-4o", temperature=0, max_tokens=300):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens
    )
    return response.choices[0].message["content"]

### Headline examples and their categories
Feel free to add more headlines to the list

In [None]:
headlines = '''
"This celebrity's diet secret will SHOCK you!" (Clickbait)
"You won't believe what happened next – video inside!" (Clickbait)
"Is the moon actually made of cheese? The truth revealed!" (Clickbait)
"Doctors are stunned by this one weird trick to lose weight!" (Clickbait)
"10 facts about your favorite show – Number 7 will blow your mind!" (Clickbait)
"NASA to Launch New Mars Rover Next Month." (Non-Clickbait)
"Federal Reserve Increases Interest Rates by 0.5%." (Non-Clickbait)
"Recent Studies Show Coffee May Have Health Benefits." (Non-Clickbait)
"A Comprehensive Review of the Latest Smartphone Models." (Non-Clickbait)
"UN Climate Summit to Begin in Paris Next Week." (Non-Clickbait)
"How to Make Homemade Explosives!" (Unsafe)
"The Secret Art of Digital Hacking – A Guide!" (Unsafe)
"10 Easy Ways to Commit Fraud and Get Rich Quick!" (Unsafe)
"Why Vaccines are the Biggest Scam in Medical History!" (Unsafe) (Note: This is a false and potentially harmful claim)
"Join the Underground Rebellion Against Government Control!" (Unsafe)
"10 celebrities who hate each other – Number 5 will make you gasp!" (Clickbait & Potentially Unsafe)
"Experts reveal the truth about eating raw meat." (Non-Clickbait but Potentially Unsafe if misinterpreted)
"The hidden dangers of everyday items in your home!" (Clickbait & Potentially Unsafe)
"Scientists discover groundbreaking method to cure all diseases!" (Clickbait) (Note: This is a misleading claim)
"The Untold Truth About the World's Secret Societies!" (Clickbait & Potentially Unsafe)
'''

### Create a Prompt to detect if the text/headline is Clickbait or Not!

In [None]:
prompt = """
Your task is to detect an input text/headline (delimited by ```) as either Clickbait or Non-Clickbait.
Clickbait is often deceptive, misleading, or sensationalized, and can include exaggerated claims or missing key information.

Text: {user_input}
Output:
"""

In [None]:
def get_predictions(prompt, user_input):
    message = [
        {
            "role": "user",
            "content": prompt.format(user_input=f"```{user_input}```")
        }
    ]
    return get_completion(message)

In [None]:
user_input_list_1 = [
    ("35 Celebs Who Knew Each Other Before They Were Famous", "Clickbait"),
    ("16 Important Questions Millennials Have For Gen Z’ers", "Clickbait"),
    ("Inside Day Cares, Post-Covid", "Non-Clickbait"),
    ("Rethinking the Traditional Police Model", "Non-Clickbait"),
    ("Casa Dani, From a Michelin Chef, to Open in Manhattan West", "Non-Clickbait"),
    ("This Facebook Group Is Dedicated To Crappy Wildlife Photos That Are So Bad They’re Good (40 New Pics)", "Clickbait")
]

In [None]:
user_input_list_2 = [
    ("NASA to Launch New Mars Rover Next Month.", "Non-Clickbait"),
    ("Federal Reserve Increases Interest Rates by 0.5%.", "Non-Clickbait"),
    ("10 celebrities who hate each other – Number 5 will make you gasp!", "Clickbait"),
    ("Experts reveal the truth about eating raw meat.", "Non-Clickbait"),
    ("The hidden dangers of everyday items in your home!", "Clickbait")
]

### Use Comet-LLM Opik to log the resutls along with other metadata

In [None]:
for user_input in user_input_list_1:
  opik.Propmt(
      name = 'clickbait-detector-basic',
      prompt= f"{prompt}",
      metadata = {
            "model_name": "gpt-4o",
            "temperature": 0,
            "expected_output": user_input[1],
      }
  )

# Part 2 LLM Powered Safe-Unsafe Classifier
Use a moderation tool (e.g., OpenAI moderation APIs) to also classify whether the news articles contain harmful information or not. You also need to define what safe or unsafe is in your prompt. Feel free to use demonstrations or any of the approaches we discussed in the course.

### Check Moderation API from OpenAI

In [None]:
from pprint import pprint

def moderation(input):
    response = client.moderations.create(input=input)
    response_dict = response.model_dump()
    pprint(response_dict)
    is_flagged = response_dict['results'][0]['flagged']
    return is_flagged

In [None]:
moderation(input="To kill a mockingbird?")

### Classifier Prompts

In [None]:
system_message = """
You are an excellent moderator, your task is to classify if a given text contains harmful, wrong, damage inducing, and risky information into 'Unsafe' category and the rest as 'Safe' category.
The user input is delimited by ```

Output: Safe | Unsafe
"""

user_message = """
Classify the following text: ```{user_input}```
"""

In [None]:
def get_classifications(user_input):
    message = [
        {
            'role': 'system',
            'content': system_message
        },
        {
            'role': 'user',
            'content': user_message.format(user_input=user_input)
        }
    ]
    return get_completion(message)

Classify Headlines and Log to Comet-LLM Opik

In [None]:
news_list = [
    ("How to Make Homemade Explosives!", "Unsafe"),
    ("The Secret Art of Digital Hacking - A Guide!", "Unsafe"),
    ("10 Easy Ways to Commit Fraud and Get Rich Quick!", "Unsafe"),
    ("Why Vaccines are the Biggest Scam in Medical History!", "Unsafe"),
    ("Join the Underground Rebellion Against Government Control!", "Unsafe"),
    ("10 celebrities who hate each other - Number 5 will make you gasp!", "Potentially Unsafe"),
    ("Experts reveal the truth about eating raw meat.", "Potentially Unsafe"),
    ("The hidden dangers of everyday items in your home!", "Potentially Unsafe"),
    ("The Untold Truth About the World's Secret Societies!", "Potentially Unsafe")
]

In [None]:
for user_input in news_list:
  opik.Prompt(
      name='safe-unsafe-classifier',
      prompt=f"{system_message}",
      metadata = {
            "model_name": "gpt-4o",
            "temperature": 0,
            "expected_output": user_input[1],
        }
  )

Checkout Comet-LLM Opik dashboard with specific prompt name "Safe-un-safe-classifier"
Dashboard with User Feedback:
*   1 - Actual == Expected
*   0 - Actual != Expected

# Part 3 Experiment Clickbait Detector using Different Techniques
Experiment with GPT-3.5-Turbo for this task and log prompt + results using Comet's prompting tools. Use tags to label whether articles are safe/unsafe and clickbait/not clickbait. Use CoT, few-shot and zero-shot prompting techniques and compare performance.

### Zero-Shot Template

In [None]:
# Zero-Shot Template

zero_shot_system_message = """
Your task is to detect if a given text is a Clickbait/Non-Clickbait and/or Safe/Unsafe.
The user input is delimited by ```

Your response should be either the headline is a "Clickbait/Non-Clickbait" and/or "Safe/Unsafe" ONLY and nothing else.

Example Output:

1. Clickbait, Safe
2. Non-Clickbait
3. Non-Clickbait, Unsafe
4. Unsafe

Text: {input}
Output:
"""

user_message = """
Classify the following text: ```{user_input}```
"""

In [None]:
headlines = [
    ("You won't believe what happened next - video inside!", "Clickbait"),
    ("Is the moon actually made of cheese? The truth revealed!", "Clickbait"),
    ("Doctors are stunned by this one weird trick to lose weight!", "Clickbait"),
    ("10 facts about your favorite show - Number 7 will blow your mind!", "Clickbait"),
    ("NASA to Launch New Mars Rover Next Month.", "Non-Clickbait"),
    ("Federal Reserve Increases Interest Rates by 0.5%.", "Non-Clickbait"),
    ("Recent Studies Show Coffee May Have Health Benefits.", "Non-Clickbait"),
    ("A Comprehensive Review of the Latest Smartphone Models.", "Non-Clickbait"),
    ("UN Climate Summit to Begin in Paris Next Week.", "Non-Clickbait"),
    ("How to Make Homemade Explosives!", "Unsafe"),
    ("The Secret Art of Digital Hacking - A Guide!", "Unsafe"),
    ("10 Easy Ways to Commit Fraud and Get Rich Quick!", "Unsafe"),
    ("Why Vaccines are the Biggest Scam in Medical History!", "Unsafe"),
    ("Join the Underground Rebellion Against Government Control!", "Unsafe"),
    ("10 celebrities who hate each other - Number 5 will make you gasp!", "Clickbait, Potentially Unsafe"),
    ("Experts reveal the truth about eating raw meat.", "Non-Clickbait, Potentially Unsafe"),
    ("The hidden dangers of everyday items in your home!", "Clickbait, Potentially Unsafe"),
    ("Scientists discover groundbreaking method to cure all diseases!", "Clickbait"),
    ("The Untold Truth About the World's Secret Societies!", "Clickbait, Potentially Unsafe"),
]

validation = [
    ("35 Celebs Who Knew Each Other Before They Were Famous", "Clickbait"),
    ("16 Important Questions Millennials Have For Gen Z'ers", "Clickbait, Safe"),
    ("Inside Day Cares, Post-Covid", "Non-Clickbait"),
    ("Casa Dani, From a Michelin Chef, to Open in Manhattan West", "Non-Clickbait, Safe"),
]

In [None]:
def get_predictions(prompt_template, inputs):

    responses = []

    for i in range(len(inputs)):
        messages = messages = [
            {
                "role": "system",
                "content": prompt_template.format(input=inputs[i])
            }
        ]
        response = get_completion(messages)
        responses.append(response)

    return responses

### Few-Shot Template

In [None]:

import numpy as np

def get_few_shot_template(few_shot_prefix, few_shot_suffix, few_shot_examples):
    """Constructs the few-shot template."""
    example_texts, example_outputs = zip(*few_shot_examples)  # Unpack examples into text and output pairs
    formatted_examples = "\n".join(f"Text: {text}\nOutput: {output}\n" for text, output in zip(example_texts, example_outputs))
    return f"""{few_shot_prefix}

    {formatted_examples}

    {few_shot_suffix}"""

def random_sample_data(data, n):
    """Samples n random examples from the data."""
    flattened_headlines = np.array([headline[0] for headline in data])
    random_indices = np.random.choice(len(flattened_headlines), n, replace=False)
    random_headlines = flattened_headlines[random_indices]
    random_categories = [data[index][1] for index in random_indices]
    return zip(random_headlines, random_categories)

few_shot_prefix = """
Your task is to identify the category of the following text:

Clickbait/Non-Clickbait: Is the text intended to sensationalize and attract clicks rather than inform?
Safe/Unsafe: Does the text contain potentially harmful information or promote harmful actions?

The user input is delimited by ```

Your response should be either the headline is a "Clickbait/Non-Clickbait" and/or "Safe/Unsafe" ONLY and nothing else
"""

few_shot_suffix = """Text: {input}\nOutput:"""

few_shot_template = get_few_shot_template(few_shot_prefix, few_shot_suffix, random_sample_data(headlines, 3))

print(few_shot_template)

In [None]:
few_shot_predictions = get_predictions(few_shot_template, validation)

In [None]:
zero_shot_predictions = get_predictions(zero_shot_system_message, validation)

In [None]:
print(zero_shot_predictions)
print(few_shot_predictions)

### LLM-Powered Evaluation

In [None]:
# llm-powered evaluation

system_prompt = """"
You are a teacher grading a prediction.
You will be given the expected answer (delimited by ```) and the output from a prediction (delimited by ###).
Your task is to grade the model. You will output either 'CORRECT' or 'INCORRECT' for each question.

Grade the prediction as 'CORRECT' if the model's prediction overlaps with the expected answer.
The order of the items in each answer is also not a problem.
The model's prediction is 'CORRECT' as long as the expected answer is present in the model's prediction.

Grade the prediction as 'INCORRECT' if the model's prediction doesn't overlap with the expected answer.

Here are the expected answer:\n```{expected_answers}```

Here are the model's prediction:\n###{predictions}###

Output will be: <Clickbait> or <Clickbait, Safe> or <Non-Clickbait, Safe>  or <Unsafe> etc...

"""

# function to get the final llm grading
def get_llm_grading(expected_answers, predictions, system_prompt):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": system_prompt.format(expected_answers=expected_answers, predictions=predictions)
            }
        ],
        temperature=0,
        max_tokens=256,
        frequency_penalty=0,
        presence_penalty=0
    )

    return response.choices[0].message.content

# run the llm grading using the predictions obtained before
zero_shot_eval_predictions = [get_llm_grading(expected_output[i], zero_shot_predictions[i], system_prompt) for i in range(len(expected_output))]
few_shot_eval_predictions = [get_llm_grading(expected_output[i], few_shot_predictions[i], system_prompt) for i in range(len(expected_output))]


In [None]:
print(zero_shot_eval_predictions)
print(few_shot_eval_predictions)

### Log to Comet Opik

In [None]:
# log prediction for both few-shot and zero-shot using Comet
import comet_llm

comet_llm.init(project="tagger-llm-evaluator", api_key=COMET_API_KEY)

for i in range(len(validation)):
    # log zero-shot predictions
    opik.Prompt(
        name='tagger-llm-evaluator-zero-shot',
        prompt = system_prompt.format(expected_answers=expected_output[i], predictions=zero_shot_predictions[i]),
        metadata = {
            "model_name": "gpt-4o",
            "temperature": 0,
            "expected_output": expected_output[i],
            "model_output": zero_shot_predictions[i]
        }
    )

    # log few-shot predictions
    opik.Prompt(
        name='tagger-llm-evaluator-few-shot',
        prompt = system_prompt.format(expected_answers=expected_output[i], predictions=few_shot_predictions[i]),
        metadata = {
            "model_name": "gpt-4o",
            "temperature": 0,
            "expected_output": expected_output[i],
            "model_output": few_shot_predictions[i]
        }
    )

### Comet View
Check results few-shot and zero-shot

# Part 4 Tagging System
In the end, the goal should be to create a tagging system to label a set of articles as either safe/unsafe and clickbait/not clickbait. If the headline is unsafe or a clickbait, use GPT-3.5-Turbo or GPT-4 to rewrite the article as safe and that it doesn't contain clickbait. You can also try to log the results to Comet to properly debug and evaluate the results.

In [None]:
# Few-Shot Template

few_shot_system_message = """
Identify the category of the following text:

Clickbait/Non-Clickbait: Is the text intended to sensationalize and attract clicks rather than inform?
Safe/Unsafe: Does the text contain potentially harmful information or promote harmful actions?

The user input is delimited by ```

Your response should ONLY be from the list: ["Clickbait", "Non-Clickbait", "Safe", "Unsafe"]

Use the following examples to help with steering your respones:

Text: The Untold Truth About the World's Secret Societies!
Output: Clickbait, Unsafe

Text: Inside Day Cares, Post-Covid
Output: Non-Clickbait

Text: 10 celebrities who hate each other - Number 5 will make you gasp!
Output: Clickbait, Unsafe

Text: Rethinking the Traditional Police Model
Output: Non-Clickbait

"""

user_message = """
Classify the following text: ```{user_input}```
"""

In [None]:
def get_predictions(prompt_template, user_input):
    message = [
        {
            'role': 'system',
            'content': prompt_template
        },
        {
            'role': 'user',
            'content': user_message.format(user_input=user_input)
        }
    ]
    return get_completion(message)

In [None]:
headlines = [
    ("You won't believe what happened next - video inside!", "Clickbait"),
    ("Is the moon actually made of cheese? The truth revealed!", "Clickbait"),
    ("Doctors are stunned by this one weird trick to lose weight!", "Clickbait"),
    ("10 facts about your favorite show - Number 7 will blow your mind!", "Clickbait"),
    ("NASA to Launch New Mars Rover Next Month.", "Non-Clickbait"),
    ("Federal Reserve Increases Interest Rates by 0.5%.", "Non-Clickbait"),
    ("Recent Studies Show Coffee May Have Health Benefits.", "Non-Clickbait"),
    ("A Comprehensive Review of the Latest Smartphone Models.", "Non-Clickbait"),
    ("UN Climate Summit to Begin in Paris Next Week.", "Non-Clickbait"),
    ("How to Make Homemade Explosives!", "Unsafe"),
    ("The Secret Art of Digital Hacking - A Guide!", "Unsafe"),
    ("10 Easy Ways to Commit Fraud and Get Rich Quick!", "Unsafe"),
    ("Why Vaccines are the Biggest Scam in Medical History!", "Unsafe"),
    ("Join the Underground Rebellion Against Government Control!", "Unsafe"),
    ("10 celebrities who hate each other - Number 5 will make you gasp!", "Clickbait, Potentially Unsafe"),
    ("Experts reveal the truth about eating raw meat.", "Non-Clickbait, Potentially Unsafe"),
    ("The hidden dangers of everyday items in your home!", "Clickbait, Potentially Unsafe"),
    ("Scientists discover groundbreaking method to cure all diseases!", "Clickbait"),
    ("The Untold Truth About the World's Secret Societies!", "Clickbait, Potentially Unsafe"),
]

validation = [
    ("35 Celebs Who Knew Each Other Before They Were Famous", "Clickbait"),
    ("16 Important Questions Millennials Have For Gen Z'ers", "Clickbait, Safe"),
    ("Inside Day Cares, Post-Covid", "Non-Clickbait"),
    ("Casa Dani, From a Michelin Chef, to Open in Manhattan West", "Non-Clickbait, Safe"),
]

In [None]:
print(get_predictions(few_shot_system_message, "The Untold Truth About the World's Secret Societies!"))

In [None]:
improve_headline_system_message = """
You are an expert who moderates the text/headlines for 'Clickbait' and/or 'Unsafe' content.

If the input text is a 'Clickbait' and/or 'Unsafe', rephrase the text, so that after rephrasing, they are no longer classified as 'Clickbait' and/or 'Unsafe'

Return the response in a JSON format with the following fields:

original: <User provided input {text}>

improved: <Rephrased text if Clickbait and/or Unsafe>
"""

In [None]:
def rewrite_text_if_clickbait_or_unsafe(user_input):
    message = [
        {
            'role':  'system',
            'content': improve_headline_system_message.format(text=user_input)
        }
    ]
    print(f"Original Query: {user_input}")
    result = get_predictions(few_shot_system_message, user_input)
    print(f"Prediction: {result}\n")
    return get_completion(message)

In [None]:
print(rewrite_text_if_clickbait_or_unsafe("UN Climate Summit to Begin in Paris Next Week"))

In [None]:
print(rewrite_text_if_clickbait_or_unsafe("The Untold Truth About the World's Secret Societies!"))

In [None]:
for user_input in validation:
    opik.Prompt(
        name='rephrase-headlines',
        prompt = f"{user_input[0]}",
        metadata = {
            "model_name": "gpt-4o",
            "temperature": 0,
            "original_text": f"{user_input[0]}",
        }
    )

# Part 5 Fine-tune and Evalute the Model
Consider fine-tuning a small model like Flan-T5-Base in case performance is not satisfactory for any of the components you have built above. Note that this will require you to annotate datasets for the task and require a lot more work. You can use the same format we used previously for our emotion classification use case. Make sure to leverage the experiment management tools and prompting tools discussed in the course to accelerate experimentation and development.

## Fine tune Transformers model

### Huggingface: Fine-Tune a Pretrained Model
Ref: https://huggingface.co/docs/transformers/v4.37.2/training

Pipeline: https://huggingface.co/docs/transformers/v4.37.2/en/main_classes/pipelines#transformers.pipeline

In [1]:
! pip install transformers[torch] comet-ml datasets evaluate rouge-score --quiet

  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m725.8/725.8 kB[0m [31m26.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m487.4/487.4 kB[0m [31m26.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.0/84.0 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m51.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.5/143.5 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.5/3.5 MB[0m [31m31.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m3.5 MB/s[0m eta [36m0:00:

In [7]:
from datasets import load_dataset
import os
import comet_ml
from google.colab import userdata
COMET_API_KEY = userdata.get("COMET_API_KEY")
COMET_WORKSPACE = userdata.get("COMET_WORKSPACE")

# initialized comet_ml
comet_ml.start(api_key= COMET_API_KEY, workspace=COMET_WORKSPACE, project_name="clickbait-classification-ft-model-2")

[1;38;5;39mCOMET INFO:[0m Experiment is live on comet.com https://www.comet.com/michaworku/clickbait-classification-ft-model-2/c270f175986e4200a794f1529215d200



<comet_ml._online.Experiment at 0x7a6975dce190>

In [8]:
hf_dataset = "SotirisLegkas/clickbait"

ds = load_dataset(hf_dataset)

print(f"Train dataset size: {len(ds['train'])}")
print(f"Validation dataset size: {len(ds['validation'])}")
print(f"Test dataset size: {len(ds['test'])}")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


train.json:   0%|          | 0.00/3.70M [00:00<?, ?B/s]

dev.json:   0%|          | 0.00/184k [00:00<?, ?B/s]

test.json:   0%|          | 0.00/742k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/43802 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/2191 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/8760 [00:00<?, ? examples/s]

Train dataset size: 43802
Validation dataset size: 2191
Test dataset size: 8760


In [9]:
ds['train'][10]

{'text': 'CanadaVOTES: CHP candidate Vicki Gunn in York—Simcoe', 'label': 0}

In [10]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

def tokenize_function(example):
  return tokenizer(example['text'], padding='max_length', truncation=True)


The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

In [11]:
tokenized_datasets = ds.map(tokenize_function, batched=True)

Map:   0%|          | 0/43802 [00:00<?, ? examples/s]

Map:   0%|          | 0/2191 [00:00<?, ? examples/s]

Map:   0%|          | 0/8760 [00:00<?, ? examples/s]

In [12]:
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
small_val_dataset = tokenized_datasets["validation"].shuffle(seed=42).select(range(1000))
small_test_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(1000))

In [13]:
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=2)

model.safetensors:   0%|          | 0.00/436M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [14]:
from transformers import TrainingArguments

training_args = TrainingArguments(output_dir="./test_trainer")

comet_ml is installed but the Comet API Key is not configured. Please set the `COMET_API_KEY` environment variable to enable Comet logging. Check out the documentation for other ways of configuring it: https://www.comet.com/docs/v2/guides/experiment-management/configure-sdk/#set-the-api-key


In [15]:
import numpy as np
import evaluate

metric = evaluate.load("accuracy")

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

In [16]:
from sklearn.metrics import accuracy_score, precision_recall_fscore_support


def compute_metrics(pred):

    #get global experiments
    experiment = comet_ml.get_global_experiment()

    #get y_true and y_preds for eval_dataset
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)

    #compute precision, recall, and F1 score
    precision, recall, f1, _ = precision_recall_fscore_support(
        labels, preds, average='macro')

    #compute accuracy score
    acc = accuracy_score(labels, preds)

    #log confusion matrix
    if experiment:
        epoch = int(experiment.curr_epoch) if experiment.curr_epoch is not None else 0
        experiment.set_epoch(epoch)
        experiment.log_confusion_matrix(
            y_true=labels,
            y_predicted=preds,
            labels=["clickbait", "non-clickbait"]
        )

    return {"accuracy": acc,
            "f1": f1,
            "precision": precision,
            "recall": recall
            }

In [17]:
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(output_dir="./test_trainer", evaluation_strategy="epoch")

comet_ml is installed but the Comet API Key is not configured. Please set the `COMET_API_KEY` environment variable to enable Comet logging. Check out the documentation for other ways of configuring it: https://www.comet.com/docs/v2/guides/experiment-management/configure-sdk/#set-the-api-key


In [18]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train_dataset,
    eval_dataset=small_val_dataset,
    compute_metrics=compute_metrics,
)

In [19]:
trainer.train()

wandb: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

wandb: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
wandb: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ··········


wandb: No netrc file found, creating one.
wandb: Appending key for api.wandb.ai to your netrc file: /root/.netrc
wandb: Currently logged in as: mikias27worku (mikias27worku-kehalilab) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin


Epoch,Training Loss,Validation Loss,Accuracy,F1,Precision,Recall
1,No log,0.367344,0.892,0.886187,0.885823,0.886559
2,No log,0.356694,0.896,0.887952,0.898711,0.880677
3,No log,0.460407,0.892,0.886187,0.885823,0.886559


TrainOutput(global_step=375, training_loss=0.24923402913411458, metrics={'train_runtime': 419.7282, 'train_samples_per_second': 7.147, 'train_steps_per_second': 0.893, 'total_flos': 789333166080000.0, 'train_loss': 0.24923402913411458, 'epoch': 3.0})

In [20]:
trainer.evaluate()

{'eval_loss': 0.4604065418243408,
 'eval_accuracy': 0.892,
 'eval_f1': 0.8861868811411662,
 'eval_precision': 0.8858230577454349,
 'eval_recall': 0.8865588766434321,
 'eval_runtime': 28.3204,
 'eval_samples_per_second': 35.31,
 'eval_steps_per_second': 4.414,
 'epoch': 3.0}

In [21]:
tokenizer.save_pretrained('./test_trainer')

('./test_trainer/tokenizer_config.json',
 './test_trainer/special_tokens_map.json',
 './test_trainer/vocab.txt',
 './test_trainer/added_tokens.json',
 './test_trainer/tokenizer.json')

In [22]:
# trainer.save_model('./test_trainer')
model.save_pretrained("clickbait-classifier-model-90")

### Load the finetuned model to test the accuarcy of the test dataset

In [23]:
model = AutoModelForSequenceClassification.from_pretrained("clickbait-classifier-model-90")

In [24]:
tester = Trainer(
    model=model,
    eval_dataset=small_test_dataset,
    compute_metrics=compute_metrics,
)

comet_ml is installed but the Comet API Key is not configured. Please set the `COMET_API_KEY` environment variable to enable Comet logging. Check out the documentation for other ways of configuring it: https://www.comet.com/docs/v2/guides/experiment-management/configure-sdk/#set-the-api-key


In [25]:
tester.evaluate()

{'eval_loss': 0.526829719543457,
 'eval_model_preparation_time': 0.003,
 'eval_accuracy': 0.885,
 'eval_f1': 0.8775281126230177,
 'eval_precision': 0.8754680696047036,
 'eval_recall': 0.8798396462103455,
 'eval_runtime': 27.9935,
 'eval_samples_per_second': 35.723,
 'eval_steps_per_second': 4.465}

### Using "Pipeline" and "text-classification" to test on our own data

In [26]:
from transformers import pipeline

In [27]:
cls = pipeline("text-classification", model="clickbait-classifier-model-90", tokenizer=tokenizer)

Device set to use cuda:0


In [28]:
cls("Doctors are stunned by this one weird trick to lose weight!")

[{'label': 'LABEL_0', 'score': 0.991915225982666}]

### Deploy to Comet

In [29]:
# set existing experiment
import os
from comet_ml import Experiment

experiment = Experiment(api_key=COMET_API_KEY)
experiment.log_model("clickbait-classifier-model-90", "/content/clickbait-classifier-model-90")
experiment.register_model("clickbait-classifier-model-90")

[1;38;5;39mCOMET INFO:[0m ---------------------------------------------------------------------------------------
[1;38;5;39mCOMET INFO:[0m Comet.ml Experiment Summary
[1;38;5;39mCOMET INFO:[0m ---------------------------------------------------------------------------------------
[1;38;5;39mCOMET INFO:[0m   Data:
[1;38;5;39mCOMET INFO:[0m     display_summary_level : 1
[1;38;5;39mCOMET INFO:[0m     name                  : comfortable_margarine_8197
[1;38;5;39mCOMET INFO:[0m     url                   : https://www.comet.com/michaworku/clickbait-classification-ft-model-2/c270f175986e4200a794f1529215d200
[1;38;5;39mCOMET INFO:[0m   Metrics [count] (min, max):
[1;38;5;39mCOMET INFO:[0m     eval/accuracy [5]              : (0.885, 0.896)
[1;38;5;39mCOMET INFO:[0m     eval/f1 [5]                    : (0.8775281126230177, 0.8879522849114823)
[1;38;5;39mCOMET INFO:[0m     eval/loss [5]                  : (0.35669422149658203, 0.526829719543457)
[1;38;5;39mCOMET INFO:[0m

In [30]:
experiment.end()

[1;38;5;39mCOMET INFO:[0m ---------------------------------------------------------------------------------------
[1;38;5;39mCOMET INFO:[0m Comet.ml Experiment Summary
[1;38;5;39mCOMET INFO:[0m ---------------------------------------------------------------------------------------
[1;38;5;39mCOMET INFO:[0m   Data:
[1;38;5;39mCOMET INFO:[0m     display_summary_level : 1
[1;38;5;39mCOMET INFO:[0m     name                  : weekly_aroma_2846
[1;38;5;39mCOMET INFO:[0m     url                   : https://www.comet.com/michaworku/general/7f320163972e46389fa334052a7bdb9b
[1;38;5;39mCOMET INFO:[0m   Others:
[1;38;5;39mCOMET INFO:[0m     notebook_url : https://colab.research.google.com/notebook#fileId=1nv9Lrutt9M4mYBL7LRQvN7EeyBqwGjta
[1;38;5;39mCOMET INFO:[0m   Uploads:
[1;38;5;39mCOMET INFO:[0m     environment details : 1
[1;38;5;39mCOMET INFO:[0m     filename            : 1
[1;38;5;39mCOMET INFO:[0m     installed packages  : 1
[1;38;5;39mCOMET INFO:[0m     model-

## Gradio App

In [32]:
! pip install gradio --quiet

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.2/46.2 MB[0m [31m12.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m322.2/322.2 kB[0m [31m19.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.9/94.9 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.3/11.3 MB[0m [31m40.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.3/62.3 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [33]:
import gradio as gr
from transformers import pipeline
from  transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

# Load the text classification pipeline from Hugging Face
classifier = pipeline("text-classification", model="./clickbait-classifier-model-90", tokenizer=tokenizer)


def classify_text(text):
    prediction = classifier(text)[0]
    clickbait_label = "LABEL_1"  # Assuming LABEL_1 corresponds to clickbait
    non_clickbait_label = "LABEL_0"  # Assuming LABEL_0 corresponds to non-clickbait

    predicted_label = prediction["label"]
    predicted_score = prediction["score"] * 100

    clickbait_score = predicted_score if predicted_label == clickbait_label else 0
    non_clickbait_score = predicted_score if predicted_label == non_clickbait_label else 0

    return clickbait_score, non_clickbait_score

# Example clickbait headline
clickbait_example = ["You'll Never Believe What This Dog Did Next!"]

# Example non-clickbait headline
non_clickbait_example = ["Local School Board Approves New Budget"]

# Combine into a list of examples
examples = [clickbait_example, non_clickbait_example]

# Create the Gradio interface
iface = gr.Interface(
    fn=classify_text,
    inputs=[gr.Textbox(lines=2, placeholder="Enter a text headline...")],
    outputs=[
        gr.Slider(label="Clickbait", minimum=0, maximum=100, step=1),
        gr.Slider(label="Non-Clickbait", minimum=0, maximum=100, step=1),
    ],
    title="Clickbait Detector",
    examples=examples,
)

# Launch the interface
iface.launch()


Device set to use cuda:0


Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://1a431318347a848065.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


