<a href="https://colab.research.google.com/github/fginter/FoamCutterSW/blob/master/01_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DSPy tutorial part 1:

*   Grabs a dataset on climate change claims
*   Carries out the same classification task using DSPy
*   Compares the outputs

IMPORTANT

Before you start, press the "key" symbol on the left, create a "new secret"
called "openai-api-key" and the value is the key you get from me. Allow "Notebook access".

In [13]:
!pip3 install -q 'datasets<4.0.0'
!pip3 install -q dspy

#Get the API key
!wget -O api-key.txt http://epsilon-it.utu.fi/dight-api-key-1.txt
api_key=open("api-key.txt").read().strip()

#Backup option:
#api_key="sk_...."

--2025-08-28 07:53:32--  http://epsilon-it.utu.fi/dight-api-key-1.txt
Resolving epsilon-it.utu.fi (epsilon-it.utu.fi)... 130.232.253.13
Connecting to epsilon-it.utu.fi (epsilon-it.utu.fi)|130.232.253.13|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 165 [text/plain]
Saving to: ‘api-key.txt’


2025-08-28 07:53:32 (21.0 MB/s) - ‘api-key.txt’ saved [165/165]



In [14]:
from datasets import load_dataset
from google.colab import userdata
import random


#### new comment

# Load the Frugal AI Challenge dataset
dataset = load_dataset("QuotaClimat/frugalaichallenge-text-train")

# Access the training split
train_data = dataset["train"]

# Pick 30 random items
sampled_items = random.sample(list(train_data), 30)

# Print out a few samples to see the structure
for item in sampled_items[:10]: #prints 10 samples
    print(" Quote:", item["quote"])
    print(" Label:", item["label"])
    print()

 Quote: Gavin’s a great friend and a great talent,” Rebel Media founder Ezra Levant told Canadaland in an email. “We tried to keep him, but he was lured away by a major competitor that we just couldn’t outbid.
 Label: 0_not_relevant

 Quote: Because it’s getting warmer, there’s more CO2 coming out which means it’s going to get warmer which means there’s more CO2 coming out[…] And it will just run away with itself.”
 Label: 2_not_human

 Quote: We had a Medieval warm period. We had a little ice age. Now the little ice age is over and it’s getting warming. It’s not surprising; you end an ice age, temperatures do get warmer. Climate change is standard. The concern should really be: in the past, mankind adjusted to climate change. We need to adjust to the change in climate. […] We need to be thinking in terms of shifting with the climate, because the climate shifts. The climate changes.
 Label: 2_not_human

 Quote: Liberals know very well..the sun solar flares are at the end of a hot cycle

In [15]:
import dspy


class ClimateViewClassifier(dspy.Signature):
    """Classify a quote or claim about climate change using one of the following labels:

    Not-Relevant: No relevant claim detected or claims that don't fit other categories

    Not-Happening: Claims denying the occurrence of global warming and its effects - Global warming is not happening. Climate change is NOT leading to melting ice (such as glaciers, sea ice, and permafrost), increased extreme weather, or rising sea levels. Cold weather also shows that climate change is not happening

    Not-Human: Claims denying human responsibility in climate change - Greenhouse gases from humans are not the causing climate change.

    Not-Bad: Claims minimizing or denying negative impacts of climate change - The impacts of climate change will not be bad and might even be beneficial.

    Solutions-Harmful-Unnecessary: Claims against climate solutions - Climate solutions are harmful or unnecessary

    Science-is-Unreliable: Claims questioning climate science validity - Climate science is uncertain, unsound, unreliable, or biased.

    Proponents-Biased: Claims attacking climate scientists and activists - Climate scientists and proponents of climate action are alarmist, biased, wrong, hypocritical, corrupt, and/or politically motivated.

    Fossil-Fuels-Needed: Claims promoting fossil fuel necessity - We need fossil fuels for economic growth, prosperity, and to maintain our standard of living.
    """

    quote = dspy.InputField(desc="Quote or claim about climate change")
    label = dspy.OutputField(
        desc="Label of the quote",
        choices=["Not-Happening","Not-Human","Not-Bad","Solutions-Harmful-Unnecessary","Science-is-Unreliable","Proponents-Biased","Fossil-Fuels-Needed"],
    )

# Initialize DSPy with an LLM (use default OpenAI if configured, or HF local)
lm=dspy.LM("openai/gpt-4.1-mini",api_key=api_key)
dspy.configure(lm=lm)

# Instantiate classifier
classifier = dspy.Predict(signature=ClimateViewClassifier)

# Run predictions on our 30 samples
for item in sampled_items:
    quote = item["quote"]
    true_label = item["label"]
    pred = classifier(quote=quote)
    print("Text:", quote.replace("\n", " "))
    print("True:", true_label)
    print("Pred:", pred.label)
    print("-" * 40)


Text: Gavin’s a great friend and a great talent,” Rebel Media founder Ezra Levant told Canadaland in an email. “We tried to keep him, but he was lured away by a major competitor that we just couldn’t outbid.
True: 0_not_relevant
Pred: Not-Relevant
----------------------------------------
Text: Because it’s getting warmer, there’s more CO2 coming out which means it’s going to get warmer which means there’s more CO2 coming out[…] And it will just run away with itself.”
True: 2_not_human
Pred: Not-Human
----------------------------------------
Text: We had a Medieval warm period. We had a little ice age. Now the little ice age is over and it’s getting warming. It’s not surprising; you end an ice age, temperatures do get warmer. Climate change is standard. The concern should really be: in the past, mankind adjusted to climate change. We need to adjust to the change in climate. […] We need to be thinking in terms of shifting with the climate, because the climate shifts. The climate changes.

# Follow-up task(s)

1.   Is the model accurate?
1.   Is the model wrong, when it disagrees with the dataset annotation?
1.   If you change the model from gpt-4.1-mini to gpt-4.1 is there any difference? (do not rerun the whole notebook so you keep your random sample)
2.   Think of any other classification schema of these claims, and try to modify the DSPy code to make it happen - did it work?
