# 60 · Tweet-Stance Classification Demo  
_Last updated: 2025-05-03_

Workflow today  

1. **Load** a table of **12 tweets** spanning three topics, sprinkled with sarcasm, emojis, and an unrelated chatter.  
2. **Label** each tweet four times (temperatures 0 / 0.5 / 1 / 1.5).  
3. **Aggregate** with a modal vote and compute a tiny agreement metric (1, 0.5, 0).  
4. **Explore**: confusion matrix, temperature impact, and a quick few-shot variant.

Everything runs live in < 90 seconds; scaling is just a matter of using the Batch API.



### If you use this code, please cite the paper: 

- Garg, P. and Fetzer, T., 2025. **Political expression of academics on Twitter**. Nature Human Behaviour. DOI: 10.1038/s41562-025-02199-1

(The above paper is forthcoming. Pre-print available at [https://www.researchsquare.com/article/rs-4480504/v1](https://www.researchsquare.com/article/rs-4480504/v1).)



## API key

* Checks `OPENAI_API_KEY` in the environment.  
* Else reads `key/openai_key.txt`.  
* Raises an error if missing (same as earlier notebooks).


In [1]:
# %pip -q install --upgrade openai python-dotenv pandas

import os, pathlib, json, random, pandas as pd
from openai import OpenAI


# 2) fallback key file
key_path = pathlib.Path("key/openai_key.txt")
if os.getenv("OPENAI_API_KEY") is None and key_path.exists():
    os.environ["OPENAI_API_KEY"] = key_path.read_text().strip()

if not os.getenv("OPENAI_API_KEY"):
    raise ValueError("Supply OPENAI_API_KEY in env or create key/openai_key.txt")

client = OpenAI()


## 1 · Craft a toy—but tricky—tweet table

* Two clear supports, two clear oppositions, **one neutral** and **one unrelated** per topic.  
* Some contain hashtags, emojis, sarcasm, or mixed sentiments to test the classifier.


In [2]:
random.seed(42)
data = [
    # Women's rights
    dict(status_id="w1", topic="Women's Right",
         text="Equal pay for equal work should be a given."),
    dict(status_id="w2", topic="Women's Right",
         text="The 'wage gap' myth again 🙄 Everyone’s tired of this false narrative."),
    dict(status_id="w3", topic="Women's Right",
         text="Interesting report on workforce participation (no stance)."),
    dict(status_id="w4", topic="Women's Right",
         text="🏀 Can't wait for tonight's WNBA finals!"),  # unrelated
    # Climate action
    dict(status_id="c1", topic="Climate Action",
         text="Renewables are the future—let's invest more in solar!"),
    dict(status_id="c2", topic="Climate Action",
         text="Climate change is exaggerated to justify taxes."),
    dict(status_id="c3", topic="Climate Action",
         text="Not sure what to think about carbon offsets 🤔"),
    dict(status_id="c4", topic="Climate Action",
         text="Just tried Ozone coffee ☕️🔥 Love the flavor!"),  # unrelated
    # Elon Musk
    dict(status_id="e1", topic="Elon Musk",
         text="Musk's innovations are incredible; can't wait for Starship 🚀"),
    dict(status_id="e2", topic="Elon Musk",
         text="He manipulates markets with a single tweet—dangerous."),
    dict(status_id="e3", topic="Elon Musk",
         text="¯\\_(ツ)_/¯ Musk being Musk, guess we'll see what happens."),
    dict(status_id="e4", topic="Elon Musk",
         text="Looking for a good musky aftershave recs?"),  # unrelated
]
df = pd.DataFrame(data)
df


Unnamed: 0,status_id,topic,text
0,w1,Women's Right,Equal pay for equal work should be a given.
1,w2,Women's Right,The 'wage gap' myth again 🙄 Everyone’s tired o...
2,w3,Women's Right,Interesting report on workforce participation ...
3,w4,Women's Right,🏀 Can't wait for tonight's WNBA finals!
4,c1,Climate Action,Renewables are the future—let's invest more in...
5,c2,Climate Action,Climate change is exaggerated to justify taxes.
6,c3,Climate Action,Not sure what to think about carbon offsets 🤔
7,c4,Climate Action,Just tried Ozone coffee ☕️🔥 Love the flavor!
8,e1,Elon Musk,Musk's innovations are incredible; can't wait ...
9,e2,Elon Musk,He manipulates markets with a single tweet—dan...


## 2 · One-shot classification helper


In [3]:
stance_schema = {
    "type": "json_schema",
    "json_schema": {
        "name": "tweet_stance",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "stance": {"type": "string",
                           "enum": ["pro", "anti", "neutral", "unrelated"]}
            },
            "required": ["stance"],
            "additionalProperties": False
        }
    }
}

system_prompt = (
    "For the provided topic and tweet, classify the writer's stance as:\n"
    "• pro (supports)\n"
    "• anti (opposes)\n"
    "• neutral (mentions without clear stance)\n"
    "• unrelated (tweet not about the topic)\n"
    "Return JSON {\"stance\": pro|anti|neutral|unrelated} **only**."
)

def classify_once(topic: str, tweet: str,
                  temp: float = 0.3,
                  model: str = "gpt-4o-mini"):
    """Call the LLM once; return stance label."""
    user = f"Topic: {topic}\nTweet: {tweet}"
    resp = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user",   "content": user}
        ],
        temperature=temp,
        max_tokens=15,
        response_format=stance_schema
    )
    return json.loads(resp.choices[0].message.content)["stance"]


## 3 · Four passes (temperature 0, 0.5, 1, 1.5)


In [4]:
temps = [0.0, 0.5, 1.0, 1.5, 2.0]

# 3) classify each tweet with different temperature settings
for idx, t in enumerate(temps, start=1):
    col = f"stance_{idx}"
    df[col] = df.apply(lambda r: classify_once(r.topic, r.text, temp=t), axis=1)

df


Unnamed: 0,status_id,topic,text,stance_1,stance_2,stance_3,stance_4,stance_5
0,w1,Women's Right,Equal pay for equal work should be a given.,pro,pro,pro,pro,pro
1,w2,Women's Right,The 'wage gap' myth again 🙄 Everyone’s tired o...,anti,anti,anti,anti,anti
2,w3,Women's Right,Interesting report on workforce participation ...,neutral,neutral,neutral,neutral,neutral
3,w4,Women's Right,🏀 Can't wait for tonight's WNBA finals!,unrelated,unrelated,unrelated,unrelated,unrelated
4,c1,Climate Action,Renewables are the future—let's invest more in...,pro,pro,pro,pro,pro
5,c2,Climate Action,Climate change is exaggerated to justify taxes.,anti,anti,anti,anti,anti
6,c3,Climate Action,Not sure what to think about carbon offsets 🤔,neutral,neutral,neutral,neutral,neutral
7,c4,Climate Action,Just tried Ozone coffee ☕️🔥 Love the flavor!,unrelated,unrelated,unrelated,unrelated,unrelated
8,e1,Elon Musk,Musk's innovations are incredible; can't wait ...,pro,pro,pro,pro,pro
9,e2,Elon Musk,He manipulates markets with a single tweet—dan...,anti,anti,anti,anti,anti


In [None]:
# 

## 4 · Modal vote and agreement


In [5]:
def modal(row):
    votes = row[[f"stance_{i}" for i in range(1, 5)]].tolist()
    return max(set(votes), key=votes.count)

def agreement(row):
    uniq = len(set(row[[f"stance_{i}" for i in range(1, 5)]]))
    return {1:1.0, 2:0.66, 3:0.33, 4:0.0}[uniq]

df["modal_stance"] = df.apply(modal, axis=1)
df["agree_score"] = df.apply(agreement, axis=1)

display(df[["status_id", "topic", "modal_stance", "agree_score"]])
print(f"Average agreement: {df.agree_score.mean():.2f}")


Unnamed: 0,status_id,topic,modal_stance,agree_score
0,w1,Women's Right,pro,1.0
1,w2,Women's Right,anti,1.0
2,w3,Women's Right,neutral,1.0
3,w4,Women's Right,unrelated,1.0
4,c1,Climate Action,pro,1.0
5,c2,Climate Action,anti,1.0
6,c3,Climate Action,neutral,1.0
7,c4,Climate Action,unrelated,1.0
8,e1,Elon Musk,pro,1.0
9,e2,Elon Musk,anti,1.0


Average agreement: 1.00


## 5 · Temperature impact — crude confusion matrix

How often does **temp 0** disagree with **temp 1.5**?


In [8]:
crosstab = pd.crosstab(df["stance_1"], df["stance_4"], rownames=["temp 0"], colnames=["temp 1.5"])
crosstab

# also add the fifth stance
# df["stance_5"] = df.apply(lambda r: classify_once(r.topic, r.text, temp=0.0), axis=1)
# df["modal_stance"] = df.apply(modal, axis=1)

temp 1.5,anti,neutral,pro,unrelated
temp 0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
anti,3,0,0,0
neutral,0,3,0,0
pro,0,0,3,0
unrelated,0,0,0,3


## 6 · Quick extension — few-shot example (optional run)

Below we pass **two labeled examples** as assistant messages before the user query.
Run on one tweet to see the effect.


In [16]:
few_shot_messages = [
    {"role":"system","content":system_prompt},
    {"role":"assistant","content":'{"stance":"pro"}'},
    {"role":"user","content":"Topic: Climate Action\nTweet: Solar power is awesome #GoGreen"},
    {"role":"assistant","content":'{"stance":"pro"}'},
    {"role":"user","content":"Topic: Climate Action\nTweet: Climate hysteria is just a money grab."},
    {"role":"assistant","content":'{"stance":"anti"}'},
]

tweet_demo = "Electric cars are overrated hype!"
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=few_shot_messages + [
        {"role":"user","content":f"Topic: Climate Action\nTweet: {tweet_demo}"}
    ],
    temperature=0.3,
    max_tokens=15,
    response_format=stance_schema
)
print("Few-shot label ➜", json.loads(resp.choices[0].message.content)["stance"])


Few-shot label ➜ anti


## 7 · Next-level ideas

* **Active learning** — sample tweets with low agreement and hand-label to build a gold set.  
* **Weighted voting** — give lower weight to temperature 1.5 labels.  
* **Topic-aware unrelated filter** — do a keyword pre-pass to skip obviously off-topic tweets.  
* **Explainability** — ask the model to provide a justification alongside the label, store it for audits.  
* **Bias check** — run the same pipeline on demographic-coded tweets (see Notebook 70 & 71) to test fairness.
