# LLM for stance detection


In this notebook, we will use the Large Language Models to perform stance detection. The task is to classify the stance of a given text towards a given target. And as I mentioned, the LLMs can be used to perform this task with Few-Shot or even Zero-Shot learning. I will provide a Zero-Shot learning example in this notebook.

I will demonstrate how to use the LLMs for stance detection using the `transformers` library. We will use the `google/gemma-3-12b-it` model for this task. The `transformers` library provides a simple API to use the LLMs for various NLP tasks.

We will use the `AutoModelForCausalLM` class to load the model and the `AutoTokenizer` class to load the tokenizer. Causal language models are the models that can generate the next token given the previous tokens and tokenizer is used to convert the text into tokens that can be fed into the model.


## Load the data


In [1]:
!nvidia-smi

Sat Mar 22 20:28:08 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.05              Driver Version: 560.35.05      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A40                     On  |   00000000:1D:00.0 Off |                    0 |
|  0%   26C    P8             20W /  300W |       1MiB /  46068MiB |      0%   E. Process |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [None]:
!pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126 
!pip3 install -U transformers
!pip3 install -U accelerate


In [1]:
import pandas as pd

# Remove #SemST from the tweet

stance = pd.read_csv("../../data/SemEval2016-testdata-taskA-all-annotations.csv")
stance["Tweet"] = stance["Tweet"].str.replace("#SemST", "")

## Load the model and tokenizer


In [2]:
# Load Huggingface Transformers library
# https://huggingface.co/transformers/
# clear jupyter notebook output
from IPython.display import clear_output
from transformers import AutoModelForCausalLM, AutoTokenizer

device = 'cuda'

model_name = "google/gemma-3-12b-it"  # LLM
tokenizer = AutoTokenizer.from_pretrained(model_name)  # Load tokenizer
model = AutoModelForCausalLM.from_pretrained(model_name, device_map=device)  # Load model
clear_output()

## Specify the prompt template


Before we start format the prompt template, we need to understand what the inputs should look like and why we need to format the prompt template like this.

For training a Large Language Model (with 'trained' referring specifically to the model after human alignment) we need to provide the model with a prompt that contains the input text and the target text. And the model should be able to distinguish between human input and desired output. Therefore, we will roughly see two types of prompt templates: one will only distinguish between the human input and model output and the other will also provide the instruction. The only difference between the two is that the former will treat the instruction as a part of the input text and the latter will treat the instruction as a separate entity.

Based on the [technical report](https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf) provided by Google, they used the first type of prompt template. Therefore, we will also follow the same format for the prompt template.

The prompt template should look like this:

```
[BOS]<start_of_turn>user
Who are you?<end_of_turn>
<start_of_turn>model
My name is Gemma!<end_of_turn>
<start_of_turn>user
What is 2+2?<end_of_turn>
<start_of_turn>model
```

Where `<start_of_turn>` and `<end_of_turn>` are the special tokens that are used to indicate the start and end of the sequence. `user` and `model` are the special tokens that are used to indicate whether the content is from user or LLM. 

Fortunately, for inference-only tasks (such as zero-shot or few-shot learning), we don’t need to provide the model with the target text—only the input text is required. By using the `transformers` library’s `pipeline` function, we can format the input in a way that is applied uniformly across different LLMs, even if they use different chat templates.

To reuse the prompt template for different inputs, we will create a `f` string that will take the input text and the target text as input and return the formatted prompt template. If you are not familiar with the `f` string, you can think of it as a string that can take variables as input and return the formatted string. More details can be found [here](https://realpython.com/python-f-strings/).


Let's take an example to understand how the prompt template will look like for the stance detection task.


In [9]:
topic = stance.loc[0, "Target"]
tweet = stance.loc[0, "Tweet"]

In [10]:
print(f"Topic: {topic}\nTweet: {tweet}\n")

Topic: Atheism
Tweet: dear lord thank u for all of ur blessings forgive my sins lord give me strength and energy for this busy day ahead #blessed #hope 



In [14]:
# Define the prompt template
prompt = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "text", "text": f"""
            Instruction: You have assumed the role of a human annotator. In this task, you will be presented with a tweet, delimited by triple backticks, concerning the {topic}. Please make the following assessment:
            (1) Determine whether the tweet discusses the topic of the {topic}. If so, please indicate whether the Twitter user who posted the tweet favors, opposes, or has no opinion on the {topic}.
                                         
            Your response should be formatted as follows: "Stance: [FAVOR, AGAINST, NONE]"

            Tweet: ```{tweet}```"""}
        ]
    }
]



In [15]:
from pprint import pprint
pprint(f"Prompt: {prompt}\n")

("Prompt: [{'role': 'system', 'content': [{'type': 'text', 'text': 'You are a "
 "helpful assistant.'}]}, {'role': 'user', 'content': [{'type': 'text', "
 "'text': '\\n            Instruction: You have assumed the role of a human "
 'annotator. In this task, you will be presented with a tweet, delimited by '
 'triple backticks, concerning the Atheism. Please make the following '
 'assessment:\\n            (1) Determine whether the tweet discusses the '
 'topic of the Atheism. If so, please indicate whether the Twitter user who '
 'posted the tweet favors, opposes, or has no opinion on the '
 'Atheism.\\n                                         \\n            Your '
 'response should be formatted as follows: "Stance: [FAVOR, AGAINST, '
 'NONE]"\\n\\n            Tweet: ```dear lord thank u for all of ur blessings '
 'forgive my sins lord give me strength and energy for this busy day ahead '
 "#blessed #hope ```'}]}]\n")


### LLM inference


### Tokenize the input text


Since we already have the raw input text, we will need to transform it into the format that the model can understand. We will use the `AutoTokenizer` class to convert the input text into tokens that can be fed into the model. The `AutoTokenizer` class will automatically select the appropriate tokenizer for the model.

For more information on understand how tokenizer works and how to use the `AutoTokenizer` class, you can refer to the [official documentation](https://huggingface.co/transformers/model_doc/auto.html#autotokenizer).


In [20]:
# Tokenize the prompt
inputs = tokenizer.apply_chat_template(prompt, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device)
# Take a look at the tokens
print(f"Tokens: {inputs}\n")

Tokens: tensor([[     2,    105,   2364,    107,   3048,    659,    496,  11045,  16326,
         236761,    108,  74279, 236787,   1599,    735,  12718,    506,   3853,
            529,    496,   3246,  24740,   1277, 236761,    799,    672,   4209,
         236764,    611,    795,    577,   6212,    607,    496,  21866, 236764,
         215131,    684,  22178,   1063,  81982, 236764,  13899,    506, 234681,
           1929, 236761,   7323,   1386,    506,   2269,  10834, 236787,    107,
            148, 236769, 236770, 236768,  32814,   3363,    506,  21866,  39817,
            506,  10562,    529,    506, 234681,   1929, 236761,   1637,    834,
         236764,   5091,  10128,   3363,    506,   9526,   2430,   1015,  12551,
            506,  21866,  69656, 236764, 122139, 236764,    653,    815,    951,
           8737,    580,    506, 234681,   1929, 236761,    107,    167,    146,
            107,    148,  11069,   3072,   1374,    577,  45047,    618,   5238,
         236787,    

In [26]:
# Note: the tokenized prompt can always be decoded back to the original prompt
# Decode the tokenized prompt
decoded_prompt = tokenizer.decode(inputs[0])
print(decoded_prompt)

<bos><start_of_turn>user
You are a helpful assistant.

Instruction: You have assumed the role of a human annotator. In this task, you will be presented with a tweet, delimited by triple backticks, concerning the Atheism. Please make the following assessment:
            (1) Determine whether the tweet discusses the topic of the Atheism. If so, please indicate whether the Twitter user who posted the tweet favors, opposes, or has no opinion on the Atheism.
                                         
            Your response should be formatted as follows: "Stance: [FAVOR, AGAINST, NONE]"

            Tweet: ```dear lord thank u for all of ur blessings forgive my sins lord give me strength and energy for this busy day ahead #blessed #hope ```<end_of_turn>
<start_of_turn>model



### Feed the input text (tokens) into the model


In [27]:
outputs = model.generate(inputs, max_new_tokens=20)  # Generate the model output
# Decode the generated output
generated_output = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
# Print the generated output and compare with human annotation
print(
    f"Generated Output: {generated_output}\nHuman annotation: {stance.loc[0,'Stance']}"
)

Generated Output: Stance: AGAINST
Human annotation: AGAINST


There are several parameters that we can use to control the output of the model. We used the `max_length` parameter to control the maximum length of the output. We alsod use the `return_tensors` parameter to control the output format. We set it to `pt` to get the output in PyTorch tensors format.

Besides these parameters, we can also use the `temperature` parameter to control the randomness of the output. We can use the `top_k` and `top_p` parameters to control the diversity of the output. We can also use the `num_return_sequences` parameter to control the number of output sequences.

There is a great explanation of temperature parameter (which is also the parameter you will use for OpenAI's Models) in the [blog](https://lukesalamone.github.io/posts/what-is-temperature/)

Feel free to play around with these parameters to see how they affect the output of the model.
