# Political DEBATE: Zero-shot NLI Classification

#### This tutorial demonstrations zero-shot classification with the DEBATE models. This includes:
1. How to download the models from the Huggingface hub.
2. How to pass the models to the GPU for accelerated classification.
3. How to use the NLI classification framework and the transformers pipeline.

The Transformers library will provide access to pre-trained language models as well as an easy to use pipeline for classification.

Read the [Transformers documentation](https://huggingface.co/docs/transformers/index)

Explore the [repository of pre-trained models](https://huggingface.co/models)

#### Requirements:
1. A very basic understanding of Python.
2. Access to a GPU is beneficial, but not necessarily required for smaller data sets. Free services like Google Colab can be used if you don't have a desktop GPU.

In [1]:
# install necessary libraries. The transformers and datasets libraries are necessary to run in Google Colab. If you are running this script locally you may need to install other libraries as well.
# If you're using a Macbook integrated GPU, make sure to install the "accelerate" library as well.
!pip install transformers datasets



In [2]:
import pandas as pd
import torch
from datasets import load_dataset
from transformers import pipeline
from sklearn.metrics import matthews_corrcoef

For this example we will use a random sample from the PolNLI test set. We can download it directly from the huggingface hub and then subset it. The 'premise' column in the dataset is the document we're going to classify. The 'hypothesis' is a statement that the model will determine if it is true or false given the contents of the premise. If the hypothesis is true, the 'entailment' label is 0. If it is not true, the label is 1.

In [3]:
ds = load_dataset("mlburnham/Pol_NLI")
test = ds['test'].to_pandas()
# we'll use a random sample of 1,000 documents for this example
test = test[['premise', 'hypothesis', 'entailment', 'task']].sample(1000, random_state = 1)
test.head()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Unnamed: 0,premise,hypothesis,entailment,task
2973,The soldiers storming the beaches on D-Day may...,This text advocates for violence against jews.,0,hatespeech and toxicity
2593,Regime warplanes and helicopters targeted Al-L...,This text is about a protest with excessive fo...,1,event extraction
6737,rt @scottwalker first up this morning at #ncsc...,The author of this text opposes jim jordan.,1,stance detection
14940,"With protection from the Taliban, al Qaeda and...",This text is about terrorism.,0,topic classification
4088,@aiyegbayo @KadariaAhmed LOL what nigerian pro...,This text advocates for violence against peopl...,1,hatespeech and toxicity


The Transformers library offers a simple pipeline we can use to classify the data. All we need to do is specify the task and the model we will use. More information on the model can be found [here](https://huggingface.co/mlburnham/Political_DEBATE_base_v1.0)

We want to make sure we are using a GPU for fast inference. So here we quickly check to see if one is available. If you're running this on a discrete GPU then leave the code as is. If you're running this on a macbook with an integrated GPU, then change "cuda" to "mps". If this prints 'cpu' rather than 'cuda' or 'mps' then something went wrong. If you're using a Colab notebook, make sure that the runtime is set to GPU. You can change this at the top by selecting a GPU under the "change runtime type" option.

In [4]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Device: {device}")

Device: cuda


Here we instantiate our classifier using the pipeline class. This is a fast and easy way to use any model on the Hugging Face Hub. We specify the device using the variable we defined above to make sure the model uses the GPU, and define the number of documents passed through the model at a time with the batch size variable. Lower batch sizes will take longer to classify, but higher batch sizes require more GPU memory.

In [5]:
pipe = pipeline("zero-shot-classification", model='mlburnham/Political_DEBATE_base_v1.0', device = device, batch_size = 32) # To use the base model
#pipe = pipeline("zero-shot-classification", model='mlburnham/Political_DEBATE_large_v1.0', device = device, batch_size = 32) # To use the large model

In our test set each document is paired with a different hypothesis. NLI classifiers work by pairing documents with "hypotheses" and determining if the hypothesis is true given the information in the text. To ensure that each document is paired with the correct hypothesis the code below will loop through each row of the dataframe, pairing documents with their associated hypothesis and then classifying one at a time. This is slower because it doesn't take advantage of batching, which classifies multiple documents in parallel. If all documents will be classified with the same hypothesis or set of hypotheses, or if you can group documents together by hypotheses, see the batching section for faster inference.

## Inference with a for loop for when each documents has a unique hypothesis.

In [6]:
colname = 'debate_label' # the name of the column where we will assign out labels to
test[colname] = 0

for i in test.index:
    hypothesis = test.loc[i, 'hypothesis']
    sample = test.loc[i, 'premise']
    res = pipe(sample, hypothesis, hypothesis_template = '{}', multi_label = False)
    test.loc[i, colname] = round(res['scores'][0])
test[colname].replace({0:1, 1:0}, inplace = True)
test[colname] = test[colname].astype(int)

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  test[colname].replace({0:1, 1:0}, inplace = True)


In [7]:

# Check the results
matthews_corrcoef(test['entailment'], test['debate_label'])

0.9004438415985251

## Batched inference for when all documents are classified with the same hypotheses.

For batched inference we create a list of all the documents we want to classify, a template for the hypothesis that each document will be paired with, and then a list of possible labels. The {} in the hypothesis template will be populated with each string in the list of labels.

In [18]:
samples = list(test['premise'])
template = 'The author of this tweet {} Trump.'
# multilabel entailment labels
labels = ['supports', 'opposes', 'is neutral towards']

Now we classify the data by passing our documents, labels, and template to the classifier. The model will pair each document with each of the three hypotheses:
* The author of this tweet supports Trump.
* The author of this tweet opposes Trump.
* The author of this tweet is neutral towards Trump.

It well then determine the probablility that each hypothesis is true given the document. The assigned label will be the hypothesis that is most likely to be true.

In [19]:
# classify the documents
# The multilabel option determines if more than one hypothesis can be true for the document.
# If false, the most likely label is returned. If true, a dictionary of labels and their estimated probability is returned.
res = pipe(samples, labels, hypothesis_template = template, multi_label = False)
# return the most probable label and add it to our data frame
test['debate_label'] = [label['labels'][0] for label in res]
test.head()

Unnamed: 0,premise,hypothesis,entailment,dataset,task,augmented_hypothesis,debate_label
2973,The soldiers storming the beaches on D-Day may...,This text advocates for violence against jews.,0,mlburnham/violent_hatespeech_entailment,hatespeech and toxicity,This text promotes violence against Jewish peo...,supports
2593,Regime warplanes and helicopters targeted Al-L...,This text is about a protest with excessive fo...,1,mlburnham/acled_event_entailment,event extraction,This article covers a protest that involved th...,is neutral towards
6737,rt @scottwalker first up this morning at #ncsc...,The author of this text opposes jim jordan.,1,mlburnham/PoliStance_Affect,stance detection,The author of this text opposes jim jordan.,is neutral towards
14940,"With protection from the Taliban, al Qaeda and...",This text is about terrorism.,0,mlburnham/dem_rep_party_platform_topics,topic classification,this text is about terrorism.,is neutral towards
4088,@aiyegbayo @KadariaAhmed LOL what nigerian pro...,This text advocates for violence against peopl...,1,mlburnham/violent_hatespeech_entailment,hatespeech and toxicity,This text supports violent actions towards ind...,is neutral towards


Labels are returned as plain text, so we now recode them to binary labels to evaluate classification performance.

In [None]:
# recode multilabel labels
test['debate_labels'].replace(regex = {r'supports':1, r'opposes':0, r'is neutral towards': 0}, inplace = True)

In [None]:
matthews_corrcoef(test['entailment'], test['debate_label'])