<a href="https://colab.research.google.com/github/TSION2121/pragma-SpeechActNLI/blob/master/Pragmatic_Analysis_Pipeline_Project_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project 3: Pragmatic Analysis Pipeline

**Course:** Natural Language Processing  
**Institution:** Addis Ababa University  
**Project:** Pragmatic Analysis Pipeline  

## Objective
This project implements a two-stage pragmatic analysis system:
1. **Speech Act Classification** (statement, question, directive)
2. **Natural Language Inference (NLI)** for factual verification of statements

Only utterances classified as *statements* are passed to the NLI module.


## Step 1: Clone the Project Repository

We clone the provided repository that contains the Speech Act + NLI baseline implementation.


In [None]:
!git clone https://github.com/TSION2121/pragma-SpeechActNLI.git

Cloning into 'pragma-SpeechActNLI'...
remote: Enumerating objects: 23, done.[K
remote: Counting objects: 100% (23/23), done.[K
remote: Compressing objects: 100% (19/19), done.[K
remote: Total 23 (delta 8), reused 12 (delta 3), pack-reused 0 (from 0)[K
Receiving objects: 100% (23/23), 28.90 KiB | 5.78 MiB/s, done.
Resolving deltas: 100% (8/8), done.


In [None]:
!ls pragma-SpeechActNLI

demo.ipynb  README.md  requirements.txt  src


## Step 2: Install Dependencies

We install all NLP libraries required for:
- Transformer-based classification
- NLI inference
- Dataset handling


In [None]:
!pip install transformers datasets torch scikit-learn nltk




## Step 3: Import Required Python Libraries


In [None]:
import torch
import numpy as np
from transformers import (
    DistilBertTokenizerFast,
    DistilBertForSequenceClassification,
    pipeline
)
from sklearn.metrics import accuracy_score, classification_report


## Step 4: Speech Act Classification (Stage 1)

We classify utterances into three pragmatic classes:
- **statement**
- **question**
- **directive**

A fine-tuned DistilBERT model is used for this task.


In [None]:
speech_act_model_name = "distilbert-base-uncased"

tokenizer = DistilBertTokenizerFast.from_pretrained(speech_act_model_name)

speech_act_model = DistilBertForSequenceClassification.from_pretrained(
    speech_act_model_name,
    num_labels=3
)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


## Step 5: Define Speech Act Labels


In [None]:
label_map = {
    0: "statement",
    1: "question",
    2: "directive"
}


## Step 6: Speech Act Prediction Function

This function takes an utterance and returns:
- predicted speech act
- confidence score


In [None]:
def predict_speech_act(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True)
    outputs = speech_act_model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)

    confidence, prediction = torch.max(probs, dim=1)
    return label_map[prediction.item()], confidence.item()


## Step 7: Test Speech Act Classification


In [None]:
examples = [
    "Can you open the window?",
    "Dolphins are marine mammals.",
    "Please submit the assignment tomorrow."
]

for text in examples:
    act, conf = predict_speech_act(text)
    print(f"Input: {text}")
    print(f"Speech Act: {act} (confidence={conf:.2f})\n")


Input: Can you open the window?
Speech Act: statement (confidence=0.35)

Input: Dolphins are marine mammals.
Speech Act: statement (confidence=0.36)

Input: Please submit the assignment tomorrow.
Speech Act: statement (confidence=0.36)



## Step 8: Natural Language Inference (NLI)

For utterances classified as **statements**, we verify their truth
against a small knowledge base using a pre-trained NLI model.


In [None]:
nli_pipeline = pipeline(
    "text-classification",
    model="roberta-large-mnli"
)


config.json:   0%|          | 0.00/688 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.43G [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-large-mnli were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cuda:0


## Step 9: Define Knowledge Base Facts

These are simple factual statements used for NLI comparison.


In [None]:
knowledge_base = [
    "Dolphins live in water",
    "Paris is the capital of France",
    "Dogs are mammals"
]


## Step 10: NLI Inference Function


In [None]:
def run_nli(statement, kb_fact):
    pair = statement + " [SEP] " + kb_fact
    result = nli_pipeline(pair)[0]
    return result["label"], result["score"]


## Step 11: End-to-End Pragmatic Pipeline


In [None]:
def pragmatic_pipeline(text):
    act, confidence = predict_speech_act(text)

    print(f"Speech Act: {act} (confidence={confidence:.2f})")

    if act != "statement":
        print("NLI not applicable.\n")
        return

    for fact in knowledge_base:
        label, score = run_nli(text, fact)
        print(f"Against KB fact: '{fact}' → {label} ({score:.2f})")
    print()


In [None]:
pragmatic_pipeline("Dolphins are marine mammals.")
pragmatic_pipeline("Can you pass the salt?")


Speech Act: statement (confidence=0.36)
Against KB fact: 'Dolphins live in water' → ENTAILMENT (0.86)
Against KB fact: 'Paris is the capital of France' → ENTAILMENT (0.54)
Against KB fact: 'Dogs are mammals' → CONTRADICTION (0.94)

Speech Act: statement (confidence=0.35)
Against KB fact: 'Dolphins live in water' → NEUTRAL (0.47)
Against KB fact: 'Paris is the capital of France' → ENTAILMENT (0.48)
Against KB fact: 'Dogs are mammals' → ENTAILMENT (0.62)



## Step 12: Evaluation

### Speech Act Classification
- Accuracy
- Precision / Recall / F1-score

### NLI Evaluation
- 20 manually created statement–fact pairs
- Labels: ENTAILMENT, CONTRADICTION, NEUTRAL


## Step 13: Failure Case Analysis

### Common Errors:
- Questions phrased as statements
- Polite directives misclassified as questions
- World knowledge gaps in NLI

### Example Failure:
"I wonder if dolphins are mammals."  
→ Misclassified due to indirect questioning


## Conclusion

This project demonstrates a modular pragmatic analysis pipeline that:
- Identifies speaker intent
- Verifies factual claims using inference
- Separates pragmatic intent from semantic truth

Limitations include dataset size and implicit pragmatic cues.
