# 2024-06-15
So yea initially, I started just with a pipeline,  like

```python
from transformers import pipeline, BertTokenizer, BertModel
import torch

# Load Hugging Face pipeline for zero-shot classification
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
```

then I was reading through Jake's article, [here](https://jaketae.github.io/study/zero-shot-classification/), realizing ok the pipeline is an abstraction and you can access the raw NLI output entailment logits, `entailement, neutral, contradiction`. 

So cool, let me try this 

In [12]:
import torch

In [1]:

from transformers import BartForSequenceClassification, BartTokenizer

model_name = "facebook/bart-large-mnli"
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForSequenceClassification.from_pretrained(model_name)



In [2]:
premise = "I am looking for a physical therapist who specializes in sports injuries, close by"

hypothesis = "this text is about something that is relatively near"

tokens = tokenizer(premise, hypothesis, return_tensors="pt")
outputs = model(**tokens)
outputs.keys()



odict_keys(['logits', 'past_key_values', 'encoder_last_hidden_state'])

In [3]:
logits = outputs.logits
logits.shape

torch.Size([1, 3])

In [6]:
logits

tensor([[-2.5674,  0.0570,  2.5532]], grad_fn=<AddmmBackward0>)

In [8]:
(contradiction, neutral, entailment) = float(logits[0][0]), f(logits[0][1]), logits[0][2]
(contradiction, neutral, entailment)

(tensor(-2.5674, grad_fn=<SelectBackward0>),
 tensor(0.0570, grad_fn=<SelectBackward0>),
 tensor(2.5532, grad_fn=<SelectBackward0>))

In [17]:
(torch.round(logits[0][0], decimals=3))

'tensor(-2.5670, grad_fn=<RoundBackward1>)'

In [22]:
f"{torch.round(logits, decimals=2)}" # .tolist()

'tensor([[-2.5700,  0.0600,  2.5500]], grad_fn=<RoundBackward1>)'

In [24]:
probs = logits.softmax(dim=1) ; probs

tensor([[0.0055, 0.0757, 0.9188]], grad_fn=<SoftmaxBackward0>)

In [26]:
print(f"{logits[0][0]:.3f}")
print(f"{probs[0][0]:.3f}")

-2.567
0.005


Thinking about negations hmm, maybe last time I was trying to add a negation into the "class" , aka, the premise. That was not working. Maybe this time, let's just stick to letting the neutrality, entailment, contradiction, to speak instead.

And an idea, if this doesn't work then yea maybe we'll need something rule based. or also compare to what a sophisticated high order model can say about the statements.

In [31]:
from itertools import product
hypotheses = {
    "relatively_near": "this text is about something that is relatively near",
    "specific_address": "this text refers to a specific address",
    "specific_location": "this text refers to a specific location",
    "relative_location": "this text refers to a relative location"
             }
premises = [
    "I am looking for a physical therapist who specializes in sports injuries, close by",
    "I am looking for a physical therapist who specializes in sports injuries",
    "I am looking for a physical therapist who specializes in sports injuries, nearby",
    "I am looking for a physical therapist who specializes in sports injuries, near me",
    "I am looking for a physical therapist who specializes in sports injuries, in Manhattan",
    "I am looking for a physical therapist who specializes in sports injuries, who is in NY in Brooklyn",
    "I am looking for a physical therapist who specializes in sports injuries, in the zip code 10010",
    "I am looking for a physical therapist who specializes in sports injuries, close to me",
    "I am looking for a physical therapist who specializes in sports injuries, close to Columbus Circle",
    "I am looking for a physical therapist who specializes in sports injuries, around Union Square",
]
terms = ["contradiction", "neutral", "entailment"]
for (premise, (hypothesis_brief, hypothesis)) in product(premises, hypotheses.items()):
    
    tokens = tokenizer(premise, hypothesis, return_tensors="pt")
    outputs = model(**tokens)
    logits = outputs.logits
    # (contradiction, neutral, entailment) = float(logits[0][0]), (logits[0][1]), logits[0][2]
    probs = logits.softmax(dim=1)
    # print(f"{logits[0][0]:.3f}")
    print("")
    print(premise)
    print(hypothesis)
    print([f"{x}: {probs[0][i]:.2f}" for i, x in enumerate(terms)])


I am looking for a physical therapist who specializes in sports injuries, close by
this text is about something that is relatively near
['contradiction: 0.01', 'neutral: 0.08', 'entailment: 0.92']

I am looking for a physical therapist who specializes in sports injuries, close by
this text refers to a specific address
['contradiction: 0.02', 'neutral: 0.59', 'entailment: 0.39']

I am looking for a physical therapist who specializes in sports injuries, close by
this text refers to a specific location
['contradiction: 0.01', 'neutral: 0.17', 'entailment: 0.82']

I am looking for a physical therapist who specializes in sports injuries, close by
this text refers to a relative location
['contradiction: 0.00', 'neutral: 0.12', 'entailment: 0.88']

I am looking for a physical therapist who specializes in sports injuries
this text is about something that is relatively near
['contradiction: 0.24', 'neutral: 0.55', 'entailment: 0.21']

I am looking for a physical therapist who specializes in sp

### Hmm not separating "relative" vs "specific"
I'm seeing both "specific" and "relative" getting flagged for actual locations (e.g. Union Square) 

### although better news, for the statement, without any location language, location hypotheses are more neutral 
So, not consistently, though the strenght is low on entailment here, compared with statements that have locations (relative or specific)
```
I am looking for a physical therapist who specializes in sports injuries
this text is about something that is relatively near
['contradiction: 0.24', 'neutral: 0.55', 'entailment: 0.21']

I am looking for a physical therapist who specializes in sports injuries
this text refers to a specific address
['contradiction: 0.03', 'neutral: 0.65', 'entailment: 0.32']

I am looking for a physical therapist who specializes in sports injuries
this text refers to a specific location
['contradiction: 0.05', 'neutral: 0.39', 'entailment: 0.56']

I am looking for a physical therapist who specializes in sports injuries
this text refers to a relative location
['contradiction: 0.06', 'neutral: 0.39', 'entailment: 0.55']
```