In [1]:
%%capture
%load_ext autoreload
%autoreload 2

# Windows Environment Simulation
from dotenv import load_dotenv
load_dotenv()

### (ExPred) Load the pretrained model

In [2]:
%%capture
from src.expred import (seeding, ExpredInput,
                        BertTokenizerWithSpans, ExpredConfig, Expred)
from src.expred.models import (prepare_for_cl, train_evidence_classifier,
                               train_mtl_token_identifier)
from transformers import BertTokenizer

# or, simply
expred_config = ExpredConfig(
    pretrained_dataset_name='fever',
    base_dataset_name='fever',
    device='cpu',
    load_from_pretrained=True)

# seeding
seeding(1234)

# Initialize tokenizer
tokenizer = BertTokenizerWithSpans.from_pretrained('bert-base-uncased')

# create the model
expred = Expred.from_pretrained(expred_config)
expred.eval()


## Evaluation of subsequences
The black-box model of ExPred that we test for short-cuts, can be applied to automate fact-checking.
The dataset curated for that purpose is called FeVer: a dataset for Fact Verification.

In [3]:
# Read subset of FeVer corresponding to all occurences of a subsequence
import pandas as pd

evaluations = pd.read_json("has_yet_to.json")

print("Loaded", len(evaluations) , "items")

Loaded 570 items


In [4]:
# Create a function to simplify execution of ExPred
def evaluate_query_with_expred(current_query, evidence):
    # transform the input to the way the expred accepts
    expred_input = ExpredInput(
        queries=[current_query.split()],
        docs=[evidence.split()],
        labels=['SUPPORTS', 'REFUTES'],
        config=expred_config,
        ann_ids=['spontan_1'],
        span_tokenizer=tokenizer)
    # don't forget to preprocess
    expred_input.preprocess()

    # the output is in the form of a dict:
    expred_output = expred(expred_input)

    # retrieve the evaluation label
    current_output = expred_input.get_decoded_cls_preds(expred_output)
    return [current_query, current_output[0]]

### Subsequence: 'has yet to'

First we will confirm that the subsequence holds true for all occurrences of the subsequence.
From the observations it can be seen that all cases of the subset containing the regex are refuted.

In [5]:
# Batch size
n = len(evaluations)

results = []

for i in range (0, n):
    if(evaluations['query'][i]):
        current_query = evaluations['query'][i]
        current_doc = evaluations['evidences'][i]

        results.append(evaluate_query_with_expred(current_query, current_doc))

df = pd.DataFrame(results, columns = ['query', 'label'])
df


Unnamed: 0,query,label
0,Avengers: Age of Ultron has yet to premiere.,REFUTES
1,Avengers: Age of Ultron has yet to premiere.,REFUTES
2,Edward Norton has yet to co-write for films.,REFUTES
3,Betty Buckley has yet to receive a Tony Award ...,REFUTES
4,Wallander is a novel series that has yet to be...,REFUTES
...,...,...
565,Sean Combs has yet to release his debut album.,REFUTES
566,John Stewart has yet to appear in DC comics.,REFUTES
567,Susan Atkins has yet to be confined.,REFUTES
568,Susan Atkins has yet to be confined.,REFUTES


Percentage of confirmed cases:

In [9]:
(((df.label.values == 'REFUTES').sum())/n)*100


100.0

Precision of the ExPred model on this subsequence

In [10]:
x = 570
(x/((df.label.values == 'REFUTES').sum()))*100

100.0

## Adverserial Attack
We replace the substring with another substring that guarantees that the meaning has flipped.

In [11]:
substring = "has yet to"
replacewith = "has acted"
n = 20

results = []

for i in range (0, n):
    current_query = evaluations['query'][i].replace(substring, replacewith)
    current_doc = evaluations['evidences'][i]

    results.append(evaluate_query_with_expred(current_query, current_doc))

df = pd.DataFrame(results, columns = ['query', 'label'])
df

Unnamed: 0,query,label
0,Avengers: Age of Ultron has acted premiere.,SUPPORTS
1,Avengers: Age of Ultron has acted premiere.,SUPPORTS
2,Edward Norton has acted co-write for films.,SUPPORTS
3,Betty Buckley has acted receive a Tony Award n...,SUPPORTS
4,Wallander is a novel series that has acted be ...,REFUTES
5,Anne Hathaway has acted act.,SUPPORTS
6,Wallander has acted be adapted from the Kurt W...,SUPPORTS
7,Iran has acted have any conflicts with the Rus...,SUPPORTS
8,Superman has acted be portrayed by George Reeves.,SUPPORTS
9,Paul Walker has acted star in the film Running...,SUPPORTS


## Conclusion

This experiment means that 'is only a' is a <span style="color:green">valid short-cut for the ExPred model</span>, as it has very high probability that ExPred supports a query containing this subsequence. The probability is acquired as follows:

In [13]:
(((df.label.values == 'SUPPORTS').sum())/n)*100

90.0