# Clickbait spoiling 🖱️
Task description: https://pan.webis.de/semeval23/pan23-web/clickbait-challenge.html (task 2)

Data: https://zenodo.org/record/6362726#.YsbdSTVBzrk

In [1]:
import pandas as pd
from transformers import pipeline

from src.data import read_data, save_df_to_jsonl

## Data preparation

In [2]:
train = read_data('data/train.jsonl')

In [3]:
train.head()

Unnamed: 0,uuid,title,question,context,spoiler
0,0af11f6b-c889-4520-9372-66ba25cb7657,"Wes Welker Wanted Dinner With Tom Brady, But P...","Wes Welker Wanted Dinner With Tom Brady, But P...","Wes Welker Wanted Dinner With Tom Brady, But P...",[how about that morning we go throw?]
1,b1a1f63d-8853-4a11-89e8-6b2952a393ec,Hole In Ozone Layer Expected To Make Full Reco...,NASA sets date for full recovery of ozone hole,Hole In Ozone Layer Expected To Make Full Reco...,[2070]
2,008b7b19-0445-4e16-8f9e-075b73f80ca4,Intellectual Stimulation Trumps Money For Empl...,This is what makes employees happy -- and it's...,Intellectual Stimulation Trumps Money For Empl...,[intellectual stimulation]
3,31ecf93c-3e21-4c80-949b-aa549a046b93,"‘Follow your passion’ is wrong, here are 7 hab...",Passion is overrated — 7 work habits you need ...,"‘Follow your passion’ is wrong, here are 7 hab...",[Purpose connects us to something bigger and i...
4,31b108a3-c828-421a-a4b9-cf651e9ac859,Revealed: The perfect way to cook rice so that...,The perfect way to cook rice so that it's perf...,Revealed: The perfect way to cook rice so that...,[in a rice cooker]


In [4]:
train.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3200 entries, 0 to 3199
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   uuid      3200 non-null   object
 1   title     3200 non-null   object
 2   question  3200 non-null   object
 3   context   3200 non-null   object
 4   spoiler   3200 non-null   object
dtypes: object(5)
memory usage: 125.1+ KB


## Question Answering model
Firstly, we will verify existing approach - question answering pipeline with roberta-base-squad2 model.

In [5]:
qa_pipeline = pipeline(model="deepset/roberta-base-squad2")

Downloading:   0%|          | 0.00/11.1k [00:00<?, ?B/s]

### Verify sample spoilers and assess accuracy
For now, simply check if real and generated spoiler intersect with at least one word.

In [20]:
score = []
preds = dict({"uuid" : [],
             "spoiler": []})

print("correct | generated spoiler | true spoiler")
for i, uuid, question, context, spoiler in zip(range(10), train.uuid, train.question, train.context, train.spoiler):
    answer = qa_pipeline(
        question=question,
        context=context,
    )
    if any(word in spoiler[0].split() for word in answer['answer'].split()):
        score.append(1)
    else:
        score.append(0)
    print(score[i], answer['answer'], spoiler)
    preds["uuid"].append(uuid)
    preds["spoiler"].append(answer['answer'])

pred_df = pd.DataFrame.from_dict(preds)
print(f"\nAccuracy: {sum(score)/len(score)}")

correct | generated spoiler | true spoiler
1 let’s go throw ['how about that morning we go throw?']
1 2070 ['2070']
0 money ['intellectual stimulation']
0 Adopting a peripheral perspective ['Purpose connects us to something bigger and in doing so makes us right sized', 'be ruthless with your "No’s."', 'Practice means greatness is doable ... one tiny step after another', 'planning of the SMART goal and number-crunching variety', 'Objectivity — the ability to see the world as it truly is']
0 I follow these steps ['in a rice cooker']
1 you'll have to buy new ones ["Apple says that if AirPods are lost or stolen, you'll have to buy new ones, just like any other Apple product."]
0 Is he constantly hungover ['"The more good games I had in them, the more I got used to them.']
1 -10 degrees Celsius," said Hänninen. ['rainbow colours in the sky and a halo spanning 360 degrees']
0 5/5 say yes ['Red wine is clearly the drink of choice if you are doing light to moderate drinking for your health, a

There is definitely room for improvement, some generated spoiler are perfectly correct, but some are totally missed.

## Evaluation
We will use script provided by SemEval23 organizers to evaluate our approaches. The scripts takes data in the form of JSONL file with columns `uuid` and `spoiler`.

In [21]:
save_df_to_jsonl(train.iloc[:10], "data/test_true.jsonl")
save_df_to_jsonl(pred_df, "data/test_output.jsonl")

In [23]:
%run src/evaluation.py --input_run "data/test_output.jsonl" --ground_truth_spoilers "data/test_true.jsonl"

  [[92mo[0m] The file data/test_output.jsonl is in JSONL format.
  [[92mo[0m] The file data/test_true.jsonl is in JSONL format.
  [[92mo[0m] Spoiler generations have correct format. Found 10
Run evaluation for all-spoilers
  [[92mo[0m] Spoiler generations have correct format. Found 10


Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.bias', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


CalledProcessError: Command '['java', '-Xmx2G', '-jar', '/meteor-1.5.jar', 'C:\\Users\\zgawrysi\\AppData\\Local\\Temp\\tmpcc9yg6n1/truths.txt', 'C:\\Users\\zgawrysi\\AppData\\Local\\Temp\\tmpcc9yg6n1/preds.txt', '-l', 'en', '-norm', '-t', 'adq']' returned non-zero exit status 1.

## Our approach
Retraining roberta?