# Clickbait Spoiler Generation using GPT-3

In [1]:
# This is necessary to fix the imports
import os
import sys
sys.path.append(os.path.abspath(os.path.join('../src')))

Preparing the data using a custom function and validating it using the openai preparation tool

In [2]:
TRAIN_DATA = "../data/parsed/openai/train_prepared.jsonl"
VALIDATION_DATA = "../data/parsed/openai/validation_prepared.jsonl"

In [3]:
from data_parser import OPENAI_MAX_TOKENS_COMPLETION, OPENAI_END_OF_COMPLETION
from prepare_data_openai import OPENAI_MODEL

### Performing the fine-tune operation

In [4]:
EPOCHS = 5
SUFFIX = "clickbait_spoiler"

In [5]:
import utils.openai
utils.openai.estimate_costs_fine_tune_training(TRAIN_DATA, OPENAI_MODEL)

4.341978

In [6]:
!openai api fine_tunes.create --training_file $TRAIN_DATA --validation_file $VALIDATION_DATA --n_epochs $EPOCHS  --model $OPENAI_MODEL --suffix $SUFFIX

Upload progress: 100%|████████████████████| 9.79M/9.79M [00:00<00:00, 12.2Git/s]
Uploaded file from ../data/parsed/openai/train_prepared.jsonl: file-KQfOLfnylBZfe1nAdk7rURtq
Upload progress: 100%|██████████████████████| 690k/690k [00:00<00:00, 1.15Git/s]
Uploaded file from ../data/parsed/openai/validation_prepared.jsonl: file-WvA9NWmP7KawNuOWY39pXJ7N
Created fine-tune: ft-YTgC43HLX5x00cwjjJ74odXM
Streaming events until fine-tuning is complete...

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2023-06-11 13:02:05] Created fine-tune: ft-YTgC43HLX5x00cwjjJ74odXM



In [7]:
!openai api fine_tunes.follow --id ft-YTgC43HLX5x00cwjjJ74odXM

[2023-06-11 13:02:05] Created fine-tune: ft-YTgC43HLX5x00cwjjJ74odXM
[2023-06-11 13:03:17] Fine-tune costs $4.34
[2023-06-11 13:03:18] Fine-tune enqueued. Queue number: 0

Stream interrupted (client disconnected).
To resume the stream, run:

  openai api fine_tunes.follow -i ft-YTgC43HLX5x00cwjjJ74odXM



In [8]:
# !openai api fine_tunes.cancel --id ft-YTgC43HLX5x00cwjjJ74odXM # Cancel if it gets expensive (hehe)

In [9]:
!openai api fine_tunes.results -i ft-YTgC43HLX5x00cwjjJ74odXM > ../data/results/openai/steps.csv

#### Testing

In [10]:
MODEL_ID = "ada:ft-personal:clickbait-spoiler-2023-06-11-16-56-34"

In [11]:
clickbait = """CLICKBAIT:\n\nA woman who interviewed over 100 people at Goldman Sachs says there's one question she always hoped candidates would ask her, but they never did\n\n\nARTICLE:\n\nAt some point toward the end of every job interview, the hiring manager will likely turn the tables and ask, \"Do you have any questions for me?\"\nThis is the time to ask smart, thoughtful questions — perhaps your final opportunity to assess whether the job would be a good fit, and your final chance to impress the hiring manager.\nBusiness Insider recently spoke with Becca Brown, cofounder of Solemates, a brand of women's shoe-care products, who knows a thing or two about interviewing.\nBefore launching her own business, Brown, who has a bachelor's from Harvard University and an MBA from Columbia, spent a lot of time interviewing job candidates at Goldman Sachs, where she held various roles, including analyst, wealth adviser, and chief of staff.\nShe was also part of the investment bank's Harvard recruiting team, she says.\n\"I interviewed anywhere from 20 to 30 job candidates a year, so in total, I interviewed over 100 people at Goldman Sachs,\" she tells Business Insider.\nShe says that candidates asked her some impressive questions — like \"What's the most challenging part of your job?\" and \"What's one of the most interesting projects you've worked on?\" — but there was one question she always hoped she'd be asked, but almost never was: \"Where do you see yourself in five years?\"\n\"I like this question — and yet no one ever asked it — because it's difficult to answer,\" she says. \"It's an important question for anyone to be asking him or herself, and so if ever a candidate were to ask this question, it would have stood out.\"\nShe continues:\nI think this is a good question for interviewees to ask because, as a candidate, if you see where the person interviewing you is headed, you can decide if that trajectory is in line with your career objectives. While they don't have to be completely correlated, it's helpful for the candidate to have some indication of the interviewer's direction.\nGet the latest Goldman Sachs stock price here.\n\n###\n\n"""
expected = """\"Where do you see yourself in five years?\""""

In [12]:
import models.gpt3 as gpt3
prediction = gpt3.predict([clickbait], MODEL_ID)[0]

In [13]:
import evaluate
meteor = evaluate.load("meteor")
bleu = evaluate.load("bleu")
bertscore = evaluate.load("bertscore")

  from .autonotebook import tqdm as notebook_tqdm
Downloading builder script: 100%|██████████| 6.81k/6.81k [00:00<00:00, 5.90MB/s]
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/ddsantos/nltk_data...
[nltk_data] Downloading package punkt to /Users/ddsantos/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     /Users/ddsantos/nltk_data...
Downloading builder script: 100%|██████████| 5.94k/5.94k [00:00<00:00, 9.78MB/s]
Downloading extra modules: 4.07kB [00:00, 13.0MB/s]                   
Downloading extra modules: 100%|██████████| 3.34k/3.34k [00:00<00:00, 13.3MB/s]
Downloading builder script: 100%|██████████| 7.95k/7.95k [00:00<00:00, 9.98MB/s]


In [14]:
bertscore_results = bertscore.compute(predictions=prediction, references=[expected], lang="en")
meteor_results = meteor.compute(predictions=prediction, references=[expected])
bleu_results = bleu.compute(predictions=prediction, references=[expected])

In [15]:
print(f"{clickbait}Expected Spoiler: {expected};\n\nSpoiler Predicted: {prediction[0]};\n\n###\n\n")
print(f"Meteor: {meteor_results['meteor']}\nBLEU-4: {bleu_results['bleu']}\nBERTscore Mean F1: {sum(bertscore_results['f1'])/len(bertscore_results['f1'])}")

CLICKBAIT:

A woman who interviewed over 100 people at Goldman Sachs says there's one question she always hoped candidates would ask her, but they never did


ARTICLE:

At some point toward the end of every job interview, the hiring manager will likely turn the tables and ask, "Do you have any questions for me?"
This is the time to ask smart, thoughtful questions — perhaps your final opportunity to assess whether the job would be a good fit, and your final chance to impress the hiring manager.
Business Insider recently spoke with Becca Brown, cofounder of Solemates, a brand of women's shoe-care products, who knows a thing or two about interviewing.
Before launching her own business, Brown, who has a bachelor's from Harvard University and an MBA from Columbia, spent a lot of time interviewing job candidates at Goldman Sachs, where she held various roles, including analyst, wealth adviser, and chief of staff.
She was also part of the investment bank's Harvard recruiting team, she says.
"