Before running, choose Runtime > Change runtime type and select GPU.


The dataset is available here. The goal is to classify whether an ArXiv paper is AI-relevant or not. There’s a train, dev, and test set, each with 500 entries. The test set doesn’t have labels. We provide all these data points to help you evaluate your solution, but as described above the solution should work with only 20 labeled examples.

Each entry contains information about an ArXiv paper:

+ label: Whether the paper is AI-relevant
+ text: The paper title and abstract, joined by a period
+ meta: Metadata about the paper

Given this dataset, the task is to perform as well on it as possible, given only 20 labeled examples.
Test how well GPT-2 performs on it when applied in a straightforward way (few-shot learning with examples in prompt).
Experiment with changes that may improve it (e.g. adjustments to the prompt, using GPT-2 as part of more complex schemes, other models and training methods).
Don’t fine-tune GPT-2 or another model on the full training set, since in practice you will only have 20 labeled data points.
Deliverables:
Share a writeup with you findings on:
How well was GPT-2 able to perform on this task?
What tweaks that you tried worked vs. didn’t work?
What would you recommend based on these results? 
What would be good next steps?
Classify the test set and share a jsonl with the classifications.
Share your code.

# Setup



In [5]:
# imports
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch
import json

In [1]:
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
tokenizer.pad_token = tokenizer.eos_token
model = GPT2LMHeadModel.from_pretrained('gpt2', pad_token_id=tokenizer.eos_token_id)
model.eval().cuda()

NameError: name 'GPT2Tokenizer' is not defined

# GPT-2 Utility functions

Resources:

1.   https://huggingface.co/transformers/model_doc/gpt2.html#gpt2lmheadmodel
2.   https://github.com/huggingface/transformers/blob/master/examples/pytorch/text-generation/run_generation.py




In [None]:
def generate(prompt, max_length=5, stop_token=None):
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    generated_text_ids = model.generate(input_ids=input_ids.cuda(), max_length=max_length+len(input_ids[0]), do_sample=False)
    generated_text = tokenizer.decode(generated_text_ids[0], clean_up_tokenization_spaces=True)
    post_prompt_text = generated_text[len(tokenizer.decode(input_ids[0], clean_up_tokenization_spaces=True)):]
    return prompt + post_prompt_text[:post_prompt_text.find(stop_token) if stop_token else None]

In [None]:
# Note that the logits are shifted over 1 to the left, since HuggingFace doesn't give a logit for the first token
def get_logits_and_tokens(text):
    input_ids = tokenizer.encode(text, return_tensors="pt")
    tokens = [tokenizer.decode([input_id]) for input_id in input_ids[0]]
    output = model(input_ids.cuda())
    return output.logits[0][:-1], tokens

# Example calls to GPT-2 Utility functions

In [None]:
EXAMPLE_PROMPT = """Horrible: negative
Great: positive
Bad:"""

generated_text = generate(EXAMPLE_PROMPT, stop_token="\n")
generated_text

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


'Horrible: negative\nGreat: positive\nBad: negative'

In [None]:
logits, tokens = get_logits_and_tokens(generated_text)
last_token_probs = torch.softmax(logits[-1], dim=0)
negative_prob = last_token_probs[tokenizer.encode(" negative")[0]]
positive_prob = last_token_probs[tokenizer.encode(" positive")[0]]

print(f"tokens: {tokens}\nnegative prob: {negative_prob}\npositive prob: {positive_prob}")

tokens: ['Hor', 'rible', ':', ' negative', '\n', 'Great', ':', ' positive', '\n', 'Bad', ':', ' negative']
negative prob: 0.7252263426780701
positive prob: 0.11788520216941833


# Loading the data

Note: before running this section load `train.jsonl` into the runtime

In [None]:
def load_jsonl(filename):
    f = open(filename)
    return [json.loads(line) for line in f.read().splitlines()]

In [None]:
train_examples = load_jsonl("train.jsonl")
train_examples[-1]

{'label': 'False',
 'meta': {'id': '1310.8601', 'year': 2013},
 'text': 'non relativistic approach for cosmological scalar field dark matter. we derive non relativistic equations of motion for the formation of cosmological structure in a scalar field dark matter sfdm model corresponding to a complex scalar field endowed with a quadratic scalar potential. starting with the full equations of motion written in the newtonian gauge of scalar perturbations we separate out the fields involved into relativistic and non relativistic parts and find the equations of motion for the latter that can be used to build up the full solution. one important assumption will also be that the sfdm field is in the regime of fast oscillations under which its behavior is exactly that of cold dark matter. the resultant equations are quite similar to the schr odinger poisson system of newtonian boson stars plus relativistic leftovers. we exploit that similarity to show how to simulate with minimum numerical effor

# Basic prompt building

In [None]:
def render_example(example):
    title = example["text"].split(".")[0].strip()
    abstract = example["text"][len(title)+1:].strip()
    return f"""Title: {title}
Abstract: {abstract}
Label: {"AI" if example["label"] == "True" else "Not AI"}"""

In [None]:
def render_end_example(example):
    title = example["text"].split(".")[0].strip()
    abstract = example["text"][len(title)+1:].strip()
    return f"""Title: {title}
Abstract: {abstract}
Label:"""

In [None]:
def make_prompt(instructions, train_examples, end_example):
    rendered_train_examples = "\n\n--\n\n".join([render_example(example) for example in train_examples])
    return f"""{instructions}

{rendered_train_examples}

--

{render_end_example(end_example)}"""

In [None]:
INSTRUCTIONS = "Classify the following examples based on whether they are AI-relevant or not:"

prompt = make_prompt(INSTRUCTIONS, train_examples[:4], train_examples[4])
print(prompt)

Classify the following examples based on whether they are AI-relevant or not:

Title: thermodynamic analysis of quantum error correcting engines
Abstract: quantum error correcting codes can be cast in a way which is strikingly similar to a quantum heat engine undergoing an otto cycle. in this paper we strengthen this connection further by carrying out a complete assessment of the thermodynamic properties of strokes operator based error correcting codes. this includes an expression for the entropy production in the cycle which as we show contains clear contributions stemming from the different sources of irreversibility. to illustrate our results we study a classical qubit error correcting code well suited for incoherent states and the qubit shor code capable of handling fully quantum states. we show that the work cost associated with the correction gate is directly associated with the heat introduced by the error. moreover the work cost associated with encoding decoding quantum informa

In [None]:
generated_text = generate(prompt, stop_token="\n")
print(generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Classify the following examples based on whether they are AI-relevant or not:

Title: thermodynamic analysis of quantum error correcting engines
Abstract: quantum error correcting codes can be cast in a way which is strikingly similar to a quantum heat engine undergoing an otto cycle. in this paper we strengthen this connection further by carrying out a complete assessment of the thermodynamic properties of strokes operator based error correcting codes. this includes an expression for the entropy production in the cycle which as we show contains clear contributions stemming from the different sources of irreversibility. to illustrate our results we study a classical qubit error correcting code well suited for incoherent states and the qubit shor code capable of handling fully quantum states. we show that the work cost associated with the correction gate is directly associated with the heat introduced by the error. moreover the work cost associated with encoding decoding quantum informa