# 1. Large Language Models

### 1.1 What might help the language model to get more context about the world?

Two possible answers are multi-modality or integration of a knowledge base.

### 1.2 Which metrics or patterns might help to detect AI-generated text?

Perplexity (LMs have low perplexity) is for sure a good. Otherwise, regularity and repetitivness (LMs do not switch sentence length / complexity too often).

# 2. Reasoning about GPT-3 (Use the [GPT-3 Paper](https://papers.nips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf)!)

### 2.1 What is the minimum amount of RAM I would need to run a batch of 4 samples through a GPT-3 instance? (Hint: think about the no. of weights)

To process the batch, you need at least to load 4 times the weights. Every float is 4 bytes => 175B weights * 4 bytes / 1024^3 = ~650GB of RAM at the very very least (in practice a lot more).

### 2.2 According to the paper, what do you think is a limitation of the training objective?

Paraphrasing from the limitation section: GPT-3 does not have a bi-directional training objectives (unlike BERT), which has drawbacks for tasks which empirically benefit from bidirectionality. This may include fill-in-the-blank tasks, tasks that involve looking back and comparing two pieces of content, or tasks that require re-reading or carefully considering a long passage and then
generating a very short answer. Every prediction step has the same priority which is undesired.



# 3. HuggingFace Introduction

HuggingFace is a platform that provides a variety of pre-trained transformers and is in general a great resource for NLP.

## Installations & Imports 

In [1]:
! pip install transformers datasets evaluate

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.27.4-py3-none-any.whl (6.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m53.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets
  Downloading datasets-2.11.0-py3-none-any.whl (468 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m468.7/468.7 KB[0m [31m19.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting evaluate
  Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.4/81.4 KB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m71.1 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=

In [2]:
import torch
import torch.nn as nn

import transformers

from transformers import pipeline
from datasets import load_dataset

import random

## DistilBert for Language Modeling

In this tutorial, we will work with DistilBert, a smaller LM than Bert with similar performance.
 Note that we use `AutoTokenizer` and `AutoModel` instead of  `DistilBertTokenizer` and `DistilBertModel`. Those two methods are equivalent. 


In [3]:
MODEL_TYPE = 'distilbert-base-uncased'

tokenizer = transformers.AutoTokenizer.from_pretrained(MODEL_TYPE)
model = transformers.AutoModel.from_pretrained(MODEL_TYPE)
print(f"# DistilBert Parameters: {round(model.num_parameters() / 1_000_000)}M (Remember from the lecture that BERT has around 110M parameters)")

text = "NLP2.0 is my favorite lecture" 
encoded_input = tokenizer(text, return_tensors='pt')

output = model(**encoded_input)

Downloading (…)okenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_layer_norm.weight', 'vocab_projector.bias', 'vocab_transform.bias', 'vocab_transform.weight', 'vocab_layer_norm.bias', 'vocab_projector.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


# DistilBert Parameters: 66M (Remember from the lecture that BERT has around 110M parameters)


Remember that BERT operates on wordpiece level.

In [4]:
tokenizer.tokenize(text)

['nl', '##p', '##2', '.', '0', 'is', 'my', 'favorite', 'lecture']

By default, the base model only contains the last hidden state as output.

In [5]:
last_hidden_state = output.last_hidden_state
last_hidden_state.shape # shape: [1, 11, 768]

torch.Size([1, 11, 768])

## 3.1 Masked Language Modeling



In [6]:
MODEL_TYPE = 'distilbert-base-uncased'
tokenizer = transformers.AutoTokenizer.from_pretrained(MODEL_TYPE)

####################################################################
# TODO find correct model head: AutoModelForMaskedLM
####################################################################
model = transformers.AutoModelForMaskedLM.from_pretrained(MODEL_TYPE)
####################################################################

text = "The new movie was [MASK]."
inputs = tokenizer(text, return_tensors="pt")
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]

logits = model(**inputs).logits
mask_token_logits = logits[0, mask_token_index, :]

top_3_tokens = torch.topk(mask_token_logits, 3, dim=1).indices[0].tolist()

for token in top_3_tokens:
    print(text.replace(tokenizer.mask_token, tokenizer.decode([token])))


The new movie was cancelled.
The new movie was filmed.
The new movie was released.


## 3.2 Language Generation
While the simple BERT variants can only produce one token at a time, there exist approaches that try to create multiple token at once. However, due to its bi-directional nature, encoder architectures perform worse than autoregressive models. 

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

####################################################################
# TODO select suitable auto-regressive model
####################################################################
MODEL_TYPE = "gpt2-large"
####################################################################

tokenizer = AutoTokenizer.from_pretrained(MODEL_TYPE)
model = AutoModelForCausalLM.from_pretrained(MODEL_TYPE)

prompt = "Today was an amazing day because"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, do_sample=True, max_new_tokens=100)
tokenizer.batch_decode(outputs, skip_special_tokens=True)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


['Today was an amazing day because I was the first to be in the room with the doctors, and what I heard was the most wonderful thing I had ever heard. It was a great day to feel normal."\n\nThis story originally appeared on GSN and was republished with permission from TIME.\n\nMORE READING\n\nIs this who Hillary Clinton really is? It is hard to tell and difficult to prove.\n\nAfter \'deplorables\': Clinton campaign turns to \'alt-right\' to defeat Trump']

## Pipeline as Alternative
pipeline is a powerful tool that allows you to automate many tasks. We quickly show how to masked language modeling and language generation:

In [None]:
text = "I want to eat [MASK]."
mask_filler = transformers.pipeline("fill-mask", "distilbert-base-uncased")
mask_filler(text, top_k=3)

[{'score': 0.03075120598077774,
  'token': 6350,
  'token_str': 'breakfast',
  'sequence': 'i want to eat breakfast.'},
 {'score': 0.02877492643892765,
  'token': 2242,
  'token_str': 'something',
  'sequence': 'i want to eat something.'},
 {'score': 0.02485204115509987,
  'token': 2009,
  'token_str': 'it',
  'sequence': 'i want to eat it.'}]

In [None]:
text = "Hugging Face is a community-based open-source platform for machine learning."
generator = transformers.pipeline("text-generation", "t5-small") # TODO 
generator(text)  # doctest: +SKIP

For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-small automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
The model 'T5ForConditionalGeneration' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'Maria

[{'generated_text': 'Hugging Face is a community-based open-source platform for machine learning. learning.'}]

## Fine-Tune BERT on Next Sentence Prediction

In [None]:
from transformers import AutoTokenizer, DataCollatorWithPadding

raw_datasets = load_dataset("glue", "mrpc")
checkpoint = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)


def tokenize_function(example):
    return tokenizer(example["sentence1"], example["sentence2"], truncation=True)


tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)



  0%|          | 0/3 [00:00<?, ?it/s]



  0%|          | 0/1 [00:00<?, ?ba/s]



In [None]:
example = tokenized_datasets['train'][0]
print(example['sentence1'])
print(example['sentence2'])
print(example['label']) # 1 means that sentence2 is the true next sentence of sentence1


Amrozi accused his brother , whom he called " the witness " , of deliberately distorting his evidence .
Referring to him as only " the witness " , Amrozi accused his brother of deliberately distorting his evidence .
1


In [None]:
tokenized_datasets = tokenized_datasets.remove_columns(["sentence1", "sentence2", "idx"])
tokenized_datasets = tokenized_datasets.rename_column("label", "labels")
tokenized_datasets.set_format("torch")
tokenized_datasets["train"].column_names

['labels', 'input_ids', 'token_type_ids', 'attention_mask']

In [None]:
from torch.utils.data import DataLoader

train_dataloader = DataLoader(
    tokenized_datasets["train"], shuffle=True, batch_size=8, collate_fn=data_collator
)
eval_dataloader = DataLoader(
    tokenized_datasets["validation"], batch_size=8, collate_fn=data_collator
)

In [None]:
for batch in train_dataloader:
    break
b = {k: v.shape for k, v in batch.items()}

You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


## 3.3 Create Model Head for Sequence Classification

In [None]:
import torch.nn as nn
from transformers import AutoModel
from transformers.modeling_outputs import SequenceClassifierOutput
from torch.nn import BCEWithLogitsLoss, CrossEntropyLoss

class MyBERTModel(nn.Module):
    def __init__(self, is_frozen=True):
        super(MyBERTModel, self).__init__()

        self.num_labels = 2
        checkpoint = 'bert-base-uncased' 
        self.base_model = AutoModel.from_pretrained(checkpoint)

        if is_frozen:
          self.freeze()

        ####################################################################
        # TODO: define your model head here
        ####################################################################
        self.dropout = nn.Dropout(0.5)
        self.linear = nn.Linear(768, 2) # output features from bert is 768 and 2 is ur number of labels
        ####################################################################

    def freeze(self):
      for param in self.base_model.parameters():
        param.requires_grad = False

    def forward(self, input_ids, attention_mask, token_type_ids, labels):
        ####################################################################
        # TODO: implement forward function
        ####################################################################
        outputs = self.base_model(input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
        
        outputs = self.dropout(outputs[1])
        logits = self.linear(outputs)
        
        loss_fct = CrossEntropyLoss()
        loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
        ####################################################################
        #return outputs, loss
        return SequenceClassifierOutput(
            loss=loss,
            logits=logits
        )

model = MyBERTModel()
model(**{'input_ids':batch['input_ids'], 'labels':batch['labels'],  'token_type_ids':batch['token_type_ids'], 'attention_mask':batch['attention_mask']})

# Note: your code should be equivalent to using the AutoModelForSequenceClassification class
'''
from transformers import AutoModelForSequenceClassification

checkpoint = 'distilbert-base-uncased'
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
'''


Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


"\nfrom transformers import AutoModelForSequenceClassification\n\ncheckpoint = 'distilbert-base-uncased'\nmodel = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)\n"

In [None]:
model(**batch)

SequenceClassifierOutput(loss=tensor(0.8143, grad_fn=<NllLossBackward0>), logits=tensor([[ 0.6680,  0.8221],
        [ 0.1438, -0.1971],
        [ 0.9690,  0.7177],
        [-0.7547,  0.2342],
        [ 0.4335,  0.0127],
        [ 0.7449, -0.1288],
        [ 0.1940,  0.6175],
        [ 0.7134,  0.1328]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [None]:
from transformers import AdamW

optimizer = AdamW(model.parameters(), lr=5e-5)

from transformers import get_scheduler

num_epochs = 3
num_training_steps = num_epochs * len(train_dataloader)
lr_scheduler = get_scheduler(
    "linear",
    optimizer=optimizer,
    num_warmup_steps=0,
    num_training_steps=num_training_steps,
)
print(num_training_steps)

import torch

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model.to(device)
device



1377


device(type='cuda')

In [None]:
import evaluate

def eval(model, loader):

  metric = evaluate.load("glue", "mrpc")
  model.eval()
  for batch in loader:
      batch = {k: v.to(device) for k, v in batch.items()}
      with torch.no_grad():
          outputs = model(**batch)


      logits = outputs.logits
      predictions = torch.argmax(logits, dim=-1)
      metric.add_batch(predictions=predictions, references=batch["labels"])

  return metric.compute()

eval(model, eval_dataloader)

Downloading builder script:   0%|          | 0.00/5.75k [00:00<?, ?B/s]

{'accuracy': 0.6838235294117647, 'f1': 0.8122270742358079}

### 3.4: Implement Training Loop

In [None]:
from tqdm import tqdm

progress_bar = tqdm(range(num_training_steps))

model.train()
for epoch in range(num_epochs):
    for batch in train_dataloader:
        batch = {k: v.to(device) for k, v in batch.items()}
        ####################################################################
        # TODO: implement training loop
        ####################################################################
        outputs = model(**batch)
        loss = outputs.loss
        loss.backward()

        optimizer.step()
        ####################################################################
        lr_scheduler.step()
        optimizer.zero_grad()
        progress_bar.update(1)

#eval(model, eval_dataloader)

100%|█████████▉| 1376/1377 [00:51<00:00, 26.03it/s]

In [None]:
eval(model, eval_dataloader)

{'accuracy': 0.6838235294117647, 'f1': 0.8122270742358079}

## Zero-Shot Classification via Prompting




In [None]:
classifier = pipeline("zero-shot-classification")

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

### Topic Modeling

In [None]:
sequence = "Who are you voting for in 2020?"
candidate_labels = ["politics", "public health", "economics"]

classifier(sequence, candidate_labels)

NameError: ignored

In [None]:
sequence = "Who is more likely to live in a city?"
candidate_labels = ["sailor", "farmer", "mayor"]

classifier(sequence, candidate_labels)

### Sentiment Classification

In [None]:
# How to improve scores? Add 'hypothesis_template = "The sentiment of this review is {}."' 

sequences = [
    "I hated this movie. The acting sucked.",
    "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."
]
candidate_labels = ["positive", "negative"]

classifier(sequences, candidate_labels)

## 3.5 In-Context Learning

In [None]:
rotten_tomatoes = load_dataset("rotten_tomatoes")

Downloading builder script:   0%|          | 0.00/5.03k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/2.02k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/7.25k [00:00<?, ?B/s]

Downloading and preparing dataset rotten_tomatoes/default to /root/.cache/huggingface/datasets/rotten_tomatoes/default/1.0.0/40d411e45a6ce3484deed7cc15b82a53dad9a72aafd9f86f8f227134bec5ca46...


Downloading data:   0%|          | 0.00/488k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/8530 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/1066 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1066 [00:00<?, ? examples/s]

Dataset rotten_tomatoes downloaded and prepared to /root/.cache/huggingface/datasets/rotten_tomatoes/default/1.0.0/40d411e45a6ce3484deed7cc15b82a53dad9a72aafd9f86f8f227134bec5ca46. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

In [None]:

subset = rotten_tomatoes['train']
examples = [
    {'text': "i can analyze this movie in three words : thumbs friggin' down .", 'label': 0},
    {'text': "sadly , 'garth' hasn't progressed as nicely as 'wayne . '", 'label': 0},
    {'text': 'make like the title and dodge this one .', 'label': 0},
    {'text': 'constantly touching , surprisingly funny , semi-surrealist exploration of the creative act .', 'label': 1},
    {'text': 'the journey is worth your time , especially if you have ellen pompeo sitting next to you for the ride .', 'label': 1},
    {'text': 'merci pour le movie .', 'label': 1}
]

# alternative
# examples = random.choices(subset, k=6)
print(examples)

test = {'text': 'if you sometimes like to go to the movies to have fun , wasabi is a good place to start .',
 'label': 1}

TEMPLATE = lambda x: x + " Overall, it was [MASK]. "

# for simplicity, we stick to a simple verbalizer so that the model has an easier time using the correct labl
# VERBALIZER = {1: ["great", "good", "wonderful", "perfect"], 0: ["bad", "terrible", "horrible"]}
VERBALIZER = {1: ["good"], 0: ["bad"]}

def verbalize(label):
  return random.choice(VERBALIZER[example['label']])


####################################################################
# TODO: create pattern as function or lambda expression
####################################################################
PATTERN = lambda x: "Review: " + x 
####################################################################

prompt = ""

for example in examples: 
  ####################################################################
  # TODO create prompt
  ####################################################################
  out = PATTERN(example['text'])
  out = TEMPLATE(out)
  out = out.replace("[MASK]", verbalize(example['label']))
  ####################################################################
  prompt += out

prompt += TEMPLATE(PATTERN(test['text']))

ref = 'Review: i can analyze this movie in three words : thumbs friggin\' down . Overall, it was bad. Review: sadly , \'garth\' hasn\'t progressed as nicely as \'wayne . \' Overall, it was bad. Review: make like the title and dodge this one . Overall, it was bad. Review: constantly touching , surprisingly funny , semi-surrealist exploration of the creative act . Overall, it was good. Review: the journey is worth your time , especially if you have ellen pompeo sitting next to you for the ride . Overall, it was good. Review: merci pour le movie . Overall, it was good. Review: if you sometimes like to go to the movies to have fun , wasabi is a good place to start . Overall, it was [MASK]. '

assert ref == prompt, 'ref and prompt do not match '

prompt

[{'text': "i can analyze this movie in three words : thumbs friggin' down .", 'label': 0}, {'text': "sadly , 'garth' hasn't progressed as nicely as 'wayne . '", 'label': 0}, {'text': 'make like the title and dodge this one .', 'label': 0}, {'text': 'constantly touching , surprisingly funny , semi-surrealist exploration of the creative act .', 'label': 1}, {'text': 'the journey is worth your time , especially if you have ellen pompeo sitting next to you for the ride .', 'label': 1}, {'text': 'merci pour le movie .', 'label': 1}]


True

In [None]:
mask_filler = transformers.pipeline("fill-mask", "distilbert-base-uncased")
mask_filler(prompt, top_k=1)

# 4 End-2-End Design


### 4.1 You want to build a classification architecture on the [AG News dataset](https://huggingface.co/datasets/ag_news). Describe how you would use BERT to build a classification architecture?

It is sufficient to attach a fully-connected layer on top of BERT and feed it with [CLS] output embedding from BERT.

### 4.2 How many output neurons would your last layer have? What activation function would you use there?

As many as the classes, i.e. 4 (see dataset documentation), therefore the extra layer would have shape (768, 4). Softmax would be good as an output activation as we want to have output probabilities for our four mutually exclusive classes.

### 4.3 Suppose now we want to use GPT-3 instead and want to do zero-shot learning. Consider the input sentence $x_0=$ _"The Social Computing Group at TUM has just released GPT-5, that is impressive!"_  labelled as $y_0=$_"Sci/Tech"_. Design a reasonable prompt for $(x_0,y_0)$.

A suitable prompt could be "Observe the news article: $x_0$. The piece is about ______".

### 4.4 Now add demonstrations to your prompt to perform in-context learning. Are you still performing zero-shot learning? If not, what instead? Explicitely state which is the pattern $f$ and which is the verbalizer $v$.

Just imagine other news pieces, for instance I can take two: $x_1$ and $x_2$ and their respective labels $y_1$, $y_2$. The new prompt with demonstation will looks like this:
"Observe the news article: $x_1$. The piece is about $y_1$. Observe the news article: $x_2$. The piece is about $y_2$". "Observe the news article: $x_0$. The piece is about ______". 

We are now performing 2-shot learning (we have two demonstrations). The pattern and the verbalizer are as follow:

Pattern $f(x)$ = "Observe the news article: $x$"

Verbalizer $v(y)$ = "The piece is about $y$"

### 4.5. Suppose we are now given additional data to fine-tune your model, but retraining 175B parameters is absolutely unfeasible. How can parameter-efficient-tuning help us?

Parameter-efficient tuning techniques can achieve results that are on par or even superior to traditional fine-tuning while updating less than $1\%$ (sometimes even closer to $0.1\%$ or $0.01\%$) of the parameters.

### 4.7. Explain one possible way to produce an alternative (much smaller)architecture capable to match GPT-3 in performance.

You could use RETRO, which has ~95% less parameters of GPT-3 and couple it with your training corpus so that can it use it for retrieval at inference time.

Alternatively, you can use PET on a GPT-3 instance to create soft-labels and use these to train a (much smaller) classifier.

Please note that you using parameter-efficient-tuning would still leave you with a large architecture at inference time.

### 4.6 Pick two parameter-efficient-tuning techniques, explain in detail how they work and how you would apply them.

Here you could pick BitFit and Adapter layers. 

The first only tunes the bias terms in self-attention and MLP layers, which are a negligible number compared to the overall amount of weights.

The second one adds additional blocks composed of feedworward layers, skip connections. Feedforward projet down first and then back up to reduce the no. of parameters in the adapter block. Add adapter blocks within transformer blocks and fine-tune only them while leaving the rest of model frozen.

# 5 Explainability

### 5.1 Consider again the [AG News dataset](https://huggingface.co/datasets/ag_news), and the input example ($x_0$,$y_0$) provided in 4.3. and a black-box classifier $f$. Name and briefly describe a method you could use to obtain a feature attribution explanation $\phi(x_0,f,y_0^*)$ for the predicted class $y_0^*$

I could use on the SHAP methods, e.g. KernelSHAP since I don't know anything about the classifier $f$. SHAP attributes a score to each feature based on its marginal contribution across features coalitions. 

The result $\phi(x_0,f,y_0^*)$ would be a list of scores, each score representing the relevance of a token in $x_0$ 


### 5.2 Make an example of how a feature attribution explanation $\phi(x_0,f,y_0)$ could look like

Assuming $f($_"The Social Computing Group at TUM has just released GPT-5, that is impressive!"_$)$ = "Sci/Tech" = $y_0^*$. A feature attribution explanation could look like this 

[The, Social, Computing, Group, at, TUM, has, just, released, GPT-5, that, is, impressive, !]

[0.0, -0.2, 0.4, 0.2, 0.0, 0.5, 0.0, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0] 

Each word has a corresponding relevance score for the class $y_0^*$. The most relevant words are _"Computing"_, _"TUM"_, and _"GPT-5"_.

### 5.3 How could an adversarial example explanation look like in this case? Briefly sketch one. At what level did you apply the perturbation to generate the attack?

An adversarial attack should be similar to the original sentence. For instance: 

_"The Social Compuing Group has just made available GPT_5, that is impressive!"_

Here we have two character level attacks (in computing and GPT-5) and one word attack (released -> made available)

### 5.4 How could an influential sample explanation look like in this case?  Briefly sketch one. Is this type of explanation local or global? Does it fall within model transparency or post-hoc explainability? 

An influential samples expanation would be a list of samples that are similar to the one considered and are thus processed similarly by the model. One of such samples could be: 

_"OpenAI is actualy slightly ahead of the Social Coputing Group in terms of releasing GPT_5"_ 

This is a post-hoc explainability method, more specifically a local explanation.

### 5.5 Make examples of 4 lexicon concepts, one for each class. What would TCAV scores tell you about those concepts? 

I will only make a few examples for the _"Sci/Tech"_ class, it is analogous for other classes. Lexicon concepts for such class would be conceptually relevant nouns/adjectives for the classifier to the class itself. So for instance "digital", "technological", and "programming" are concepts that we could test for.

TCAV scores measure the relevance of such concepts for the classifier to identify the output class - the higher the score, the strong the signal for the classifier. Testing against many concepts ensures that we find the most relevants ones and we are less likely to select tokens deemed as relevant because of spurious correlations.


### 5.6 How could the explanation of 5.2 look like if the task was sentiment analysis instead of topic classification?

We would have high relevance scores for tokens that carry strong information about the sentence's sentiment. For instance, I would expect methods to attibute a high score to the word "impressive".

### Based on what criteria are you implicitly evaluating the explanations of 5.2, 5.4, and 5.6?

Solely based on whether it looks plausible to us. So plausibility, not faithfulness