In [1]:
!pip install transformers torch
from transformers import pipeline





In [2]:
models = {
    "BERT": "bert-base-uncased",
    "RoBERTa": "roberta-base",
    "BART": "facebook/bart-base"
}


Experiment 1: Text Generation
Task: Try to generate text using the prompt: "The future of Artificial Intelligence is"

Code Hint: pipeline('text-generation', model='...')
Hypothesis: Which models will fail? Why? (Hint: Can an Encoder generate new tokens easily?)


In [3]:
prompt = "The future of Artificial Intelligence is"

for name, model in models.items():
    try:
        generator = pipeline("text-generation", model=model)
        output = generator(prompt, max_new_tokens=20)
        print(f"\n{name} Output:")
        print(output[0]["generated_text"])
    except Exception as e:
        print(f"\n{name} Failed:")
        print(e)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cuda:0



BERT Output:
The future of Artificial Intelligence is....................


config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

If you want to use `RobertaLMHeadModel` as a standalone, add `is_decoder=True.`


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cuda:0



RoBERTa Output:
The future of Artificial Intelligence is


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/558M [00:00<?, ?B/s]

Some weights of BartForCausalLM were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['lm_head.weight', 'model.decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cuda:0



BART Output:
The future of Artificial Intelligence is ebook exhibitionshot pointed pointed563 Ame nutrientsHonginduced unlikelyinduced GibbsCarelogin Nuclear ogre martialisoft


Experiment 2: Masked Language Modeling (Missing Word)
Task: Predict the missing word in: "The goal of Generative AI is to [MASK] new content."

Code Hint: pipeline('fill-mask', model='...')
Hypothesis: This is how BERT/RoBERTa were trained. They should perform well.


In [4]:
text = "The goal of Generative AI is to [MASK] new content."

for name, model in models.items():
    try:
        fill = pipeline("fill-mask", model=model)
        output = fill(text)
        print(f"\n{name} Predictions:")
        for pred in output[:3]:
            print(pred["token_str"])
    except Exception as e:
        print(f"\n{name} Failed:")
        print(e)


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0



BERT Predictions:
create
generate
produce


Device set to use cuda:0



RoBERTa Failed:
No mask_token (<mask>) found on the input


Device set to use cuda:0



BART Failed:
No mask_token (<mask>) found on the input


Experiment 3: Question Answering
Task: Answer the question "What are the risks?" based on the context: "Generative AI poses significant risks such as hallucinations, bias, and deepfakes."

Code Hint: pipeline('question-answering', model='...')
Note: Using a "base" model (not fine-tuned for SQuAD) might yield random or poor results. Observe this behavior.

In [5]:
qa_input = {
    "question": "What are the risks?",
    "context": "Generative AI poses significant risks such as hallucinations, bias, and deepfakes."
}

for name, model in models.items():
    try:
        qa = pipeline("question-answering", model=model)
        output = qa(qa_input)
        print(f"\n{name} Answer:")
        print(output["answer"])
    except Exception as e:
        print(f"\n{name} Failed:")
        print(e)


Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cuda:0
Some weights of RobertaForQuestionAnswering were not initialized from the model checkpoint at roberta-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



BERT Answer:
deepfakes


Device set to use cuda:0
Some weights of BartForQuestionAnswering were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



RoBERTa Answer:
hallucinations, bias, and deepfakes


Device set to use cuda:0



BART Answer:
deepfakes


| Task | Model | Classification (Success/Failure) | Observation (What actually happened?) | Why did this happen? (Architectural Reason) |
|------|-------|----------------------------------|----------------------------------------|--------------------------------------------|
| **Generation** | BERT | Failure | Output contained repeated dots and no meaningful continuation. | BERT is an encoder-only model and is not trained for autoregressive next-token generation. |
|  | RoBERTa | Failure | Model returned only the prompt without generating new text. | RoBERTa is also encoder-only and cannot generate tokens sequentially. |
|  | BART | Partial Success | Generated text, but output was incoherent and noisy. | BART has a decoder capable of generation, but `bart-base` is not trained as a causal language model. |
| **Fill-Mask** | BERT | Success | Correctly predicted words like “create”, “generate”, “produce”. | BERT is trained using Masked Language Modeling (MLM). |
|  | RoBERTa | Failure | Error: No `<mask>` token found in input. | RoBERTa expects `<mask>` instead of `[MASK]`; incorrect mask token caused failure. |
|  | BART | Failure | Error: No `<mask>` token found in input. | BART uses `<mask>` token; incorrect mask format caused failure. |
| **QA** | BERT | Partial Success | Returned a relevant but incomplete answer (“deepfakes”). | Base BERT is not fine-tuned on QA datasets like SQuAD. |
|  | RoBERTa | Partial Success | Returned a more complete answer (“hallucinations, bias, and deepfakes”). | Strong encoder representations help, but no QA fine-tuning. |
|  | BART | Partial Success | Returned a single relevant term (“deepfakes”). | QA head is randomly initialized; model lacks QA-specific training. |
