In [1]:
# Install the core libraries required for the session
!pip install torch tensorflow transformers pandas



In [2]:
from transformers import pipeline
import pandas as pd

# Setting up the model identifiers as per assignment instructions
models = {
    "BERT": "bert-base-uncased",
    "RoBERTa": "roberta-base",
    "BART": "facebook/bart-base"
}



In [3]:
gen_prompt = "The future of Artificial Intelligence is"

print("--- Running Experiment 1: Text Generation ---")
for name, model_path in models.items():
    print(f"\nTesting {name}...")
    try:
        # Initializing the text-generation pipeline
        gen_pipe = pipeline("text-generation", model=model_path)
        result = gen_pipe(gen_prompt, max_new_tokens=20)
        print(f"Output: {result[0]['generated_text']}")
    except Exception as e:
        print(f"Result: Model failed to generate. (Error: {e})")

--- Running Experiment 1: Text Generation ---

Testing BERT...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cpu


Output: The future of Artificial Intelligence is....................

Testing RoBERTa...


config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

If you want to use `RobertaLMHeadModel` as a standalone, add `is_decoder=True.`


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu


Output: The future of Artificial Intelligence is

Testing BART...


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/558M [00:00<?, ?B/s]

Some weights of BartForCausalLM were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['lm_head.weight', 'model.decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


Output: The future of Artificial Intelligence is euph OLED OLED empirical empirical empirical Dustin Zel euph euph euph Duo stat stat 445 445 445ORYORY


In [4]:
# BERT and BART use [MASK], RoBERTa uses <mask>
mlm_prompt = "The goal of Generative AI is to [MASK] new content."
roberta_mlm_prompt = "The goal of Generative AI is to <mask> new content."

print("--- Running Experiment 2: Fill-Mask ---")
for name, model_path in models.items():
    prompt = roberta_mlm_prompt if name == "RoBERTa" else mlm_prompt
    try:
        mask_pipe = pipeline("fill-mask", model=model_path)
        result = mask_pipe(prompt)
        print(f"{name} top prediction: '{result[0]['token_str']}' (Score: {result[0]['score']:.4f})")
    except Exception as e:
        print(f"{name} failed: {e}")

--- Running Experiment 2: Fill-Mask ---


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


BERT top prediction: 'create' (Score: 0.5397)


Device set to use cpu


RoBERTa top prediction: ' generate' (Score: 0.3711)


Device set to use cpu


BART failed: No mask_token (<mask>) found on the input


In [5]:
qa_context = "Generative AI poses significant risks such as hallucinations, bias, and deepfakes."
qa_question = "What are the risks?"

print("--- Running Experiment 3: Question Answering ---")
for name, model_path in models.items():
    try:
        # Note: Using a base model not fine-tuned on SQuAD may lead to poor results
        qa_pipe = pipeline("question-answering", model=model_path)
        result = qa_pipe(question=qa_question, context=qa_context)
        print(f"{name} Answer: {result['answer']}")
    except Exception as e:
        print(f"{name} failed: {e}")

--- Running Experiment 3: Question Answering ---


Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu


BERT Answer: , and deepfakes


Some weights of RobertaForQuestionAnswering were not initialized from the model checkpoint at roberta-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu


RoBERTa Answer: such as hallucinations, bias, and deepfakes


Some weights of BartForQuestionAnswering were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu


BART Answer: such as hallucinations, bias, and deepfakes


| Task | Model | Classification (Success/Failure) | Observation (What actually happened?) | Why did this happen? (Architectural Reason) |
| :--- | :--- | :--- | :--- | :--- |
| **Generation** | BERT | Failure | Generated only dots: .......... | BERT is an encoder-only model and cannot autoregressively generate new tokens. |
|  | RoBERTa | Failure | Repeated the prompt or produced empty output | RoBERTa is also encoder-only and not designed for text generation. |
|  | BART | Failure | Produced gibberish like "euph OLED empirical" | BART is a seq2seq model but not fine-tuned for open-ended generation. |
| **Fill-Mask** | BERT | Success | Predicted words like "create" with high confidence | BERT is trained using Masked Language Modeling (MLM). |
|  | RoBERTa | Success | Predicted words like "generate" | RoBERTa is an optimized MLM-based encoder. |
|  | BART | Failure | Error: No mask token found | BART is not primarily trained for MLM; it uses different tokenization and objectives. |
| **QA** | BERT | Partial | Extracted only partial answer like ", and deepfakes" | Base BERT is not fine-tuned on QA datasets like SQuAD. |
|  | RoBERTa | Success | Correctly extracted "hallucinations, bias, and deepfakes" | Stronger contextual representations improve span extraction. |
|  | BART | Success | Correctly extracted "hallucinations, bias, and deepfakes" | Encoder-decoder architecture handles comprehension and generation well. |
