In [1]:
!pip install transformers



In [2]:
from transformers import pipeline



In [3]:
# define models
models = {
    "BERT": "bert-base-uncased",
    "RoBERTa": "roberta-base",
    "BART": "facebook/bart-base"
}


# Experiment 1 — Text Generation

In [12]:
prompt = "The future of Artificial Intelligence is"

In [13]:
for name, model in models.items():
    print(f"\nModel: {name}")
    try:
        generator = pipeline("text-generation", model=model)
        output = generator(prompt, max_length=30, num_return_sequences=1)
        print(output[0]["generated_text"])
    except Exception as e:
        print("Error:", e)



Model: BERT


If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
If you want to use `RobertaLMHeadModel` as a standalone, add `is_decoder=True.`


The future of Artificial Intelligence is................................................................................................................................................................................................................................................................

Model: RoBERTa


Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


The future of Artificial Intelligence is

Model: BART


Some weights of BartForCausalLM were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['lm_head.weight', 'model.decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


The future of Artificial Intelligence is broavenaven Rohingavenavenaven activismavenavenpop particularlyaven pandavenaven hated concealedavenaven concealedaven mastersavenaven controversiesaven drillavenaven drillgradesavenaven shootingavenaven cleveraven TAMADRA TAMADRAgrades particularly controversies concealed concealed mastersaven hated controversiesIII drill drillaven controversiesawiavenpared drillavenalyses controversies drill drill drillgrades bro hated drill fracture shooting shooting shootingoca rangegradesavenCreatavenaven Telecomavengranaven controversies Garrisonavenaven strangely shootingaven belovedARSARS Alliedavenavengrades gossipavenaven fractureavenavenARSaven sandbox distributorsgrades controversiesgradesavenARS particularlyavenaven penetaven ridiculouslyaven Garrisonavengrades sandboxARSARSgradesARSARS controversiesavenaven gossipaven Bastavengrades Bast recycleaven belovedavenaven pesavenaven sandbox bulletavenARSARS bulletavenavensiteARS bulletARSARS ridiculously

# Experiment 2 — Fill Mask

In [10]:
# mask_prompt = "The goal of Generative AI is to [MASK] new content."
mask_prompt = "The goal of Generative AI is to <mask> new content."

In [11]:
for name, model in models.items():
    print(f"\nModel: {name}")
    try:
        fill_mask = pipeline("fill-mask", model=model)
        output = fill_mask(mask_prompt)
        for o in output[:3]:
            print(o["sequence"])
    except Exception as e:
        print("Error:", e)



Model: BERT


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


Error: No mask_token ([MASK]) found on the input

Model: RoBERTa


Device set to use cpu


The goal of Generative AI is to generate new content.
The goal of Generative AI is to create new content.
The goal of Generative AI is to discover new content.

Model: BART


Device set to use cpu


The goal of Generative AI is to create new content.
The goal of Generative AI is to help new content.
The goal of Generative AI is to provide new content.


# Experiment 3 — Question Answering

In [8]:
qa_input = {
    "question": "What are the risks?",
    "context": "Generative AI poses significant risks such as hallucinations, bias, and deepfakes."
}

In [9]:
for name, model in models.items():
    print(f"\nModel: {name}")
    try:
        qa = pipeline("question-answering", model=model)
        output = qa(qa_input)
        print("Answer:", output["answer"])
    except Exception as e:
        print("Error:", e)


Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



Model: BERT


Device set to use cpu
Some weights of RobertaForQuestionAnswering were not initialized from the model checkpoint at roberta-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Answer: poses significant risks such as hallucinations

Model: RoBERTa


Device set to use cpu


Answer: risks such as hallucinations, bias, and

Model: BART


Some weights of BartForQuestionAnswering were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu


Answer: , and


## Observation Table

| Task | Model | Success / Failure | Observation | Why did this happen? |
|------|--------|------------------|--------------|----------------------|
| Generation | BERT | Failure | Output repeated dots instead of meaningful text. | BERT is encoder-only and trained for masked token prediction, not next-word generation. |
|  | RoBERTa | Failure | Returned only the prompt without continuation. | RoBERTa is encoder-only and cannot generate tokens autoregressively. |
|  | BART | Partial Success | Generated continuation but text was nonsensical. | BART can generate text but base checkpoint is not fine-tuned for open-ended generation. |
| Fill-Mask | BERT | Failure | Error when using `<mask>` token. | BERT tokenizer expects `[MASK]` token; mismatch caused failure. |
|  | RoBERTa | Success | Predicted correct words like generate/create. | RoBERTa is MLM-trained and uses `<mask>` token. |
|  | BART | Partial Success | Produced reasonable masked-word predictions. | BART uses denoising pretraining and can reconstruct missing text but is not specialized for MLM. |
| QA | BERT | Failure | Did not produce meaningful answer. | Base BERT is not fine-tuned for QA; QA head randomly initialized. |
|  | RoBERTa | Partial Success | Extracted partial correct answer. | Strong encoder representations but not QA fine-tuned. |
|  | BART | Failure | Produced incoherent answer fragment. | bart-base not fine-tuned for extractive QA. |
