In [3]:
from transformers import pipeline, AutoTokenizer, AutoModelForMaskedLM


In [4]:
prompt = "The future of Artificial Intelligence is"

models = [
    "bert-base-uncased",
    "roberta-base",
    "facebook/bart-base"
]

for m in models:
    print(f"\n=== {m} ===")
    try:
        pipe = pipeline("text-generation", model=m)
        print(pipe(prompt, max_length=30))
    except Exception as e:
        print("ERROR:", e)



If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`



=== bert-base-uncased ===


Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
If you want to use `RobertaLMHeadModel` as a standalone, add `is_decoder=True.`


[{'generated_text': 'The future of Artificial Intelligence is................................................................................................................................................................................................................................................................'}]

=== roberta-base ===


Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'The future of Artificial Intelligence is'}]

=== facebook/bart-base ===


Some weights of BartForCausalLM were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['lm_head.weight', 'model.decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'The future of Artificial Intelligence isuddenly Ital Ital Ital Madd Madd Madd Medicare Medicare Ital maritalaliasalias Madd Madd marital irregular Italastery Charlie�� Salmon voc irregular Madd Charlie Ital Ital populatedischer seating Madd sticky Stuff reproduce Stuff Stuff sticky Stuff exclaim Stuff Stuff Stuff proclaiming epic reproduceED Stuff Galileo Stuff Stuffopolis Stuff Stuff Lex Stuff facilitated Stuff Stuff backups Stuff Stuff epic Stuff Stuff Baby scramble Stuff Stuff statistic Baby Traps Stuff Stuff Armenian Stuff Stuff scramble Stuff Baby Stuff statistic proclaiming Stuff Stuff facilitated Baby Baby statistic statistic circumvent Stuff Stuff too Stuff StuffED High reproduce Stuff Traps Stuff Beat facilitated Padres Stuff Stuffhabi Stuff Stuff Immortal scramble Stuff kidding Baby Stuff Mac Stuffopolis Traps� Stuff Stuff scores statistic simulac Mac Stuff Stuff Mac proclaiming Stuff tooovic Stuff irregular Stuff too Baby Stuff Stuff journalists Stuff In

In [7]:
model_masks = {
    "bert-base-uncased": "[MASK]",
    "roberta-base": "<mask>",
    "facebook/bart-base": "<mask>"
}

base = "The goal of Generative AI is to {} new content."

for model_name, mask_token in model_masks.items():
    print(f"\n=== {model_name} ===")

    # build sentence for the model
    sentence = base.format(mask_token)

    try:
        pipe = pipeline("fill-mask", model=model_name)
        print("Input:", sentence)
        result = pipe(sentence)
        print(result)
    except Exception as e:
        print("ERROR:", e)



=== bert-base-uncased ===


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0


Input: The goal of Generative AI is to [MASK] new content.
[{'score': 0.5396888852119446, 'token': 3443, 'token_str': 'create', 'sequence': 'the goal of generative ai is to create new content.'}, {'score': 0.15575668215751648, 'token': 9699, 'token_str': 'generate', 'sequence': 'the goal of generative ai is to generate new content.'}, {'score': 0.054054468870162964, 'token': 3965, 'token_str': 'produce', 'sequence': 'the goal of generative ai is to produce new content.'}, {'score': 0.04451529309153557, 'token': 4503, 'token_str': 'develop', 'sequence': 'the goal of generative ai is to develop new content.'}, {'score': 0.01757732406258583, 'token': 5587, 'token_str': 'add', 'sequence': 'the goal of generative ai is to add new content.'}]

=== roberta-base ===


Device set to use cuda:0


Input: The goal of Generative AI is to <mask> new content.
[{'score': 0.3711293935775757, 'token': 5368, 'token_str': ' generate', 'sequence': 'The goal of Generative AI is to generate new content.'}, {'score': 0.36771273612976074, 'token': 1045, 'token_str': ' create', 'sequence': 'The goal of Generative AI is to create new content.'}, {'score': 0.08351442217826843, 'token': 8286, 'token_str': ' discover', 'sequence': 'The goal of Generative AI is to discover new content.'}, {'score': 0.021335095167160034, 'token': 465, 'token_str': ' find', 'sequence': 'The goal of Generative AI is to find new content.'}, {'score': 0.016521504148840904, 'token': 694, 'token_str': ' provide', 'sequence': 'The goal of Generative AI is to provide new content.'}]

=== facebook/bart-base ===


Device set to use cuda:0


Input: The goal of Generative AI is to <mask> new content.
[{'score': 0.0746147632598877, 'token': 1045, 'token_str': ' create', 'sequence': 'The goal of Generative AI is to create new content.'}, {'score': 0.06571780890226364, 'token': 244, 'token_str': ' help', 'sequence': 'The goal of Generative AI is to help new content.'}, {'score': 0.060879286378622055, 'token': 694, 'token_str': ' provide', 'sequence': 'The goal of Generative AI is to provide new content.'}, {'score': 0.03593532741069794, 'token': 3155, 'token_str': ' enable', 'sequence': 'The goal of Generative AI is to enable new content.'}, {'score': 0.03319435939192772, 'token': 1477, 'token_str': ' improve', 'sequence': 'The goal of Generative AI is to improve new content.'}]


In [6]:
question = "What are the risks?"
context = "Generative AI poses significant risks such as hallucinations, bias, and deepfakes."

models = [
    "bert-base-uncased",
    "roberta-base",
    "facebook/bart-base"
]

for m in models:
    print(f"\n=== {m} ===")
    try:
        qa = pipeline("question-answering", model=m)
        print(qa(question=question, context=context))
    except Exception as e:
        print("ERROR:", e)


Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



=== bert-base-uncased ===


Device set to use cuda:0
Some weights of RobertaForQuestionAnswering were not initialized from the model checkpoint at roberta-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


{'score': 0.01443921448662877, 'start': 14, 'end': 60, 'answer': 'poses significant risks such as hallucinations'}

=== roberta-base ===


Device set to use cuda:0
Some weights of BartForQuestionAnswering were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


{'score': 0.008587918244302273, 'start': 0, 'end': 42, 'answer': 'Generative AI poses significant risks such'}

=== facebook/bart-base ===


Device set to use cuda:0


{'score': 0.11816955544054508, 'start': 11, 'end': 81, 'answer': 'AI poses significant risks such as hallucinations, bias, and deepfakes'}


| Task | Model | Classification (Success/Failure) | Observation (What actually happened?) | Why did this happen? (Architectural Reason) |
| :--- | :--- | :--- | :--- | :--- |
| **Generation** | BERT | *Failure* | *Generated symbols.* | *BERT is an Encoder; it isn't trained to predict the next word.* |
| | RoBERTa | *Failure* | *Generated nothing* | *RoBERTa is an encoder* |
| | BART | *Success* | *Generated text but was not meaningful* | *BART is an encoder-decoder* |
| **Fill-Mask** | BERT | *Success* | *Predicted 'create'.* | *BERT is trained on Masked Language Modeling (MLM).* |
| | RoBERTa | *Success* | *Predicted 'generate'.*| RoBERTa is trained on Masked Language Modelling (MLM).* |
| | BART | * Partial Success* | *Predicted 'create'.* | Lower quality because MLM is only a pretrain objective |
| **QA** | BERT | *Partial Success* | *selects a chunk that partially answers the question.* | *Encoder-only models can extract spans* |
| | RoBERTa | *Partial Success* | *similar span selection, slightly less precise.* | *Optimized encoder gives good contextual understanding but lacks QA head* |
| | BART | *Partial Success* | *the closest to a full correct span, even though BART isn’t designed for extractive QA.* | *Encoder–decoder can reconstruct answers but is not extractive QA oriented* |