In [1]:
from transformers import pipeline


  from .autonotebook import tqdm as notebook_tqdm


**EXPERIMENT 1: TEXT GENERATION**

In [2]:
#1A. BERT – Text Generation
gen_bert = pipeline("text-generation", model="bert-base-uncased")
gen_bert("The future of Artificial Intelligence is", max_length=30)


If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'The future of Artificial Intelligence is................................................................................................................................................................................................................................................................'}]

In [3]:
#1B. RoBERTa – Text Generation
gen_roberta = pipeline("text-generation", model="roberta-base")
gen_roberta("The future of Artificial Intelligence is", max_length=30)


If you want to use `RobertaLMHeadModel` as a standalone, add `is_decoder=True.`
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'The future of Artificial Intelligence is'}]

In [4]:
#1C. BART – Text Generation
gen_bart = pipeline("text-generation", model="facebook/bart-base")
gen_bart("The future of Artificial Intelligence is", max_length=30)


Some weights of BartForCausalLM were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['lm_head.weight', 'model.decoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'The future of Artificial Intelligence ischio bumpchio skipping noteFWsitchiochio Thiefchiochio VACPUiker sag splitchiochiouminatiuminati eru bumpPHOTOSAMP launcher cat commenddesktop eru VAdesktopchioiliarydesktopchiochioConclusion 4096 ingestion PebbleAnything Pebble PebbleAbove GTchio 03 caulKenn� ingestion 4000 Witches narrativesAnything nounino PebbledesktopWhen lowly snaps snaps hypnot catiterraneanchiodesktopino narratives narrativesordanChWhen inhab Thiefdesktopchiodesktopdesktop protects Thief Thief narratives narrativesPHOTOS narratives narratives narratives Witches narrativesachusetts Pebble Pebble Pebble scrolling narratives Hopkins narratives chords narratives narratives Wagnerino eligibledesktopdesktop pairs�� ris note Pebble�zza ris narratives eligible narratives narrativesinourt risAbove Pebble ingestionachusetts eligiblemastdesktop scrollingchio eligible sag probably probablyursedesktop risiterranean narratives eligibledesktop Pebble ris ris noteAbo

**EXPERIMENT 2: FILL-MASK**

In [5]:
# 2A. BERT – Fill Mask
mask_bert = pipeline("fill-mask", model="bert-base-uncased")
mask_bert("The goal of Generative AI is to [MASK] new content.")


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


[{'score': 0.539692759513855,
  'token': 3443,
  'token_str': 'create',
  'sequence': 'the goal of generative ai is to create new content.'},
 {'score': 0.15575766563415527,
  'token': 9699,
  'token_str': 'generate',
  'sequence': 'the goal of generative ai is to generate new content.'},
 {'score': 0.05405496060848236,
  'token': 3965,
  'token_str': 'produce',
  'sequence': 'the goal of generative ai is to produce new content.'},
 {'score': 0.044515229761600494,
  'token': 4503,
  'token_str': 'develop',
  'sequence': 'the goal of generative ai is to develop new content.'},
 {'score': 0.017577484250068665,
  'token': 5587,
  'token_str': 'add',
  'sequence': 'the goal of generative ai is to add new content.'}]

In [6]:
# 2B. RoBERTa – Fill Mask

mask_roberta = pipeline("fill-mask", model="roberta-base")
mask_roberta("The goal of Generative AI is to <mask> new content.")


Device set to use cpu


[{'score': 0.37113118171691895,
  'token': 5368,
  'token_str': ' generate',
  'sequence': 'The goal of Generative AI is to generate new content.'},
 {'score': 0.3677138090133667,
  'token': 1045,
  'token_str': ' create',
  'sequence': 'The goal of Generative AI is to create new content.'},
 {'score': 0.08351466804742813,
  'token': 8286,
  'token_str': ' discover',
  'sequence': 'The goal of Generative AI is to discover new content.'},
 {'score': 0.02133519947528839,
  'token': 465,
  'token_str': ' find',
  'sequence': 'The goal of Generative AI is to find new content.'},
 {'score': 0.01652175933122635,
  'token': 694,
  'token_str': ' provide',
  'sequence': 'The goal of Generative AI is to provide new content.'}]

In [None]:
#2C. BART – Fill Mask
mask_bart = pipeline("fill-mask", model="facebook/bart-base")
mask_bart("The goal of Generative AI is to <mask> new content.")


Device set to use cpu


In [None]:
#EXPERIMENT 3: QUESTION ANSWERING
qa_bert = pipeline("question-answering", model="bert-base-uncased")

qa_bert({
    "context": "Generative AI poses significant risks such as hallucinations, bias, and deepfakes.",
    "question": "What are the risks?"
})


In [None]:
qa_roberta = pipeline("question-answering", model="roberta-base")

qa_roberta({
    "context": "Generative AI poses significant risks such as hallucinations, bias, and deepfakes.",
    "question": "What are the risks?"
})


In [None]:
qa_bart = pipeline("question-answering", model="facebook/bart-base")

qa_bart({
    "context": "Generative AI poses significant risks such as hallucinations, bias, and deepfakes.",
    "question": "What are the risks?"
})


**Experiment 3: Question Answering**

| Task           | Model       | Classification (Success/Failure) | Observation (What actually happened?)                              | Why did this happen? (Architectural Reason)                                                    |
| -------------- | ----------- | -------------------------------- | ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------- |
| **Generation** | **BERT**    | Failure                          | Output is empty, repetitive, or nonsensical text.                  | BERT is an **encoder-only** model and is not trained for autoregressive next-token generation. |
|                | **RoBERTa** | Failure                          | Similar failure as BERT; unable to generate coherent continuation. | RoBERTa is also **encoder-only**, optimized for understanding, not text generation.            |
|                | **BART**    | Success                          | Generates fluent and coherent continuation of the sentence.        | BART is an **encoder–decoder** model trained for sequence-to-sequence generation tasks.        |
| **Fill-Mask**  | **BERT**    | Success                          | Correctly predicts words like *create*, *generate*.                | BERT is trained using **Masked Language Modeling (MLM)**.                                      |
|                | **RoBERTa** | Success                          | Predicts accurate and contextually relevant masked words.          | RoBERTa improves MLM training with more data and better optimization.                          |
|                | **BART**    | Partial Success                  | Predicts plausible words but sometimes less precise.               | BART can handle masking but MLM is not its primary training objective.                         |
| **QA**         | **BERT**    | Partial / Weak Success           | Returns short or vague answers; sometimes incorrect.               | Base BERT is **not fine-tuned** on QA datasets like SQuAD.                                     |
|                |             |                                  |                                                                    |                                                                                                |
