This notebook uses `Hugging Face’s pipeline()` to access powerfull models like `gpt2`, `BERT`, `T5`

In [1]:
from transformers import pipeline

2025-06-12 13:30:00.682835: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1749715200.697839    8836 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1749715200.702343    8836 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1749715200.714047    8836 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1749715200.714062    8836 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1749715200.714063    8836 computation_placer.cc:177] computation placer alr

#### Text Generation (GPT-2)

In [2]:
# !pip install TensorFlow

In [3]:
# !pip install torch

In [4]:
# pip install tf-keras

In [8]:
"""
Here are some Parameters used in below code:

- prompt (str) : "The future of AI is" -  The initial input text for which the model will generate a continuation.
- max_length (int) -                      The maximum number of tokens (including the input prompt) in the output text.
- num_return_sequences (int) -            The number of distinct text sequences to return for the given prompt.
                                          (if num_return_sequences=2 it will generate 2 sequences with 30 tokens for each)

It will generate text upto max_length(including input prompt length) as one sequences.
"""

generator = pipeline("text-generation", model="gpt2")   # Initializes a text generation pipeline using a specific pre-trained model.
result = generator("The future of AI is", max_length=30, num_return_sequences=1)
# print(result[0]["generated_text"])
print(result)

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': "The future of AI is a question of how much of it is used in other areas, and whether it is being used as a means of improving the efficiency of AI.\n\nIn recent years, the number of AI jobs has grown to more than 20,000 a year from about 10,000. A lot of these jobs may be related to machines. The question is whether those jobs are being used for AI-related and related activities, or whether the jobs are being used for AI-related research.\n\nThese are the problems that I have identified as being at the heart of the problem. I don't think there are going to be big problems in the future.\n\nFor example, there will be big advances in the development of a computer. But there will be more people that are going to start working on AI-related problems. The problem is that the problem is not going to be solved by machines.\n\nThe question is if the problem is going to be solved by artificial intelligence. I'm not sure that will be the case.\n\nI think the problem is that 

#### 2. Sentiment Analysis (BERT)

In [10]:
sentiment_analyzer = pipeline("sentiment-analysis") # Initialize a pipeline for sentiment analysis using default pre-trained model.
result = sentiment_analyzer("I love using Hugging Face transformers!")
print(result)

result = sentiment_analyzer("I gets bored after 20 mins of study")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9971315860748291}]
[{'label': 'NEGATIVE', 'score': 0.999782383441925}]


#### 3. Question Answering (BERT)

In [12]:
qa_pipeline = pipeline("question-answering")
result = qa_pipeline({
    "context":"Transformers are a deep learning architecture introduced in 2017.",
    "question":"When were transformers introduced?"
})
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


{'score': 0.926306962966919, 'start': 60, 'end': 64, 'answer': '2017'}


#### 4. Summarization (T5)

In [15]:
summarizer = pipeline("summarization", model="t5-small")
text = """
The Transformer model, introduced in the paper "Attention is All You Need", 
has revolutionized natural language processing by becoming the foundational architecture behind many advanced models, 
including BERT, GPT, and T5, enabling significant progress in various NLP tasks.
"""

result = summarizer(text, max_length=30, min_length=10, do_sample=False)
print(result)

Device set to use cpu
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'summary_text': 'the Transformer model, introduced in the paper "Attention is all You Need" has revolutionized natural language processing . the model is the foundational architecture behind many advanced models .'}]


#### 5. Named Entity Recognition (NER)
`grouped_entities=True`:
* When set to True, it groups together tokens that belong to the same named entity.
* For example, without grouping, "Barack" and "Obama" might be separate entities; with grouping, they'll be combined as one: "Barack Obama".

In [17]:
# NER identifies entities like people, locations, dates, etc.
ner_pipeline = pipeline("ner", grouped_entities=True)
result = ner_pipeline("Barack Obama was born in Hawaii and became the president of the United States.")
print(result)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


[{'entity_group': 'PER', 'score': 0.9992892, 'word': 'Barack Obama', 'start': 0, 'end': 12}, {'entity_group': 'LOC', 'score': 0.9995566, 'word': 'Hawaii', 'start': 25, 'end': 31}, {'entity_group': 'LOC', 'score': 0.99947286, 'word': 'United States', 'start': 64, 'end': 77}]


#### 6. Translation (English → Hindi)

In [3]:
# !pip install sentencepiece

In [4]:
translator = pipeline("translation_en_to_hi", model="Helsinki-NLP/opus-mt-en-hi")
result = translator("I love learning about AI and machine learning.")
print(result)

Device set to use cpu


[{'translation_text': 'मुझे एआई और मशीन सीखने के बारे में सीखना अच्छा लगता है.'}]
