
# 🤗 Transformers API Examples

This notebook demonstrates how to use Hugging Face `transformers` library APIs for:
- Text generation (GPT-2)
- Text classification (BERT)
- Question answering (DistilBERT)
- Zero-shot classification (BART)

These are **basic, reproducible examples** to get comfortable with the API.


In [1]:

# If running in Colab, uncomment:
# !pip install -q transformers torch


In [2]:

from transformers import pipeline

print("Transformers pipelines ready.")


Transformers pipelines ready.


## 🔮 Text Generation with GPT-2

In [3]:

generator = pipeline("text-generation", model="gpt2")
prompt = "The future of artificial intelligence is"
output = generator(prompt, max_length=50, num_return_sequences=1)
print(output[0]['generated_text'])


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


The future of artificial intelligence is still in its infancy and it's unclear what will happen to the world's most advanced AI.

"There's a lot of work to do to get it to understand the human brain better," said Thomas. "We're not going to know all the details, but we're going to be very excited if it is able to understand the human brain better."

The AI's neural network was recently upgraded to be able to run on hardware with "a variety of computational tools." However, the AI isn't ready for prime time yet.

"We're working on it on a regular basis, and we don't have any predictions about it yet, but we're working on it," said Thomas. "We're hoping to have it operational by the end of this year."

This article was originally published on Tech Insider. Read the original article.


## 🏷️ Text Classification with BERT

In [4]:

classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")
result = classifier("I love using Hugging Face transformers!")
print(result)


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9971315860748291}]


## ❓ Question Answering with DistilBERT

In [5]:

qa_pipeline = pipeline("question-answering", model="distilbert-base-uncased-distilled-squad")
context = "Hugging Face is creating a tool that democratizes AI through open-source libraries."
question = "What is Hugging Face creating?"
result = qa_pipeline(question=question, context=context)
print(result)


config.json:   0%|          | 0.00/451 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/265M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cpu


{'score': 0.575995683670044, 'start': 25, 'end': 82, 'answer': 'a tool that democratizes AI through open-source libraries'}


## 🎯 Zero-Shot Classification with BART

In [6]:

zero_shot = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
sequence_to_classify = "I need to book a flight to San Francisco next week."
candidate_labels = ["travel", "cooking", "finance"]
result = zero_shot(sequence_to_classify, candidate_labels)
print(result)


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


{'sequence': 'I need to book a flight to San Francisco next week.', 'labels': ['travel', 'finance', 'cooking'], 'scores': [0.9862432479858398, 0.011267457157373428, 0.002489235484972596]}



## ✅ Next Steps
- Try other tasks: summarization, translation, named entity recognition.
- Swap models in the pipeline to compare results (e.g., `facebook/mbart-large-50-many-to-many-mmt` for translation).
- Integrate these components into larger workflows (chatbots, RAG, analytics).


In [9]:
ner = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english")
text = "Hugging Face is a company based in Paris."
recognized_entities = ner(text)
print(recognized_entities)

config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'entity': 'I-ORG', 'score': np.float32(0.99465585), 'index': 1, 'word': 'Hu', 'start': 0, 'end': 2}, {'entity': 'I-ORG', 'score': np.float32(0.8991443), 'index': 2, 'word': '##gging', 'start': 2, 'end': 7}, {'entity': 'I-ORG', 'score': np.float32(0.9700209), 'index': 3, 'word': 'Face', 'start': 8, 'end': 12}, {'entity': 'I-LOC', 'score': np.float32(0.9969995), 'index': 9, 'word': 'Paris', 'start': 35, 'end': 40}]


## 👤 Named Entity Recognition with dbmdz/bert-large-cased-finetuned-conll03-english

In [8]:
translator = pipeline("translation_en_to_fr", model="Helsinki-NLP/opus-mt-en-fr")
english_text = "This is a sentence I would like to translate to French."
french_text = translator(english_text, max_length=50)
print(french_text[0]['translation_text'])

config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

Device set to use cpu


C'est une phrase que j'aimerais traduire en français.


## 🌍 Text Translation with Helsinki-NLP/opus-mt-en-fr

In [12]:
from transformers import pipeline
from huggingface_hub import notebook_login
from google.colab import userdata

# Attempt to log in to Hugging Face Hub using a token stored in Colab secrets
try:
    HF_TOKEN = userdata.get('HF_TOKEN')
    if HF_TOKEN:
        notebook_login()
    else:
        print("HF_TOKEN not found in Colab secrets. Proceeding without authentication.")
except userdata.SecretNotFoundError:
    print("HF_TOKEN not found in Colab secrets. Proceeding without authentication.")

# Using a different summarization model
summarizer = pipeline("summarization", model="sshleifer/distilbart-cnn-12-6")
article = """
Hugging Face Inc. is an American company that develops tools for building applications using machine learning.
It is known for its Transformers library, a Python package for natural language processing, and for its platform that allows users to share and host machine learning models and datasets.
"""
summary = summarizer(article, max_length=50, min_length=10, do_sample=False)
print(summary[0]['summary_text'])

HF_TOKEN not found in Colab secrets. Proceeding without authentication.


config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


 Hugging Face Inc. develops tools for building applications using machine learning . It is known for its Transformers library, a Python package for natural language processing, and for its platform that allows users to share and host models and datasets .


## Integrating Components into Workflows

You can combine the Hugging Face `transformers` pipeline components to build larger workflows like chatbots, Retrieval Augmented Generation (RAG) systems, and data analytics pipelines.

### Chatbots

For chatbots, you can use:
- **Text Generation:** To generate responses based on user input and conversation history.
- **Text Classification:** To understand user intent or sentiment.
- **Named Entity Recognition:** To extract key information from user messages.

You would typically build a conversational flow that uses these models sequentially or conditionally based on the chatbot's logic.

### Retrieval Augmented Generation (RAG)

RAG systems combine retrieval (finding relevant information from a knowledge base) and generation (using a language model to generate a response based on the retrieved information). You can use:
- **Question Answering:** To extract precise answers from retrieved documents.
- **Text Generation:** To synthesize information from retrieved documents into a coherent response.
- **Text Classification:** To filter or categorize retrieved documents.

A RAG workflow usually involves:
1. Receiving a user query.
2. Using the query to retrieve relevant documents from a database or knowledge base.
3. Passing the query and retrieved documents to a language model for generation or question answering.
4. Returning the generated response to the user.

### Analytics

In analytics pipelines, you can use these components to process and analyze text data:
- **Text Classification:** To categorize documents or sentiment analysis.
- **Named Entity Recognition:** To extract entities like names, organizations, or locations for analysis.
- **Summarization:** To create concise summaries of large volumes of text data.

You would typically integrate these models into data processing pipelines using libraries like pandas or Spark to process text data in bulk and extract insights.

This is just a high-level overview. The specific implementation will depend on the complexity of your workflow and the tools you are using.