✅ What is a pipeline in Hugging Face?
The pipeline is a high-level API provided by Hugging Face’s 🤗 transformers library.

It’s a ready-to-use abstraction that:

Loads a model,

Preprocesses your input,

Runs inference,

Postprocesses the output...
✅ All with just one line of code.

🧠 Analogy:
Think of a pipeline as a smart chatbot helper:

You just say: “Translate this” or “Summarize this,”
and it automatically knows:

Which model to use

How to tokenize (preprocess)

How to run the model

How to decode the output



✅ Why Use pipeline?

| Without pipeline                                             | With pipeline                          |
| ------------------------------------------------------------ | -------------------------------------- |
| You must manually load model, tokenizer, run inference, etc. | Everything handled automatically       |
| More flexibility, more code                                  | Easier and faster for standard tasks   |
| Great for customization                                      | Great for productivity and prototyping |


✅ Types of Pipelines
You can instantiate pipelines for many built-in tasks, e.g.:


pipeline("text-classification")\
pipeline("summarization")\
pipeline("translation_en_to_fr")\
pipeline("question-answering")\
pipeline("fill-mask")\
pipeline("feature-extraction")\
pipeline("zero-shot-classification")\
pipeline("text-generation")\
Each one has:

A default model (can be overridden)

A custom tokenizer

A preprocessing + postprocessing wrapper

In [1]:
from transformers import pipeline

pipe = pipeline("text-generation", model="openai-community/gpt2")

In [2]:
# Generate text
prompt = "Once upon a time"
result = pipe(prompt, max_length=50, num_return_sequences=1)

print(result[0]['generated_text'])

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


Once upon a time, you have had access to a game that had been cut with a feature called "The Final Quest". You are now forced to navigate through a completely new world, and each new adventure was created by a different team from around Europe


Here's a clear explanation of each task you've listed — **all are common NLP (Natural Language Processing) tasks that LLMs (like GPT, Mistral, BERT, etc.) are capable of performing**, especially using Hugging Face pipelines or LangChain chains.

---

### ✅ 1. **Text Classification**

* **What it does**: Assigns labels to entire texts (e.g., sentiment analysis, topic detection).
* **Example**:
  Input: *"This product is amazing!"*
  Output: `label: positive`

---

### ✅ 2. **Token Classification**

* **What it does**: Labels each **token** (word/subword) in the input.
* **Used for**: Named Entity Recognition (NER), Part-of-Speech tagging.
* **Example**:
  Input: *"Elon Musk founded SpaceX."*
  Output: `Elon Musk → PERSON`, `SpaceX → ORG`

---

### ✅ 3. **Table Question Answering**

* **What it does**: Answers questions based on **tabular data** (structured rows/columns).
* **Example**:
  Table: Employee salary table
  Question: *"Who earns the highest salary?"*
  Output: `"Alice Johnson"`

---

### ✅ 4. **Question Answering**

* **What it does**: Extracts answers from **contextual paragraphs**.
* **Example**:
  Context: *"Paris is the capital of France."*
  Question: *"What is the capital of France?"*
  Output: `"Paris"`

---

### ✅ 5. **Zero-Shot Classification**

* **What it does**: Classifies text into user-defined categories **without any fine-tuning**.
* **Example**:
  Input: *"I love designing apps."*
  Labels: `["tech", "sports", "politics"]`
  Output: `label: tech (confidence: 0.94)`

---

### ✅ 6. **Translation**

* **What it does**: Translates text between languages.
* **Example**:
  Input: *"Bonjour tout le monde"*
  Output: `"Hello everyone"`

---

### ✅ 7. **Summarization**

* **What it does**: Compresses long text into a shorter summary while preserving key meaning.
* **Example**:
  Input: A 3-paragraph news article
  Output: `"NASA announced a new space telescope."`

---

### ✅ 8. **Feature Extraction**

* **What it does**: Converts text into a **numerical embedding** (vector).
* **Used in**: RAG, search, clustering, recommendation.
* **Example**:
  Input: *"How to bake a cake?"*
  Output: `[0.123, 0.434, -0.999, ...]`

---

### ✅ 9. **Text Generation**

* **What it does**: Autocompletes or continues a piece of text.
* **Example**:
  Prompt: *"Once upon a time,"*
  Output: *"there was a brave knight who fought dragons."*

---

### ✅ 10. **Text2Text Generation**

* **What it does**: Transforms input into new text (paraphrasing, summarizing, Q\&A, etc.)
* **Common with**: T5, FLAN-T5, BART.
* **Example**:
  Input: *"Translate English to French: Hello."*
  Output: `"Bonjour"`

---

### ✅ 11. **Fill-Mask**

* **What it does**: Predicts masked words in a sentence.
* **Used for**: Cloze tests, sentence completion.
* **Example**:
  Input: *"The capital of France is `<mask>`."*
  Output: `"Paris"`

---

### ✅ 12. **Sentence Similarity**

* **What it does**: Measures how semantically similar two sentences are.
* **Used for**: Plagiarism detection, duplicate questions, search ranking.
* **Example**:
  Sentence A: *"How do I cook pasta?"*
  Sentence B: *"What's the process for making spaghetti?"*
  Output: `Similarity score: 0.89`

---

### ✅ 13. **Text Ranking**

* **What it does**: Ranks multiple candidate texts based on relevance to a query.
* **Used in**: Search engines, document retrieval, rerankers in RAG.
* **Example**:
  Query: *"What is BERT?"*
  Candidates:

  1. Article on transformers
  2. Blog about BERT
  3. Ad for kitchen appliances
     Output: `Rank: [2, 1, 3]`

---

## 🔍 Summary Table

| Task                     | Granularity     | Use Case                               |
| ------------------------ | --------------- | -------------------------------------- |
| Text Classification      | Full text       | Sentiment, topics                      |
| Token Classification     | Word-level      | NER, POS tagging                       |
| Table QA                 | Structured data | QA over Excel/CSV-like data            |
| Question Answering       | Paragraph       | SQuAD-style QA                         |
| Zero-Shot Classification | Any             | Classification with no labeled data    |
| Translation              | Sentence        | Multilingual applications              |
| Summarization            | Paragraph       | Article or report compression          |
| Feature Extraction       | Sentence/vector | Embeddings for ML, RAG                 |
| Text Generation          | Open-ended      | Creative or structured text generation |
| Text2Text Generation     | Input-to-output | QA, translation, summarization, etc.   |
| Fill-Mask                | Word-level      | Completion or correction               |
| Sentence Similarity      | Sentence pairs  | Duplicate detection, ranking           |
| Text Ranking             | Query + docs    | Search result reranking                |

---

Would you like me to generate **code examples** (using Hugging Face or LangChain) for any of these tasks?


In [3]:
## text classification
from transformers import pipeline

classifier = pipeline("text-classification",model="distilbert/distilbert-base-uncased-finetuned-sst-2-english")
result = classifier("I love this phone! It's amazing.")
print(result)


[{'label': 'POSITIVE', 'score': 0.999883770942688}]


In [4]:
## Token Classification (NER)
ner = pipeline("token-classification",model="dslim/bert-base-NER", grouped_entities=True)
result = ner("Hugging Face is based in New York.")
print(result)

config.json:   0%|          | 0.00/829 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/433M [00:00<?, ?B/s]

Some weights of the model checkpoint at dslim/bert-base-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/59.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/2.00 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[{'entity_group': 'ORG', 'score': 0.8408983, 'word': 'Hugging Face', 'start': 0, 'end': 12}, {'entity_group': 'LOC', 'score': 0.99925005, 'word': 'New York', 'start': 25, 'end': 33}]




In [5]:
## Table Question Answering
from transformers import pipeline

qa = pipeline("table-question-answering", model="google/tapas-large-finetuned-wtq")
table = {
    "Name": ["Alice", "Bob"],
    "Age": ["24", "19"],
    "City": ["London", "Paris"]
}
query = "Who is older?"
print(qa(table=table, query=query))


config.json: 0.00B [00:00, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/1.35G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/490 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/154 [00:00<?, ?B/s]

  text = normalize_for_match(row[col_index].text)
  cell = row[col_index]


{'answer': 'Bob', 'coordinates': [(1, 0)], 'cells': ['Bob'], 'aggregator': 'NONE'}


In [6]:
## Question Answering (on Paragraphs)
qa = pipeline("question-answering", model="distilbert/distilbert-base-uncased-distilled-squad")
context = "Paris is the capital of France and one of the most beautiful cities."
question = "What is the capital of France?"
print(qa(question=question, context=context))

config.json:   0%|          | 0.00/451 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/265M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

{'score': 0.9931169152259827, 'start': 0, 'end': 5, 'answer': 'Paris'}


In [7]:
## Zero-Shot Classification
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
sequence = "I enjoy painting landscapes and portraits."
labels = ["art", "sports", "politics"]
print(classifier(sequence, candidate_labels=labels))

{'sequence': 'I enjoy painting landscapes and portraits.', 'labels': ['art', 'sports', 'politics'], 'scores': [0.9935991168022156, 0.0039911591447889805, 0.0024097126442939043]}


In [9]:
## translation
translator = pipeline("translation", model="google-t5/t5-base")
print(translator("The weather is sunny today."))

config.json: 0.00B [00:00, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/892M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]



[{'translation_text': 'Das Wetter ist heute sonnig.'}]


In [10]:
## Summary
summarizer = pipeline("summarization", model="Falconsai/text_summarization")
text = """Hugging Face is a company that develops tools for natural language processing. They created the Transformers library..."""
print(summarizer(text, max_length=30, min_length=10, do_sample=False))


Your max_length is set to 30, but your input_length is only 26. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=13)


[{'summary_text': 'Hugging Face is a company that develops tools for natural language processing . They created the Transformers library...'}]


In [11]:
## Feature Extraction (Embeddings)
feature_extractor = pipeline("feature-extraction", model="BAAI/bge-base-en-v1.5")
vector = feature_extractor("Hugging Face is awesome!")
print(vector[0])

config.json:   0%|          | 0.00/777 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

[[-0.641159176826477, -0.197326198220253, 0.5568535327911377, -0.8168917298316956, -0.0997236892580986, 0.2702544927597046, 0.7345966100692749, -0.07503358274698257, -0.162505105137825, -0.5393748879432678, -0.709320604801178, -0.2532331347465515, -0.9462787508964539, 0.00427139550447464, -0.08178513497114182, 0.48399093747138977, -0.01694808155298233, 0.6189571619033813, -0.33883798122406006, 0.61031174659729, 8.425861597061157e-05, -0.6929726004600525, -0.16443412005901337, 0.34209513664245605, 0.6664635539054871, -0.36562618613243103, 0.8190702795982361, -0.08573944121599197, 0.1845795214176178, 0.24272564053535461, 0.48523086309432983, -0.12498156726360321, 0.15445536375045776, -0.9349124431610107, 0.4214376211166382, 0.2639595568180084, -0.044407546520233154, 0.37696272134780884, -0.4832245409488678, 0.34510233998298645, 0.052554864436388016, -0.6860820055007935, 0.05912695825099945, -0.34223437309265137, -0.925268828868866, -0.7785194516181946, -0.7367229461669922, 1.024753093719

In [12]:
## Text2Text Generation
text2text = pipeline("text2text-generation", model="google/flan-t5-base")
print(text2text("Translate English to French: I love you"))



[{'generated_text': "Je m'aime tu"}]


In [14]:
## Fill-Mask
fill_mask = pipeline("fill-mask", model="distilbert/distilbert-base-uncased")
print(fill_mask("The capital of France is [MASK]."))  # Uses models like BERT


[{'score': 0.14268764853477478, 'token': 16766, 'token_str': 'marseille', 'sequence': 'the capital of france is marseille.'}, {'score': 0.09020446240901947, 'token': 25387, 'token_str': 'nantes', 'sequence': 'the capital of france is nantes.'}, {'score': 0.0880827009677887, 'token': 17209, 'token_str': 'toulouse', 'sequence': 'the capital of france is toulouse.'}, {'score': 0.08617933094501495, 'token': 3000, 'token_str': 'paris', 'sequence': 'the capital of france is paris.'}, {'score': 0.07720646262168884, 'token': 10241, 'token_str': 'lyon', 'sequence': 'the capital of france is lyon.'}]


In [None]:
### Model loading without pipeline

In [15]:
# ✅ 1. Text Classification
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

text = "I love this phone! It's amazing."
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
    predicted_class = torch.argmax(logits).item()
    label = model.config.id2label[predicted_class]
    print(label)



tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

POSITIVE


In [16]:
# ✅ 2. Token Classification (NER)
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import TokenClassificationPipeline

text = "Hugging Face is based in New York."
tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")

tokens = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**tokens).logits
predictions = torch.argmax(outputs, dim=2)

for token, pred in zip(tokenizer.convert_ids_to_tokens(tokens['input_ids'][0]), predictions[0]):
    label = model.config.id2label[pred.item()]
    print(f"{token}: {label}")




Some weights of the model checkpoint at dslim/bert-base-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[CLS]: O
Hu: B-ORG
##gging: I-ORG
Face: I-ORG
is: O
based: O
in: O
New: B-LOC
York: I-LOC
.: O
[SEP]: O


In [39]:
from transformers import AutoTokenizer, TapasForQuestionAnswering
import pandas as pd
import torch

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("google/tapas-base-finetuned-wtq")
model = TapasForQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

# Table
data = {
    "Actors": ["Brad Pitt", "Leonardo Di Caprio", "George Clooney"],
    "Age": ["56", "45", "59"],
    "Number of movies": ["87", "53", "69"],
}
table = pd.DataFrame.from_dict(data)

# Queries
queries = [
    "How many movies has George Clooney played in?",
    "How old is Brad Pitt?"
]

# Encode
inputs = tokenizer(table=table, queries=queries, padding="max_length", return_tensors="pt")

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    predictions = tokenizer.convert_logits_to_predictions(inputs, outputs.logits)

# 🔍 DEBUG: View predictions
print("Raw predictions:", predictions)

# Format answers
for query, answer_coords in zip(queries, predictions):
    if not answer_coords or not answer_coords[0]:
        print(f"Q: {query}\nA: Not found\n")
        continue

    # Flatten and extract
    answers = []
    for group in answer_coords:
        for (row, col) in group:
            answers.append(str(table.iat[row, col]))

    print(f"A: {', '.join(answers)}\n")


Raw predictions: ([[(2, 2)], [(0, 1)]],)
A: 69, 56



In [23]:
# ✅ 4. Paragraph QA
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

context = "Paris is the capital of France."
question = "What is the capital of France?"
model_name = "distilbert-base-cased-distilled-squad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)

inputs = tokenizer(question, context, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    answer_start = torch.argmax(outputs.start_logits)
    answer_end = torch.argmax(outputs.end_logits) + 1
    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0][answer_start:answer_end]))
    print(answer)



tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

Paris


In [24]:
# ✅ 5. Zero-shot Classification
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from transformers import pipeline

sequence = "I enjoy painting."
candidate_labels = ["art", "sports", "politics"]

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
print(classifier(sequence, candidate_labels))



{'sequence': 'I enjoy painting.', 'labels': ['art', 'sports', 'politics'], 'scores': [0.9947479367256165, 0.0031351102516055107, 0.0021169448737055063]}


In [45]:
# ✅ 6. Translation
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load T5 tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

# Define the input (prefix + sentence for translation)
input_text = "translate English to French: I love learning AI with Hugging Face."
inputs = tokenizer(input_text, return_tensors="pt")

# Generate translation
outputs = model.generate(**inputs, max_length=40, num_beams=4, early_stopping=True)

# Decode output
translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Show result
print("Input:", input_text)
print("🇫🇷 Translated:", translated_text)


Input: translate English to French: I love learning AI with Hugging Face.
🇫🇷 Translated: J'aime apprendre l'AI avec Hugging Face.


In [41]:
# ✅ 7. Summarization
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("Falconsai/text_summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("Falconsai/text_summarization")

text = """Hugging Face is a company that develops tools for NLP. They created the Transformers library..."""

inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True)
summary_ids = model.generate(inputs["input_ids"], max_length=30, min_length=10, do_sample=False)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(summary)


Hugging Face is a company that develops tools for NLP. They created the Transformers library...


In [43]:

# ✅ 8. Feature Extraction
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-small-en-v1.5")
model = AutoModel.from_pretrained("BAAI/bge-small-en-v1.5")

text = "Hugging Face is awesome!"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state
    print(embeddings.shape)



tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

torch.Size([1, 7, 384])


In [44]:
# ✅ 9. Text Generation
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

input_text = "Once upon a time"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_length=30)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was


In [46]:
# ✅ 10. Text2Text Generation (T5)
from transformers import T5Tokenizer, T5ForConditionalGeneration

model = T5ForConditionalGeneration.from_pretrained("t5-small")
tokenizer = T5Tokenizer.from_pretrained("t5-small")

input_text = "Translate English to French: I love you"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
output_ids = model.generate(input_ids)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))



You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


Je vous aime


In [47]:
# ✅ 11. Fill-Mask
from transformers import BertTokenizer, BertForMaskedLM

model = BertForMaskedLM.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

input_text = "The capital of France is [MASK]."
inputs = tokenizer(input_text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs).logits
    mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]
    predicted_token_id = outputs[0, mask_token_index].argmax(axis=-1)
    print(tokenizer.decode(predicted_token_id))



BertForMaskedLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another archite

paris


In [48]:
# ✅ 12. Sentence Similarity
from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("all-MiniLM-L6-v2")
sentences = ["I like machine learning.", "I enjoy studying AI."]
embeddings = model.encode(sentences, convert_to_tensor=True)
score = util.pytorch_cos_sim(embeddings[0], embeddings[1])
print("Similarity:", score.item())



modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Similarity: 0.5766932368278503


In [49]:
# ✅ 13. Text Ranking
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id = "BAAI/bge-reranker-base"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

query = "Benefits of drinking water"
docs = [
    "Water helps regulate body temperature.",
    "Baking cookies requires flour and sugar."
]

scores = []
for doc in docs:
    inputs = tokenizer(query, doc, return_tensors="pt", padding=True, truncation=True)
    with torch.no_grad():
        logits = model(**inputs).logits
        score = torch.sigmoid(logits)[0].item()
        scores.append(score)

ranked = sorted(zip(docs, scores), key=lambda x: x[1], reverse=True)
for doc, score in ranked:
    print(f"{score:.3f} → {doc}")

tokenizer_config.json:   0%|          | 0.00/443 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/279 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/799 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

0.980 → Water helps regulate body temperature.
0.000 → Baking cookies requires flour and sugar.
