Authored by: Aryan Mistry

# Transformers Pipeline Quickstart

Hugging Face’s `transformers` library provides high‑level *pipelines* that combine tokenization, model inference and decoding into a single call. Pipelines make it easy to perform common NLP tasks such as sentiment analysis, text generation, translation and question answering without deep knowledge of the underlying model architecture.

This updated notebook introduces several pipelines and includes exercises to practise using them. The code cells will not execute in this environment because the `transformers` package is not installed. On your own machine you can install it via `pip` and run the examples. [1]

## 1 – Installation

To install the `transformers` library along with PyTorch (needed for most models), run the following command in your terminal or notebook. The `--quiet` flag suppresses progress output.

In [1]:
# Install transformers and a backend (e.g. torch)
!pip install --quiet transformers torch

## 2 – Sentiment Analysis

The sentiment analysis pipeline classifies text as positive, negative or neutral. It downloads a pre‑trained model (default is `distilbert-base-uncased-finetuned-sst-2-english`). You can replace the model name with any compatible classification model. [3]

In [8]:
from transformers import pipeline

sentiment = pipeline('sentiment-analysis', model='deepseek-ai/DeepSeek-V3.2-Exp')
sentences = [
    'I love large language models, they are incredibly useful!',
    'This tutorial is confusing and frustrating.',
    'I thought the risotto was pretty average. Not the best, not the worst.'
]
for s in sentences:
    print(s, '->', sentiment(s))

config.json: 0.00B [00:00, ?B/s]

ValueError: The checkpoint you are trying to load has model type `deepseek_v32` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`

### Sentiment Analysis Exercises

1. **Explore Tone:** Write your own sentences expressing different emotions (joy, sadness, anger, sarcasm) and run them through the sentiment pipeline. Do the predictions match your intuition?
2. **Custom Models:** Change the model to `cardiffnlp/twitter-roberta-base-sentiment` (or another sentiment model) and compare the outputs.
3. **Batch Processing:** Pass a list of sentences at once to the pipeline and time how long it takes compared to processing them one by one.

## 3 – Text Generation

The text generation pipeline uses causal language models (like GPT‑2) to extend a prompt. The `max_length` parameter controls the total number of tokens (prompt plus generated), and sampling settings such as `temperature`, `top_k` and `top_p` influence diversity. Check out how GPT-2 completes our example prompt "In a world where AI helps us learn...". [2]

In [9]:
generator = pipeline('text-generation', model='gpt2')
prompt = 'In a world where AI helps us learn'
outputs = generator(prompt, max_length=40, num_return_sequences=2, do_sample=True, top_k=50, top_p=0.95)
for i, out in enumerate(outputs, 1):
    print(f'Generated {i}:', out['generated_text'])

config.json: 0.00B [00:00, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 93 files:   0%|          | 0/93 [00:00<?, ?it/s]

model-00001-of-00092.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

model-00005-of-00092.safetensors:   0%|          | 0.00/7.87G [00:00<?, ?B/s]

model-00004-of-00092.safetensors:   0%|          | 0.00/7.87G [00:00<?, ?B/s]

model-00007-of-00092.safetensors:   0%|          | 0.00/7.87G [00:00<?, ?B/s]

model-00002-of-00092.safetensors:   0%|          | 0.00/650M [00:00<?, ?B/s]

model-00003-of-00092.safetensors:   0%|          | 0.00/650M [00:00<?, ?B/s]

model-00008-of-00092.safetensors:   0%|          | 0.00/7.87G [00:00<?, ?B/s]

model-00006-of-00092.safetensors:   0%|          | 0.00/7.87G [00:00<?, ?B/s]

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/tqdm/contrib/concurrent.py", line 51, in _executor_map
    return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tqdm/notebook.py", line 250, in __iter__
    for obj in it:
               ^^
  File "/usr/local/lib/python3.12/dist-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
               ^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 619, in result_iterator
    yield _result_or_cancel(fs.pop())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 317, in _result_or_cancel
    return fut.result(timeout)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 451, in result
    self._condition.wait(timeout)
  File "/u

TypeError: object of type 'NoneType' has no len()

### Generation Exercises

1. **Prompt Variations:** Try prompts from different domains (science, poetry, legal language). How does the style of the output change?
2. **Sampling Parameters:** Experiment with `temperature`, `top_k` and `top_p`. Higher temperatures and larger `top_p` values typically increase creativity but may reduce coherence.
3. **Controlled Length:** Use different `max_length` values. Observe how the model completes the prompt when allowed to generate more or fewer tokens.

## 4 – Summarization

Summarisation condenses long passages into a shorter version that preserves key information. Models like `t5-small` or `facebook/bart-large-cnn` are commonly used. [13]

In [4]:
summarizer = pipeline('summarization')
text = (
    'Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. ' +
    'The term may also be applied to any machine that exhibits traits associated with a human mind such as learning and problem-solving. ' +
    'The ideal characteristic of artificial intelligence is its ability to rationalize and take actions that have the best chance of achieving a specific goal.'
)
summary = summarizer(text, max_length=60, min_length=25, do_sample=False)
print(summary[0]['summary_text'])

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


 Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions . The ideal characteristic of artificial intelligence is its ability to rationalize and take actions that have the best chance of achieving a specific goal .


### Summarization Exercises

1. **Long Article:** Provide a multi‑paragraph article (e.g. copy from a blog post) to the summarizer. How concise is the summary?
2. **Control Length:** Adjust `max_length` and `min_length`. What happens if you set `max_length` too low or too high?
3. **Different Models:** Use `t5-small` instead of the default summarization model. Compare the results.

## 5 – Translation

Translation pipelines convert text between languages using models like MarianMT or T5. You specify the task (e.g. `'translation_en_to_fr'`). [2]

In [5]:
translator = pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')
sentence = 'Machine learning is changing the world.'
result = translator(sentence)
print(result[0]['translation_text'])

config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

Device set to use cpu


L'apprentissage automatique change le monde.


### Translation Exercises

1. **Multiple Sentences:** Translate a list of sentences in one call. Does the model handle context across sentences?
2. **Other Languages:** Try translating from English to German (`translation_en_to_de`) or Chinese (`translation_en_to_zh`).
3. **Back Translation:** Translate a sentence to another language and then back to English. Does the meaning change?

## 6 – Question Answering

The question answering pipeline extracts an answer from a passage given a question. Models like `bert-large-uncased-whole-word-masking-finetuned-squad` are used. [3]

In [6]:
question_answerer = pipeline('question-answering')
context = 'Caltech’s Center for Technology and Management Education offers professional development courses in AI.'
question = 'Who offers professional development courses?'
result = question_answerer(question=question, context=context)
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


{'score': 0.7356677055358887, 'start': 0, 'end': 56, 'answer': 'Caltech’s Center for Technology and Management Education'}


### Question Answering Exercises

1. **Your Own Passage:** Write a short paragraph and ask at least three questions about it. Does the model extract the correct answers?
2. **Ambiguous Questions:** Ask questions where the answer is not explicitly stated in the passage. How does the model respond?
3. **Multiple Choice:** Use the pipeline to locate answers in multiple passages (e.g. different sections of a document) and select the passage that contains the best answer.

Foundational LLMs & Transformers
1. Vaswani, A., et al. (2017). Attention is All You Need. Advances in Neural Information Processing Systems (NIPS 2017).
2. Brown, T. B., et al. (2020). Language Models are Few-Shot Learners. NeurIPS 2020.
3. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT 2019.
4. OpenAI (2023). GPT-4 Technical Report. arXiv:2303.08774.
5. Touvron, H., et al. (2023). LLaMA 2: Open Foundation and Fine-Tuned Chat Models. Meta AI.

Generative AI & Sampling

6. Goodfellow, I., et al. (2014). Generative Adversarial Nets. NeurIPS 2014.
7. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
8. Neal, R. M. (1993). Probabilistic Inference Using Markov Chain Monte Carlo Methods. Technical Report CRG-TR-93-1, University of Toronto.

Retrieval-Augmented Generation (RAG) & Knowledge Grounding

9. Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP. NeurIPS 2020.
10. deepset ai (2023). Haystack: Open-Source Framework for Search and RAG Applications. https://haystack.deepset.ai
11. LangChain (2023). LangChain Documentation and Cookbook. https://python.langchain.com

Evaluation & Safety

12. Papineni, K., et al. (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. ACL 2002.
13. Lin, C.-Y. (2004). ROUGE: A Package for Automatic Evaluation of Summaries. ACL Workshop 2004.
14. OpenAI (2024). Evaluating Model Outputs: Faithfulness and Grounding. OpenAI Docs.
15. Guardrails AI (2024). Open-Source Guardrails Framework. https://github.com/shreyar/guardrails

Prompt Engineering & Instruction Tuning

16. White, J. (2023). The Prompting Guide. https://www.promptingguide.ai
17. Ouyang, L., et al. (2022). Training Language Models to Follow Instructions with Human Feedback. NeurIPS 2022.

Agents & Tool Use

18. Yao, S., et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629.
19. LangChain (2024). LangChain Agents and Tools Documentation.
20. Microsoft (2023). Semantic Kernel Developer Guide. https://learn.microsoft.com/en-us/semantic-kernel/
21. Google DeepMind (2024). Gemini Technical Report. arXiv:2312.11805.

State, Memory & Orchestration

22. LangGraph (2024). Stateful Agent Orchestration Framework. https://langchain-langgraph.vercel.app
23. Park, J. S., et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. arXiv:2304.03442.

Pedagogical and Course Design References

24. fast.ai (2023). fast.ai Deep Learning Course Notebooks. https://course.fast.ai
25. Ng, A. (2023). DeepLearning.AI Short Courses on Generative AI.
26. MIT 6.S191, Stanford CS324, UC Berkeley CS294-158. (2022–2024). Course Materials and Public Notebooks for ML and LLMs.