# Introduction to Transformers with Hugging Face
In this notebook, you'll learn the basics of using pre-trained transformer models for text classification, summarization, and more using the Hugging Face `transformers` library.

In [1]:
# 📦 Install dependencies
!pip install -q transformers datasets

## Task 1: Sentiment Analysis
Lets use a pre trained model to analyse the sentiment of a sentence.

In [2]:
from transformers import pipeline
classifier = pipeline('sentiment-analysis')

text = "I love the new design of your app!"
result = classifier(text)
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9998569488525391}]


## Task 2: Text Summarisation
Summarise long passages of text using a summarisation model.

In [6]:
summariser = pipeline('summarization')

long_text = """
The Hugging Face Transformers library provides thousands of pre-trained models to perform tasks on text such as classification, information extraction, question answering, summarization, translation, and more. This allows developers to quickly build state-of-the-art NLP applications.
"""
summary = summariser(long_text, max_length=25, min_length=5, do_sample=False)
print(summary[0]['summary_text'])

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


 Hugging Face Transformers library provides thousands of pre-trained models to perform tasks on text such as classification, information


## Task 3: Question Answering
Use a model to find answers to questions from context.

In [4]:
qa = pipeline('question-answering')

context = "Hugging Face is a company that provides machine learning tools including the Transformers library."
question = "What does Hugging Face provide?"
answer = qa(question=question, context=context)
print(answer['answer'])

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Device set to use cpu


machine learning tools


## 🧠 Task 4: Fill-Mask
Predict missing words in a sentence.

In [7]:
fill_mask = pipeline('fill-mask')
result = fill_mask("Transformers are revolutionizing the field of <mask>.")
for prediction in result:
    print(f"{prediction['sequence']} (score: {prediction['score']:.4f})")

No model was supplied, defaulted to distilbert/distilroberta-base and revision fb53ab8 (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


Transformers are revolutionizing the field of robotics. (score: 0.1995)
Transformers are revolutionizing the field of computing. (score: 0.0961)
Transformers are revolutionizing the field of physics. (score: 0.0409)
Transformers are revolutionizing the field of science. (score: 0.0346)
Transformers are revolutionizing the field of mathematics. (score: 0.0256)


## 🧠 Customize Your Hugging Face Sentiment Model
By default, the Hugging Face `pipeline('sentiment-analysis')` uses a general-purpose model. However, you can specify a different model depending on your domain, language, or accuracy needs.

**Examples of specialized models you could use:**
- `distilbert-base-uncased-finetuned-sst-2-english` (default for sentiment analysis)
- `cardiffnlp/twitter-roberta-base-sentiment` (great for social media text)
- `nlptown/bert-base-multilingual-uncased-sentiment` (supports multiple languages)

Make sure the model is compatible with the sentiment-analysis task.

In [8]:
# 🔁 Specify a custom model from Hugging Face Hub
# Replace the model name below with one of your choice
custom_model_name = 'distilbert-base-uncased-finetuned-sst-2-english'  # or try 'cardiffnlp/twitter-roberta-base-sentiment'

from transformers import pipeline
classifier = pipeline('sentiment-analysis', model=custom_model_name)

text = "I love the new design of your app!"
result = classifier(text)
print(result)

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9998569488525391}]


## ✅ You're Ready!
This notebook gave you a taste of what transformer models can do. Feel free to explore more models and tasks at [https://huggingface.co/models](https://huggingface.co/models)