### Using Pretrained Models to Classify Text

In [2]:
# Installs the 'transformers' library from Hugging Face
# This library allows working with pre-trained natural language processing (NLP) models
# such as BERT, GPT, RoBERTa, etc.
pip install transformers


Collecting transformers
  Downloading transformers-4.57.3-py3-none-any.whl.metadata (43 kB)
Collecting filelock (from transformers)
  Downloading filelock-3.20.1-py3-none-any.whl.metadata (2.1 kB)
Collecting huggingface-hub<1.0,>=0.34.0 (from transformers)
  Downloading huggingface_hub-0.36.0-py3-none-any.whl.metadata (14 kB)
Collecting regex!=2019.12.17 (from transformers)
  Downloading regex-2025.11.3-cp313-cp313-win_amd64.whl.metadata (41 kB)
Collecting tokenizers<=0.23.0,>=0.22.0 (from transformers)
  Downloading tokenizers-0.22.1-cp39-abi3-win_amd64.whl.metadata (6.9 kB)
Collecting safetensors>=0.4.3 (from transformers)
  Downloading safetensors-0.7.0-cp38-abi3-win_amd64.whl.metadata (4.2 kB)
Collecting fsspec>=2023.5.0 (from huggingface-hub<1.0,>=0.34.0->transformers)
  Downloading fsspec-2025.12.0-py3-none-any.whl.metadata (10 kB)
Downloading transformers-4.57.3-py3-none-any.whl (12.0 MB)
   ---------------------------------------- 0.0/12.0 MB ? eta -:--:--
   ------------------

In [5]:
# Installs the 'torch' library (PyTorch)
# PyTorch is a popular deep learning framework used for building and training
# neural networks, handling tensors, and performing GPU-accelerated computations.
pip install torch


Note: you may need to restart the kernel to use updated packages.


In [2]:
# Imports the 'pipeline' function from the Hugging Face transformers library
# The pipeline function provides an easy way to use pre-trained NLP models
# for tasks like text classification, question answering, text generation, etc.
from transformers import pipeline


In [7]:
# Creates a sentiment analysis pipeline using Hugging Face transformers
# "sentiment-analysis" specifies the NLP task
# 'distilbert/distilbert-base-uncased-finetuned-sst-2-english' is the pre-trained model fine-tuned for sentiment analysis
# 'revision="714eb0f"' specifies the exact version of the model to use
# The resulting 'model' object can be used to analyze the sentiment of text (positive/negative)
model = pipeline(
    "sentiment-analysis",
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    revision="714eb0f"
)


Device set to use cpu


In [8]:
# Uses the previously created sentiment analysis pipeline to analyze the sentiment of the given text
# The text: "The long lines and poor customer service really turned me off" expresses a negative experience
# The output will typically be a dictionary with 'label' (e.g., POSITIVE or NEGATIVE) and 'score' (confidence level)
model('The long lines and poor customer service really turned me off')


[{'label': 'NEGATIVE', 'score': 0.9995430707931519}]

In [9]:

# Load the classification pipeline with the specified model
pipe = pipeline("text-classification", model="tabularisai/multilingual-sentiment-analysis")



config.json:   0%|          | 0.00/851 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/541M [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'Very Positive', 'score': 0.5586305260658264}]


In [12]:
# Classify a new sentence
sentence = "I love this product! It's amazing and works perfectly."
result = pipe(sentence)

# Print the result
print(result)

[{'label': 'Very Positive', 'score': 0.5586305260658264}]


In [13]:
# Classify a new sentence
sentence = "I  didn't love this product! It's not  amazing and doesn't work perfectly."
result = pipe(sentence)

# Print the result
print(result)

[{'label': 'Negative', 'score': 0.9187023639678955}]


Itâ€™s just as easy to analyze a text string for emotion by loading a different pretrained
model. To demonstrate, try this:

In [14]:
# Creates a text classification pipeline using Hugging Face transformers
# 'text-classification' specifies the NLP task of classifying text into categories
# 'bhadresh-savani/distilbert-base-uncased-emotion' is a pre-trained model fine-tuned for emotion detection
# 'return_all_scores=True' ensures the pipeline returns the scores for all possible emotions, not just the top one
# The resulting 'model' object can be used to analyze emotions in text (e.g., joy, sadness, anger, fear, etc.)
model = pipeline(
    'text-classification',
    model='bhadresh-savani/distilbert-base-uncased-emotion',
    return_all_scores=True
)


config.json:   0%|          | 0.00/768 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/291 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cpu


In [15]:
model('The long lines and poor customer service really turned me off')

[[{'label': 'sadness', 'score': 0.10836926847696304},
  {'label': 'joy', 'score': 0.0023739372845739126},
  {'label': 'love', 'score': 0.0006029442301951349},
  {'label': 'anger', 'score': 0.8861261606216431},
  {'label': 'fear', 'score': 0.0019340685103088617},
  {'label': 'surprise', 'score': 0.0005936266970820725}]]