# Introduction to Hugging Face Transformers

This notebook provides an introduction to the Hugging Face `transformers` library and demonstrates several key applications of transformer models.

## Topics Covered:
1. Introduction to Hugging Face and Transformers
2. Setting up the environment
3. Text Classification with BERT
4. Named Entity Recognition
5. Text Generation with GPT-2
6. Question Answering with a Transformer
7. Text Summarization
8. Fine-tuning a Transformer Model

## 1. Introduction to Hugging Face and Transformers

Hugging Face is an AI company that has developed the `transformers` library, which provides thousands of pre-trained models to perform tasks on text, images, and audio. The library is built on top of PyTorch and TensorFlow and has become the industry standard for deploying transformer models.

Transformers are a type of neural network architecture introduced in the paper "Attention Is All You Need" (Vaswani et al., 2017). They revolutionized NLP by enabling parallel processing of sequences and capturing long-range dependencies through self-attention mechanisms.

## 2. Setting up the environment

Let's start by installing the necessary libraries and importing the required modules.

In [1]:
# Install the transformers library
#!pip install transformers datasets torch evaluate

In [2]:
# Import necessary libraries
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForTokenClassification, AutoModelWithLMHead
from transformers import pipeline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set a random seed for reproducibility
torch.manual_seed(42)
np.random.seed(42)

2025-03-13 17:18:22.966758: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-13 17:18:22.976191: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1741875502.986497   25919 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1741875502.989292   25919 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-13 17:18:23.000224: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

## 3. Text Classification with BERT

One of the most common applications of transformer models is text classification. We'll use BERT (Bidirectional Encoder Representations from Transformers) to classify text sentiment.

In [3]:
# Load the sentiment analysis pipeline with DistilBERT (a lighter version of BERT)
sentiment_classifier = pipeline("sentiment-analysis")

# Define some example texts
texts = [
    "I absolutely loved this movie! The acting was fantastic.",
    "The service at this restaurant was terrible, and the food was cold.",
    "The product works as expected, nothing special.",
    "I'm not sure if I'll recommend this book to others."
]



No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


In [4]:
# Analyze the sentiment of each text
results = sentiment_classifier(texts)

# Display the results
for text, result in zip(texts, results):
    print(f"Text: {text}")
    print(f"Sentiment: {result['label']}, Score: {result['score']:.4f}\n")

Text: I absolutely loved this movie! The acting was fantastic.
Sentiment: POSITIVE, Score: 0.9999

Text: The service at this restaurant was terrible, and the food was cold.
Sentiment: NEGATIVE, Score: 0.9997

Text: The product works as expected, nothing special.
Sentiment: NEGATIVE, Score: 0.9988

Text: I'm not sure if I'll recommend this book to others.
Sentiment: NEGATIVE, Score: 0.9441



### Custom Text Classification

Let's see how to use a specific BERT model for a custom classification task.

In [5]:
# Load a pre-trained BERT model and tokenizer for sequence classification
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

In [6]:
# Custom example
text = "The new smartphone has excellent features, but the price is too high."

# Tokenize the input
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
print(inputs)

# Get model prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Print results
positive_score = predictions[0, 1].item()
negative_score = predictions[0, 0].item()

print(f"Text: {text}")
print(f"Positive score: {positive_score:.4f}")
print(f"Negative score: {negative_score:.4f}")
print(f"Predicted sentiment: {'Positive' if positive_score > negative_score else 'Negative'}")

{'input_ids': tensor([[  101,  1996,  2047, 26381,  2038,  6581,  2838,  1010,  2021,  1996,
          3976,  2003,  2205,  2152,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}
Text: The new smartphone has excellent features, but the price is too high.
Positive score: 0.0089
Negative score: 0.9911
Predicted sentiment: Negative


## 4. Named Entity Recognition

Named Entity Recognition (NER) identifies entities such as persons, organizations, locations, etc., in text.

In [7]:
# Load a more recent NER model
model_name = "dslim/bert-base-NER"  # This model is known to work well with named entities
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Create pipeline with the loaded model
ner = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

# Example text for NER
text = "KAUST Academy is planning on creating a new program sponsored by the Ministry of Environment, Water and Agriculture, There is an Ali here"


Some weights of the model checkpoint at dslim/bert-base-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda:0


In [8]:
# Run NER
results = ner(text)

# Print results in a more readable format
for entity in results:
    print(f"Entity: {entity['word']}")
    print(f"Type: {entity['entity_group']}")
    print(f"Confidence: {entity['score']:.4f}")
    print("-" * 30)

Entity: KAUST Academy
Type: ORG
Confidence: 0.9959
------------------------------
Entity: Ministry of Environment, Water and Agriculture
Type: ORG
Confidence: 0.9980
------------------------------
Entity: Ali
Type: PER
Confidence: 0.4834
------------------------------


### Visualizing NER Results

Let's create a visualization for our NER results.

In [9]:
import spacy
from spacy import displacy

# Install spacy if needed
# !pip install spacy
# !python -m spacy download en_core_web_sm

text = "KAUST Academy is planning on creating a new program sponsored by the Ministry of Environment, Water and Agriculture, There is an Ali here"

entities = ner(text)

# Convert to format for displaCy visualization
doc_entities = {"text": text, "ents": [], "title": None}
for entity in entities:
    #print(entity)
    doc_entities["ents"].append(
        {"start": entity["start"], "end": entity["end"], "label": entity["entity_group"]}
    )

# Display the visualization
displacy.render(doc_entities, style="ent", manual=True, jupyter=True)

## 5. Text Generation with GPT-2

Transformer models can generate coherent and contextually relevant text. Let's use GPT-2 for text generation.

In [10]:
# Load text generation pipeline with GPT-2
generator = pipeline("text-generation", model="gpt2")

# Generate text using different prompts
prompts = [
    "Artificial intelligence is",
    "The future of transportation involves",
    "Climate change will impact",
    "KAUST Is"
]

Device set to use cuda:0


In [11]:
for prompt in prompts:
    # Generate text
    generated_text = generator(
        prompt, 
        max_length=1000,
        truncation=True,
        num_return_sequences=1,
        temperature=0.7,  
        top_p=0.9,        # Nucleus sampling parameter
        do_sample=True,    # Use sampling instead of greedy decoding
        
    )
    
    print(f"Prompt: {prompt}")
    print(f"Generated: {generated_text[0]['generated_text']}\n")
    print("-"*50)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: Artificial intelligence is
Generated: Artificial intelligence is a new field that has been attracting a lot of attention in recent years, but it's still only a few years old. A lot of the work that is done is done in a few dozen different domains, so it's not clear if it's feasible to get all that work done in one place, or whether that's a good thing or a bad thing.

So, we decided to start with one of our own. We had an idea for a solution that we thought would be a great solution for a lot of people, and that's what we're doing.

One of the biggest challenges we faced was figuring out how to make it work. We started by writing a bunch of tests, and then we had to figure out how to write them. That took a lot of time. We spent a lot of time figuring out what the test should do, how to implement it, and then we had to figure out how to do it on the fly.

We had to figure out how to make it work.

We had to figure out how to make it work.

We had to figure out how to make it wo

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: The future of transportation involves
Generated: The future of transportation involves a more equitable distribution of economic resources, which means that public transportation should be considered a means to achieve that goal.

This is an important point, and one that has been made repeatedly by other states. The state of Indiana is well placed to make that case.

In 2012, Indiana became the first state in the nation to require its citizens to carry a concealed carry permit. As a result, the state has become the first state in the nation to require its citizens to carry a concealed carry permit.

The Indiana Legislature passed a law in November that requires all Indiana residents to carry a concealed carry permit, which will allow law enforcement to carry concealed weapons in public. The law was passed by a vote of 11 to 4.

It is important to note that this law was not just a measure to make Indiana law-abiding citizens safer, but also to ensure that law enforcement officer

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: Climate change will impact
Generated: Climate change will impact on your life. If you live in a climate-controlled area, you should always be aware of the effects of climate change.

There are many factors that can affect your life, including the weather, climate, and weather patterns. The following are some of the most important factors that affect your life:

What kind of food you eat

What type of shelter you have

What kind of equipment you use

What type of health care you have

What type of clothes you have

What type of equipment you have

What type of home you have

What kind of food you have

What type of clothing you have

What type of home you have

What type of food you have

What type of equipment you have

What type of home you have

What type of health care you have

What type of food you have

What type of equipment you have

What type of home you have

What type of food you have

What type of home you have

What type of health care you have

What type of food y

### Controlling Text Generation Parameters

Let's explore how different parameters affect text generation.

In [12]:
# Load a larger GPT-2 model for better generation
generator_large = pipeline("text-generation", model="gpt2-medium")

# Single prompt with different temperature settings
prompt = "The key to successful machine learning is"
temperatures = [0.2, 0.7, 1.2]

for temp in temperatures:
    generated_text = generator_large(
        prompt, 
        max_length=100, 
        num_return_sequences=1,
        temperature=temp,
        do_sample=True
    )
    
    print(f"Temperature: {temp}")
    print(f"Generated: {generated_text[0]['generated_text']}\n")

Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Temperature: 0.2
Generated: The key to successful machine learning is to be able to predict what will happen in the future. This is why machine learning is so important for the future of the Internet.

The key to successful machine learning is to be able to predict what will happen in the future. This is why machine learning is so important for the future of the Internet.

The key to successful machine learning is to be able to predict what will happen in the future. This is why machine learning is so important for



Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Temperature: 0.7
Generated: The key to successful machine learning is to be aware of the various types of problems that can arise. We will then discuss the different categories of problems that can be solved and the tools that can be used to achieve them.


How Machine Learning Works

An example of a problem that can be solved with machine learning is predicting the likelihood of an event occurring. Given the following data sets:

X-Files episode 5.5.6

X-Files episode 5.5.

Temperature: 1.2
Generated: The key to successful machine learning is understanding the trade-offs between algorithms and human cognition. If you treat software as a commodity rather than the engine of cognition, there is a tendency to ignore or ignore those limitations, to put too much effort into solving difficult problems in ways human intuition can't. This reduces computing to algorithmic analysis, and to reduce the computational complexity of machine learning, making deep learning very much like neural nets or

## 6. Question Answering with a Transformer

Transformer models can be used for extractive question answering, where the answer is extracted from a given context.

In [13]:
# Load question answering pipeline
qa_pipeline = pipeline("question-answering")

# Define context and questions
context = """
Transformers are neural network architectures that have revolutionized machine learning, particularly in the field of Natural Language Processing (NLP). 
They were introduced in a 2017 paper titled 'Attention Is All You Need' by researchers at Google. The key innovation was the self-attention mechanism, 
which allows the model to weigh the importance of different words in a sentence when making predictions. This addressed limitations in previous 
sequence-to-sequence models that used recurrent neural networks (RNNs). Popular transformer models include BERT (developed by Google), GPT (developed by OpenAI(Sadly, and I hate them)), 
and T5 (also by Google). These models have been pre-trained on massive text datasets and can be fine-tuned for specific tasks.
"""

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


In [14]:
questions = [
    "Who developed BERT?",
    "What is the key innovation in transformers?",
    "When were transformers introduced?",
    "What limitations did transformers address?",
    "Do i like openai ?"
]

# Get answers for each question
for question in questions:
    answer = qa_pipeline(question=question, context=context)
    print(f"Question: {question}")
    print(f"Answer: {answer['answer']}")
    print(f"Confidence: {answer['score']:.4f}\n")

Question: Who developed BERT?
Answer: Google
Confidence: 0.8852

Question: What is the key innovation in transformers?
Answer: self-attention mechanism
Confidence: 0.6998

Question: When were transformers introduced?
Answer: 2017
Confidence: 0.9215

Question: What limitations did transformers address?
Answer: sequence-to-sequence models
Confidence: 0.2421

Question: Do i like openai ?
Answer: I hate them
Confidence: 0.0294



## 7. Text Summarization

Transformers can generate concise summaries of longer texts.

In [15]:
# Load summarization pipeline
summarizer = pipeline("summarization")

# Text to summarize
article = """
    "Artificial intelligence (AI) has rapidly evolved in recent years, transforming various sectors including healthcare, finance, transportation, and entertainment. \n",
    "Machine learning algorithms, particularly deep learning models, have demonstrated remarkable capabilities in image recognition, natural language processing, \n",
    "and decision-making tasks. Companies worldwide are investing billions in AI research and development to gain competitive advantages.\n",
    "However, the rise of AI also brings significant challenges. Ethical concerns about bias in AI algorithms have emerged, as systems may perpetuate or amplify \n",
    "existing societal biases present in training data. Privacy issues are also prominent, especially with the vast amounts of personal data collected to train AI systems. \n",
    "Additionally, there are growing concerns about job displacement as automation capabilities increase.\n",
    "Regulatory frameworks are beginning to emerge globally to address these challenges. The European Union has proposed comprehensive AI regulations that categorize \n",
    "AI systems based on risk levels. The United States is developing sector-specific approaches, while China has implemented its own regulatory system focusing on \n",
    "algorithm transparency and data security.\n",
    "Despite these challenges, AI continues to advance rapidly. Researchers are working on more sophisticated models with enhanced reasoning capabilities, while also \n",
    "developing techniques to make AI more interpretable, fair, and aligned with human values. The future of AI will likely involve finding the right balance between \n",
    "technological innovation and ethical considerations.\n"
"""


No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0


In [16]:
# Generate summary
summary = summarizer(article, max_length=150, min_length=40, do_sample=False)

print("Original text length:", len(article))
print("Summary length:", len(summary[0]['summary_text']))
print("\nSummary:")
print(summary[0]['summary_text'])

Original text length: 1716
Summary length: 507

Summary:
 "Artificial intelligence (AI) has rapidly evolved in recent years, transforming various sectors including healthcare, finance, transportation, and entertainment . Ethical concerns about bias in AI algorithms have emerged, as systems may perpetuate or amplify societal biases present in training data . Privacy issues are also prominent, especially with the vast amounts of personal data collected to train AI systems . There are growing concerns about job displacement as automation capabilities increase .


## 8. Fine-tuning a Transformer Model

While pre-trained models are powerful, fine-tuning them on specific datasets often improves performance for domain-specific tasks. Let's see how to fine-tune a BERT model for sentiment analysis.

In [17]:
# Install required libraries if not already installed
#!pip install datasets evaluate transformers[torch]

In [18]:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
import numpy as np
import evaluate

# Load a dataset (SST-2 for sentiment analysis)
dataset = load_dataset("glue", "sst2")


In [19]:
print(dataset)

# Show a few examples
for i, example in enumerate(dataset["train"]):
    if i < 5:  # Just show a few examples
        print(f"Text: {example['sentence']}")
        print(f"Label: {example['label']} ({['negative', 'positive'][example['label']]})\n")
    else:
        break

DatasetDict({
    train: Dataset({
        features: ['sentence', 'label', 'idx'],
        num_rows: 67349
    })
    validation: Dataset({
        features: ['sentence', 'label', 'idx'],
        num_rows: 872
    })
    test: Dataset({
        features: ['sentence', 'label', 'idx'],
        num_rows: 1821
    })
})
Text: hide new secretions from the parental units 
Label: 0 (negative)

Text: contains no wit , only labored gags 
Label: 0 (negative)

Text: that loves its characters and communicates something rather beautiful about human nature 
Label: 1 (positive)

Text: remains utterly satisfied to remain the same throughout 
Label: 0 (negative)

Text: on the worst revenge-of-the-nerds clichés the filmmakers could dredge up 
Label: 0 (negative)



In [20]:
# Load pre-trained tokenizer and model
model_checkpoint = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint, num_labels=2)

# Tokenize the dataset
def tokenize_function(examples):
    return tokenizer(examples["sentence"], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

# Prepare for training
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))  # Using a subset for demonstration
small_eval_dataset = tokenized_datasets["validation"].shuffle(seed=42).select(range(200))
eval_dataset = tokenized_datasets["validation"].shuffle(seed=42).select(range(200))
# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [21]:
# Define metric for evaluation
metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)



In [22]:
# Define evaluation arguments
eval_args = TrainingArguments(
    output_dir="./pre_training_results",
    per_device_eval_batch_size=16,
    logging_dir="./pre_training_logs",
    report_to="none", 
)

In [23]:
# Initialize Trainer for evaluation only
pre_trainer = Trainer(
    model=model,
    args=eval_args,
    compute_metrics=compute_metrics,
)

# Evaluate the model before fine-tuning
print("Evaluating model before fine-tuning...")
pre_training_results = pre_trainer.evaluate(eval_dataset=eval_dataset)

Evaluating model before fine-tuning...


In [24]:
print(f"Pre-training accuracy: {pre_training_results['eval_accuracy']:.4f}")

Pre-training accuracy: 0.4750


In [25]:
# Also test on some sample sentences
test_sentences = [
    "I really enjoyed this movie and would recommend it to anyone!",
    "The plot was confusing and the acting was terrible.",
    "It's neither great nor terrible, just average."
]


In [26]:
# Ensure model is in evaluation mode
model.eval()

# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Tokenize and predict
test_encodings = tokenizer(test_sentences, padding=True, truncation=True, return_tensors="pt")
test_encodings = {k: v.to(device) for k, v in test_encodings.items()}

with torch.no_grad():
    outputs = model(**test_encodings)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Print results
print("\nModel predictions before fine-tuning:")
for sentence, prediction in zip(test_sentences, predictions):
    positive_score = prediction[1].item()
    sentiment = "positive" if positive_score > 0.5 else "negative"
    print(f"Sentence: {sentence}")
    print(f"Sentiment: {sentiment} (positive score: {positive_score:.4f})\n")



Model predictions before fine-tuning:
Sentence: I really enjoyed this movie and would recommend it to anyone!
Sentiment: negative (positive score: 0.4893)

Sentence: The plot was confusing and the acting was terrible.
Sentiment: negative (positive score: 0.4815)

Sentence: It's neither great nor terrible, just average.
Sentiment: negative (positive score: 0.4744)



In [27]:
# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train_dataset,
    eval_dataset=small_eval_dataset,
    compute_metrics=compute_metrics,
)

# Fine-tune the model
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,0.3838,0.379482,0.845
2,0.1943,0.408333,0.85
3,0.0477,0.46442,0.845


TrainOutput(global_step=189, training_loss=0.27543795518774206, metrics={'train_runtime': 33.7994, 'train_samples_per_second': 88.759, 'train_steps_per_second': 5.592, 'total_flos': 397402195968000.0, 'train_loss': 0.27543795518774206, 'epoch': 3.0})

In [28]:
# Evaluate the model
eval_results = trainer.evaluate()

In [29]:

print(f"Evaluation accuracy: {eval_results['eval_accuracy']:.4f}")

# Test on some new examples
test_sentences = [
    "I really enjoyed this movie and would recommend it to anyone!",
    "The plot was confusing and the acting was terrible.",
    "It's neither great nor terrible, just average."
]

# Tokenize and predict
device = model.device
test_encodings = tokenizer(test_sentences, padding=True, truncation=True, return_tensors="pt")
test_encodings = {k: v.to(device) for k, v in test_encodings.items()}
with torch.no_grad():
    outputs = model(**test_encodings)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Print results
for sentence, prediction in zip(test_sentences, predictions):
    positive_score = prediction[1].item()
    sentiment = "positive" if positive_score > 0.5 else "negative"
    print(f"Sentence: {sentence}")
    print(f"Sentiment: {sentiment} (positive score: {positive_score:.4f})\n")

Evaluation accuracy: 0.8450
Sentence: I really enjoyed this movie and would recommend it to anyone!
Sentiment: positive (positive score: 0.9418)

Sentence: The plot was confusing and the acting was terrible.
Sentiment: negative (positive score: 0.1229)

Sentence: It's neither great nor terrible, just average.
Sentiment: negative (positive score: 0.3739)



## Conclusion

In this notebook, we've explored the Hugging Face `transformers` library and demonstrated several key applications of transformer models:

1. Text classification for sentiment analysis
2. Named Entity Recognition (NER)
3. Text generation with GPT-2
4. Question answering
5. Text summarization
6. Fine-tuning a pre-trained model

These applications demonstrate the versatility and power of transformer models in natural language processing tasks. The Hugging Face ecosystem makes it easy to leverage these models with minimal code, while also providing the flexibility to customize and fine-tune them for specific use cases.

For more information and advanced use cases, refer to the official Hugging Face documentation: https://huggingface.co/docs/transformers/index
