**Getting Started with Hugging Face Library: NLP Tasks using Transformers**¶

This notebook serves as a beginner-friendly guide to using the Hugging Face library for natural language processing (NLP) tasks using Transformers. The Hugging Face library provides a wide range of pre-trained models and tools that enable developers to quickly implement state-of-the-art NLP techniques.

This notebook has been created as part of the Lazy Programmer course on transformers, which focuses on practical implementation and understanding of NLP tasks using Hugging Face. Whether you are a beginner or have some experience in NLP, this notebook will help you get started with Transformers and gain hands-on experience with different NLP tasks.

In this notebook, I will cover six essential NLP tasks: sentiment analysis, text generation, masked language modeling, text summarization, question answering and zero-shot classification. For each task, I will walk through the process of importing a dataset, loading a pre-trained Transformer model, and applying the model to the dataset. Do check out this course on Transformers by Lazy programmer for more detailed explanation

**Table of Contents**
- Sentiment Analysis
- Text Generation
- Masked Language Modelling
- Text Summarization
- Question Answering
- Zero-shot Classification

In [1]:
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, confusion_matrix, roc_auc_score

from transformers import pipeline

import torch


**Sentiment Analysis**

In this section, we will explore sentiment analysis using a pre-trained Transformer model. The Hugging Face library provides a convenient pipeline function that allows us to easily perform sentiment analysis on text.

First, we import the necessary dependencies and create a sentiment analysis pipeline using pipeline("sentiment-analysis"). This pipeline utilizes a pre-trained Transformer model specifically designed for sentiment analysis tasks.

In [2]:
classifier = pipeline("sentiment-analysis")
type(classifier)

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

transformers.pipelines.text_classification.TextClassificationPipeline

**We can then pass a single sentence or a list of sentences to the classifier and get the predicted sentiment labels and associated confidence scores.**

In [3]:
# Output is a dictionary containing label and score as keys
classifier("This is a great movie")

[{'label': 'POSITIVE', 'score': 0.9998798370361328}]

In [4]:
classifier(["This is a irrelevant movie","People of this state are helpful"])

[{'label': 'NEGATIVE', 'score': 0.9997994303703308},
 {'label': 'POSITIVE', 'score': 0.9995457530021667}]

In [5]:
torch.cuda.is_available(), torch.cuda.current_device()

(True, 0)

In [6]:
classifier = pipeline("sentiment-analysis", device = 0)

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
