> Running powerful NLP models with the HuggingFace `transformers` library.

# Intro

Welcome to the third lesson of the course. Let's recap our progress so far:  

- Lesson 1: We made a python environment for LLMs.  
- Lesson 2: Set up a personal blog to track our progress.

Next we will use our first LLM. We'll start with a Natural Language Processing (NLP) model provided by the HuggingFace team.

## Notebook best practices  

First, let's set up our notebook to be fully interactive and easy to use. We can do this with a couple of "magic functions" built-in to Jupyter.  

Specifically, we use the magic `autoreload` and `matplotlib` functions. The cell below shows them in action:    

In [1]:
# best practice notebook magic
%load_ext autoreload
%autoreload 2
%matplotlib inline

Let's take a look at what these magic functions do.

`autoreload` dynamically reloads code libraries, even as they're changing under the hood. That means we do not have to restart the notebook after every change. We can instead code and experiment on the fly.   

`matplotlib inline` automatically displays any plots below the code cell that created them. The plots are also saved in the notebook itself, which is perfect for our blog posts.

All of our notebooks going forward will start with these magic functions.

Let's start with the `"hello, world!"` of NLP: sentiment analysis.

# Sentiment Analysis with HuggingFace

:::: callout-note
The code and examples below are based on the official HuggingFace tutorial, reworked to better suit the course.
::::

Imagine that we're selling some product. And we've gathered a bunch of reviews from a large group of users to find out both the good and bad things that people are saying. The bad reviews will point out where our product needs improving. Positive reviews will show what we're doing right.

Figuring out the tone of a statement (*positive vs. negative*) is an area of NLP known as `sentiment analysis`.

Going through each review would give us a ton of insight about our product. But, it would take a ton of intense and manual effort. Enter Machine Learning to the rescue! An NLP model can automatically analyze and classify the reviews in bulk.

## First, a Pipeline

Let's take a look at the HuggingFace NLP model that we'll run. At a high level, the model is built around three key pieces:  

1. A `Config` file.  
2. A `Preprocessor` file.   
3. `Model` file(s).   

The HuggingFace API has a handy, high-level `pipeline` that wraps up all three objects for us.

:::: callout-important
Before going forward, make sure that the `llm-env` environment from the first lesson is active. This environment has the HuggingFace libraries used below.
::::

The code below uses the `transformers` library to build a Sentiment Analysis `pipeline`.

In [11]:
# load in the pipeline object from HuggingFace
from transformers import pipeline

# create a sentiment analysis pipeline
classifier = pipeline("sentiment-analysis", model="nlptown/bert-base-multilingual-uncased-sentiment")


                                                                                                                                                                                                                        Since we didn't specify a model, you can see in the output above that HuggingFace picked a [distilbert model](distilbert-base-uncased-finetuned-sst-2-english) for us by default.  

We will learn more about what exactly `distilbert` is and how it works later on. For now, think of it as a useful NLP genie who can look at a sentence and tell us whether its has a positive or negative tone.

Next, let's find out what the model thinks about the sentence: `"HuggingFace pipelines are awesome!"`

In [66]:
# sentiment analysis on a simple, example sentence
example_sentence = "Si, Me Gusta AI!"
classifier(example_sentence)

[{'label': '5 stars', 'score': 0.4380775988101959}]

Not bad. We see a strong confident score for a `POSITIVE` label, as could be expected.

We can also pass many sentences at once, which starts to show the bulk processing power of these models. Let's process four sentences at once: three positive ones, and a clearly negative one. 

In [21]:
# many sentences at once, in a python list
many_sentences = [
    "A chicken is a chicken.",
    "Everybody knows that!",
    "A rose is a rose is a rose.",
    "I really like this course so far",
]

# process many sentences at once

results = classifier(many_sentences)

# check the tone of each sentence
for result in results:
    print(f"label: {result['label']}, with score: {round(result['score'], 4)}")

label: 1 star, with score: 0.2874
label: 5 stars, with score: 0.626
label: 5 stars, with score: 0.33
label: 5 stars, with score: 0.6003


Congrats! You've now ran a HuggingFace pipeline and used it to analyze the tone of a few sentences. Next, let's take a closer look at the pipeline object.

# Going inside the `pipeline`

Under the hood, a pipeline handles three key HuggingFace NLP pieces: Config, Preprocessor, and Model.

To better understand each piece, let's take one small step down the ladder of abstraction and build our own simple pipeline.

We will use the same `distilbert` model from before. First we need the three key pieces mentioned above. Thankfully, we can import each of these pieces from the `transformers` library.

## Config class

The `config` class is a simple map with the options and configurations of a model. It has the key-value pairs that define a model's architecture and hyperparameters.

In [12]:
# config for the model
from transformers import DistilBertConfig

## Preprocessor class

The `preprocessor` object in this case is a `Tokenizer`. Tokenizers convert strings and characters into special tensor inputs for the LLM. 

:::: callout-note
Correctly pre-processing inputs is one of the most important and error-prone steps in using ML models. In other words, it's good to offload to a class that's already been tested and debugged.
::::

In [13]:
# input preprocessor to tokenize strings
from transformers import DistilBertTokenizer

The `model` class holds the weights and parameters for the actual LLM. It's the "meat and bones" of the setup, so to speak.

In [14]:
# the text classifier model
from transformers import DistilBertForSequenceClassification


## Naming the model

We need to know a model's full, proper name in to load it from HuggingFace. Its name is how we find the model on the [HuggingFace Model Hub](https://huggingface.co/docs/hub/models-the-hub).  

Once we know its full name, there is a handy `from_pretrained()` function that will automatically find and download the pieces for us.  

In this case, the distilbert model's full name is:  
> `distilbert-base-uncased-finetuned-sst-2-english`.

In [90]:
# sentiment analysis model name
model_name = 't5-base'

In the code below we can now load each of the three NLP pieces for this model.

In [16]:
# create the config
config = DistilBertConfig.from_pretrained(model_name)

# create the input tokenizer 
tokenizer = DistilBertTokenizer.from_pretrained(model_name)

# create the model
model = DistilBertForSequenceClassification.from_pretrained(model_name)

Next we will compose these three pieces together to mimic the original `pipeline` example.

## Putting together a `simple_pipeline`

### Preprocessing the inputs

First, we create a `preprocess` function to turn a given `text` string into the proper, tokenized inputs than an LLM expects.  

In [18]:
def preprocess(text: str):
    """
    Sends `text` through the model's tokenizer.  
    The tokenizer turns words and characters into proper inputs for an NLP model.
    """
    tokenized_inputs = tokenizer(text, return_tensors='pt')
    return tokenized_inputs


Let's test this preprocessing function on the example sentence from earlier. 

In [22]:
# manually preprocessing the example sentence: "HuggingFace pipelines are awesome!"
preprocess(example_sentence)

{'input_ids': tensor([[ 101, 4205, 2018, 1037, 2307, 2154,  999,  102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]])}

It turned an input string into numerical embeddings for the LLM. We'll breakdown what exactly this output means later on in the course. For now, think of it as sanitizing and formatting the text into a format that the LLM has been trained to work with. 

### Running the model

Next up, let's make our own `forward` function that run the LLM on preprocessed inputs. 

In [23]:
def forward(text: str):
    """
    First we preprocess the `text` into tokens.
    Then we send the `tokenized_inputs` to the model.
    """
    tokenized_inputs = preprocess(text)
    outputs = model(**tokenized_inputs)
    return outputs

In [52]:
tokenizer?

[0;31mSignature:[0m     
[0mtokenizer[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mtext[0m[0;34m:[0m [0mUnion[0m[0;34m[[0m[0mstr[0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m][0m[0;34m][0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mtext_pair[0m[0;34m:[0m [0mUnion[0m[0;34m[[0m[0mstr[0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m][0m[0;34m,[0m [0mNoneType[0m[0;34m][0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mtext_target[0m[0;34m:[0m [0mUnion[0m[0;34m[[0m[0mstr[0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m][0m[0;34m][0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m


Let's check what this outputs for our running example sentence.

In [24]:
outputs = forward(example_sentence); outputs

SequenceClassifierOutput(loss=None, logits=tensor([[-4.2644,  4.6256]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

You'll see a lot going on in the `SequenceClassifierOutput` above. To be honest, this is where the original `pipeline` does most of the heavy-lifting for us. It takes the raw, detailed output from an LLM and converts it into a more human-readable format.

We'll mimic this heavy-lifting by using the `Config` class and model outputs to find out whether the sentence is positive or negative.

In [25]:
def process_outputs(outs):
    """
    Converting the raw model outputs into a human-readable result.

    Steps:
        1. Grab the raw "scores" from the model for Positive and Negative labels.  
        2. Find out which score is the highest (aka the model's decision).  
        3. Use the `config` object to find the class label for the highest score.  
        4. Turn the raw score into a human-readable probability value.  
        5. Print out the predicted labels with its probability.  
    """
    # 1. Grab the raw "scores" that from the model for Positive and Negative labels
    logits = outs.logits

    # 2. Find the strongest label score, aka the model's decision
    pred_idx = logits.argmax(1).item()

    # 3. Use the `config` object to find the class label
    pred_label = config.id2label[pred_idx]  

    # 4. Calculate the human-readable number for the score
    pred_score = logits.softmax(-1)[:, pred_idx].item()

    # 5. return the label and score in a dictionary
    return {
        'label': pred_label,
        'score': pred_score, 
    }


We can now put together a `simple_pipeline`, and check how it compares to the original `pipeline`.

In [26]:
def simple_pipeline(text):
    """
    Putting the NLP pieces and functions together into a pipeline.
    """
    # get the model's raw output
    model_outs = forward(text)
    # convert the raw outputs into a human readable result
    predictions = process_outputs(model_outs)
    return predictions

Calling the `simple_pipeline` on the example sentence, drumroll please...

In [31]:
# running out simple pipeline on the example text
simple_pipeline("HIIII")

{'label': 'POSITIVE', 'score': 0.9883008599281311}

And just like that, we too a small peek under the `pipeline` hood and built our own, simple working version.  

One pain point: we had to know the full, proper name of the different `Distilbert*` pieces to import the Config, Preprocessor, and Model. This gets overwhelming fast given the flood of LLM models released almost daily. Thankfully, HuggingFace has come up with a great solution to this problem: the `Auto` class.

# True HuggingFace magic: `Auto` classes

With `Auto` classes, we don't have to know the exact or proper name of the LLM's objects to import them. We only need the proper name of the model on the hub: 

In [1]:
# viewing our distilbert model's name
model_name

NameError: name 'model_name' is not defined

Run the cell below to import the Auto classes. Then we'll use them with the model name to create an even cleaner `simple_pipeline`.

In [49]:
# importing the Auto classes
from transformers import AutoConfig
from transformers import AutoTokenizer
from transformers import AutoModelForSequenceClassification

Next we create the three key NLP pieces with the Auto classes.

In [50]:
# building the pieces with `Auto` classes
config = AutoConfig.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

We can now use these pieces to build a `simple_pipeline` class that's cleaner than before, and can handle any model_name:

In [36]:
class SentimentPipeline:
    def __init__(self, model_name: str):
        """
        Simple Sentiment Analysis pipeline.
        """
        self.model_name = model_name
        self.config = AutoConfig.from_pretrained(self.model_name)
        self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
        self.model = AutoModelForSequenceClassification.from_pretrained(self.model_name)

    def preprocess(self, text: str):
        """
        Sends `text` through the LLM's tokenizer.  
        The tokenizer turns words and characters into special inputs for the LLM.
        """
        tokenized_inputs = self.tokenizer(text, return_tensors='pt')
        return tokenized_inputs

    def forward(self, text: str):
        """
        First we preprocess the `text` into tokens.
        Then we send the `token_inputs` to the model.
        """
        token_inputs = self.preprocess(text)
        outputs = self.model(**token_inputs)
        return outputs

    def process_outputs(self, outs):
        """
        Here we mimic the post-processing that HuggingFace automatically does in its `pipeline`.  
        """
        # grab the raw scores from the model for Positive and Negative labels
        logits = outs.logits

        # find the strongest label score, aka the model's decision
        pred_idx = logits.argmax(1).item()

        # use the `config` object to find the actual class label
        pred_label = self.config.id2label[pred_idx]  

        # calculate the human-readable probability score for this class
        pred_score = logits.softmax(-1)[:, pred_idx].item()

        # return the predicted label and its score
        return {
            'label': pred_label,
            'score': pred_score, 
        }
    
    def __call__(self, text: str):
        """
        Overriding the call method to easily and intuitively call the pipeline.
        """
        model_outs = self.forward(text)
        preds = self.process_outputs(model_outs)
        return preds

## Using the custom `SentimentPipeline`

Let's leverage both the new class and a different model, to show the power of Auto classes.  

For fun, let's use BERT model that was trained specifically on tweets. The full model's name is [`finiteautomata/bertweet-base-sentiment-analysis`](https://huggingface.co/finiteautomata/bertweet-base-sentiment-analysis).

In [64]:
# using a different model
new_model_name = 'finiteautomata/bert-base-spanish-wwm-uncased-reranker-25'

In [84]:
new_model_name??

[0;31mType:[0m        str
[0;31mString form:[0m finiteautomata/bert-base-spanish-wwm-uncased-reranker-25
[0;31mLength:[0m      56
[0;31mDocstring:[0m  
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.

Now let's run it on our handy example sentence.

In [79]:
example_sentence2 = "Te odio!"

In [80]:
# calling our new, flexible pipeline
simple_pipeline(example_sentence2)

{'label': 'LABEL_1', 'score': 0.5055498480796814}

In [51]:
simple_pipeline.model

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(64001, 768, padding_idx=1)
      (position_embeddings): Embedding(130, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-11): 12 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
             

Congrats! You've now built a flexible pipeline for Sentiment Analysis that can leverage most NLP models on the HuggingFace hub.

In [94]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

class TextSummarizationPipeline:
    def __init__(self, model_name: str):
        """
        Simple Text Summarization pipeline.
        """
        self.model_name = model_name
        self.config = AutoConfig.from_pretrained(self.model_name)
        self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
        self.model = AutoModelForSeq2SeqLM.from_pretrained(self.model_name)

    def preprocess(self, text: str):
        """
        Preprocesses text for summarization.
        """
        inputs = self.tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
        # Add an "end of summary" token if necessary for your model
        inputs['input_ids'] = self.add_end_of_summary_token(inputs['input_ids'])
        return inputs

    def add_end_of_summary_token(self, input_ids):
        """
        Adds the end of summary token to the input sequence.
        """
        # Modify this based on your model's specific end of summary token ID
        eos_token_id = self.tokenizer.eos_token_id
        input_ids = torch.cat((input_ids, torch.tensor([eos_token_id]).unsqueeze(0)), dim=-1)
        return input_ids

    def forward(self, preprocessed_text):
        """
        Passes preprocessed text through the summarization model.
        """
        # Include preprocessed_text['input_ids'] as decoder_input_ids
        outputs = self.model(**preprocessed_text, decoder_input_ids=preprocessed_text['input_ids'])
        return outputs

    def process_outputs(self, model_outputs):
        """
        Extracts the generated summary text.
        """
        summary_ids = model_outputs.logits.argmax(-1)  # Get most likely token IDs
        summary = self.tokenizer.decode(summary_ids[0], skip_special_tokens=True)  # Decode and remove special tokens
        return summary

    def __call__(self, text: str):
        """
        Summarizes the provided text.
        """
        preprocessed_text = self.preprocess(text)
        model_outputs = self.forward(preprocessed_text)
        summary = self.process_outputs(model_outputs)
        return summary


# Example usage
summarizer = TextSummarizationPipeline("t5-base")
text = "The first 4 points1 maximize the beauty - fully channel the beauty with in. Maybe ask what makes this moment beautiful? See if beauty can be increased in every situation. MtB also could be taken as A use of reason and also a disciplining of the senses to focus on beauty (i.e. all the pretty flowers, all the pretty birds).2 full expression - it takes a lot of effort for one to understand who they are when they are comfortable and how to channel the most real expressions of themselves What holds people back? Shyness, distraction (inability to focus on that which they want express)3 expect rising - this means our expectations are constantly rising. Kinda in line with give em an inch they'll take a mile also related to law of diminishing returns -The law of diminishing returns states that in all productive processes, adding more of one factor of production, while holding all others constant (ceteris paribus), will at some point yield lower incremental per-unit returns. Tooo much ice cream too much cash4 power of pettiness - is the idea that pettiness is the destroyer of all people. That even the best of us can’t truly be unaffected by the petty bullshit around us. They can however minimize its effects. In my own life I am obliterated by friends and coworker’s snide remarks and judgments. How do I minimize its effects? By talking myself down, deep breaths, weed, alchohol, revenge..."
summary = summarizer(text)
print(summary)


NameError: name 'torch' is not defined

In [1]:
model_name


NameError: name 'model_name' is not defined

In [99]:
from transformers import pipeline

summarizer = pipeline("summarization", model="t5-base")

# Example usage
text = " maximize the beauty - fully channel the beauty with in. Maybe ask what makes this moment beautiful? See if beauty can be increased in every situation. MtB also could be taken as A use of reason and also a disciplining of the senses to focus on beauty (i.e. all the pretty flowers, all the pretty birds)"

summary = summarizer(text, max_length=30)

print(summary)


[{'summary_text': 'maximize the beauty - fully channel the beauty with in . also could be taken as a use of reason and also a disciplining'}]


# Conlusion

This notebook went through the basics of using a HuggingFace pipeline to run sentiment analysis on a few sentences. We then looked under the hood at the pipeline's three key pieces: Config, Preprocessor, and Model.   

Lastly, we built our own `simple_pipeline` from scratch to see how the pieces fit together.

The goal of this notebook was two fold. First, we wanted to gain hands-on experience with using the `transformers` API from HuggingFace. It's an incredibly powerful library, that lets us do what used to be difficult, research-level NLP tasks in a few lines of code.  

Second, we wanted to get some familiarity with downloading models. The model weights that we downloaded from HuggingFace are the same ones that we will be fine-tuning, quantizing, and deploying on our devices throughout the course. 

There are two appendixes below. The first one gives a handy way of counting the number of weights in a model. The second one goes into more details about how to interactively debug an analyze the code in a Jupyter notebook.  

# Appendix 1: Counting the number of parameters in a model

The following code snippet counts the number of trainable parameters in a model. It's a question that comes up often when working with LLMs, and having a quick reference to find out a rough model's size often comes in handy. 

In [41]:
def count_parameters(model):
    """
    Counts the number of trainable parameters in a `model`.
    """
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

Here we use it to count the number of parameters in the distilbert model from above. 

In [42]:
# view the number of parameters in the last model used
f"Number of trainable params: {count_parameters(model):,}"

'Number of trainable params: 66,955,010'

# Appendix 2: Inspecting the `classifier`, notebook style.

What is the `classifier` object, exactly? Jupyter has many powerful ways of inspecting and analyzing its code. 

One of the simplest ways of checking an object is to call it by itself in a code cell, as shown below.

In [43]:
# show the contents of the `classifier` object
classifier

<transformers.pipelines.text_classification.TextClassificationPipeline at 0x1282d2290>

We can see the `classifier` is a type of `TextClassification` pipeline. This makes sense: we fed it an input sentence and asked it to classify the statement as *positive* vs. *negative*.

There is also a tab-autocomplete feature to find the members and methods of an object. For example, to look up everything in `classifier`, hit tab after adding a `.`.  

Uncomment the cells below and hit the tab key to test the auto-complete feature.

In [47]:
## tab after the `.` to auto-complete all variables/methods
classifier.       

<transformers.pipelines.text_classification.TextClassificationPipeline at 0x1282d2290>

Let's say you vaguely remember the name of a variable or function, say for example the `forward()` method. In that case you can type the first few letters and hit tab to auto-complete the full set of options:

In [None]:
## tab after the `.for` to auto-complete the rest of the options
# classifier.for

## Asking questions: `?` and `??`

Lastly, we can literally interrogate an object in Jupyter for more information.  

If we tag a single `?` after an object, we'll get its basic documentation (docstring). Note that we omit it here to keep the notebook from getting too busy.

In [None]:
#| output: false
## the power of asking questions
classifier?

[0;31mSignature:[0m      [0mclassifier[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mType:[0m           TextClassificationPipeline
[0;31mString form:[0m    <transformers.pipelines.text_classification.TextClassificationPipeline object at 0x11cd918d0>
[0;31mFile:[0m           ~/mambaforge/envs/llm-env/lib/python3.11/site-packages/transformers/pipelines/text_classification.py
[0;31mDocstring:[0m     
Text classification pipeline using any `ModelForSequenceClassification`. See the [sequence classification
examples](../task_summary#sequence-classification) for more information.

Example:

```python
>>> from transformers import pipeline

>>> classifier = pipeline(model="distilbert/distilbert-base-uncased-finetuned-sst-2-english")
>>> classifier("This movie is disgustingly good !")
[{'label': 'POSITIVE', 'score': 1.0}]

>>> classifier("Director tried too much.")
[{'label': 'NEGATIVE', 'score': 0.996}]
```

If we tag on *two* question marks: `??`, then we get the full source code of the object:

In [None]:
#| output: false
## really curious about classifier
classifier??

[0;31mSignature:[0m      [0mclassifier[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mType:[0m           TextClassificationPipeline
[0;31mString form:[0m    <transformers.pipelines.text_classification.TextClassificationPipeline object at 0x11cd918d0>
[0;31mFile:[0m           ~/mambaforge/envs/llm-env/lib/python3.11/site-packages/transformers/pipelines/text_classification.py
[0;31mSource:[0m        
[0;34m@[0m[0madd_end_docstrings[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mbuild_pipeline_init_args[0m[0;34m([0m[0mhas_tokenizer[0m[0;34m=[0m[0;32mTrue[0m[0;34m)[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34mr"""[0m
[0;34m        return_all_scores (`bool`, *optional*, defaults to `False`):[0m
[0;34m            Whether to return all prediction scores or just the one of the predicted class.[0m
[0;34m        function_to_apply (`str`, *optional*, defaults to `"default"`):[0m
[0;34m 

Both `?` and `??` are excellent and quick ways to look under the hood of any object in Jupyter.

## Inspecting a specific `classifier` function

Let's take a look at the function that does the heavy lifting for our sentiment analysis task: `forward()`.

In [None]:
 # looking at what actually runs the inputs
classifier.forward

<bound method Pipeline.forward of <transformers.pipelines.text_classification.TextClassificationPipeline object at 0x11cd918d0>>

What does this function actually do? Let's find out.

In [None]:
# source code of the forward function
classifier.forward??

[0;31mSignature:[0m [0mclassifier[0m[0;34m.[0m[0mforward[0m[0;34m([0m[0mmodel_inputs[0m[0;34m,[0m [0;34m**[0m[0mforward_params[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m <no docstring>
[0;31mSource:[0m   
    [0;32mdef[0m [0mforward[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mmodel_inputs[0m[0;34m,[0m [0;34m**[0m[0mforward_params[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m        [0;32mwith[0m [0mself[0m[0;34m.[0m[0mdevice_placement[0m[0;34m([0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m            [0;32mif[0m [0mself[0m[0;34m.[0m[0mframework[0m [0;34m==[0m [0;34m"tf"[0m[0;34m:[0m[0;34m[0m
[0;34m[0m                [0mmodel_inputs[0m[0;34m[[0m[0;34m"training"[0m[0;34m][0m [0;34m=[0m [0;32mFalse[0m[0;34m[0m
[0;34m[0m                [0mmodel_outputs[0m [0;34m=[0m [0mself[0m[0;34m.[0m[0m_forward[0m[0;34m([0m[0mmodel_inputs[0m[0;34m,[0m [0;34m**[0m[0mforward_params[0m

We can see that it automatically handles whether we're running a TensorFlow (`tf`) or PyTorch (`pt`) model. Then, it makes sure the tensors are on the correct device. Lastly is calls another function, `_forward()` on the prepared inputs.  

We can follow the rabbit hole as far down as needed. Let's take a look at the source of `_forward`.

In [None]:
# going deeper
classifier._forward??

[0;31mSignature:[0m [0mclassifier[0m[0;34m.[0m[0m_forward[0m[0;34m([0m[0mmodel_inputs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
_forward will receive the prepared dictionary from `preprocess` and run it on the model. This method might
involve the GPU or the CPU and should be agnostic to it. Isolating this function is the reason for `preprocess`
and `postprocess` to exist, so that the hot path, this method generally can run as fast as possible.

It is not meant to be called directly, `forward` is preferred. It is basically the same but contains additional
code surrounding `_forward` making sure tensors and models are on the same device, disabling the training part
of the code (leading to faster inference).
[0;31mSource:[0m   
    [0;32mdef[0m [0m_forward[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mmodel_inputs[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m        [0;31m# `XXXForSequenceClassification` models should not use `use_cache=True` even

Ah, we can see it calls the `model` of the classifier. This is the `distilbert` model we saw earlier! Now we can peek under the hood at the actual Transformer LLM.

In [None]:
# the distilbert sentiment analysis model
classifier.model

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(105879, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12

We will breakdown the different pieces in this model later on in the course.  

The important takeaway for now is that this shows the main structure of most Transformer LLM models. The changes are mostly incremental from this foundation. 