# Introduction to the 🤗 `transformers` library

As mentioned in our glorious introduction, we are going to use the [Hugging Face Hub](https://huggingface.co/docs/hub/index) today.
Let's start with some numbers. At the time of writing, the Hugging Face Hub is a platform with over 350k models, 75k datasets, and 150k demo apps.
Everything is open source and publicly available.

Under the hood, the Hugging Face Hub uses git-based repositories. While you can directly interact with Hub repos as you would with any other git repo, we will be using the Hugging Face `transformers` library.

Installing the transformers library is as easy as

```shell
pip install transformers
```

You also need your favourite deep learning framework (`torch` or `tensorflow`). Today you don't have to worry about dependencies, we've already added everything to the `environment.yaml`.

## Pipelines

With `transformers`, all (NLP) models are a line away from being used. You can choose between a high- and a low-level API, depending on how you want to use them.
On the highest level, there is the `pipeline` function. `pipeline` can be used for most tasks across different modalities. Here are some examples, you can find the complete list in the [`transformers` documentation](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline).

| Task                      | Description                                                     | Modality           | Pipeline identifier                           |
|---------------------------|-----------------------------------------------------------------|--------------------|------------------------------------------------|
| Text classification       | assign a label to a given sequence of text                      | NLP                | `pipeline(task=“sentiment-analysis”)`            |
| Text generation           | generate text given a prompt                                     | NLP                | `pipeline(task=“text-generation”)`               |
| Summarization             | generate a summary of a sequence of text or document             | NLP                | `pipeline(task=“summarization”)`                 |
| Image classification      | assign a label to an image                                       | Computer vision    | `pipeline(task=“image-classification”)`          |
| Image segmentation        | assign a label to each individual pixel of an image               | Computer vision    | `pipeline(task=“image-segmentation”)`            |
| Object detection          | predict the bounding boxes and classes of objects in an image     | Computer vision    | `pipeline(task=“object-detection”)`              |
| Audio classification      | assign a label to some audio data                                 | Audio              | `pipeline(task=“audio-classification”)`          |
| Automatic speech recognition | transcribe speech into text                                    | Audio              | `pipeline(task=“automatic-speech-recognition”)` |
| Visual question answering | answer a question about the image, given an image and a question | Multi-modal        | `pipeline(task=“vqa”)`                           |
| Document question answering | answer a question about the document, given a document and a question | Multi-modal    | `pipeline(task=“document-question-answering”)`  |
| Image captioning          | generate a caption for a given image                             | Multi-modal        | `pipeline(task=“image-to-text”)`                 |

Let's use `sentiment-analysis` as an example.

In [None]:
from transformers import pipeline

classifier = pipeline('sentiment-analysis')

As you can see, `pipeline` downloads and caches a pretrained model as well as a tokenizer. You can now use the `classifier`:

In [None]:
classifier("I am thrilled to announce that coffee is the best caffeinated drink in the world.")

Let's take a closer look at `classifier`. `classifier.model` will tell you which model the pipeline is using:

In [None]:
classifier.model

In [None]:
classifier.model

Remember from the previous lab that there are two important concepts: The tokenizer and the model itself. If you need a refresher:

Tokenizing a text is splitting it into words or subwords, which then are converted to ids through a look-up table. You can learn more about tokenizers in the [Hugging Face docs](https://huggingface.co/docs/transformers/tokenizer_summary).

You can also access a model's the tokenizer:

In [None]:
classifier.tokenizer

Now, what if you want to use a particular model? You can pass an optional `model` parameter to `pipeline`. Let's say you wanted to analyze French text:

In [None]:
classifier = pipeline('sentiment-analysis', model="nlptown/bert-base-multilingual-uncased-sentiment")

In [None]:
classifier("J'ai le plaisir d'annoncer que le café est la meilleure boisson caféinée au monde.")

Of course, there's no guarantee that the output of different models is the same, and it's best to read the documentation page of a particular model. Speaking of documentation pages, models on the Hugging Face Hub come with what is known as [_Model Card_](https://huggingface.co/docs/hub/model-cards).

> Model cards are files that accompany the models and provide handy information. Under the hood, model cards are simple Markdown files with additional metadata. Model cards are essential for discoverability, reproducibility, and sharing! You can find a model card as the README.md file in any model repo. ([docs](https://huggingface.co/docs/hub/model-cards))

Had we read [the model card for ` bert-base-multilingual-uncased-sentiment `]https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment) beforehand, we would not have been surprised by the return value. It was trained on reviews, no wonder it returns star ratings!

## `AutoClass`es

As we've learnt last week, for (large) language models to work, we need a _tokenizer_ and the _model_ itself. These two concepts are implemented via the `Tokenizer` and `Model` base classes respectively, which the concrete models and their tokenizers inherit from.

Whenever you need a particular model and want more control over it than you get by invoking `pipeline(...)`, you can instantiate it via its class. For instance, for the `nlptown/bert-base-multilingual-uncased-sentiment`:

In [None]:
from transformers import BertForSequenceClassification
from transformers import BertTokenizer

# The model...
BertForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")


In [None]:
# ... and its tokenizer
BertTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

This can be very tedious, however. Remember, there are some 350k models on Hugging Face Hub and its not always obvious which model class is the right one for your model. So, how do we overcome this issue?

Enter `AutoClasses`: An `AutoClass` is a shortcut that automatically retrieves the architecture of a pretrained model from its name or path. You only need to select the appropriate `AutoClass` for your task and it’s associated preprocessing class.
So, again, for `nlptown/bert-base-multilingual-uncased-sentiment`:

In [None]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# The model...
AutoModelForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

In [None]:
# ... and its tokenizer
AutoTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

Isn't this much more convenient? `AutoClasses` also power `pipeline`!

In [None]:
classifier = pipeline(
    "sentiment-analysis",
    model=AutoModelForSequenceClassification.from_pretrained(
        "nlptown/bert-base-multilingual-uncased-sentiment"
    ),
    tokenizer=AutoTokenizer.from_pretrained(
        "nlptown/bert-base-multilingual-uncased-sentiment"
    ),
)

In [None]:
classifier("Sono lieto di annunciare che il caffè è la migliore bevanda a base di caffeina del mondo.")
# ... or so, my Italian is not that good

This is everything you need to know about `transformers` for the moment. We will cover optimized inference and training large models later.

Now, let's engineer some prompts!

## Prompt Engineering 101

What is prompt engineering? According to Wikipedia:

> Prompt engineering is the process of structuring text that can be interpreted and understood by a generative AI model. A prompt is natural language text describing the task that an AI should perform.

In this section, we are going to engineer some prompts to make the LLM answer all our questions. We'll use the "tiny" LLM [`Qwen/Qwen2.5-0.5B`](https://huggingface.co/Qwen/Qwen2.5-0.5B).

In [None]:
qwen = pipeline("text-generation", model="Qwen/Qwen2.5-0.5B", max_new_tokens=100)

So, what is the task that qwen should perform for us? Answering questions, of course. Let's describe this task:

In [24]:
prompt = \
"""
You are a helpful bot and are answering all questions the human has. 
You only answer the question and do not provide any additional information. 
You are not allowed to ask questions.

The human asks: "{question}"

Your answer:
"""

See the `{question}` template? We can use python's string formatting to fill it in.

In [None]:
print(prompt.format(question="What is the meaning of life?").strip())

Let's ask qwen:

In [None]:
print(qwen(prompt.format(question="What is the meaning of life?").strip())[0]['generated_text'])

Your turn: Tune the prompt such that it only answers the question and doesn't write anything beyond it.

### Leveraging Prompt Engineering to add context

How can we make qwen answer questions about facts? Again, we engineer the prompt!

In [27]:
prompt = \
"""
You are a helpful bot and are answering all questions the human has. 
You only answer the question and do not provide any additional information. 
You are not allowed to ask questions.

You are given the following context:
Big Thought's favorite color is blue.


The human asks: "{question}"

Your answer:
"""

In [None]:
print(qwen(prompt.format(question="What is Big Thought's favorite color?").strip())[0]['generated_text'])

It is often helpful to format your prompts using JSON-, XML-, or markdown-inspired formatting.

In [29]:
prompt = \
"""
<INSTRUCTION>
You are a helpful bot and are answering all questions the human has. 
You only answer the question and do not provide any additional information. 
You are not allowed to ask questions.
</INSTRUCTION>

<CONTEXT>
Big Thought's favorite color is blue.
</CONTEXT>

<QUESTION>
{question}
</QUESTION>

<ANSWER>
"""

In [None]:
print(qwen(prompt.format(question="What is Big Thought's favorite color?").strip())[0]['generated_text'])

This not only helps the model, it also makes it easy for you to parse the output!

### Your turn: Prompt engineering a simple chatbot

With everything you've learnt so far, you are able to build a simple chatbot.
Use the following prompt "style":
Update the prompt with the previous in- and output every time you query the LLM!

_Hint: Hugging Face `transformers` provides you with [helpful utilities](https://huggingface.co/docs/transformers/main/chat_templating) for prompt engineering chat models!_

We provide you with a simple ipywidgets-based UI.

_Can you prevent Qwen from hallucinating whole conversations?_

In [None]:
import ipywidgets as widgets
from IPython.display import display

prompt = """
<INSTRUCTION>
You are a helpful and friendly bot.
You provide short and concise answers.
You are not allowed to ask questions.
You do not hallucinate or provide false information.
</INSTRUCTION>
"""


def process_message(message: str, prompt: str):
    prompt = prompt + f"<HUMAN>\n{message}\n</HUMAN>\n\n<BOT>\n"
    response = qwen(prompt)[0]['generated_text']
    response = response.split("<BOT>")[1].strip().split("</BOT>")[0]
    return response


user_input = widgets.Text(placeholder="Type your message here")
send_button = widgets.Button(description="Send")
chat_area = widgets.Textarea(disabled=True)


def send_message(button):
    global prompt
    message = user_input.value
    chat_area.value = chat_area.value + f"\nHuman: {message}"
    response = process_message(message, prompt)
    chat_area.value = chat_area.value + f"\nBot: {response}"
    user_input.value = ""
    prompt = prompt + f"<HUMAN>\n{message}\n</HUMAN>\n\n<BOT>\n{response}\n</BOT>\n"
    print(prompt)


send_button.on_click(send_message)

layout = widgets.VBox([chat_area, widgets.HBox([user_input, send_button])])

display(layout)

If you are interested in LLMs and Prompt Engineering, we highly recommend you to take a closer look at [`langchain`](https://www.langchain.com) and [LlamaIndex](https://www.llamaindex.ai/)!

----

Now that you have some experience with prompt engineering, it's time to wrap our chatbot (which I assume is working flawlessly at this point ;P) in a more appealing UI. Go back to the Lab02 README to learn about streamlit!