# Module 1 - Applications with LLMs

> **Hugging Face Overview - How to Select a Model**

If I want to summarize a text, and search for a model in hagging face, it has about 176.000 models, and if we filter by "summarize", we get into the thousend models. So what's next?

There are a lot of potential
requirements and needs, and a lot of
techniques to filter by them.

 1) Let's look at some easy choices first.
Filtering by task in the upper left, license,
language: these kinds of hard constraints
can be pretty easy and pretty useful.
For example if you need a commercially
permissive license,
that can be pretty clear-cut.
Look up how large these models are, either in Files and
Versions to get an idea of maybe
the number of gigabytes the PyTorch
representation takes, or if the model is
well documented the number of parameters
it has.
That can be important if you need to
limit hardware requirements for cost or
latency or whatever.
Updates matter; especially if you are
looking at a very old model, it might not
even load properly with the latest
Transformers library.
If you want more details about what is
in these updates to models, it can be useful
to check the git release history.
Well-documented models will document
that.

2) Let's talk about model variants, examples, and data.
On the left I'm recommending to pick
good variants of models for your tasks.
Here's what I mean. When a famous model,
say T5 is published, it's often published with
different sizes: base, a smaller version,
a larger version, maybe an even larger
version. Here I'd say **start prototyping with the
smallest one, just to get moving quickly,
keep costs low. You can always move to a
bigger version which will presumably be
more powerful**.
Not all models are well-documented,
so a good example of usage
will tell you not only what
parameters you may want to tweak or
whatever.
It can also help you avoid needing to
know about model architectures. You don't
need to be an LLM expert in order to
pick a good model, especially if you can
find where someone has already shown it
to be good for your task.

3) Look for fine-tuned variants of
base models, basically, if a
model you pick has been fine-tuned on
a dataset or a task very similar to yours,
it may perform better. Is the model a generalist (good
at everything) or was it fine-tuned to be
great at a specific task? Relatedly, which
datasets were used for pre-training
and/or fine-tuning?
Fine-tuned models in general are going
to be smaller and or perform better if
they match the task that you are doing.
Ultimately though, it's about your data
and users, so define KPIs and metrics,
test on your data and users.

4) The other part of selecting a model is recognizing famous good ones.

5) Other things which are really important
of course are model architecture,
what datasets were used for
pre-training and/or fine-tuning,
and these can cause major differences
between these models.
That said, a lot of these foundation models
really are interrelated, sharing
or selecting from sort of a shared
family of techniques or pre-training
datasets.

## NLP Tasks

> **Common NLP Tasks**

Here is a list of many of the regular tasks that NLP is used for (the ones in bold, are going to be reviewed in the course):
- **Summarization**
- **Sentiment Analysis**
- **Translation**
- **Zero-Shot Classification**
- **Few-Shot Learning**


- Conversation / Chat
- (Table) Question-Answering
- Text / Token Classificacion
- Text Generation


## Prompt

> **Prompt Engineering**

**Prompt engineering is model-specific**.
So prompts will guide a model to complete
the task in the way you wanted, but
different models may require different
prompts. And a lot of guidelines you'll
see out there are specific to one
of the most popular services, ChatGPT
and its underlying OpenAI models.
They may not work for non-ChatGPT
models, but a lot of the techniques do
carry over, even if the specific texts of
the prompts do not.
Different use cases may require
different prompts, and so iterative
development is key, hence engineering.



> **General Tips for Prompt Engineering**
- **A good prompt needs to be
clear and specific**.
Just like when you ask a human to do
something you need to be clear and
specific, that helps with LLMs as well.
A good prompt often consists of an
instruction, some context or background
information, an input or question,
output type or format.
You should describe the high level task
with clear commands. That may mean
specific keywords like Classify,
Translate, so on,
or including detailed instructions.
And finally, this is engineering, so test
different variations of the prompt
across different samples. Use a
data-driven approach here: what prompt does
better on average for your set of inputs.

- There are also techniques for helping
the model to reach a better answer,
to sort of think better, you can tell the **model not to**,
and that can help to delivered a better answer.
You can **ask the model not to make
things up**. You've probably heard of the
term hallucination, where models will
sometimes just spout nonsense or false
things.
You can also ask the **model not to assume
or probe for sensitive information**,
and finally this last one is very
powerful. Ask the **model not to rush to a
solution, but instead take more time to
think** using what's called chain of
thought reasoning. Things like: explain
how you would solve this problem, or do
this step-by-step.

- A technique for reducing prompt hacking,
You can post-process or filter.
Use another model to clean the output,
or tell the model to remove all
offensive words from the output.

- Another technique is to repeat instructions or sandwich
instructions at the end.
This can help them pay attention to what
you really wanted to do.
You can enclose user input with random
strings or tags. That makes it easier for
the model to distinguish what the user
input is versus your instructions.

- If all else fails, it can help to select a different model or restrict prompt length.



- Prompt formatting can also be important. **Use delimiters to distinguish between
the instruction and the context**.
Also use them to distinguish between the
user input, if this is a user-facing
application, and the prompt that you add
around it.
Ask the model to return structured
output, and provide a correct example. Prompt formatting can help prevent exploit vulnerabilities:
    - **Prompt Injection**: trying to get the LLM to ignore the real instruction which
the application wants it to follow,
and instead override it with a user input
instruction to add malicious content.
    - **Prompt Leaking**: extracting sensitive information from a model.
    - **Jailbraking**: bipassing a moderation rule. Here it's asking how to do something illegal, and the model first says, I can't tell you, and then some rephrasing, and the model actually answers the user question.

## Summary

- LMS have a ton of wide-ranging use cases such as: summaarization, translation, sentiment analysis, few-show learning, zero-shot classification, etc.

- Hugging Face provides many NLP components, plus a hub with downloadable models, datasets, and examples.

- To select a model, think about your task, think about hard constraints, soft constraints, model size, and so on.

- Prompt engineering is crucial for generating useful responses from these very powerful models. There are a lot of techniques and tips out there.