<a href="https://colab.research.google.com/github/victorouse/victorouse.github.io/blob/main/blog/exploring-llms/2-open-source-llms/index.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---

### HuggingFace

Models created by the Open Source community are generally written in Python using frameworks like [PyTorch](https://pytorch.org/) and [Tensorflow](https://www.tensorflow.org/).

Each model will have its own instructions on how to compile and run them.

One popular library that abstracts the downloading, loading, and API of [compatible] models is the HuggingFace [transformers](https://github.com/huggingface/transformers) library.

The `transformers` library is able to download and load any model published on the [HuggingFace Hub](https://huggingface.co/docs/hub/index).

First, we will need to create a HuggingFace Account and login using the HuggingFace CLI.

In [8]:
!pip install huggingface_hub



In [9]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|
    
    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token: 
Add token as git credential? (Y/n) n
Token is valid (permission: read).
Your token has been saved to /root

After we've logged in, we now have access to all the `transformers` [models](https://huggingface.co/models?library=transformers&sort=downloads) on HuggingFace.

We can now pick a model, and use the `pipeline` function from the `transformers` library to download and load our model.

In [10]:
!pip install transformers



In [11]:
from transformers import pipeline

classifier = pipeline('sentiment-analysis')
print(classifier('I love cats'))
print(classifier('I hate dogs'))

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'POSITIVE', 'score': 0.9995536208152771}]
[{'label': 'NEGATIVE', 'score': 0.9954286813735962}]


Note that we passed the string argument `sentiment-analysis`, which itself is not actually a model, but a [task](https://huggingface.co/docs/transformers/main/en/task_summary).

Tasks in this context refer to common Natrual Language Processing (NLP) tasks, with `sentiment-analysis` being one of them.

Each task has it's own string identifier, i.e. `sentiment-analysis` and a default model that is used for the task.

This is why we observed the following output from the previous step:

```shell
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
```

We can see that the task is using the [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) model downloaded from HuggingFace.

We can specify an alternative model, such as the [finiteautomata/bertweet-base-sentiment-analysis](https://huggingface.co/finiteautomata/bertweet-base-sentiment-analysis) model by providing a `model` argument to the `pipeline` function:

In [12]:
!pip install emoji



In [13]:
# This classifier is trained on Tweets!
classifier = pipeline(model="finiteautomata/bertweet-base-sentiment-analysis")

print(classifier('I love Elon Musk'))
print(classifier('I hate Elon Musk'))
print(classifier('OpenAI is good'))

[{'label': 'POS', 'score': 0.9926522970199585}]
[{'label': 'NEG', 'score': 0.9819341897964478}]
[{'label': 'POS', 'score': 0.9817819595336914}]
