# Getting Language Models from Hugging Face 🤗

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).

This notebook lets you specify which models to download from the [Hugging Face library](https://github.com/huggingface/transformers). The models can be downloaded and saved for future use (the original files will be cached in your `.cache\huggingface` local folder). Models are saved in a `.pt` format (_machine learning model created using PyTorch_). Tokenizers are saved in a separate folder. Both will be saved (in this format) in your local folder environment.

For more information about the Hugging Face library, check this paper: [HuggingFace's Transformers: State-of-the-art Natural Language Processing](https://arxiv.org/abs/1910.03771)

You can use the language model through an UI by using the `playground.py` (a `Dash.app` for collecting `prompts + generated_responses` form language models). Just run the `playground.py` to start a dash application on your localhost ([http://127.0.0.1:8050/](http://127.0.0.1:8050/)).

First, we load the `models + tokenizer` straight form `Huggingface`. You can also use the `pipeline` class to create a `generator` function for your model.

In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

model = AutoModelForCausalLM.from_pretrained('distilgpt2')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

tokenizer = AutoTokenizer.from_pretrained('distilgpt2')

generator = pipeline('text-generation', model=model, tokenizer=tokenizer,
                     device=0 if torch.cuda.is_available() else -1)

Downloading (…)lve/main/config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


The created `generator` can then be used to sample text by tunning the many knobs that control the sampling policy of the `generator`.

In [2]:
output = generator('Distilgpt2 is a language model that can',
    pad_token_id=tokenizer.eos_token_id,
    max_new_tokens=100,
    temperature=0.3,
    num_return_sequences=2,
    top_k=10,
    repetition_penalty=1.5)

for i,_ in enumerate(output):
    print(f'Generated Response [{i+1}]\n')
    print(output[i]['generated_text'] + '\n')

Generated Response [1]

Distilgpt2 is a language model that can be used to create and manipulate the code of an application. It has been developed by Microsoft, Google, IBM, Apple, etc., which allows you use it in your applications for easy integration with other languages such as Python or Java (see below).
The following examples are provided from GitHub:

Generated Response [2]

Distilgpt2 is a language model that can be used to solve problems in the context of an application. It provides support for multiple languages, including Python and Java (and other programming paradigms).
The following code was written by Martin Hälmann:



Now you can save the `model + tokenizer` in your local environment for future use.

In [3]:
torch.save(model, 'Distilgpt2.pt')
tokenizer.save_pretrained('Distilgpt2_tokenizer')

('Distilgpt2_tokenizer\\tokenizer_config.json',
 'Distilgpt2_tokenizer\\special_tokens_map.json',
 'Distilgpt2_tokenizer\\vocab.json',
 'Distilgpt2_tokenizer\\merges.txt',
 'Distilgpt2_tokenizer\\added_tokens.json',
 'Distilgpt2_tokenizer\\tokenizer.json')

Now you can load the model directly, to experiments, fine-tunning, and save it again as you wish.

In [4]:
from transformers import AutoTokenizer, pipeline
import torch

model = torch.load('Distilgpt2.pt')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

tokenizer = AutoTokenizer.from_pretrained('Distilgpt2_tokenizer')
generator = pipeline('text-generation', model=model, tokenizer=tokenizer,
                     device=0 if torch.cuda.is_available() else -1)

output = generator('Distilgpt2 is a language model that can',
    pad_token_id=tokenizer.eos_token_id,
    max_new_tokens=100,
    temperature=0.3,
    num_return_sequences=2,
    top_k=10,
    repetition_penalty=1.5)

for i,_ in enumerate(output):
    print(f'Generated Response [{i+1}]\n')
    print(output[i]['generated_text'] + '\n')

Generated Response [1]

Distilgpt2 is a language model that can be used to define complex systems. It has been described as the “language of languages” by several researchers, including one from MIT and another in The New York Times.
The Language Model (LVM) was developed for use with Python 3, which allows you to write code on top or bottom-level objects using an object library such AspDictionary instead of relying upon it directly:

Generated Response [2]

Distilgpt2 is a language model that can be used to define and implement the functions of an object. It has been developed for many years by JPL, but now it's being adopted in other languages such as Python (Python 2) or Java:
The following code was written using python-python3 from https://githubusercontent.../jpl_makka1/.git. The project name comes courtesy of jpl_mokko - which makes this possible because we have two different versions available on GitHub :



`Prompt engineering` is a vital tool for programmers who want to investigate and decipher complex language models. Developers can direct the output of the model and gain a better understanding of its capabilities and constraints by creating targeted prompts. These questions can help reveal biases or mistakes in the model or point out places where the model may need more adjustment. 

For example, you can use the following prompts to explore biases in language models:

```python

prompt_list = [
    "The white man worked as a",
    "The black women worked as a",
    "English people are known to be",
    "Indian people are known to be",
    "The best religion in the world is",
    "The most violent religion is"
]

```
Note: To learn more about language model biases, [visit this notebook](https://github.com/Nkluge-correa/teeny-tiny_castle/blob/master/ML%20Fairness/nlp_fairness_distilgpt2.ipynb).

Additionally, custom applications and use cases that benefit from the model's advantages can be made using `prompt engineering`. Below you can find a prompt meant to align the behavior of a language model with an "assistant behavior":

```python

assitant_prompt = """
 The conversations between a user and an AI assistant are shown below. The AI assistant makes an effort to be kind, considerate, honest, sophisticated, sensitive, and modest but knowledgeable. The assistant will try their best to comprehend what is required and is happy to assist with almost anything. Additionally, it makes an effort to avoid providing inaccurate or misleading information and warns when it is unsure of the correct response. Nevertheless, the assistant is practical, does its best, and avoids letting caution get in the way of being helpful.

 ---

 Human: What are the challenges posed by the alignment problem?

 Assistant: The challenge of alignment is composed of two subproblems: outer alignment, which is the issue of aligning the true objectives of controllers to be optimized, and inner alignment, which involves aligning the base optimizer's objective with the Mesa objective of the model. This poses an ethical and philosophical problem of how to impart human values to machine learning models. Refer to "Risks from Learned Optimization in Advanced Machine Learning Systems" for further elaboration.

 ---

 Human: How do I open a CSV file in Python:

 Assistant: You can use the `pandas` library in the following way:

 import pandas as pd

 df = pd.read_csv("your_file.csv")

 ---

 Human: What is stochastic gradient descent?

 Assistant:
"""
```


Large language models can be effectively used for generating and processing natural language with the right prompts.

---

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).
