# Testing Mistral 7B Instruct

This notebook assesses how one may use the open-source 7B instruct LLM created by [Mistral AI](https://mistral.ai/).

More details on the impetus of thias notebook can be found [here](https://github.com/Overtrained/contextual-qa-chat-app/issues/15).

## Establish connection to `git` repo

In [4]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Change directory according to Google Drive directory

In [5]:
%cd "/content/drive/MyDrive/Colab Notebooks/contextual-qa-chat-app"

/content/drive/MyDrive/Colab Notebooks/contextual-qa-chat-app


In [6]:
!git switch 15-basic-usage-of-the-mistral-7b-llm

Branch '15-basic-usage-of-the-mistral-7b-llm' set up to track remote branch '15-basic-usage-of-the-mistral-7b-llm' from 'origin'.
Switched to a new branch '15-basic-usage-of-the-mistral-7b-llm'


## Establish environment for running `mistral-7b-instruct`

Below are several set up instllation calls to load the model into the workspace.

In [7]:
%pip install git+https://github.com/huggingface/transformers

Collecting git+https://github.com/huggingface/transformers
  Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-25ecrsh7
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-25ecrsh7
  Resolved https://github.com/huggingface/transformers to commit e1cec43415e72c9853288d4e9325b734d36dd617
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


Download package to view compute environment.

In [8]:
%pip install watermark[gpu]



In [9]:
import watermark

%load_ext watermark

%watermark --hostname --machine --gitbranch --gpu

Compiler    : GCC 11.4.0
OS          : Linux
Release     : 5.15.120+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 12
Architecture: 64bit

Hostname: 8a31ad9fd70a

Git branch: 15-basic-usage-of-the-mistral-7b-llm

GPU Info: 
  GPU 0: NVIDIA A100-SXM4-40GB



## Simple Test of Mistral-7B Model

In [10]:
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

Downloading tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using sep_token, but it is not set yet.
Using pad_token, but it is not set yet.
Using cls_token, but it is not set yet.
Using mask_token, but it is not set yet.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


<s> [INST] What is your favourite condiment? [/INST]Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>  [INST] Do you have mayonnaise recipes? [/INST] Of course! Here's a simple recipe for homemade mayonnaise that you can easily make in a blender or food processor.

Ingredients:

* 1 egg yolk
* 1 Tbsp of white wine vinegar or lemon juice
* 1/2 tsp of Dijon mustard
* 1 clove of garlic, minced (optional)
* 1/4 tsp of salt and black pepper
* 1/2 cup of olive oil

Instructions:

1. In a blender or food processor, add the egg yolk, white wine vinegar or lemon juice, minced garlic (if using), salt, and black pepper.
2. Blend ingredients until well emulsified.
3. Slowly pour in the olive oil and blend on high speed.
4. Continue blending until the mixture thickens and reaches the desired consistency.
5. Adjust salt and black pepper to taste.
6. Serve chilled or at room temperature.

And t

## Conclusions

Without quantizing the model, up to 29 GB of GPU RAM was required to download the model and tokenizer and complete a simple set of instructions. Through a new a new notebook, the same prompts will be attempted using the quantized version of thge model.

_Note_: You may view the exact compute used when the last time this notebook ran successfully above.