<a href="https://colab.research.google.com/github/MLField1/Mistral-7B-Fine-tune/blob/main/Test_Voice_and_Tone_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Make sure to select a T4 GPU in the resources menu for the LLM to run.

NOTE: You would need to make a huggingface account to gain access to an API key required to download the models. Put the huggingface token in the secrets page as HUGGINGFACE_TOKEN.

This installs the packages required to run the model.

In [1]:
# You only need to run this once per machine
!pip install -q -U bitsandbytes
!pip install -q -U git+https://github.com/huggingface/transformers.git
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m119.8/119.8 MB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.3/21.3 MB[0m [31m67.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Building wheel for transformers (pyproject.toml) ... [?25l[?25hdone
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Building wheel for peft (pyproject.toml) ... [?25l[?25hdone
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Building wheel for accelerate (pyproject.toml) ... [?25l[?25hdone


In [4]:
from google.colab import userdata
secret_hf = userdata.get('HUGGINGFACE_TOKEN')
!huggingface-cli login --token $secret_hf

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


This loads the base model.

In [5]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

base_model_id = "mistralai/Mistral-7B-Instruct-v0.3"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,  # Mistral, same as before
    quantization_config=bnb_config,  # Same quantization config as before
    device_map="auto",
    trust_remote_code=True,
)

eval_tokenizer = AutoTokenizer.from_pretrained(base_model_id, add_bos_token=True, trust_remote_code=True)

config.json:   0%|          | 0.00/601 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.55G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/138k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/587k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.96M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

This downloads the fine-tuned parameters and loads them to the base model.

In [6]:
from peft import PeftModel
!git clone https://huggingface.co/masonlf/7BMistral_Tone_Stylev2InstructFinal


ft_model = PeftModel.from_pretrained(base_model, "masonlf/7BMistral_Tone_Stylev2InstructFinal")

Cloning into '7BMistral_Tone_Stylev2InstructFinal'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects: 100% (9/9), done.[K
remote: Compressing objects: 100% (9/9), done.[K
remote: Total 12 (delta 2), reused 0 (delta 0), pack-reused 3 (from 1)[K
Unpacking objects: 100% (12/12), 5.33 KiB | 1.77 MiB/s, done.


adapter_config.json:   0%|          | 0.00/752 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/877M [00:00<?, ?B/s]

This prompts the model. Change the eval_prompt to anything you want to ask the fine-tuned model and run the code. The output will appear below.

In [7]:
eval_prompt = "Draft a short 200 word technical paragraph explaining the concept of edge computing, its advantages over cloud computing, and its importance in the Internet of Things (IoT) ecosystem."
model_input = eval_tokenizer(eval_prompt, return_tensors="pt").to("cuda")

ft_model.eval()
with torch.no_grad():
    print(eval_tokenizer.decode(ft_model.generate(**model_input, max_new_tokens=4096, repetition_penalty=1.15)[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Draft a short 200 word technical paragraph explaining the concept of edge computing, its advantages over cloud computing, and its importance in the Internet of Things (IoT) ecosystem.

Edge computing is a decentralized approach to data processing that brings computation and data storage closer to where data is created or generated, rather than relying on centralized servers or cloud services for processing. This paradigm shift from cloud-based systems to edge-based systems aims to reduce latency, bandwidth usage, and improve system reliability by minimizing data transmission between devices and remote servers. In an IoT context, edge computing can process sensor data locally before sending it to the cloud, reducing the volume of data transmitted and enabling real-time analysis. Additionally, edge computing enhances privacy and security as sensitive data does not need to be sent across networks, mitigating potential cyber threats. The significance of edge computing lies in its ability t