# Using Llama3 With Huggingface
=============================
    
  M. Zia Rasa

* Note 1: To use Llama3 with huggingface, it is recommonded to create a HuggingFace account.
* Then go to HuggingFace account and create a token and save it in a config.json file or add it to your google colab secrets if you are using google colab

### Installing dependencies

In [1]:
!pip install accelerate
!pip install bitsandbytes
!pip install torch

Collecting bitsandbytes
  Downloading bitsandbytes-0.44.1-py3-none-manylinux_2_24_x86_64.whl.metadata (3.5 kB)
Downloading bitsandbytes-0.44.1-py3-none-manylinux_2_24_x86_64.whl (122.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.4/122.4 MB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitsandbytes
Successfully installed bitsandbytes-0.44.1


### importing packages

In [2]:
import json
import torch
from transformers import (AutoTokenizer,
                          AutoModelForCausalLM,
                          BitsAndBytesConfig,
                          pipeline)

### Load the HuggingFace token which is saved in a jsonfile

In [4]:
config_data = json.load(open("config.json"))
HF_TOKEN = config_data["HF_Token"]

### Choose your model

* Speficify a model name that you want to use, go to huggingface and select your desired models.
    "https://huggingface.co/models"
but, before using the model, you need to agree the license key

* Here, we are using "Meta-llama-3-8B" model
    "https://huggingface.co/meta-llama/Meta-Llama-3-8B"

In [5]:
model_name = "meta-llama/Meta-Llama-3-8B"

**Quantisation Configuration**

In [6]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

**Loading the Tokenizer and the LLM**

In [7]:
tokenizer = AutoTokenizer.from_pretrained(model_name,
                                          token=HF_TOKEN)

tokenizer.pad_token = tokenizer.eos_token

tokenizer_config.json:   0%|          | 0.00/50.6k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/73.0 [00:00<?, ?B/s]

In [8]:
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    quantization_config=bnb_config,
    token=HF_TOKEN
)

config.json:   0%|          | 0.00/654 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/177 [00:00<?, ?B/s]

In [9]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [10]:
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=128
)

In [11]:
def get_response(prompt):
  sequences = text_generator(prompt)
  gen_text = sequences[0]["generated_text"]
  return gen_text

In [13]:
prompt = "What is Llama3 model language?"
llama3_response = get_response(prompt)
llama3_response

'What is Llama3 model language? Llama3 is a high-level language designed to be easy to learn and use. It is a simple and powerful language that can be used to create games, animations, and simulations. Llama3 is a high-level language that is easy to learn and use. It is a simple and powerful language that can be used to create games, animations, and simulations. Llama3 is a high-level language that is easy to learn and use. It is a simple and powerful language that can be used to create games, animations, and simulations. Llama3 is a high-level language that is easy to learn and use. It is a'

In [14]:
print(llama3_response[len(prompt):])

 Llama3 is a high-level language designed to be easy to learn and use. It is a simple and powerful language that can be used to create games, animations, and simulations. Llama3 is a high-level language that is easy to learn and use. It is a simple and powerful language that can be used to create games, animations, and simulations. Llama3 is a high-level language that is easy to learn and use. It is a simple and powerful language that can be used to create games, animations, and simulations. Llama3 is a high-level language that is easy to learn and use. It is a


In [15]:
prompt = "The typical color of the sky is:"

llama3_response = get_response(prompt)
llama3_response

'The typical color of the sky is: blue. The color of the sky is one of the most important colors of the world. The sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of the color blue. The color of the sky is a symbol of'

In [16]:
prompt="Describe quantum physics in one short sentence of no more than 12 words"

llama3_response = get_response(prompt)
llama3_response

'Describe quantum physics in one short sentence of no more than 12 words.\nI was challenged by a friend of mine to describe quantum physics in one short sentence of no more than 12 words. I was unable to come up with a good answer, so I asked my friends on the Quantum Physics Facebook group. Here are some of the answers they came up with.\nThe best answer came from Paul H. Smith who said:\n“Quantum physics is the study of how the universe works on the smallest scales, where classical physics breaks down.”\nI think this is a good answer, and I would have given it a 5 out of 5, but I had to deduct 1 point for the fact that'

## Thank you