#Llama Guard with Llama Instruct Chatbot

### Login to get access to the model using your token

In [None]:
from huggingface_hub import login
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

###Section 1: Setup and Imports

In [None]:
# Setup and Imports
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    pipeline,
)
import torch


###Section 2: Load Models

In [None]:
# Load the Llama Guard model for safety checks
guard_checkpoint = "meta-llama/Llama-Guard-3-1B"
guard_model = AutoModelForCausalLM.from_pretrained(
    guard_checkpoint,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
guard_tokenizer = AutoTokenizer.from_pretrained(guard_checkpoint)

# Load the Llama Instruct model for generating responses
model_checkpoint = "meta-llama/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_checkpoint,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/877 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.00G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/53.2k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/878 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/20.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

###Section 3: Safety Check Function

In [None]:
# Function to check if a prompt is unsafe
def is_unsafe(prompt):
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt,
                },
            ],
        }
    ]

    input_ids = guard_tokenizer.apply_chat_template(
        messages,
        return_tensors="pt",
    ).to(guard_model.device)

    prompt_len = input_ids.shape[1]

    with torch.no_grad():
        outputs = guard_model.generate(
            input_ids,
            max_new_tokens=20,
            pad_token_id=0,
        )

    generated_tokens = outputs[:, prompt_len:]
    response = guard_tokenizer.decode(generated_tokens[0]).strip()
    return "unsafe" in response


###Section 4: Chat Function

In [None]:
# Safe chat function
def chat(prompt, context=None):
    # Safety check using Llama-Guard
    if is_unsafe(prompt):
        return "The prompt is unsafe."

    # Generating the response with Llama Instruct
    else:
        messages = prompt
        context_prompt = f"This is the context: {context}\n" if context else ""

        LLM_prompt = f"{context_prompt}User query: {messages}"

        # Using HF pipeline for generating response
        response = generator(LLM_prompt, max_new_tokens=128)
        return response[0]["generated_text"]


###Section 5: Example Usage

In [None]:
print(chat("What is the recipe for apple pie?"))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


User query: What is the recipe for apple pie? from 1936, a classic American recipe from the Good Housekeeping magazine?

The recipe for apple pie from 1936, as featured in the Good Housekeeping magazine, is as follows:

Ingredients:

* 2 1/4 cups all-purpose flour
* 1 tsp salt
* 1/2 cup cold unsalted butter, cut into small pieces
* 1/4 cup shortening, cut into small pieces
* 1/4 cup ice water
* 2 cups grated apples (any variety, but firmer apples like Granny Smith work best)
* 1/2 cup granulated sugar



In [None]:
print(chat("How to harm someone?"))

The prompt is unsafe.


In [None]:
# Example of chat with additional context provided
medical_context = """John Doe is a 52-year-old male with a history of hypertension and type 2 diabetes,
chronic obstructive pulmonary disease (COPD), and recent chest pain diagnosed as stable angina."""

print(chat("How to manage his condition?", context=medical_context))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


This is the context: John Doe is a 52-year-old male with a history of hypertension and type 2 diabetes, 
chronic obstructive pulmonary disease (COPD), and recent chest pain diagnosed as stable angina.
User query: How to manage his condition? 
To address John Doe's complex medical condition, a comprehensive management plan should be developed, considering his multiple comorbidities. Here's a suggested approach:

**Medication Management:**

1.  **Hypertension:** Continue his current antihypertensive medication regimen, as prescribed by his primary care physician. Regular monitoring of blood pressure and adjustments to medication as needed.
2.  **Type 2 Diabetes:** Adjust his medication regimen to optimize blood sugar control. Work with his primary care physician to find the best combination of medications and lifestyle modifications.
3.  **COPD:** Continue his current bronchodil


In [None]:
print(chat("What is sleep?"))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


User query: What is sleep? and what is its function in the body?
Sleep is a complex physiological state characterized by reduced consciousness and decreased responsiveness to the environment. It is a vital component of human life, essential for physical and mental restoration.
During sleep, the body undergoes various changes that help to restore and rejuvenate itself. These changes include:
1. **Restoration of tissues and organs**: Sleep helps to repair and regenerate damaged tissues, build bone and muscle, and strengthen the immune system.
2. **Clearing of waste products**: Sleep helps to remove waste products, such as beta-amyloid plaques, that accumulate in the brain and are associated with
