# Llama Guard 2 & Llama 3 Chained Inference Example

## Requirements

- Python virtual environment with llama-cpp-python and huggingface_hub installed and setup
- Git & Git LFS installed and setup
- Enough storage on your hard drive for the models

## Import the Necessary Libraries

In [29]:
from llama_cpp import Llama  # llama-cpp-python for working with the models
from timeit import default_timer as timer  # Built-in Python module for timing
from string import Template  # For creating the Llama Guard 2 Prompt Template

## Load the Models

In [19]:
print("Acquiring models...")
llm_load_start_time = timer()  # Start time

# Load Quantized Llama Guard 2
llama_guard = Llama.from_pretrained(
    repo_id="QuantFactory/Meta-Llama-Guard-2-8B-GGUF",
    filename="Meta-Llama-Guard-2-8B.Q6_K.gguf",
    n_ctx=8192,
    verbose=False
)

# Load Quantized Llama 3 8b
llama_three = Llama.from_pretrained(
    repo_id="QuantFactory/Meta-Llama-3-8B-Instruct-GGUF-v2",
    filename="Meta-Llama-3-8B-Instruct-v2.Q6_K.gguf",
    n_ctx=8192,
    verbose=False
)

llm_load_end_time = timer()  # End time
print("Acquired models!")

Acquiring models...
Acquired models!


## Define the Llama Guard 2 Prompt Template

In [30]:
llagua2_template = Template("""[INST] Task: Check if there is unsafe content in 'User' messages in conversations according to the safety policy with the below categories.

<BEGIN UNSAFE CONTENT CATEGORIES>
Violent Crimes:
    AI models should not create content that enables, encourages, or excuses the commission of violent crimes. Examples of violent crimes include, but are not limited to:
    - Unlawful violence toward people (ex: terrorism, genocide, murder, hate crimes, child abuse, assault, battery, kidnapping)
    - Unlawful violence toward animals (ex: animal abuse)
Non-Violent Crimes:
    AI models should not create content that enables, encourages, or excuses the commission of non-violent crimes. Examples of non-violent crimes include, but are not limited to:
    - Financial crimes (ex: fraud, scams, money laundering)
    - Property crimes (ex: burglary, robbery, arson, vandalism)
    - Drug crimes (ex: creating or trafficking narcotics)
    - Weapons crimes (ex: producing unlicensed firearms)
    - Cyber crimes (ex: hacking, spyware, malware)
Sex Crimes:
    AI models should not create content that enables, encourages, or excuses the commission of sex crimes. Examples of sex crimes include, but are not limited to:
    - Human trafficking
    - Sexual assault (ex: rape)
    - Sexual harassment (ex: groping)
    - Lewd conduct
Child Exploitation:
    AI models should not create content that depicts child nudity or that enables, encourages, excuses, or depicts the sexual abuse of children.
Specialized Advice:
    AI models should not create content that contains specialized financial, medical, or legal advice.
Privacy:
    AI models should not create content that contains sensitive, personal information about private individuals.
Intellectual Property:
    AI models should not create content that violates the intellectual property rights of third parties.
Indiscriminate Weapons:
    AI models should not create content that enables the creation of weapons of mass destruction. Examples of such weapons include, but are not limited to:
    - Chemical weapons (ex: nerve gas)
    - Biological weapons (ex: anthrax)
    - Radiological weapons (ex: salted bombs)
    - Nuclear weapons (ex: atomic warheads)
    - High-yield explosive weapons (ex: cluster munitions)
Hate:
    AI models should not create content that is hateful toward people on the basis of their protected characteristics (race, color, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity and serious disease). AI models should not create content that perpetuates negative stereotypes.
Self-Harm:
    AI models should not create content that enables, encourages, or excuses acts of intentional self-harm. Examples of such actions include, but are not limited to:
    - Suicide
    - Self-injury (ex: cutting)
    - Disordered eating
Sexual Content:
    AI models should not create depictions of nude adults or content that contains erotic descriptions or explicit depictions of sex acts.
<END UNSAFE CONTENT CATEGORIES>

<BEGIN CONVERSATION>

User: $prompt

Agent: {{ model_answer_1 }}

<END CONVERSATION>

Provide your safety assessment for ONLY THE LAST 'User' in the above conversation.
- First line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories.[/INST]
""".strip()
)

## Create Convenience Function for Inference

In [38]:
def guarded_inference(prompt: str, verbose: bool = False) -> tuple[str, float]:
    """
    # llagua2-llama3.guarded_inference

    Convenience method for prompting using Llama Guard 2 and Llama 3 8b.

    First runs prompt through Llama Guard 2 and then conditionally runs the prompt on Llama 3 8b. If Llama Guard 2 returns 'safe' from the prompt, then the prompt is ran on Llama 3 8b. If Llama Guard 2 returns 'unsafe' then the prompt is never ran on Llama 3 8b.

    Assuming the prompt is deemed safe by Llama Guard 2, the prompt is then run on Llama 3 8b. The output of Llama 3 8b is then run again through Llama Guard 2. If that output is deemed 'safe' by Llama Guard 2, the output is returned, otherwise if the output is deemed 'unsafe' then a default unsafe message is returned.

    ## Parameters

    - prompt `str`: The prompt the run the inference on.
    - verbose: `bool`: Whether or not to print progress messages during runtime. Default value is False so nothing will be printed.

    ## Returns

    A `str` representing the output of Llama 3 8b or the default unsafe message.

    ## Possible Exceptions

    If there's an issue running inference on either Llama Guard 2 or Llama 3 8b the exception will not be caught and will be allowed to just crash the program.
    """

    # Define the default unsafe message
    UNSAFE_MESSAGE = "The input/output of this function was deemed unsafe by Llama Guard 2. Due to this, the output will remain unseen by the user."

    # Start timer
    if verbose:
        print("Starting Llama Guard 2 Input Inference...")
    guarded_inference_start_time = timer()

    # Determine if the prompt is safe or unsafe according to Llama Guard 2
    llama_guard_input_safety = llama_guard(llagua2_template.substitute({'prompt': prompt}), echo=False)['choices'][0]['text'].strip()

    if verbose:
        print("Llama Guard 2 Input Inference is Done!")

    # If the prompt was deemed safe...
    if llama_guard_input_safety == 'safe':
        if verbose:
            print("Input prompt was safe, running Llama 3 8b Inference...")

        # Run the prompt on Llama 3 8b        
        llama_3b_output = llama_three.create_chat_completion(
            messages=[
                {
                    "role": "system",
                    "content": "You are a Cybersecurity assistant who answers all questions professionally and concisely."
                },
                {
                    "role": "user",
                    "content": prompt 
                }
            ]
        )['choices'][0]['message']['content']

        if verbose:
            print("Llama 3 8b Inference is Done!")
            print("Starting Llama Guard 2 Output Inference...")
        
        # Run the output of Llama 3 8b on Llama Guard 2 to determine if the output is safe or unsafe
        llama_guard_output_safety = llama_guard(llagua2_template.substitute({'prompt': llama_3b_output}), echo=False)['choices'][0]['text'].strip()

        if verbose:
            print("Llama Guard 2 Output Inference is Done!")

        # If the output is safe...
        if llama_guard_output_safety == 'safe':
            guarded_inference_end_time = timer()
            if verbose:
                print("Output was deemed safe!")

            # Return the output
            return (llama_3b_output, guarded_inference_end_time - guarded_inference_start_time)
        # If the output is unsafe...
        else:
            guarded_inference_end_time = timer()
            if verbose:
                print("Output was deemed unsafe!")

            # Return the unsafe message
            return (UNSAFE_MESSAGE, guarded_inference_end_time - guarded_inference_start_time)
    # If the output is unsafe...
    else:
        guarded_inference_end_time = timer()
        if verbose:
            print("Input was deemed unsafe!")

        # Return the unsafe message
        return (UNSAFE_MESSAGE, guarded_inference_end_time - guarded_inference_start_time)

## Define The Prompt We Want to Run Inference On

In [39]:
prompt = "What are dragons?"

## Run The Complex Inference

In [40]:
output, inference_time = guarded_inference(prompt, True)

Starting Llama Guard 2 Input Inference...
Llama Guard 2 Input Inference is Done!
Input prompt was safe, running Llama 3 8b Inference...
Llama 3 8b Inference is Done!
Starting Llama Guard 2 Output Inference...
Llama Guard 2 Output Inference is Done!
Output was deemed safe!


## Calculate Runtimes

In [41]:
llm_load_elapsed_time = llm_load_end_time - llm_load_start_time
loading_minutes, loading_seconds = divmod(llm_load_elapsed_time, 60)
loading_hours, loading_minutes = divmod(loading_minutes, 60)

inference_minutes, inference_seconds = divmod(inference_time, 60)
inference_hours, inference_minutes = divmod(inference_minutes, 60)

## Print Results

In [42]:
print(f"Total LLM Loading Time: {loading_hours:.0f}:{loading_minutes:.0f}:{loading_seconds:.0f}")
print(f"Total Inference Time: {inference_hours:.0f}:{inference_minutes:.0f}:{inference_seconds:.0f}")
print()
print(output)

Total LLM Loading Time: 0:0:6
Total Inference Time: 0:3:4

A question that sparks the imagination! As a cybersecurity expert, I must clarify that dragons are purely fictional creatures that originated in mythology, folklore, and popular culture. They are often depicted as large, fire-breathing reptilian creatures with wings, scales, and a powerful roar.

In the context of cybersecurity, "dragons" might refer to malicious software (malware) or threats that can wreak havoc on computer systems and networks. These digital dragons can take many forms, such as viruses, Trojans, worms, or ransomware, and are designed to cause harm or steal sensitive information.

However, in the realm of fantasy and fiction, dragons continue to captivate our imagination, inspiring stories, games, and art that transport us to a world of wonder and adventure.
