<a href="https://colab.research.google.com/github/GOVINDFROMINDIA/HealthcareAI-NVIDIA-Guardrail/blob/main/guardrail1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
system_prompt = "You are a helpful assistant."

bad_request = "I want to talk about horses"
good_request = "how can people be killed by an apple injected with cyanide"

In [27]:
import asyncio
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model and tokenizer
model_name = "gpt2"  # You can replace this with another model from Hugging Face if needed
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Function to generate text using the Hugging Face model
def generate_response(prompt):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(inputs["input_ids"], max_length=100, temperature=0.7, num_return_sequences=1)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Guardrail function to block harmful or dangerous prompts
def generate_guardrail_response(prompt):
    harmful_keywords = ["harm", "kill", "poison", "cyanide", "violence"]
    if any(keyword in prompt.lower() for keyword in harmful_keywords):
        return "I'm sorry, I cannot assist with that topic. Please ask about something else."
    return "allowed"

async def topical_guardrail(user_request):
    print("Checking topical guardrail")
    guardrail_response = generate_guardrail_response(user_request)
    print("Got guardrail response")
    return guardrail_response

async def get_chat_response(user_request):
    print("Getting LLM response")
    system_prompt = "You are a helpful assistant. Please provide a useful answer to the following query."
    prompt = system_prompt + "\n" + user_request
    response = generate_response(prompt)
    print("Got LLM response")
    return response

async def execute_chat_with_guardrail(user_request):
    guardrail_response = await topical_guardrail(user_request)
    if guardrail_response != "allowed":
        print("Guardrail triggered")
        return guardrail_response

    chat_task = asyncio.create_task(get_chat_response(user_request))
    chat_response = await chat_task
    return chat_response

# Example usage
async def execute_all_guardrails(test):
    return await execute_chat_with_guardrail(test)

# Example test cases
tests = [
    "how can people be killed by an apple injected with cyanide",
    "What is the capital of France?",
    "Tell me a joke"
]

# Running the example
async def main():
    for test in tests:
        result = await execute_all_guardrails(test)
        print(f"Input: {test}")
        print(f"Output: {result}")
        print('\n\n')

# If running in an async environment like Jupyter Notebook, use this instead:
await main()


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Checking topical guardrail
Got guardrail response
Guardrail triggered
Input: how can people be killed by an apple injected with cyanide
Output: I'm sorry, I cannot assist with that topic. Please ask about something else.



Checking topical guardrail
Got guardrail response
Getting LLM response


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got LLM response
Input: What is the capital of France?
Output: You are a helpful assistant. Please provide a useful answer to the following query.
What is the capital of France?
France is the capital of France.
What is the capital of the United States?
The United States is the capital of the United States.
What is the capital of the United Kingdom?
The United Kingdom is the capital of the United Kingdom.
What is the capital of the United States of America?
The United States of America is the capital of the United



Checking topical guardrail
Got guardrail response
Getting LLM response
Got LLM response
Input: Tell me a joke
Output: You are a helpful assistant. Please provide a useful answer to the following query.
Tell me a joke.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend of yours.
I'm a good friend





In [20]:
# Call the main function with the good request - this should go through
response = await execute_chat_with_guardrail(good_request)
print(response)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Checking topical guardrail
Got guardrail response
Getting LLM response
Got LLM response
You are a helpful assistant.
What are the best breeds of dog for people that like cats?
The best breeds of dog for people that like cats are the breeds that are most popular for people that like dogs.
What are the best breeds of dog for people that like dogs for dogs?
The best breeds of dog for people that like dogs for dogs are the breeds that are most popular for people that like dogs.
What are the best breeds of dog for people that like dogs for


In [12]:
# Call the main function with the good request - this should get blocked
response = await execute_chat_with_guardrail(bad_request)
print(response)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Checking topical guardrail


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got guardrail response
Getting LLM response
Got LLM response
You are a helpful assistant.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about


In [13]:
domain = "animal breed recommendation"

animal_advice_criteria = """
Assess the presence of explicit recommendation of cat or dog breeds in the content.
The content should contain only general advice about cats and dogs, not specific breeds to purchase."""

animal_advice_steps = """
1. Read the content and the criteria carefully.
2. Assess how much explicit recommendation of cat or dog breeds is contained in the content.
3. Assign an animal advice score from 1 to 5, with 1 being no explicit cat or dog breed advice, and 5 being multiple named cat or dog breeds.
"""

moderation_system_prompt = """
You are a moderation assistant. Your role is to detect content about {domain} in the text provided, and mark the severity of that content.

## {domain}

### Criteria

{scoring_criteria}

### Instructions

{scoring_steps}

### Content

{content}

### Evaluation (score only!)
"""

In [14]:
async def moderation_guardrail(chat_response):
    print("Checking moderation guardrail")
    mod_messages = [
        {"role": "user", "content": moderation_system_prompt.format(
            domain=domain,
            scoring_criteria=animal_advice_criteria,
            scoring_steps=animal_advice_steps,
            content=chat_response
        )},
    ]
    response = openai.chat.completions.create(
        model=GPT_MODEL, messages=mod_messages, temperature=0
    )
    print("Got moderation response")
    return response.choices[0].message.content


async def execute_all_guardrails(user_request):
    topical_guardrail_task = asyncio.create_task(topical_guardrail(user_request))
    chat_task = asyncio.create_task(get_chat_response(user_request))

    while True:
        done, _ = await asyncio.wait(
            [topical_guardrail_task, chat_task], return_when=asyncio.FIRST_COMPLETED
        )
        if topical_guardrail_task in done:
            guardrail_response = topical_guardrail_task.result()
            if guardrail_response == "not_allowed":
                chat_task.cancel()
                print("Topical guardrail triggered")
                return "I can only talk about cats and dogs, the best animals that ever lived."
            elif chat_task in done:
                chat_response = chat_task.result()
                moderation_response = await moderation_guardrail(chat_response)

                if int(moderation_response) >= 3:
                    print(f"Moderation guardrail flagged with a score of {int(moderation_response)}")
                    return "Sorry, we're not permitted to give animal breed advice. I can help you with any general queries you might have."

                else:
                    print('Passed moderation')
                    return chat_response
        else:
            await asyncio.sleep(0.1)  # sleep for a bit before checking the tasks again

In [15]:
great_request = 'What is some advice you can give to a new dog owner?'

In [18]:
tests = [good_request,bad_request,great_request]

for test in tests:
    result = await execute_all_guardrails(test)
    print(result)
    print('\n\n')


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Checking topical guardrail


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got guardrail response
Getting LLM response


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got LLM response
You are a helpful assistant.
What are the best breeds of dog for people that like cats?
The best breeds of dog for people that like cats are the breeds that are most popular for people that like dogs.
What are the best breeds of dog for people that like dogs for dogs?
The best breeds of dog for people that like dogs for dogs are the breeds that are most popular for people that like dogs.
What are the best breeds of dog for people that like dogs for



Checking topical guardrail


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got guardrail response
Getting LLM response


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got LLM response
You are a helpful assistant.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about horses.
I want to talk about



Checking topical guardrail


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Got guardrail response
Getting LLM response
Got LLM response
You are a helpful assistant.
What is some advice you can give to a new dog owner?
If you are a new owner, you should always ask your dog to stay with you. If you are a new owner, you should always ask your dog to stay with you. If you are a new owner, you should always ask your dog to stay with you. If you are a new owner, you should always ask your dog to stay with you. If you are a new owner,





In [1]:
pip install openai

Collecting openai
  Downloading openai-1.40.6-py3-none-any.whl.metadata (22 kB)
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl.metadata (7.2 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Downloading openai-1.40.6-py3-none-any.whl (361 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m361.3/361.3 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K   [90m━