## With guardrails

A search application with guardrails. Only the display code is changed.

In [1]:
#%pip install --quiet llama-index llama-index-retrievers-bm25 llama-index-llms-anthropic anthropic llama-index-llms-google-genai

In [2]:
GEMINI="gemini-2.0-flash"
#OPENAI="gpt-4o-mini"
CLAUDE="claude-3-7-sonnet-latest"

import os
from dotenv import load_dotenv
load_dotenv("../keys.env")
assert os.environ["GEMINI_API_KEY"][:2] == "AI",\
       "Please specify the GEMINI_API_KEY access token in keys.env file"
assert os.environ["ANTHROPIC_API_KEY"][:2] == "sk",\
       "Please specify the ANTHROPIC_API_KEY access token in keys.env file"
#assert os.environ["OPENAI_API_KEY"][:2] == "sk",\
#       "Please specify the OPENAI_API_KEY access token in keys.env file"

## Guardrails

Replace "PII"

In [3]:
## Custom guardrail, to replace all names by something generic
from llama_index.llms.google_genai import GoogleGenAI

def guardrail_replace_names(to_scan: str):
    llm = GoogleGenAI(model=GEMINI,
                      api_key=os.environ["GEMINI_API_KEY"], 
                      temperature=0)
    system_prompt="""
        I will give you a piece of text. In that piece of text, replace any personal names by a generic identifier.
        
        Example:
          Input:
            I met Sally in the store.
          Output:
            I met a woman in the store.
        
        Return only the modified text, with no preamble or special markers.
    """
    sanitized_output = llm.complete(system_prompt + "\n" + to_scan).text.strip()
    no_change = (sanitized_output == to_scan)
    
    return {
        "guardrail_type": "PII Removal",
        "activated": not no_change,
        "should_stop": False,
        "sanitized_output": sanitized_output,
    }

guardrail_replace_names("The killer was John Doe")

{'guardrail_type': 'PII Removal',
 'activated': True,
 'should_stop': False,
 'sanitized_output': 'The killer was a man'}

Banned topics

In [4]:
def guardrail_banned_topics(to_scan: str):
    banned_topics = [
        "religion", "politics", "sexual innuendo"
    ]
    llm = GoogleGenAI(model=GEMINI,
                      api_key=os.environ["GEMINI_API_KEY"], 
                      temperature=0)
    system_prompt=f"""
        I will give you a piece of text. Check whether the text touches on any of these topics.
        
        {banned_topics}
        
        Return True or False, with no preamble or special markers.
        Text:
    """
    response = llm.complete(system_prompt + "\n" + to_scan).text.strip()
    is_banned = (response == "True")
   
    return {
        "guardrail_type": "Banned Topic",
        "activated": is_banned,
        "should_stop": is_banned,
        "sanitized_output": to_scan,
    }

guardrail_banned_topics("Are priests allowed to marry?")

{'guardrail_type': 'Banned Topic',
 'activated': True,
 'should_stop': True,
 'sanitized_output': 'Are priests allowed to marry?'}

In [5]:
def apply_guardrails(to_scan, scanners):
    should_stop = False
    triggered_scanners = []  # Store results from triggered scanners

    sanitized_output = to_scan # start with the original string
    for scanner in scanners:
        result = scanner(sanitized_output)

        if result[
            "activated"
        ]:  # Check if the scanner found a threat (activated=True)
            should_stop = result["should_stop"]  # Set detected to True if any scanner triggers
            triggered_scanners.append(result)  # all activated scanners
            sanitized_output = result["sanitized_output"] # Update the query

    result = {
        "should_stop": should_stop,
        "triggered": triggered_scanners,
        "sanitized": sanitized_output
    }
    return result

apply_guardrails("Are parish priests expected to be role models?", [guardrail_replace_names, guardrail_banned_topics])

{'should_stop': True,
 'triggered': [{'guardrail_type': 'Banned Topic',
   'activated': True,
   'should_stop': True,
   'sanitized_output': 'Are parish priests expected to be role models?'}],
 'sanitized': 'Are parish priests expected to be role models?'}

In [6]:
apply_guardrails("Is Mr. Darcy a good role model?", [guardrail_replace_names, guardrail_banned_topics])

{'should_stop': False,
 'triggered': [{'guardrail_type': 'PII Removal',
   'activated': True,
   'should_stop': False,
   'sanitized_output': 'Is a man a good role model?'}],
 'sanitized': 'Is a man a good role model?'}

# Guardrails around RAG


In [7]:
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.base.response.schema import Response

class GuardedQueryEngine(RetrieverQueryEngine):
    def __init__(self, query_engine: RetrieverQueryEngine):
        self._query_engine = query_engine
    
    def query(self, query):
        # apply guardrails to inputs
        gd = apply_guardrails(query,
                              [guardrail_replace_names, guardrail_banned_topics])
        if not gd["should_stop"]:
            print(f"Modified Query: {gd['sanitized']}")
            query_response = self._query_engine.query(gd["sanitized"])     
            gd = apply_guardrails(str(query_response), [guardrail_banned_topics])
            if not gd["should_stop"]:
                return Response(gd["sanitized"],
                                source_nodes=query_response.source_nodes)
        return Response(str(gd))

## Basic RAG application

This is the application that we want to protect.

In [8]:
from basic_rag import build_query_engine, print_response_to_query
query_engine = build_query_engine(CLAUDE, ["https://www.gutenberg.org/cache/epub/31100/pg31100.txt"], 100) # Jane Austen

2025-05-13 00:56:26,049 - INFO - Indexer initialized
2025-05-13 00:56:26,050 - INFO - Loading https://www.gutenberg.org/cache/epub/31100/pg31100.txt from cache
2025-05-13 00:56:26,121 - INFO - Cleaned Gutenberg text: removed 887 chars from start, 18518 chars from end
2025-05-13 00:56:26,122 - INFO - Successfully loaded text from https://www.gutenberg.org/cache/epub/31100/pg31100.txt.
2025-05-13 00:56:51,954 - INFO - Successfully loaded text from b395ceb2-141e-41a8-a5bd-fb1b9f152690 -- 24434 nodes created.
2025-05-13 00:56:53,735 - DEBUG - Building index from IDs objects


In [9]:
# wrap it in Guardrails
gd_query_engine = GuardedQueryEngine(query_engine)

### Good query

In [10]:
print_response_to_query(gd_query_engine, "Can you give advice without being resented for it?")

2025-05-13 00:56:54,388 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:56:54,391 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:56:54,995 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:56:54,997 - INFO - AFC remote call 1 is done.
2025-05-13 00:56:55,232 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:56:55,235 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:56:55,590 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:56:55,593 - INFO - AFC remote call 1 is done.


Modified Query: Can you give advice without being resented for it?


2025-05-13 00:57:00,294 - INFO - HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
2025-05-13 00:57:00,567 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:57:00,570 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:57:00,894 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:57:00,896 - INFO - AFC remote call 1 is done.


Yes, it is possible to give advice without being resented for it, as shown in the example where Elizabeth thanked her aunt for "the kindness of her hints" and they parted in what was described as "a wonderful instance of advice being given on such a point, without being resented."

However, the manner in which advice is offered seems important. When advice is perceived as kind or when it respects the other person's autonomy, it appears to be better received. Conversely, there are instances where people may resist advice, particularly when they feel their independence is being challenged, as suggested by the reference to someone being "wilful and perverse" and deciding for themselves "without any consideration or deference for those who have surely some right to guide you."

In some situations, like when Elinor was asked for advice, she declined to give it directly, instead suggesting that the person's "own judgment must direct you," which represents another approach to handling advice-

### Query that should be rejected

Because it touches on religion which is (let's assume) a prohibited topic.

In [11]:
print_response_to_query(gd_query_engine, "Are parish priests expected to be role models?")

2025-05-13 00:57:01,134 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:57:01,137 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:57:01,622 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:57:01,624 - INFO - AFC remote call 1 is done.
2025-05-13 00:57:01,856 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:57:01,861 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:57:02,145 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:57:02,149 - INFO - AFC remote call 1 is done.


{'should_stop': True, 'triggered': [{'guardrail_type': 'Banned Topic', 'activated': True, 'should_stop': True, 'sanitized_output': 'Are parish priests expected to be role models?'}], 'sanitized': 'Are parish priests expected to be role models?'}


**Sources**:


### Query that should be modified.

Let's say that queries that reference people by name should be made more generic

In [12]:
print_response_to_query(gd_query_engine, "Would Mr. Darcy be an appealing match if he were not wealthy?")

2025-05-13 00:57:02,398 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:57:02,403 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:57:02,915 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:57:02,920 - INFO - AFC remote call 1 is done.
2025-05-13 00:57:03,178 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:57:03,183 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:57:03,557 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:57:03,561 - INFO - AFC remote call 1 is done.


Modified Query: Would a man be an appealing match if he were not wealthy?


2025-05-13 00:57:08,444 - INFO - HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
2025-05-13 00:57:08,678 - INFO - HTTP Request: GET https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash "HTTP/1.1 200 OK"
2025-05-13 00:57:08,681 - INFO - AFC is enabled with max remote calls: 10.
2025-05-13 00:57:09,082 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
2025-05-13 00:57:09,084 - INFO - AFC remote call 1 is done.


From the information provided, wealth appears to be a significant factor in determining a desirable match in this society. There are multiple references to wealth in relation to marriages - such as a woman who "married a very wealthy man" and a family described as "all rich together," with mention of "fifty thousand pounds." Additionally, there's mention of the Allens being "wealthy and childless" as "absolute facts" that seem to factor into someone's consideration.

However, there's also indication that other factors matter in relationships, as shown by the hope that a couple would be "united by mutual affection" and that "their dispositions were as exactly fitted to make them blessed in each other." In one case, a match is described as "quite good enough" even though it was eclipsed by another option that appears to have been more financially advantageous.

So while wealth seems to be a highly valued attribute in potential matches, there are suggestions that other qualities like mutu