# Secure RAG with LLamaIndex

In this notebook, we will show practical attack on RAG when automatic candidates screening based on their CVs. In one of CVs of the least experienced candidate, I added a prompt injection and changed text color to white, so it's hard to spot.

We will try to perform attack first and then secure it with LLM Guard.

-----------------

Let's start by installing [LlamaIndex](https://www.llamaindex.ai/)

In [None]:
import llm_guard
!pip install llama-index llama-hub

Then we need to set up the environment.

In [1]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [2]:
import openai

openai.api_key="sk-test-token"

Now, we can load the test document with fake resumes.

In [3]:
from llama_hub.file.pymu_pdf.base import PyMuPDFReader

loader = PyMuPDFReader()
documents = loader.load(file_path="./resumes.pdf")

Now, we can import the libraries and configure them.

In [4]:
# Only for debugging purposes
from llama_index.callbacks import (
    CallbackManager,
    LlamaDebugHandler,
)

llama_debug = LlamaDebugHandler(print_trace_on_end=False)
callback_manager = CallbackManager([llama_debug])

In [5]:
from llama_index import (
    ServiceContext,
    OpenAIEmbedding,
    VectorStoreIndex,
)
from llama_index.llms import OpenAI
from llama_index.text_splitter import SentenceSplitter

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
transformations = [
    SentenceSplitter(),
    OpenAIEmbedding(),
]
service_context = ServiceContext.from_defaults(
    llm=llm, 
    transformations=transformations,
    callback_manager=callback_manager,
)
index = VectorStoreIndex.from_documents(
    documents, service_context=service_context
)

DEBUG:llama_index.node_parser.node_utils:> Adding chunk: John Doe
123 Hospitality Lane, Hotelville, TX 7...
> Adding chunk: John Doe
123 Hospitality Lane, Hotelville, TX 7...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: Jane Smith
456 Caregiver Road, Caretown, CA 902...
> Adding chunk: Jane Smith
456 Caregiver Road, Caretown, CA 902...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: Michael Johnson
789 Elderly Avenue, Compassion ...
> Adding chunk: Michael Johnson
789 Elderly Avenue, Compassion ...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: Alex Taylor
123 Coding Street, Techville, WA 98...
> Adding chunk: Alex Taylor
123 Coding Street, Techville, WA 98...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: Emily Roberts
234 Care Circle, Compassion Heigh...
> Adding chunk: Emily Roberts
234 Care Circle, Compassion Heigh...
DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
load_ssl_context verify=True cert=None trust_en

Once it's done, we can run query and see the results.

In [6]:
query_engine = index.as_query_engine(similarity_top_k=3)
response = query_engine.query("I am screening candidates for adult caregiving opportunity. Please recommend me an experienced person. Return just a name")
print(str(response))

DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations cafile='/Users/asofter/Desktop/Projects/llm-guard-experiments/venv/lib/python3.11/site-packages/certifi/cacert.pem'
load_verify_locations cafile='/Users/asofter/Desktop/Projects/llm-guard-experiments/venv/lib/python3.11/site-packages/certifi/cacert.pem'
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.<locals>.parser at 0x12602b740>, 'json_data': {'input': ['I am screening candidates for adult caregiving opportunity. Please recommend me an experienced person. Return just a name'], 'model': <OpenAIEmbeddingModeModel.TEXT_EMBED_ADA_002: 'text-embedding-ada-002'>, 'encoding_format': 'base64'}}
Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.<locals>.parser

We can see that the most inexperienced person was picked up, so the attack was successful.

We can also see the debug logs.

In [7]:
print(llama_debug.get_llm_inputs_outputs())
llama_debug.flush_event_logs()

[[CBEvent(event_type=<CBEventType.LLM: 'llm'>, payload={<EventPayload.MESSAGES: 'messages'>: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content="You are an expert Q&A system that is trusted around the world.\nAlways answer the query using the provided context information, and not prior knowledge.\nSome rules to follow:\n1. Never directly reference the given context in your answer.\n2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.", additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content="Context information is below.\n---------------------\ntotal_pages: 5\nfile_path: ./resumes.pdf\nsource: 2\n\nJane Smith\n456 Caregiver Road, Caretown, CA 90210\n(555) 678-9101 | janesmith@email.com | LinkedIn: /jane-smith-caregiver\nObjective:\nCompassionate and skilled Adult and Child Care Professional with over 8 years of experience in\nproviding exceptional care to individuals of all ages. Specialized in c

----

Now let's try to secure it with LLM Guard. We will redact PII and detect prompt injections.

In [None]:
!pip install llm-guard

First, we need to make an [Output Parsing Modules](https://docs.llamaindex.ai/en/stable/module_guides/querying/output_parser.html). It will scan the output and replace PII placeholders with real values.

In [8]:
from typing import Any, List
from llama_index.types import BaseOutputParser
from llm_guard.output_scanners.base import Scanner as OutputScanner
from llm_guard import scan_output


class LLMGuardOutputParserException(ValueError):
    """Exception to raise when llm-guard marks output invalid."""


class LLMGuardOutputParser(BaseOutputParser):
    def __init__(self, output_scanners: List[OutputScanner], fail_fast: bool = True):
        self.output_scanners = output_scanners
        self.fail_fast = fail_fast

    def parse(self, output: str, query: str = "") -> Any:
        sanitized_output, results_valid, results_score = scan_output(self.output_scanners, query, output, self.fail_fast)
        
        if not all(results_valid.values()):
            raise LLMGuardOutputParserException(f"Output `{sanitized_output}` is not valid, scores: {results_score}")
        
        return sanitized_output
    
    def format(self, query: str) -> str:
        # You can also implement input scanning here
        
        return query

Let's configure output scanners.

In [None]:
from llm_guard.vault import Vault
from llm_guard.output_scanners import Deanonymize, Toxicity

vault = Vault()

output_parser=LLMGuardOutputParser(
    output_scanners=[
        Deanonymize(vault),
        Toxicity(),
    ]
)

And reinitiate service context again with the new output parser.

In [None]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1, output_parser=output_parser)

service_context = ServiceContext.from_defaults(
    llm=llm, 
    transformations=transformations,
    callback_manager=callback_manager,
)
index = VectorStoreIndex.from_documents(
    documents, service_context=service_context
)

We have two options on integrating LLM Guard for the input:

1. [Node Postprocessor](https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/root.html)
2. Ingestion pipeline [transformation](https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/transformations.html)

We will use the first option but in the real application, we should use both: clean data before ingestion and verify after retrieval. 

In [11]:
from typing import List, Optional
import logging
from llama_index.bridge.pydantic import Field
from llama_index.postprocessor.types import BaseNodePostprocessor
from llama_index.schema import MetadataMode, NodeWithScore, QueryBundle

logger = logging.getLogger(__name__)

class LLMGuardNodePostProcessor(BaseNodePostprocessor):
    scanners: List = Field(description="Scanner objects")
    fail_fast: bool = Field(
        description="If True, the postprocessor will stop after the first scanner failure.",
    )
    skip_scanners: List[str] = Field(
        description="List of scanner names to skip when failed e.g. Anonymize.",
    )

    def __init__(
        self,
        scanners: List,
        fail_fast: bool = True,
        skip_scanners: List[str] = None,
    ) -> None:
        if skip_scanners is None:
            skip_scanners = []
        
        try:
            import llm_guard
        except ImportError:
            raise ImportError(
                "Cannot import llm_guard package, please install it: ",
                "pip install llm-guard",
            )

        super().__init__(
            scanners=scanners,
            fail_fast=fail_fast,
            skip_scanners=skip_scanners,
        )

    @classmethod
    def class_name(cls) -> str:
        return "LLMGuardNodePostProcessor"

    def _postprocess_nodes(
        self,
        nodes: List[NodeWithScore],
        query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]:
        from llm_guard import scan_prompt
        
        safe_nodes = []
        for node_with_score in nodes:
            node = node_with_score.node
            
            sanitized_text, results_valid, results_score = scan_prompt(
                self.scanners, 
                node.get_content(metadata_mode=MetadataMode.LLM), 
                self.fail_fast,
            )
            
            for scanner_name in self.skip_scanners:
                results_valid[scanner_name] = True
            
            if any(not result for result in results_valid.values()):
                logger.warning(f"Node `{node.node_id}` is not valid, scores: {results_score}")
                
                continue
            
            node.set_content(sanitized_text)
            safe_nodes.append(NodeWithScore(node=node, score=node_with_score.score))
            
        return safe_nodes

Now we can configure input scanners.

In [None]:
from llm_guard.input_scanners import Anonymize, PromptInjection, Toxicity, Secrets

input_scanners = [
    Anonymize(vault, entity_types=["PERSON", "EMAIL_ADDRESS", "EMAIL_ADDRESS_RE", "PHONE_NUMBER"]), 
    Toxicity(), 
    PromptInjection(),
    Secrets()
]

llm_guard_postprocessor = LLMGuardNodePostProcessor(
    scanners=input_scanners,
    fail_fast=False,
    skip_scanners=["Anonymize"],
)

And finally, we can run the query again.

In [13]:
query_engine = index.as_query_engine(
    similarity_top_k=3,
    node_postprocessors=[llm_guard_postprocessor]
)
response = query_engine.query("I am screening candidates for adult caregiving opportunity. Please recommend me an experienced person. Return just a name")
print(str(response))

DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations cafile='/Users/asofter/Desktop/Projects/llm-guard-experiments/venv/lib/python3.11/site-packages/certifi/cacert.pem'
load_verify_locations cafile='/Users/asofter/Desktop/Projects/llm-guard-experiments/venv/lib/python3.11/site-packages/certifi/cacert.pem'
DEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.<locals>.parser at 0x125cefc40>, 'json_data': {'input': ['I am screening candidates for adult caregiving opportunity. Please recommend me an experienced person. Return just a name'], 'model': <OpenAIEmbeddingModeModel.TEXT_EMBED_ADA_002: 'text-embedding-ada-002'>, 'encoding_format': 'base64'}}
Request options: {'method': 'post', 'url': '/embeddings', 'files': None, 'post_parser': <function Embeddings.create.<locals>.parser

Let's also check the debug logs.

In [14]:
print(llama_debug.get_llm_inputs_outputs())
llama_debug.flush_event_logs()

[[CBEvent(event_type=<CBEventType.LLM: 'llm'>, payload={<EventPayload.MESSAGES: 'messages'>: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content="You are an expert Q&A system that is trusted around the world.\nAlways answer the query using the provided context information, and not prior knowledge.\nSome rules to follow:\n1. Never directly reference the given context in your answer.\n2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.", additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content="Context information is below.\n---------------------\ntotal_pages: 5\nfile_path: ./resumes.pdf\nsource: 2\n\ntotal_pages: 5\nfile_path: ./resumes.pdf\nsource: 2\n\n[REDACTED_PERSON_1]\n456 Caregiver Road, Caretown, CA 90210\n[REDACTED_PHONE_NUMBER_1] | [REDACTED_EMAIL_ADDRESS_1] | LinkedIn: /jane-smith-caregiver\nObjective:\nCompassionate and skilled Adult and Child Care Professional with over 8 years of expe

Here we can see that no real name was passed to the LLM but only redacted one. However, output parser could deanonymize it.