# Protect your LangChain RAG Apps with ChainGuard

_Authored by: [Lakera](https://huggingface.co/lakera)_

In this tutorial we'll cover how you can protect your Retrieval Augmented Generation (RAG) LangChain apps from [indirect prompt injection](https://www.lakera.ai/blog/guide-to-prompt-injection#direct-prompt-injection-vs-indirect-prompt-injection) with [Lakera Guard](https://lakera.ai/), following the [LangChain RAG Quickstart tutorial](https://python.langchain.com/docs/use_cases/question_answering/quickstart/).

There's also a [video recording of this tutorial available on YouTube](https://youtu.be/MdZ6XnViY3o)


## Dependencies

First we'll start by installing and importing the necessary dependencies


### Installation


In [None]:
%%capture

%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai chromadb bs4 python-dotenv

### Importing


In [None]:
import bs4
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

### Environment Variables

Because this tutorial leverages Lakera Guard and OpenAI, you'll need a [Lakera Guard API key](https://platform.lakera.ai/account/api-keys) and an [OpenAI API key](https://platform.openai.com/api-keys).

You can load them into your environment by creating a `.env` file in the same directory as this notebook or [export them in your local environment](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety#h_a1ab3ba7b2). It is possible, but not recommended to hardcode your API keys directly into a cell in the notebook.

If you're using a `.env` file, it should look like this:

```bash
LAKERA_GUARD_API_KEY="<your-lakera-api-key>"
OPENAI_API_KEY="<your-openai-api-key>"
```


#### Loading Environment Variables


In [None]:
# load LAKERA_GUARD_API_KEY and OPENAI_API_KEY from .env
from dotenv import load_dotenv

load_dotenv()

## Context URL

In order to demonstrate prompt injection, we're going to replace the URL from the LangChain tutorial with a specific URL that points to a page where we've included an example indirect prompt injection payload.

You can swap between the two `CONTEXT_URL` values below to see the difference in the generated responses.


In [None]:
# CONTEXT_URL = "http://lakeraai.github.io/chainguard/demos/benign-demo-page/" # benign page w/o prompt injection
CONTEXT_URL = "http://lakeraai.github.io/chainguard/demos/indirect-prompt-injection/"  # contains indirect prompt injection

**Note**: When you change the URL, you'll need to restart the notebook kernel and run the cells again in order to avoid the previous URL's context being cached in the local [Chroma vector database](https://www.trychroma.com/).


## Vector Database

Now we'll initiatilize our vector database, load the context from our URL, chunk the context, and insert the embeddings into our vector database.


In [None]:
loader = WebBaseLoader(
    web_paths=([CONTEXT_URL]),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()

## Model & Prompt Template

Next, we'll load the [RAG prompt template from the LangChain Hub](https://smith.langchain.com/hub/rlm/rag-prompt) and configure our LLM - in this case we're using `gpt-3.5-turbo`.


In [None]:
prompt = hub.pull("rlm/rag-prompt")

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)


# combine context from the vector database into a single string
# we can concatenate and pass  as the context to the prompt template
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

## Define the Chain

Now we'll define our chain, which will consist of the following steps:

1. **Retrieve**: Retrieve the most relevant context chunks from the vector database
2. **Prompt**: Generate a prompt using the RAG prompt template and the retrieved context chunks
3. **Generate**: Generate a response using the prompt and the LLM
4. **Parse**: Parse the generated response and return the answer to the user's question


In [None]:
rag_chain = (
    # Retrieve
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    # Prompt
    | prompt
    # Generate
    | llm
    # Parse
    | StrOutputParser()
)

## Invoke the Chain

Finally, we'll invoke the chain with a question related to the content from our `CONTEXT_URL` and display the generated response.


In [None]:
rag_chain.invoke("What is Lakera Guard?")

You might notice a link in the response, and if you navigate to our [example indirect prompt injection page](http://lakeraai.github.io/chainguard/demos/indirect-prompt-injection/), you might not see that same link anywhere on the page. Take a minute to explore the page and it's content and see if you can find the indirect injeciton payload.

<details>
<summary>Having trouble finding the paylaod?</summary>
<p>Try selecting all the text on the page with <kbd>CMD</kbd>+<kbd>A</kbd> or <kbd>CTRL</kbd>+<kbd>A</kbd> or inspect the page's source code and see if you can find the hidden text.</p>
</details>


## Inspecting the Chain

In this example, we can see the content in the URL that's causing this issue, but if we were working with many documents for context instead of just one, we might want to be able to see what's going on in real-time as our chain executes and our prompt is constructed.

Let's define a function that will allow us to inspect the chain as it executes and then create a new chain that uses this function to log the chain's progress as it executes.


In [None]:
# we'll use this to expose the content that's flowing through a step in the chain
def chain_inspector(content):
    # output the content that's been passed to this step
    print("Content:")
    print(content)

    # return the content so that the chain can continue exucuting
    return content


# use LangChain's RunnableLambda to wrap the function so that it can be used in the chain
chain_logger = RunnableLambda(chain_inspector)

### Logged Chain

Now we can add a step to the chain anywhere we want to see the content that's being passed from one step to the next.


In [None]:
logged_rag_chain = (
    # inspect the initial input to the chain
    chain_logger
    # Retrieve
    | {"context": retriever | format_docs, "question": RunnablePassthrough()}
    # inspect the context after it's been retrieved
    | chain_logger
    # Prompt
    | prompt
    # inspect the prompt after it's been generated
    | chain_logger
    # Generate
    | llm
    # Parse
    | StrOutputParser()
)

### Invoke the Logged Chain

Now we can invoke our chain with logging and inspect the content as it's passed from one step to the next.


In [None]:
logged_rag_chain.invoke("What is Lakera Guard?")

## Protecting your LangChain Apps with ChainGuard

[ChainGuard](https://lakeraai.github.io/chainguard/) allows you to secure Generative AI applications and agents built with LangChain from prompt injection, jailbreaks, and other risks with Lakera Guard.


### Install ChainGuard


In [None]:
%%capture

%pip install --upgrade lakera-chainguard

### Import ChainGuard

We'll also import the `LakeraGuardError` exception class that ChainGuard will raise when it detects prompt injection.


In [None]:
from lakera_chainguard import LakeraChainGuard, LakeraGuardError

### Initialize ChainGuard


In [None]:
chain_guard = LakeraChainGuard()

### Protecting the Chain


In [None]:
def indirect_prompt_injection_detector(input):
    # detect prompt injections in the RAG context
    chain_guard.detect(input["context"])

    return input


# use LangChain's RunnableLambda to wrap the function so that it can be used in the chain
detect_injections = RunnableLambda(indirect_prompt_injection_detector)

### Guarded Chain

Our guarded chain will have the following steps:

1. **Retrieve**: Retrieve the most relevant context chunks from the vector database
2. **Guard**: Protect the chain from indirect prompt injection with Lakera Guard
3. **Prompt**: Generate a prompt using the RAG prompt template and the retrieved context chunks
4. **Generate**: Generate a response using the prompt and the LLM
5. **Parse**: Parse the generated response and return the answer to the user's question


In [None]:
guarded_rag_chain = (
    # Retrieve
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    # Guard
    | detect_injections
    # Prompt
    | prompt
    # Generate
    | llm
    # Parse
    | StrOutputParser()
)

### Invoke the Guarded Chain

ChainGuard will raise an Exception if it detects any prompt injection in the generated response, so we can just add it as a step in the chain, wrap the invocation in a `try/except`, and catch the `LakeraGuardError` exception.


In [None]:
try:
    response = guarded_rag_chain.invoke("What is Lakera Guard?")

    # Jupyter Notebooks don't output from a `try` block, so we need to directly print the response
    print(response)
except LakeraGuardError as e:
    print(e)
    print(e.lakera_guard_response)

If you'd like to see how the guarded chain still executes when there's no prompt injection, you can try changing the `CONTEXT_URL` to the benign URL, restarting the kernel, and re-running the cells in this notebook.

**Note**: It's important to restart the kernel when you change the `CONTEXT_URL` to clear out the ChromaDB cache or the existing embeddings will still be present in the vector database and might be used when generating the response.


## Learn More About ChainGuard

ChainGuard is the eaisest way to protect your LangChain applications and agents from prompt injection, jailbreaks, and more with [Lakera Guard](https://lakera.ai/). It provides a simple interface for using any of Lakera Guard's detectors and options for customizing how ChainGuard reacts when Guard flags an input, like raising a warning instead of an exception.

There are more examples of how you can use ChainGuard in your Generative AI applications in our [guide to guarding your LangChain apps with Lakera](https://www.lakera.ai/blog/langchain-lakera-guard-integration).

If you'd like to learn more about integrating ChainGuard, the [ChainGuard documentation](https://lakeraai.github.io/chainguard/) includes [tutorials](https://lakeraai.github.io/chainguard/#tutorials), [how-to guides](https://lakeraai.github.io/chainguard/#how-to-guides), and an [API reference](https://lakeraai.github.io/chainguard/reference/).

Want to help protect LangChain apps? [Contribute to ChainGuard](https://github.com/lakeraai/chainguard/blob/main/CONTRIBUTING.md).
