# Pangea AI Security Tools

- Defend against prompt injection.
- Prevent the leakage and exposure of sensitive information, including:
  - Personally Identifiable Information (PII)
  - Protected Health Information (PHI)
  - Financial data
  - Secrets
  - Intellectual property
  - Profanity
- Remove malicious content from input and output, such as IP addresses, domains, and URLs.
- Monitor user inputs and model responses to enable comprehensive threat analysis, auditing, and compliance.

## Prerequisites

### OpenAI API key

The examples below use OpenAI models. To run them, get your [OpenAI API key](https://platform.openai.com/api-keys) and export it as an environment variable:

- `OPENAI_API_KEY`

### Pangea project

Sign up for a free [Pangea account](https://pangea.cloud/signup) to host the security services used in these tools.

After creating your account, click **Skip** on the **Get started with a common service** screen. This will take you to the Pangea User Console, where you can enable the individual services required for the tools.

To learn more about Pangea services and their capabilities, visit the Pangea website:
- [AI Guard](https://pangea.cloud/services/ai-guard/)
- [Redact](https://pangea.cloud/services/redact/)
- [Domain Intel](https://pangea.cloud/services/domain-intel/reputation/)
- [IP Intel](https://pangea.cloud/services/ip-intel/reputation/)
- [URL Intel](https://pangea.cloud/services/url-intel/)
- [Secure Audit Log](https://pangea.cloud/services/secure-audit-log/)

## Installation

In [None]:
%%bash
pip install langchain-community langgraph langchain-openai pangea-sdk==5.2.0b2

## Tools

Run Pangea tools using agents or invoke them as a Runnable within chains.

### AI Guard

#### Enable the AI Guard service

1. Open your [Pangea User Console](https://console.pangea.cloud).  
2. Click **AI Guard** in the left-hand sidebar and follow the prompts, accepting all defaults.  
3. When finished, click **Done** and then **Finish**. The enabled service will be marked with a green dot.
4. On the service **Overview** page, capture the **Default Token** and **Domain** values by clicking their respective tiles. Save these values in the appropriate environment variables:
    - `PANGEA_DOMAIN`
    - `PANGEA_AI_GUARD_TOKEN`

For more information on setting up the underlying service and its usage, visit the [AI Guard documentation](https://pangea.cloud/docs/ai-guard/).

#### Set up the environment

In [5]:
import os
from dotenv import load_dotenv
from pydantic import SecretStr

load_dotenv()

openai_api_key = SecretStr(os.getenv("OPENAI_API_KEY"))
pangea_domain = os.getenv("PANGEA_DOMAIN")
pangea_ai_guard_token = SecretStr(os.getenv("PANGEA_AI_GUARD_TOKEN"))

#### Define the model

In [6]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

model = ChatOpenAI(model_name="gpt-4o-mini", openai_api_key=openai_api_key.get_secret_value(), temperature=0)

#### Use AI Guard tool with an agent

The example below demonstrates how the `pangea_llm_response_guard` recipe can be used to defang malicious IP addresses, domains, and URLs that may be included in an LLM response to the user. You can configure the exact defanging and redaction rules by updating the service recipes in your [Pangea User Console](https://console.pangea.cloud/service/ai-guard/recipes).

When creating an instance of AI Guard, apply a recipe to sanitize text. In the following example, use the [LLM Response](https://pangea.cloud/docs/ai-guard/recipes#llm-response) (`pangea_llm_prompt_guard`) recipe to prevent sensitive or high-risk information from being returned to the user. This recipe can:
- Defang malicious links (e.g., IPs, URLs, domains).
- Redact specific personally identifiable information (PII) and secrets in the prompt, based on the rules defined in the recipe.

Recipes can be customized by adding, removing, or modifying rules. You can also discover, create, and configure additional recipes in your [Pangea User Console](https://console.pangea.cloud/service/ai-guard/recipes).

In [18]:
from langchain_community.tools.pangea.ai_guard import PangeaAIGuard, PangeaConfig

pangea_config = PangeaConfig(domain=pangea_domain)
pangea_ai_guard_tool = PangeaAIGuard(token=pangea_ai_guard_token, config=pangea_config, recipe="pangea_llm_response_guard")

The example below demonstrates how the `pangea_llm_response_guard` recipe can be used to defang malicious IP addresses, domains, and URLs that may be included in an LLM response to the user. You can configure the exact defanging and redaction rules by updating the service recipes in your [Pangea User Console](https://console.pangea.cloud/service/ai-guard/recipes).

In the example data, IPs are mixed with those listed on the [IPsum Threat Intelligence Feed](https://github.com/stamparm/ipsum) site. The AI Guard tool defangs IP addresses it identifies as dangerous, reducing the risk of users inadvertently using them.

The pre-built agent is instructed via a system message to apply the service recipe to the final result. Alternatively, you can create your own agent and implement a more deterministic approach to ensure the LLM's response is thoroughly sanitized by the service.

In [19]:
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool

@tool
def search_tool(data):
    """Call to perform search"""

    return """
    47.84.32.175
    37.44.238.68
    47.84.73.221
    47.236.252.254
    34.201.186.27
    52.89.173.88
    """

tools = [search_tool, pangea_ai_guard_tool]

query = """
Hi, I am Bond, James Bond. I monitor IPs found in MI6 network traffic.
Please find me the most recent ones, you copy?
"""

system_message="Always use AI Guard before your final response to keep it safe for the user."

langgraph_agent_executor = create_react_agent(model, tools, state_modifier=system_message)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

The most recent IPs found in MI6 network traffic are: 47[.]84[.]32[.]175, 37[.]44[.]238[.]68, 47[.]84[.]73[.]221, 47[.]236[.]252[.]254, 34.201.186.27, 52.89.173.88.


#### Use AI Guard as a Runnable in chains

In the following example, use the [LLM Prompt Pre-Send](https://pangea.cloud/docs/ai-guard/recipes#llm-prompt-pre-send) (`pangea_llm_prompt_guard`) recipe to prevent sensitive or high-risk information from being submitted to a public LLM, such as ChatGPT. This recipe can:
- Defang malicious links (e.g., IPs, URLs, domains).
- Redact specific personally identifiable information (PII) and secrets in the prompt, based on the rules defined in the recipe.

Recipes can be customized by adding, removing, or modifying rules. You can also discover, create, and configure additional recipes in your [Pangea User Console](https://console.pangea.cloud/service/ai-guard/recipes).

In [20]:
from langchain_community.tools.pangea.ai_guard import PangeaAIGuard, PangeaConfig

pangea_config = PangeaConfig(domain=pangea_domain)
pangea_ai_guard_tool = PangeaAIGuard(token=pangea_ai_guard_token, config=pangea_config, recipe="pangea_llm_prompt_guard")

The user prompt includes some personally identifiable information.

The chain invokes the AI Guard tool before submitting the user prompt to the LLM.

In [28]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([("human", "{input}")])

query = """
Hi, I am Bond, James Bond. I am looking for a job. Please write me a super short resume.

I am skilled in international espionage, covert operations, and seduction.

Include a contact header:
Email: j.bond@mi6.co.uk
Phone: +44 20 0700 7007
Address: Universal Exports, 85 Albert Embankment, London, United Kingdom
"""

from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | pangea_ai_guard_tool
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

Sure! Here’s a concise resume for you:

---

**[Your Name]**  
Email: <EMAIL_ADDRESS>  
Phone: +44 20 0700 7007  
Address: Universal Exports, 85 *****************, London, United Kingdom  

---

**Objective**  
Dynamic and skilled professional seeking a challenging position in international espionage and covert operations.

---

**Skills**  
- Expertise in international espionage  
- Proficient in covert operations  
- Exceptional skills in seduction and interpersonal communication  

---

**Experience**  
- Conducted high-stakes intelligence operations in various global locations.  
- Developed and maintained covert relationships to gather critical information.  
- Successfully executed missions requiring discretion and strategic planning.  

---

**Education**  
- [Your Degree/Field of Study]  
- [Your University/Institution]  

---

**References**  
Available upon request.

--- 

Feel free to fill in your name and any additional details as needed!


Note that only some personally identifiable information has been replaced or masked by the AI Guard tool. To apply stricter redaction rules, update the service recipes in your [Pangea User Console](https://console.pangea.cloud/service/ai-guard/recipes).

The same chain, without the AI Guard protection, will submit sensitive information to the LLM, providing it with personal context.

In [29]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

**James Bond**  
Email: j.bond@mi6.co.uk  
Phone: +44 20 0700 7007  
Address: Universal Exports, 85 Albert Embankment, London, United Kingdom  

---

**Objective**  
Dynamic and resourceful professional seeking a challenging position in international security and intelligence.

**Skills**  
- **International Espionage**: Extensive experience in gathering intelligence and conducting undercover operations across various global locations.  
- **Covert Operations**: Proven track record in executing high-stakes missions with precision and discretion.  
- **Seduction & Negotiation**: Exceptional interpersonal skills with a talent for building rapport and influencing key stakeholders.  

**Experience**  
- **Secret Agent, MI6**  
  - Conducted numerous successful missions, neutralizing threats to national security.  
  - Collaborated with international agencies to gather intelligence and thwart criminal organizations.  
  - Trained in advanced combat, surveillance, and technology utilization.

#### Use AI Guard as a standalone tool

You can also call the AI Guard tool directly as needed.

In [1]:
print(pangea_ai_guard_tool.run("Ping me at example@example.com"))
print(pangea_ai_guard_tool.invoke("Take my SSN: 234-56-7890"))

NameError: name 'pangea_ai_guard_tool' is not defined

### Redact Guard

#### Enable the Redact service

1. Open your [Pangea User Console](https://console.pangea.cloud).  
2. Click **Redact** in the left-hand sidebar and follow the prompts, accepting all defaults.  
3. When finished, click **Done** and then **Finish**. The enabled service will be marked with a green dot.
4. On the service **Overview** page, capture the **Default Token**, **Config ID**, and **Domain** values by clicking their respective tiles. Save these values in the appropriate environment variables:
    - `PANGEA_DOMAIN`
    - `PANGEA_REDACT_TOKEN`
    - `PANGEA_REDACT_CONFIG_ID`

For more information on setting up the service and its usage, visit the [Redact documentation](https://pangea.cloud/docs/redact/).

#### Set up the environment

In [2]:
import os
from dotenv import load_dotenv
from pydantic import SecretStr

load_dotenv()

openai_api_key = SecretStr(os.getenv("OPENAI_API_KEY"))
pangea_domain = os.getenv("PANGEA_DOMAIN")
pangea_redact_token = SecretStr(os.getenv("PANGEA_REDACT_TOKEN"))
pangea_redact_config_id = SecretStr(os.getenv("PANGEA_REDACT_CONFIG_ID"))

#### Define the model

In [3]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model_name="gpt-4o-mini", openai_api_key=openai_api_key.get_secret_value(), temperature=0)

#### Instantiate Redact Guard

The Redact service has various rules that can be applied to detect, replace, mask, hash, and/or encrypt sensitive information in the content sent to AI app or returned to the user. You can customize these rules and configure additional ones in your [Pangea User Console](https://console.pangea.cloud/service/redact/rulesets). By default, three rules are enabled;
- Replace IP addresses. 
- Replace Email addresses.
- Replace US Social Security Numbers (SSN).

In [4]:
from langchain_community.tools.pangea.redact_guard import PangeaRedactGuard, PangeaConfig

pangea_config = PangeaConfig(domain=pangea_domain, config_id=pangea_redact_config_id)
pangea_redact_guard_tool = PangeaRedactGuard(token=pangea_redact_token, config=pangea_config)

#### Define the context data and the user query

In this example, we will emulate a helpful HR assistant trained to return employee records.

In [16]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage
from langchain_core.tools import tool

@tool
def search_tool(data):
    """Call to perform HR record search"""

    return """
    Name: Jason Bourne
    Title: Rogue Operative
    Department: Former CIA Black Ops

    Email: j.bourne@unknown.gov
    Social Security Numbers:
    - 234-56-7890
    - 345-67-8901
    - 456-78-9012

    Hobbies:
    - Traveling
    - Using books and rolled-up news papers as weapons
    """

query = """
Hi, I am Jason Bourne. What do you have on me?
"""

#### Use Redact Guard as an agent tool

In the following example, sensitive information is removed from the data returned from the search tool by Redact Guard.

In [17]:
from langgraph.prebuilt import create_react_agent

tools = [search_tool, pangea_redact_guard_tool]

langgraph_agent_executor = create_react_agent(model, tools)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

Here is the information I found on you, Jason Bourne:

- **Title:** Rogue Operative
- **Department:** Former CIA Black Ops
- **Email:** <EMAIL_ADDRESS>
- **Social Security Numbers:** 
  - <US_SSN>
  - <US_SSN>
  - <US_SSN>
- **Hobbies:**
  - Traveling
  - Using books and rolled-up newspapers as weapons

If you need more specific information or have any other questions, feel free to ask!


Note that only some personally identifiable information has been replaced by the Redact Guard tool. To apply stricter redaction rules, update the service recipes in your [Pangea User Console](https://console.pangea.cloud/service/ai-guard/recipes).

Omitting the Redact Guard tool can reveal the sensitive info to LLM and the user.

In [18]:
tools = [search_tool]

langgraph_agent_executor = create_react_agent(model, tools)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

Here is the information I found on you, Jason Bourne:

- **Title:** Rogue Operative
- **Department:** Former CIA Black Ops
- **Email:** j.bourne@unknown.gov
- **Social Security Numbers:**
  - 234-56-7890
  - 345-67-8901
  - 456-78-9012
- **Hobbies:**
  - Traveling
  - Using books and rolled-up newspapers as weapons

If you need more specific information or have any other questions, feel free to ask!


#### Use Redact Guard in a chain

In the following example, Redact Guard will remove sensitive information from the additional context added to the user's query by a RAG system.

In [19]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompt_values import ChatPromptValue

def rag(input: ChatPromptValue) -> ChatPromptValue:
    """
    Emulates a Retrieval-Augmented Generation (RAG) process by appending an employee's HR record to the context of a chain.
    """

    messages = input.to_messages()
    message = SystemMessage(search_tool(query))
    messages.append(message)

    return ChatPromptValue(messages=messages)

# Define a chat prompt template for an HR assistant chain.
prompt = ChatPromptTemplate.from_messages([
  ("human", "{input}"),
  (
    "system", """
    You are an HR assistant.
    Show employees their HR records.
    Don't change anything, just read it back to them.
    Don't hide any sensitive info, it is obfuscated automatically before you receive it.
    """
  )
])

# Define a chain to retrieve relevant information from a RAG system,
# redact sensitive information using Pangea's Redact Guard,
# and respond to the user with the augmented content.
chain = (
  prompt
  | rag
  | pangea_redact_guard_tool
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

Here are your HR records, Jason:

- **Name:** Jason Bourne
- **Title:** Rogue Operative
- **Department:** Former CIA Black Ops

- **Email:** <EMAIL_ADDRESS>
- **Social Security Numbers:** 
  - <US_SSN>
  - <US_SSN>
  - <US_SSN>

- **Hobbies:**
  - Traveling
  - Using books and rolled-up newspapers as weapons

If you need any further information or assistance, feel free to ask!


The same chain, without the Redact Guard protection, will submit sensitive information to the LLM and potentially return it to the user.

In [20]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

Here are your HR records, Jason:

- **Name:** Jason Bourne
- **Title:** Rogue Operative
- **Department:** Former CIA Black Ops

- **Email:** j.bourne@unknown.gov
- **Social Security Numbers:** 
  - 234-56-7890
  - 345-67-8901
  - 456-78-9012

- **Hobbies:**
  - Traveling
  - Using books and rolled-up newspapers as weapons

If you need any further information or assistance, feel free to ask!


#### Use Redact Guard as a standalone tool

You can also call the Redact Guard tool directly as needed.

In [21]:
print(pangea_redact_guard_tool.run("Ping me at example@example.com"))
print(pangea_redact_guard_tool.invoke("Take my SSN: 234-56-7890"))

Ping me at <EMAIL_ADDRESS>
Take my SSN: <US_SSN>


### Domain Intel Guard

#### Enable the Domain Intel service

1. Open your [Pangea User Console](https://console.pangea.cloud).  
2. Click **Domain Intel** in the left-hand sidebar and follow the prompts, accepting all defaults.  
3. When finished, click **Done** and then **Finish**. The enabled service will be marked with a green dot.
4. On the service **Overview** page, capture the **Default Token** and **Domain** values by clicking their respective tiles. Save these values in the appropriate environment variables:
    - `PANGEA_DOMAIN`
    - `PANGEA_DOMAIN_INTEL_TOKEN`
5. Click **Reputation** in the left-hand sidebar, then select a default provider.

For more information on setting up the underlying service and its usage, visit the [Domain Intel documentation](https://pangea.cloud/docs/domain-intel/).

#### Set up the environment

In [8]:
import os
from dotenv import load_dotenv
from pydantic import SecretStr

load_dotenv()

openai_api_key = SecretStr(os.getenv("OPENAI_API_KEY"))
pangea_domain = os.getenv("PANGEA_DOMAIN")
pangea_domain_intel_token = SecretStr(os.getenv("PANGEA_DOMAIN_INTEL_TOKEN"))

#### Define the model

In [9]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model_name="gpt-4o-mini", openai_api_key=openai_api_key.get_secret_value(), temperature=0)

#### Instantiate Domain Intel Guard

In [10]:
from langchain_community.tools.pangea.domain_intel_guard import PangeaDomainIntelGuard, PangeaConfig

pangea_config = PangeaConfig(domain=pangea_domain)
pangea_domain_intel_guard_tool = PangeaDomainIntelGuard(token=pangea_domain_intel_token, config=pangea_config, threshold=80)

#### Use Domain Intel Guard as an agent tool

The example below demonstrates how Domain Intel Guard can prevent an agent from returning malicious information introduced through data poisoning to the user.

In [12]:
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool

@tool
def search_tool(data):
    """Call to perform email search"""

    """
    Simulates an email search tool finding a fabricated IT-related email.
    This example demonstrates how spam emails can introduce data poisoning,
    leading to an indirect prompt injection attack.
    """

    return """
    Email

    From: IT Admin
    Subject: Update VPN configuration

    Hi team, your friendly IT admin here. Use `neuzeitschmidt.site` for the VPN server address.

    Thanks, and don't forget to plug you computer in!
    """

query = """
Read me back the most recent email from IT.
"""

In this example, the pre-built agent is instructed via a system message to stop execution if any malicious content is found. Alternatively, you can create your own agent and implement a more deterministic approach to ensure the LLM's response is safe for the user.

In [13]:
tools = [search_tool, pangea_domain_intel_guard_tool]

system_message = """
Check for malicious content in your final response.
If malicious content is found, respond with: Malicious content found in the response.
"""

langgraph_agent_executor = create_react_agent(model, tools)

state = langgraph_agent_executor.invoke({"messages": [("human", query), ("system", system_message)]})

print(state["messages"][-1].content)

Malicious content found in the response.


Without the protection of the Domain Intel Guard tool, the same agent might inadvertently return the malicious domain to the user.

In [14]:
tools = [search_tool]

langgraph_agent_executor = create_react_agent(model, tools)

state = langgraph_agent_executor.invoke({"messages": [("human", query), ("system", system_message)]})

print(state["messages"][-1].content)

The most recent email from IT is as follows:

**From:** IT Admin  
**Subject:** Update VPN configuration  

Hi team, your friendly IT admin here. Use `neuzeitschmidt.site` for the VPN server address.

Thanks, and don't forget to plug your computer in!


#### Use Domain Intel Guard as a Runnable in chains

In [16]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompt_values import ChatPromptValue
from langchain_core.messages import SystemMessage

def rag(input: ChatPromptValue) -> ChatPromptValue:

    messages = input.to_messages()

    message = SystemMessage(search_tool(query))

    messages.append(message)

    return ChatPromptValue(messages=messages)

prompt = ChatPromptTemplate.from_messages([("human", "{input}"), ("system", system_message)])

Invoke Domain Intel Guard before the prompt is submitted to the LLM. 

In [17]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | pangea_domain_intel_guard_tool
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

  message = SystemMessage(search_tool(query))


Malicious content found in the response.


Without the protection of the Domain Intel Guard tool, the same chain might inadvertently return the malicious domain to the user.

In [18]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

The most recent email from IT is as follows:

**From:** IT Admin  
**Subject:** Update VPN configuration  

Hi team, your friendly IT admin here. Use `neuzeitschmidt.site` for the VPN server address.

Thanks, and don't forget to plug your computer in!


#### Use Domain Intel Guard as a standalone tool

You can also call the Domain Intel Guard tool directly as needed.

In [10]:
print(pangea_domain_intel_guard_tool.run("neuzeitschmidt.site"))
print(pangea_domain_intel_guard_tool.invoke("neuzeitschmidt.site"))

Malicious domains found in the provided input.
Malicious domains found in the provided input.


### IP Intel Guard

#### Enable the IP Intel service

1. Open your [Pangea User Console](https://console.pangea.cloud).  
2. Click **IP Intel** in the left-hand sidebar and follow the prompts, accepting all defaults.  
3. When finished, click **Done** and then **Finish**. The enabled service will be marked with a green dot.
4. On the service **Overview** page, capture the **Default Token** and **Domain** values by clicking their respective tiles. Save these values in the appropriate environment variables:
    - `PANGEA_DOMAIN`
    - `PANGEA_IP_INTEL_TOKEN`
5. Click **Reputation** in the left-hand sidebar, then select a default provider.

For more information on setting up the underlying service and its usage, see the [IP Intel documentation](https://pangea.cloud/docs/ip-intel/).

#### Set up the environment

In [1]:
import os
from dotenv import load_dotenv
from pydantic import SecretStr

load_dotenv()

openai_api_key = SecretStr(os.getenv("OPENAI_API_KEY"))
pangea_domain = os.getenv("PANGEA_DOMAIN")
pangea_ip_intel_token = SecretStr(os.getenv("PANGEA_IP_INTEL_TOKEN"))

#### Define the model

In [2]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model_name="gpt-4o-mini", openai_api_key=openai_api_key.get_secret_value(), temperature=0)

#### Instantiate IP Intel Guard

In [3]:
from langchain_community.tools.pangea.ip_intel_guard import PangeaIpIntelGuard, PangeaConfig

pangea_config = PangeaConfig(domain=pangea_domain)
pangea_ip_intel_guard_tool = PangeaIpIntelGuard(token=pangea_ip_intel_token, config=pangea_config)

#### Use IP Intel Guard with an agent

The example below demonstrates how IP Intel Guard can prevent an agent from returning malicious information introduced through data poisoning to the user.

In [4]:
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool

@tool
def search_tool(data):
    """Call to perform email search"""

    """
    Simulates an email search tool finding a fabricated IT-related email.
    This example demonstrates how spam emails can introduce data poisoning,
    leading to an indirect prompt injection attack.
    """

    return """
    Email

    From: IT Admin
    Subject: Update Firewall rules

    Hi team, IT admin again. Please whitelist our new office IP, 190.28.74.251, for access to your service.

    Thanks, and keep your computer safe!
    """

query = """
Read me back the most recent email from IT.
"""

In this example, the pre-built agent is instructed via a system message to stop execution if any malicious content is found. Alternatively, you can create your own agent and implement a more deterministic approach to ensure the LLM's response is safe for the user.

In [5]:
tools = [search_tool, pangea_ip_intel_guard_tool]

system_message = """
Use IP Intel Guard to check for malicious content in your final response.
If a malicious content is found, respond with: Malicious content found in the response.
"""

langgraph_agent_executor = create_react_agent(model, tools, state_modifier=system_message)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

Malicious content found in the response.


In the above example, a malicious email context was introduced to the user's question through data poisoning. However, the IP Intel Guard tool informed the agent about the malicious content.

Without the protection of the IP Intel Guard tool, the same agent might inadvertently return the malicious domain to the user:

In [6]:
tools = [search_tool]

langgraph_agent_executor = create_react_agent(model, tools)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

The most recent email from IT is as follows:

**From:** IT Admin  
**Subject:** Update Firewall rules  

Hi team, IT admin again. Please whitelist our new office IP, 190.28.74.251, for access to your service.

Thanks, and keep your computer safe!


#### Use IP Intel Guard as a Runnable in chains

In [7]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompt_values import ChatPromptValue
from langchain_core.messages import SystemMessage

def rag(input: ChatPromptValue) -> ChatPromptValue:
    """
    Simulates a Retrieval-Augmented Generation (RAG) process by appending a fabricated
    IT-related email to the input chat messages. This example demonstrates how spam emails
    can introduce data poisoning, leading to an indirect prompt injection attack.
    """

    messages = input.to_messages()

    message = SystemMessage("""
    Email

    From: IT Admin
    Subject: Update Firewall rules

    Hi team, IT admin again. Please whitelist our new office IP, 190.28.74.251, for access to your service.

    Thanks, and keep your computer safe!
    """)

    messages.append(message)

    return ChatPromptValue(messages=messages)

prompt = ChatPromptTemplate.from_messages([("human", "{input}")])

query = """
Read me back the most recent email from IT.
"""

Invoke IP Intel Guard before the prompt is submitted to the LLM. 

In [8]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | pangea_ip_intel_guard_tool
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

I'm sorry, but I can't access your emails or any personal data. However, I can help you draft a response or provide guidance on how to check your email. Let me know what you need!


In the above example, a malicious email context was introduced to the user's question through data poisoning. However, the IP Intel Guard tool blocked the content containing the harmful IP, rendering the LLM's response harmless.

Without the protection of the IP Intel Guard tool, the same chain might inadvertently return the malicious IP to the user:

In [9]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

The most recent email from IT is as follows:

**From:** IT Admin  
**Subject:** Update Firewall rules  

Hi team, IT admin again. Please whitelist our new office IP, 190.28.74.251, for access to your service.

Thanks, and keep your computer safe!


#### Use IP Intel Guard as a standalone tool

You can also call the IP Intel Guard tool directly as needed.

In [10]:
print(pangea_ip_intel_guard_tool.run("190.28.74.251"))
print(pangea_ip_intel_guard_tool.invoke("190.28.74.251"))

Malicious IPs found in the provided input.
Malicious IPs found in the provided input.


### URL Intel Guard

#### Enable the URL Intel service

1. Open your [Pangea User Console](https://console.pangea.cloud).  
2. Click **URL Intel** in the left-hand sidebar and follow the prompts, accepting all defaults.  
3. When finished, click **Done** and then **Finish**. The enabled service will be marked with a green dot.
4. On the service **Overview** page, capture the **Default Token** and **Domain** values by clicking their respective tiles. Save these values in the appropriate environment variables:
    - `PANGEA_DOMAIN`
    - `PANGEA_IP_INTEL_TOKEN`

For more information on setting up the underlying service and its usage, see the [URL Intel documentation](https://pangea.cloud/docs/url-intel/).

#### Set up the environment

In [14]:
import os
from dotenv import load_dotenv
from pydantic import SecretStr

load_dotenv()

openai_api_key = SecretStr(os.getenv("OPENAI_API_KEY"))
pangea_domain = os.getenv("PANGEA_DOMAIN")
pangea_url_intel_token = SecretStr(os.getenv("PANGEA_URL_INTEL_TOKEN"))

#### Define the model

In [16]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model_name="gpt-4o-mini", openai_api_key=openai_api_key.get_secret_value(), temperature=0)

#### Instantiate URL Intel Guard

In [17]:
from langchain_community.tools.pangea.url_intel_guard import PangeaUrlIntelGuard, PangeaConfig

pangea_config = PangeaConfig(domain=pangea_domain)
pangea_url_intel_guard_tool = PangeaUrlIntelGuard(token=pangea_url_intel_token, config=pangea_config)

#### Use URL Intel Guard as an agent tool

The example below demonstrates how URL Intel Guard can prevent an agent from returning malicious information introduced through data poisoning or indirect prompt injection to the user.

In [19]:
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool

@tool
def search_tool(data):
    """Call to perform email search"""

    """
    Simulates a Retrieval-Augmented Generation (RAG) process by appending a fabricated
    IT-related email to the input chat messages. This example demonstrates how spam emails
    can introduce data poisoning, leading to an indirect prompt injection attack.
    """

    return """
    Email

    From: Marketing
    Subject: Product hunt

    Hi, everyone. We are doing a gift card raffle for those who upvote us on product hunt TODAY!

    Click here to upvote: http://113.235.101.11:54384

    Your participation is appreciated!
    """

query = """
Read me back the most recent email from Marketing.
"""

In this example, the pre-built agent is instructed via a system message to stop execution if any malicious content is found. Alternatively, you can create your own agent and implement a more deterministic approach to ensure the LLM's response is safe for the user.

In [23]:
tools = [search_tool, pangea_url_intel_guard_tool]

system_message = """
Use URL Intel Guard to check for malicious content in your final response.
If malicious content is found, respond with: Malicious content found in the response.
"""

langgraph_agent_executor = create_react_agent(model, tools, state_modifier=system_message)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

Malicious content found in the response.


In the above example, a malicious email is returned to the user. However, the URL Intel Guard tool informed the agent about the malicious content.

Without the protection of the IP Intel Guard tool, the same agent might inadvertently return the malicious link to the user:

In [24]:
tools = [search_tool]

langgraph_agent_executor = create_react_agent(model, tools)

state = langgraph_agent_executor.invoke({"messages": [("human", query)]})

print(state["messages"][-1].content)

The most recent email from Marketing is as follows:

**From:** Marketing  
**Subject:** Product hunt  

Hi, everyone. We are doing a gift card raffle for those who upvote us on Product Hunt TODAY!

Click here to upvote: [http://113.235.101.11:54384](http://113.235.101.11:54384)

Your participation is appreciated!


#### Use URL Intel Guard as a Runnable in a chain

In [28]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompt_values import ChatPromptValue
from langchain_core.messages import SystemMessage

def rag(input: ChatPromptValue) -> ChatPromptValue:
    """
    Simulates a Retrieval-Augmented Generation (RAG) process by appending a fabricated
    IT-related email to the input chat messages. This example demonstrates how spam emails
    can introduce data poisoning, leading to an indirect prompt injection attack.
    """

    messages = input.to_messages()

    message = SystemMessage(search_tool(query))

    messages.append(message)

    return ChatPromptValue(messages=messages)

prompt = ChatPromptTemplate.from_messages([("human", "{input}")])

query = """
Read me back the most recent Marketing email.
"""
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | pangea_url_intel_guard_tool
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

I'm sorry, but I can't access or read specific emails or external content. However, I can help you draft a marketing email or provide tips on effective email marketing strategies. Let me know how you'd like to proceed!


In the above example, a malicious email context was added to the user's question. However, the URL Intel Guard tool blocked the content containing the harmful URL, rendering the LLM's response harmless.

Without the protection of the URL Intel Guard tool, the same chain might inadvertently return the malicious link to the user.

In [5]:
from langchain_core.output_parsers import StrOutputParser

chain = (
  prompt
  | rag
  | model
  | StrOutputParser()
)

print(chain.invoke({"input": query}))

Here's the most recent marketing email:

---

**From:** Marketing  
**Subject:** Product Hunt

Hi, everyone. We are doing a gift card raffle for those who upvote us on Product Hunt TODAY!

Click here to upvote: [http://113.235.101.11:54384](http://113.235.101.11:54384)

Your participation is appreciated!

--- 

Let me know if you need anything else!


#### Use URL Intel Guard as a standalone tool

You can also call the URL Intel Guard tool directly as needed.

In [10]:
print(pangea_url_intel_guard_tool.run("http://113.235.101.11:54384"))
print(pangea_url_intel_guard_tool.invoke("http://113.235.101.11:54384"))

Malicious URLs found in the provided input.
Malicious URLs found in the provided input.
