## LangChain application architecture

The example application will accept a user prompt and answer the query based on additional context from simulated enterprise data.

The secured LangChain application will use Pangea's AuthN service to authenticate users in their browser via its hosted login flow, providing a simple and secure way to add login functionality to your application.

Once authenticated, the user's identity will be used to retrieve the authorization policy associated with their login, as defined in Pangea's AuthZ service.

> You can integrate each service into your application with simple setup steps and a few lines of code, which will be covered later in this tutorial.

The application will check which resources the user is allowed to access before adding them to the context of the prompt submitted to the LLM. This will enable users to ask questions related to enterprise data while ensuring secure access. 

<figure>
  <img
    alt="Diagram illustrating prompt & response exchange between user, Generative AI app, Pangea services, and LLM within a RAG application with IAM implemented."
    title="Prompt & response diagram in a RAG application with identity and access management"
    src="./img/rag-iam-prompt-response-sequence-diagram.png"
    width="728"
  />
  <figcaption>Prompt & response diagram in a RAG application with identity and access management</figcaption>
</figure>

## Build your secure LangChain app

By following the steps below, you will build a simple RAG app protected with Pangea's security services.

### Create a context-based chain for answering questions

The code below defines a simple chain that adds extra information to the context of the user prompt before generating a response.

In [13]:
import os
from dotenv import load_dotenv
from langchain_community.chat_models.cloudflare_workersai import ChatCloudflareWorkersAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain

load_dotenv()

model = ChatCloudflareWorkersAI(
        account_id=os.getenv("CLOUDFLARE_ACCOUNT_ID"), 
        api_token=os.getenv("CLOUDFLARE_API_KEY"), 
        model="@cf/meta/llama-3.3-70b-instruct-fp8-fast",
        temperature=0.0)
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant answering questions based on the provided context: {context}. Be concise. If you don't find relevant information, say: I can't tell you."),
    ("human", "Question: {input}"),
])

qa_chain = create_stuff_documents_chain(model, prompt)

The [create_stuff_documents_chain](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.stuff.create_stuff_documents_chain.html#langchain.chains.combine_documents.stuff.create_stuff_documents_chain) function returns a runnable that accepts a `context` argument, which should be a list of document objects. You can manually build this context, for example:

In [14]:
from langchain.schema import Document

docs = [
    Document(page_content="""
    To enter Wonderland, follow the White Rabbit down the rabbit hole.
    """)
]


You can now test your enhanced chain with the added context.

In [15]:
print(qa_chain.invoke({"input": "Which entrance should I use?", "context": docs}))

2025-01-22 22:08:20 - INFO - Sending prompt to Cloudflare Workers AI: {'prompt': "role: system, content: You are a helpful assistant answering questions based on the provided context: \n    To enter Wonderland, follow the White Rabbit down the rabbit hole.\n    . Be concise. If you don't find relevant information, say: I can't tell you.\nrole: user, content: Question: Which entrance should I use?", 'tools': None}


{'result': {'response': 'To enter Wonderland, follow the White Rabbit down the rabbit hole.'}, 'success': True, 'errors': [], 'messages': []}


### Store context in a vector store

To efficiently search for relevant information within a large volume of application-specific data based on its semantic meaning, the data must be embedded and represented as vectors.

In the following example, we obtain context data from the local file system, organized by resource type, represented by folder names. We use the BGE model from Cloudflare Workers AI to embed this data and store the embeddings in a FAISS vector database. To maintain clear security boundaries, each vector is labeled with its corresponding resource type in the embedding metadata.

The example data is organized in the `data` folder within this repository:

```bash
├── engineering
│   └── patents.md
├── finance
│   └── expenses.md
└── public
    └── public-events-coc.md
```

#### Embed data

Run the following code to load text content from the `data` folder and embed it into a vector store.

In [16]:
from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings.cloudflare_workersai import CloudflareWorkersAIEmbeddings

# Use the current working directory
data_path = os.path.join(os.getcwd(), "data")
docs_loader = DirectoryLoader(data_path, show_progress=True)
docs = docs_loader.load()

for doc in docs:
    assert doc.metadata["source"]
    doc.metadata["resource_type"] = os.path.basename(os.path.dirname(doc.metadata["source"]))

text_splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=20)
text_splits = text_splitter.split_documents(docs)

embeddings = CloudflareWorkersAIEmbeddings(
    account_id=os.getenv("CLOUDFLARE_ACCOUNT_ID"), 
    api_token=os.getenv("CLOUDFLARE_API_KEY"), 
    model_name="@cf/baai/bge-base-en-v1.5",
)
vectorstore = FAISS.from_documents(documents=text_splits, embedding=embeddings)

100%|██████████| 3/3 [00:00<00:00, 123.40it/s]


You can check which resource types were saved in the metadata.

In [17]:
print("\n".join([str(x.metadata["resource_type"]) for x in docs]))

public
engineering
finance


### Create a retrieval chain

Now, you can use the vector store to retrieve context relevant to the user's query. Use the [create_retrieval_chain](https://api.python.langchain.com/en/latest/chains/langchain.chains.retrieval.create_retrieval_chain.html) function to build a retrieval chain. When invoked, this chain will pass the additional context to the question-answering chain and return an object containing the original input, context, and answer. 

Enable the question-answering chain to run with additional context from the vector store using the following code.

In [18]:
from langchain.chains import create_retrieval_chain

retriever = vectorstore.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, qa_chain)

At this point, there are no restrictions on the context data that can be added to the user's prompt. To verify this, let's submit questions that draw information from different resources.

In [19]:
def get_answers():
    question = "Do we have any inventions that don't have a patent yet?"
    retrieval = retrieval_chain.invoke({"input": question})
    print(f"\nQuestion: {question}\nAnswer: {retrieval['answer']}")

    question = "Can I bring a pet to public events?"
    retrieval = retrieval_chain.invoke({"input": question})
    print(f"\nQuestion: {question}\nAnswer: {retrieval['answer']}")

    question = "What is our biggest cumulative expense?"
    retrieval = retrieval_chain.invoke({"input": question})
    print(f"\nQuestion: {question}\nAnswer: {retrieval['answer']}")

get_answers()

2025-01-22 22:09:56 - INFO - Sending prompt to Cloudflare Workers AI: {'prompt': "role: system, content: You are a helpful assistant answering questions based on the provided context: Patents\n\nWhite Knight's Inventions\n\nInvention Name Patent Number Rain-Proof Deal Box Patent #11223 Tears-bringing tune Pending Portable Bee-Hive Patent #55678 Mobile Mouse Traps Patent #44556\n\nExpenses\n\nTea Party\n\nExpense Category Cost Frequency Tea Leaves £50 weekly Cups and Saucers £20 monthly Table Maintenance £15 weekly Sugar Cubes £5 daily Butter for Watches £10 monthly Nap Cushions £40 annually Wine £0 not served\n\nNo Cats Allowed - Any cats, particularly the Cheshire Cat, are strictly forbidden as they disrupt gameplay with their vanishing and reappearing antics.. Be concise. If you don't find relevant information, say: I can't tell you.\nrole: user, content: Question: Do we have any inventions that don't have a patent yet?", 'tools': None}



Question: Do we have any inventions that don't have a patent yet?
Answer: {'result': {'response': 'Yes, according to the provided context, the "Tears-bringing tune" invention is currently pending and does not have a patent number assigned to it yet.'}, 'success': True, 'errors': [], 'messages': []}


2025-01-22 22:09:59 - INFO - Sending prompt to Cloudflare Workers AI: {'prompt': "role: system, content: You are a helpful assistant answering questions based on the provided context: No Cats Allowed - Any cats, particularly the Cheshire Cat, are strictly forbidden as they disrupt gameplay with their vanishing and reappearing antics.\n\nCode of Conduct for Public Events\n\nCroquet-Ground Rules\n\nThe Queen Always Wins - It's the Queen's game.\n\nThe Queen Never Loses - If the Queen loses, see Rule #1.\n\nUse Flamingos as Mallets - No exceptions. Flamingos must be treated properly; if they refuse, it's your fault.\n\nHedgehogs as Balls - Hedgehogs are to be rolled gently, or else.\n\nPatents\n\nWhite Knight's Inventions\n\nInvention Name Patent Number Rain-Proof Deal Box Patent #11223 Tears-bringing tune Pending Portable Bee-Hive Patent #55678 Mobile Mouse Traps Patent #44556. Be concise. If you don't find relevant information, say: I can't tell you.\nrole: user, content: Question: Can 


Question: Can I bring a pet to public events?
Answer: {'result': {'response': "No, you cannot bring cats, specifically the Cheshire Cat, to public events as they are strictly forbidden due to their disruptive behavior. However, the rules do not explicitly mention other types of pets, but it's best to exercise caution and avoid bringing any pets that might disrupt the gameplay, especially if it involves the Queen's game of croquet."}, 'success': True, 'errors': [], 'messages': []}

Question: What is our biggest cumulative expense?
Answer: {'result': {'response': 'To determine the biggest cumulative expense, we need to calculate the annual cost of each item. \n\n- Tea Leaves: £50/week * 52 weeks = £2,600/year\n- Cups and Saucers: £20/month * 12 months = £240/year\n- Table Maintenance: £15/week * 52 weeks = £780/year\n- Sugar Cubes: £5/day * 365 days = £1,825/year\n- Butter for Watches: £10/month * 12 months = £120/year\n- Nap Cushions: £40/year (already annual)\n- Wine: £0/year (not ser

> Different models, content sources, and text splitters can yield varying results.

### Add access control

To implement authorization during the retrieval process, we will create a custom [VectorStoreRetriever](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStoreRetriever.html#langchain_core.vectorstores.VectorStoreRetriever). The custom retriever will call the [AuthZ](https://pangea.cloud/services/authz/) service to filter the vector search results based on the policies defined in the service.

#### Enable AuthZ

To host AuthZ and other Pangea services securing your chain, you'll need a free [Pangea account](https://pangea.cloud/signup). After creating your account, click **Skip** on the **Get started with a common service** screen. This will take you to the Pangea User Console, where you can enable the service.

To enable the service, click its name in the left-hand sidebar and follow the prompts, accepting all defaults. When you’re finished, click **Done**.

<figure>
  <img
    alt="Pangea Services in the Pangea User Console with the AuthZ service being highlighted"
    title="Pangea Services"
    src="./img/pangea-services-authz.png"
    width="728"
  />
  <figcaption>Pangea Services</figcaption>
</figure>

Once the service is enabled, you will be taken to its Overview page. Capture the **Configuration Details**:

- **Domain** (shared across all services in the project)
- **Default Token** (a token provided by default for each service)

You can copy these values by clicking on the respective property tiles.

<figure>
  <img
    alt="Pangea AuthZ Service Overview page with the service Configuration Details in the Pangea User Console"
    title="Pangea Service Configuration Details"
    src="./img/pangea-service-authz-overview.png"
    width="728"
  />
  <figcaption>Pangea Service Configuration Details</figcaption>
</figure>

Save the configuration values in your `.env`, for example:

```bash
# Cloudflare Workers AI
CLOUDFLARE_ACCOUNT_ID="..."
CLOUDFLARE_API_KEY="..."

# Pangea
PANGEA_DOMAIN="aws.us.pangea.cloud"
PANGEA_AUTHZ_TOKEN="pts_kwaun3...jhpqzf"
```

> Instead of storing secrets locally and potentially exposing them to the environment, you can securely store your credentials in Pangea's [Vault](https://pangea.cloud/services/vault/), optionally enable rotation, and retrieve them dynamically at runtime. Enable Vault the same way you enabled other services by selecting it in the left-hand sidebar of the Pangea User Console. The [Manage Secrets](https://pangea.cloud/docs/vault/manage-secrets/secret-overview) documentation provides guidance on storing and using secrets in Vault.
>
> For example, you can store your Cloudflare API key in Vault and retrieve it using the [Vault APIs](https://pangea.cloud/docs/api/vault/v1-general#/v1/get). When you enable a new Pangea service, its default token is stored in Vault automatically.

#### Set authorization policies

The AuthZ service enables access control based on roles (RBAC), relationships (ReBAC), and attributes (ABAC). In this tutorial, we will use a simple RBAC authorization schema to control access to data stored in the vector database, categorized by resource type.

> Centralizing the authorization policy allows real-time updates and enables all your apps to access the same policy dynamically.

1. Click **Resource Types** in the left-hand sidebar, type `engineering` into the **Name** input for your first resource type, and click **Save**.
1. Click **+ Resource Type** in the top right, and add another resource type named `finance`.

   <figure>
     <img
       alt="Pangea AuthZ Service Resource Types page in the Pangea User Console"
       title="AuthZ Resource Types"
       src="./img/pangea-service-authz-resource-types.png"
       width="728"
     />
     <figcaption>AuthZ Resource Types</figcaption>
   </figure>

1. Click **Roles & Access** in the left-hand sidebar, type `engineer` into the **Name** input for your first role, and click **Save**.
1. On the `engineer` role screen, check the `read` permission for the `engineering` resource type.
1. Click **+ Role** at the top right and add `analyst` role.
1. On the `analyst` role screen, check the `read` permission for the `finance` resource type.
1. Click **Save**.

   <figure>
     <img
       alt="Pangea AuthZ Service Roles page in the Pangea User Console"
       title="AuthZ Roles"
       src="./img/pangea-service-authz-roles.png"
       width="728"
     />
     <figcaption>AuthZ Roles</figcaption>
   </figure>

1. Click **Assigned Roles & Relations**, and in the opened dialog, assign the `analyst` role to user `alice`.

   <figure>
     <img
       alt="Pangea AuthZ Service Assign Role or Relation dialog on the Assigned Roles & Relations page in the Pangea User Console"
       title="Assign Role or Relation"
       src="./img/pangea-service-authz-assignment.png"
       width="728"
     />
     <figcaption>Assign Role or Relation</figcaption>
   </figure>

1. Click **+ Assign** in the **Assign Role or Relation** dialog.

1. On the **Assigned Roles & Relations** page, click **+ Assign** to open the **Assign Role or Relation** dialog and assign the `engineer` role to user `bill`.

1. Click **+ Assign** in the **Assign Role or Relation** dialog.

1. On the **Assigned Roles & Relations** page, you will see the current assignments.

   <figure>
     <img
       alt="Pangea AuthZ Service Assigned Roles & Relations page in the Pangea User Console"
       title="Assigned Roles & Relations"
       src="./img/pangea-service-authz-assignments.png"
       width="728"
     />
     <figcaption>Assigned Roles & Relations</figcaption>
   </figure>

You can now check the permissions set in the authorization schema and the role assignments using AuthZ [APIs](https://pangea.cloud/docs/api/authz) or an [SDK](https://pangea.cloud/docs/sdk/python/authz) for the supported environments.

> For more information on setting up the advanced capabilities of the AuthZ service and how to use it, visit the [AuthZ documentation](https://pangea.cloud/docs/authz/overview).

#### Create an AuthZ retriever

In the following example, we will extend the [VectorStoreRetriever](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStoreRetriever.html#langchain_core.vectorstores.VectorStoreRetriever) class with a custom filter based on permissions and assignments saved in AuthZ. The retriever will use Pangea's Python [SDK](https://pangea.cloud/docs/sdk/python/authz#perform-a-check-request) to interact with the service in your Pangea project.

The following code adds access control in the retrieval process.

In [33]:
from langchain_core.vectorstores import VectorStore, VectorStoreRetriever
from pangea import PangeaConfig
from pangea.services import AuthZ
from pangea.services.authz import Resource, Subject

class AuthzRetriever(VectorStoreRetriever):
    def __init__(
        self,
        vectorstore: VectorStore,
        subject_id: str,
    ):
        """
        Args:
            vectorstore: Vector store used for retrieval
            subject_id: Unique identifier for the subject, whose permissions will be used to filter documents
        """

        super().__init__(vectorstore=vectorstore)

        self._client = AuthZ(token=os.getenv("PANGEA_AUTHZ_TOKEN"), config=PangeaConfig(domain=os.getenv("PANGEA_DOMAIN")))
        self._subject = Subject(type="user", id=subject_id)
        self.search_kwargs["filter"] = self._filter

    def _filter(self, metadata: dict[str, str]) -> bool:
        """Filter documents based on the subject's permissions in AuthZ."""

        resource_type: str | None = metadata.get("resource_type")

        # Disallow access to untagged resources.
        if not resource_type:
            return False

        # Allow unrestricted access to public resources.
        if resource_type == "public":
            return True

        response = self._client.check(resource=Resource(type=resource_type), action="read", subject=self._subject)

        return response.result is not None and response.result.allowed

#### Create a retrieval chain with built-in access control

You can now create a retriever for a username specified in your AuthZ policies to use in a retrieval chain. Initialize the AuthZ retriever with the username `bill`.

In [38]:
# Load updated AuthZ variables
load_dotenv(override=True)

retriever = AuthzRetriever(vectorstore=vectorstore, subject_id="vanessa")
retrieval_chain = create_retrieval_chain(retriever, qa_chain)


You will then receive answers based only on the information the user is authorized to access.

In [39]:
get_answers()

2025-01-22 22:16:13 - INFO - Sending prompt to Cloudflare Workers AI: {'prompt': "role: system, content: You are a helpful assistant answering questions based on the provided context: Expenses\n\nTea Party\n\nExpense Category Cost Frequency Tea Leaves £50 weekly Cups and Saucers £20 monthly Table Maintenance £15 weekly Sugar Cubes £5 daily Butter for Watches £10 monthly Nap Cushions £40 annually Wine £0 not served\n\nNo Cats Allowed - Any cats, particularly the Cheshire Cat, are strictly forbidden as they disrupt gameplay with their vanishing and reappearing antics.\n\nUse Flamingos as Mallets - No exceptions. Flamingos must be treated properly; if they refuse, it's your fault.\n\nHedgehogs as Balls - Hedgehogs are to be rolled gently, or else.. Be concise. If you don't find relevant information, say: I can't tell you.\nrole: user, content: Question: Do we have any inventions that don't have a patent yet?", 'tools': None}



Question: Do we have any inventions that don't have a patent yet?
Answer: {'result': {'response': "I can't tell you. The provided context appears to be related to a unique tea party setting with specific rules and expenses, but it does not mention any information about inventions or patents."}, 'success': True, 'errors': [], 'messages': []}


2025-01-22 22:16:15 - INFO - Sending prompt to Cloudflare Workers AI: {'prompt': "role: system, content: You are a helpful assistant answering questions based on the provided context: No Cats Allowed - Any cats, particularly the Cheshire Cat, are strictly forbidden as they disrupt gameplay with their vanishing and reappearing antics.\n\nCode of Conduct for Public Events\n\nCroquet-Ground Rules\n\nThe Queen Always Wins - It's the Queen's game.\n\nThe Queen Never Loses - If the Queen loses, see Rule #1.\n\nUse Flamingos as Mallets - No exceptions. Flamingos must be treated properly; if they refuse, it's your fault.\n\nHedgehogs as Balls - Hedgehogs are to be rolled gently, or else.\n\nExpenses\n\nTea Party. Be concise. If you don't find relevant information, say: I can't tell you.\nrole: user, content: Question: Can I bring a pet to public events?", 'tools': None}



Question: Can I bring a pet to public events?
Answer: {'result': {'response': "According to the rules, cats, particularly the Cheshire Cat, are strictly forbidden from public events as they disrupt gameplay. However, it doesn't mention other types of pets. But, considering the rules focus on specific creatures like flamingos and hedgehogs for the game, it's best to assume that pets, in general, might not be allowed to avoid disruptions. If you're unsure, it's best to leave your pet at home to ensure a smooth and enjoyable experience for everyone."}, 'success': True, 'errors': [], 'messages': []}


2025-01-22 22:16:20 - INFO - Sending prompt to Cloudflare Workers AI: {'prompt': 'role: system, content: You are a helpful assistant answering questions based on the provided context: Expenses\n\nTea Party\n\nExpense Category Cost Frequency Tea Leaves £50 weekly Cups and Saucers £20 monthly Table Maintenance £15 weekly Sugar Cubes £5 daily Butter for Watches £10 monthly Nap Cushions £40 annually Wine £0 not served\n\nCode of Conduct for Public Events\n\nCroquet-Ground Rules\n\nThe Queen Always Wins - It\'s the Queen\'s game.\n\nThe Queen Never Loses - If the Queen loses, see Rule #1.\n\n"Off with Their Heads!" Applies - Any form of disobedience or error may lead to immediate execution. Stay alert!. Be concise. If you don\'t find relevant information, say: I can\'t tell you.\nrole: user, content: Question: What is our biggest cumulative expense?', 'tools': None}



Question: What is our biggest cumulative expense?
Answer: {'result': {'response': "To determine the biggest cumulative expense, let's analyze the given information:\n\n1. Tea Leaves: £50/week\n2. Cups and Saucers: £20/month\n3. Table Maintenance: £15/week\n4. Sugar Cubes: £5/day\n5. Butter for Watches: £10/month\n6. Nap Cushions: £40/year\n7. Wine: £0 (not served)\n\nFirst, we need to convert all frequencies to a common unit, such as years, for easier comparison:\n\n1. Tea Leaves: £50/week * 52 weeks/year = £2600/year\n2. Cups and Saucers: £20/month * 12 months/year = £240/year\n3. Table Maintenance: £15/week * 52 weeks/year = £780/year\n4. Sugar Cubes: £5/day * 365 days/year = £1825/year\n5. Butter for Watches: £10/month * 12 months/year = £120/year\n6. Nap Cushions: £40/year (already in years)\n7. Wine: £0/year\n\nNow, let's rank these expenses from highest to lowest:\n1. Tea Leaves: £2600/year\n2. Sugar Cubes: £1825/year\n3. Table Maintenance: £780/year\n4. Cups and Saucers: £240/y


Note that the engineer `bill` does not have access to financial data.

### Add authentication

To reduce risks associated with public access to your application, you can require users to sign in. After authentication, the user’s ID can be used to retrieve the authorization policies stored in AuthZ.

You can easily add login functionality to your application using the [AuthN](https://pangea.cloud/services/authn/) service. It allows users to sign in to the Pangea-hosted authorization server using their browser. Your application will then perform an authorization code flow using the Pangea SDK you’ve already imported to communicate with the AuthZ service. For demonstration purposes, we will use Flask as the client server.

#### Enable AuthN

If you are on the AuthZ page, navigate to the list of services by clicking **Back to Main Menu** in the top left corner. This will return you to the project page in your Pangea User Console, where enabled services will be marked with a green dot.

Click **AuthN** in the left-hand sidebar and follow the prompts, accepting all defaults. When you’re finished, click **Done** and **Finish**.

When the service is enabled, you will be taken to its Overview page. Capture the AuthN **Configuration Details**:

- **Default Token** (a token provided by default for each service)
- **Hosted Login** (the login URL that your application will redirect users to sign in)

You can copy these values by clicking on the respective property tiles.

<figure>
  <img
    alt="Pangea AuthN Service Overview page with the service Configuration Details in the Pangea User Console"
    title="AuthN Configuration Details"
    src="./img/pangea-service-authn-overview.png"
    width="728"
  />
  <figcaption>AuthN Configuration Details</figcaption>
</figure>

Save the configuration values in your `.env`, for example:

```bash
# Cloudflare Workers AI
CLOUDFLARE_ACCOUNT_ID="..."
CLOUDFLARE_API_KEY="..."

# Pangea
PANGEA_DOMAIN="aws.us.pangea.cloud"
PANGEA_AUTHZ_TOKEN="pts_kwaun3...jhpqzf"
PANGEA_AUTHN_HOSTED_LOGIN="https://pdn-lqcuqlhizxsjrpbewgdrpi53cc72gdit.login.aws.us.pangea.cloud"
PANGEA_AUTHN_CLIENT_TOKEN="pcl_pgd43k...yoy6kn"
```

#### Enable Hosted Login flow

An easy and secure way to authenticate your users with AuthN is to use its [Hosted Login](https://pangea.cloud/docs/authn/hosted-login/), which implements the OAuth 2 Authorization Code grant. The Pangea SDK will manage the flow, providing user profile information and allowing you to use the user's login to verify their permissions defined in AuthZ.

1. Click **General** in the left-hand sidebar.
1. On the **Authentication Settings** screen, click **Redirect (Callback) Settings**.
1. In the right pane, click **+ Redirect**.
1. Enter `http://localhost:3080` in the URL input field.
1. Click **Save** in the **Add redirect** dialog.
1. Click **Save** again in the **Redirect (Callback) Settings** pane on the right.

> For more information on setting up advanced capabilities of the AuthN service (such as sign-in and sign-up options, security controls, session management, and more) visit the [AuthN documentation](https://pangea.cloud/docs/authn/general).

#### Require users to sign in

Add login functionality to your application with the following code. This will open the hosted login page in your browser. After creating an account in AuthN and signing in, the script will request the user profile from AuthN and store the username in a variable. This variable will then be passed to the AuthZ retriever, ensuring that only authorized data is accessible during retrieval.

When you run the code below, sign up as `alice` in your browser, using your email during the sign-up process, and verify it. Once signed up and signed in successfully, you can close the page and return to your notebook. The script will receive the user profile and store the username in the corresponding variable.

<figure>
  <img
    alt="Pangea AuthN Sign Up page"
    title="AuthN Sign Up page"
    src="./img/authn-sign-up.png"
    width="512"
  />
  <figcaption>AuthN Sign Up page</figcaption>
</figure>

In [23]:
import threading
import urllib
import webbrowser
from functools import partial
from queue import Queue
from secrets import token_hex
from pangea.services import AuthN
from flask import Flask, request, abort

# Queue to share data between threads
queue = Queue()

# Initialize AuthN client
authn = AuthN(token=os.getenv("PANGEA_AUTHN_CLIENT_TOKEN"), config=PangeaConfig(domain=os.getenv("PANGEA_DOMAIN")))

# Initialize Flask app
app = Flask(__name__)

# Generate a unique state token for verifying callback
state = token_hex(32)

# Define the redirect route
@app.route("/callback")
def callback():
    # Verify that the state param matches the original.
    if request.args.get("state") != state:
        return abort(401)

    auth_code = request.args.get("code")
    if auth_code is None:
        return abort(401)

    # Exchange the authorization code for the user's tokens and info.
    response = authn.client.userinfo(code=auth_code)

    if not response.success or response.result is None or response.result.active_token is None:
        return abort(401)

    queue.put(response.result.active_token.token)
    queue.task_done()

    return "Authenticated, you can close this tab."

# Start the Flask server in a separate daemon thread
def run_server():
    app.run(port=3080, debug=False)

# Run Flask server as a daemon thread
server_thread = threading.Thread(target=run_server, daemon=True)
server_thread.start()

# Open a browser to authenticate
authn_hosted_login = os.getenv("PANGEA_AUTHN_HOSTED_LOGIN")
redirect_uri = "http://localhost:3080/callback"
url_parameters = {
    "redirect_uri": redirect_uri,
    "response_type": "code",
    "state": state,
}
url = f"{authn_hosted_login}?{urllib.parse.urlencode(url_parameters)}"

print("Opening browser to authenticate...")
webbrowser.open_new_tab(url)

# Wait for the server to receive the auth code.
token = queue.get(block=True)
check_result = authn.client.token_endpoints.check(token).result
assert check_result
username = check_result.owner

print(f"Authenticated as {username}")

Opening browser to authenticate...
 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://127.0.0.1:3080
[33mPress CTRL+C to quit[0m
127.0.0.1 - - [26/Oct/2024 18:24:08] "GET /callback?code=pmc_rgcp2tvxmh7ccyqdmijwgpafp2nyte4p&state=e36a837f4279e397a2ff77f1f10597b055af64a2794231e513371ccf08523a78 HTTP/1.1" 200 -


Authenticated as alice


You can now pass the username dynamically to your AuthZ retriever using the value returned from AuthN, getting answers based on the information accessible to the authenticated user.

In [24]:
retriever = AuthzRetriever(vectorstore=vectorstore, subject_id=username)
retrieval_chain = create_retrieval_chain(retriever, qa_chain)

get_answers()


Question: Do we have any inventions that don't have a patent yet?
Answer: I can't tell you.

Question: Can I bring a pet to public events?
Answer: No cats are allowed at public events, as they disrupt gameplay. Other pets are not mentioned, so I can't tell you.

Question: What is our biggest cumulative expense?
Answer: The biggest cumulative expense is for Tea Leaves, which costs £50 weekly. Over a year, this amounts to £2,600 (£50 x 52 weeks).



Note that the analyst `alice` cannot include the engineering context in her conversation with the LLM. But rest assured, whether you're an engineer or an analyst, you must leave your cat at home when heading to a corporate party in this office.


## Conclusion

Adding authentication and authorization to your RAG app enables the secure use of business-specific information in a generative AI application. By making minor modifications to the existing code, you can create an identity and authorization-aware experience in your LangChain apps using Pangea-hosted AuthN and AuthZ services.

For more examples and detailed implementations, explore the following GitHub repositories:

- [Authenticating Users for Access Control with RAG for LangChain in Python](https://github.com/pangeacyber/langchain-python-user-authn)
    
-  [User-based Access Control with RAG for LangChain in Python](https://github.com/pangeacyber/langchain-python-rag-authz)