# Fine Grained Authorization for Retrieval Augmented Generation (RAG)

In the previous step we installed and launched an instance of SpiceDB. Now let's start building our RAG pipeline with fine grained authorization. 


This workshop simulates a simple usecase - We have user Tim that has access to two documents. We query a LLM for information from one of the documents. We then remove Tim's permissions to view one of the documents and make the same query. If all goes well, the information should not be available to him. 

### Part 1: Add Secrets & Setup

Download the `requirements.txt` file from this directory and run the following command to install all dependencies for the workshop. 

In [None]:
%pip install -r requirements.txt

The consolidated list of imports are added at the start of the workshop

In [None]:
import os
from typing import List, Optional

from dotenv import load_dotenv

from authzed.api.v1 import (
    Client,
    WriteSchemaRequest,
    WriteRelationshipsRequest,
    RelationshipUpdate,
    Relationship,
    ObjectReference,
    SubjectReference,
    LookupResourcesRequest,
    CheckPermissionRequest,
    CheckPermissionResponse,
)

from grpcutil import insecure_bearer_token_credentials

from pinecone import Pinecone, ServerlessSpec
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore, PineconeEmbeddings
from langchain_core.runnables import (
    RunnableParallel,
    RunnableLambda,
    RunnablePassthrough,
)
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from openai import AsyncOpenAI

This walkthrough requires the following environment variables. Create a file named `.env` in your working directory and add these variables:

- ```OPENAI_API_KEY=<add OpenAI key>```
- ```PINECONE_API_KEY=<add Pinecone key>```
- ```SPICEDB_TOKEN=rag-rebac-walkthrough```
- ```SPICEDB_ENDPOINT=localhost:50051```

Load the secrets into your app from the .env file

In [None]:
load_dotenv()

OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
PINECONE_API_KEY = os.getenv('PINECONE_API_KEY')
SPICEDB_ENDPOINT = os.getenv('SPICEDB_ENDPOINT')
SPICEDB_TOKEN = os.getenv('SPICEDB_TOKEN')

REQUIRED = {
    'OPENAI_API_KEY': OPENAI_API_KEY,
    'PINECONE_API_KEY': PINECONE_API_KEY,
    'SPICEDB_ENDPOINT': SPICEDB_ENDPOINT,
    'SPICEDB_TOKEN': SPICEDB_TOKEN,
}
missing = [k for k, v in REQUIRED.items() if not v]
if missing:
    raise RuntimeError(
        'Missing env vars: ' + ', '.join(missing) +
        '\nSet them in your environment or a .env file in this directory.'
    )

# Make these available to downstream SDKs
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
os.environ['PINECONE_API_KEY'] = PINECONE_API_KEY

### Part 2: Write a schema to SpiceDB

In today's scenario, we will be authorizing fine-grain access to view blog articles.

The source of truth for permissions in the RAG pipeline is SpiceDB. Create an instance of the SpiceDB client that we'll use for the workshop.


In [None]:
def make_spicedb_client() -> Client:
    # For TLS environments, replace with bearer_token_credentials(...).
    return Client(
        target=SPICEDB_ENDPOINT,
        credentials=insecure_bearer_token_credentials(SPICEDB_TOKEN),
    )

_client = make_spicedb_client()
print('SpiceDB client ready:', isinstance(_client, Client))


Let's begin by defining the authorization logic for our example. To do this, we write a [schema](https://authzed.com/docs/spicedb/concepts/schema) to SpiceDB. The schema below defines two object types, ```user``` and ```article```. Users can relate to a document as a ```viewer``` and any user who is related to a document as a ```viewer``` can ```view``` the document.

In [None]:
from grpcutil import insecure_bearer_token_credentials

SCHEMA = """definition user {}

definition article {
    relation viewer: user

    permission view = viewer
}"""

client = make_spicedb_client()

try:
    resp = await(client.WriteSchema(WriteSchemaRequest(schema=SCHEMA)))
except Exception as e:
    print(f"Write schema error: {type(e).__name__}: {e}")

### Part 3: Write a Relationship to SpiceDB

Now, we write relationships to SpiceDB that specify that Tim is a viewer of document 123 and 456.

After these relationships are written, any permission checks to SpiceDB will reflect that Tim can view documents 123 and 456.

In [None]:
try:
    resp = await (client.WriteRelationships(
        WriteRelationshipsRequest(
            updates=[
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_TOUCH,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="123"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_TOUCH,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="456"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
            ]
        )
    ))
except Exception as e:
    print(f"Write relationships error: {type(e).__name__}: {e}")

### Part 4: Simulate a real-world RAG scenario

We now define a Pinecone serverless index.

Pinecone is a specialized database designed for handling vector-based data. Their serverless product makes it easy to get started with a vector DB.

In [None]:
pc = Pinecone(api_key=PINECONE_API_KEY)

index_name = "documents"
namespace_name = "authzed"

pc.create_index(
    name=index_name,
    dimension=1024,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

We are simulating a real-world RAG (retrieval-augmented generation) scenario by embedding a completely fictional string: "Bill Gates won the 2025 Oscar for best football movie.". Since LLMs don't "know" this fact (as it's made up), we mimic a typical RAG case where private or unknown data augments prompts.

In this example, we also specify metadata like article_id to track which article the string comes from. The article_id is important for linking embeddings to objects that users are authorized on.

In [None]:
# Create a Document object that specifies our made up documents and specifies the document_id as metadata.

documents = [
    Document(
        page_content="Bill Gates won the 2025 Oscar for best football movie",
        metadata={"article_id": "123"}
    ),
    Document(
        page_content="The revenue for Q4 2025 is one billion dollars!",
        metadata={"article_id": "456"}
    )
]


# Initialize LangChain embeddings
embeddings = PineconeEmbeddings(
    model="multilingual-e5-large",
    pinecone_api_key=PINECONE_API_KEY
)

# Create vector store and upsert both documents
docsearch = PineconeVectorStore.from_documents(
    documents=documents,
    index_name=index_name,
    embedding=embeddings,
    namespace=namespace_name
)

### Part 5: Make a request when the user is authorized to view the necessary contextual data using the Pre-Filter Method

Here is a high-level architecture diagram of the Pre-Filter method
![pre-filter architecture diagram](/secure-rag-pipelines/images/secure-rag.png)

We'll query SpiceDB for a list of documents that Tim is allowed to view using the [LookupResources API](https://buf.build/authzed/api/docs/main:authzed.api.v1#authzed.api.v1.LookupResourcesRequest). 

In [None]:
subject = SubjectReference(
    object=ObjectReference(
        object_type="user",
        object_id="tim",
    )
)

def lookupArticles():
    return client.LookupResources(
        LookupResourcesRequest(
            subject=subject,
            permission="view",
            resource_object_type="article",
        )
    )
try:
    resp = lookupArticles()

    authorized_articles = []

    async for response in resp:
            authorized_articles.append(response.resource_object_id)
except Exception as e:
    print(f"Lookup error: {type(e).__name__}: {e}")

print("Article IDs that Tim is authorized to view:")
print(authorized_articles)

We can now issue a prompt to GPT-5, enhanced with relevant data that the user is authorized to access. This ensures that the response is based on information the user is permitted to view.

In [None]:
# Define the ask function
def ask():
    # Initialize a LangChain object for an OpenAI chat model.
    llm = ChatOpenAI(
        openai_api_key=OPENAI_API_KEY,
        model="gpt-5-nano-2025-08-07",
        temperature=1
    )

    # Initialize a LangChain object for a Pinecone index with an OpenAI embeddings model.
    knowledge = PineconeVectorStore.from_existing_index(
        index_name=index_name,
        namespace=namespace_name,
        embedding=OpenAIEmbeddings(
            openai_api_key=OPENAI_API_KEY,
            dimensions=1024,
            model="text-embedding-3-large"
        )
    )

    # Initialize a retriever with a filter that restricts the search to authorized documents.
    retriever=knowledge.as_retriever(
            search_kwargs={
            "filter": {
                "article_id":
                    {"$in": authorized_articles},
            },
        }
    )

    # Initialize a string prompt template that let's us add context and a question.
    prompt = ChatPromptTemplate.from_template("""Answer the question below using the context:

    Context: {context}

    Question: {question}

    Answer: """)

    retrieval =  RunnableParallel(
        {"context": retriever, "question": RunnablePassthrough()}
    )

    chain = retrieval | prompt | llm | StrOutputParser()

    question = """Who won the Oscar for best football movie?"""

    print("Prompt: \n")
    print(question)
    print(chain.invoke(question))

#invoke the ask function
ask()

You can also generate a summary of all the articles that Tim is authorized to view.

In [None]:
#Summarize only articles that the user is authorized to view

async def summarize_accessible_articles(user_id: str):

    # 1️⃣ Lookup articles
    subject = SubjectReference(
        object=ObjectReference(object_type="user", object_id=user_id)
    )
    response = client.LookupResources(
        LookupResourcesRequest(
            subject=subject,
            permission="view",
            resource_object_type="article",
        )
    )
    authorized_articles = [res.resource_object_id async for res in response]
    print(f"🔍 {user_id} can view articles: {authorized_articles}")

    if not authorized_articles:
        return "❌ No accessible articles."

    # 2️⃣ Setup LangChain retriever w/ filter
    knowledge = PineconeVectorStore.from_existing_index(
        index_name=index_name,
        namespace=namespace_name,
        embedding=OpenAIEmbeddings(
            openai_api_key=OPENAI_API_KEY,
            dimensions=1024,
            model="text-embedding-3-large"
        )
    )

    retriever = knowledge.as_retriever(
        search_kwargs={
            "filter": {"article_id": {"$in": authorized_articles}},
            "k": 100  # Ensure we get all matches
        }
    )

    docs = await retriever.ainvoke("Give me all the contents to summarize")

    if not docs:
        return "❌ No content found."

    combined_text = "\n\n".join([d.page_content for d in docs])

    # 3️⃣ Summarize using OpenAI
    summary_prompt = (
        "You are an AI assistant. Based ONLY on the following articles, "
        "generate a concise summary of their contents. Do not use any outside knowledge.\n\n"
        + combined_text
        + "\n\nSummary:"
    )

    openai_client = AsyncOpenAI(api_key=OPENAI_API_KEY)
    chat_response = await openai_client.chat.completions.create(
        messages=[{"role": "user", "content": summary_prompt}],
        model="gpt-5-nano-2025-08-07",
        temperature=1
    )

    return chat_response.choices[0].message.content

In [None]:
summary = await summarize_accessible_articles("tim")
print("📄 Summary of accessible articles:")
print(summary)

### Part 6: Make a request when the user is NOT authorized to view the necessary contextual data

Now, let's see what happens when Tim is not authorized to view the document.

First, we will delete the relationship that related Tim as a viewer to document 123.

In [None]:
try: 
    resp = await client.WriteRelationships(
        WriteRelationshipsRequest(
            updates=[
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_DELETE,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="123"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
            ]
        )
    )
except Exception as e:
    print(f"Write relationships error: {type(e).__name__}: {e}")

Next, we will update the list of documents that Tim is authorized to view.

In [None]:
try:
        resp = lookupArticles()

        authorized_articles = []

        async for response in resp:
                authorized_articles.append(response.resource_object_id)
except Exception as e:
    print(f"Lookup error: {type(e).__name__}: {e}")

print("Documents that Tim can view:")
print(authorized_articles)

Now, we can run our query again. 

Note that we no longer recieve a completion that answers our question because Tim is no longer authorized to view the document that contains the context required to answer the question.

In [None]:
#this function was defined above
ask()

## Post-Filter Method

We just completed the pre-filter method where we queried SpiceDB for all the documents that Tim was authorized to view. An alternate approach is to use the Post-Filter method where a [CheckPermissionRequest](https://buf.build/authzed/api/docs/main:authzed.api.v1#authzed.api.v1.CheckPermissionRequest) is performed on every document ID that the vector database returns. The list of authorized documents is then passed on to the LLM for a response to the query.

Here is a high-level architecture diagram of the post-filter method
![post-filter architecture diagram](/secure-rag-pipelines/images/post-filter.png)

Choosing between the two depends on your usecase. Typically if you have a high positive hit-rate from your vector database, a post-filter approach works well. Conversely, if you have a large corpus of documents in your RAG pipeline and a low positive hit-rate, the pre-filter approach works better. 

### Part 7: Restore Tim's permissions

Let's restore Tim's permissions to view document `123`

In [None]:
try: 
    resp = await client.WriteRelationships(
        WriteRelationshipsRequest(
            updates=[
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_TOUCH,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="123"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
            ]
        )
    )
except Exception as e:
    print(f"Write relationships error: {type(e).__name__}: {e}")

### Part 8: Checking for Permissions

Define the method that gets the `article_id` for all documents and checks whether the user has permissions for each article. Compare and contrast this with the Pre-Filter method where we performed a lookup to get a list of documents that the user had access to. 

In [None]:
async def filter_docs_with_spicedb(docs: List):
    filtered_docs = []
    for doc in docs:
        article_id = doc.metadata.get("article_id")
        resp = await client.CheckPermission(
            CheckPermissionRequest(
                subject=SubjectReference(
                    object=ObjectReference(
                        object_type="user",
                        object_id="tim",
                    ),
                ),
                resource=ObjectReference(
                    object_type="article",
                    object_id=str(article_id),
                ),
                permission="view",
            )
        )
        if resp.permissionship == CheckPermissionResponse.PERMISSIONSHIP_HAS_PERMISSION:
            filtered_docs.append(doc)
        
    return filtered_docs

All that's left is to build a Langchain graph. This snippet sets up a retriever to fetch relevant documents, then applies a post-filter using SpiceDB to ensure only documents the user is authorized to view are included. 

`RunnableLambda` allows you to wrap a custom Python function (such as your authorization filter) so it can be used as a step in the LangChain pipeline. `RunnablePassthrough` simply passes its input through unchanged, making it useful for forwarding data (like the user's question) to the next step in the chain.

In [None]:
# Build the LangChain graph
retriever = docsearch.as_retriever(search_kwargs={"k": 4})
llm = ChatOpenAI(api_key=OPENAI_API_KEY, 
                 model="gpt-5-nano-2025-08-07", 
                 temperature=1)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You answer strictly from the provided context. If insufficient, say so."),
    ("human", "Question: {question}\n\nContext:\n{context}")
])

# Combine: retrieve → post-filter → prompt → LLM
graph = (
    RunnableParallel(
        {
            "context": retriever | RunnableLambda(filter_docs_with_spicedb),
            "question": RunnablePassthrough(),
        }
    )
    | prompt
    | llm
    | StrOutputParser()
)

print("✅ Retrieval + chain wired up")

Run this code to ask the LLM about some data in document `123`. Since Tim does have permission to view this document, you should see the correct response. 

In [None]:
question = "Who won the 2025 Oscar for best football movie?"
result = await graph.ainvoke(question) 
print(result)

Let's remove Tim's permission to view the document and then ask the same question again. Since Tim doesn't have permission to view this, document the LLM isn't able to provide an answer. We performed this step while testing the pre-filter method as well.

In [None]:
# We remove Tim's permissions to view the contents of article "123"

try: 
    resp = await client.WriteRelationships(
        WriteRelationshipsRequest(
            updates=[
                RelationshipUpdate(
                    operation=RelationshipUpdate.Operation.OPERATION_DELETE,
                    relationship=Relationship(
                        resource=ObjectReference(object_type="article", object_id="123"),
                        relation="viewer",
                        subject=SubjectReference(
                            object=ObjectReference(
                                object_type="user",
                                object_id="tim",
                            )
                        ),
                    ),
                ),
            ]
        )
    )
except Exception as e:
    print(f"Write relationships error: {type(e).__name__}: {e}")

In [None]:
question = "Who won the 2025 Oscar for best football movie?"
result = await graph.ainvoke(question) 
print(result)

### Cleanup

You can now delete your Pinceone index if you'd like to.

In [None]:
pc.delete_index(index_name)

### Conclusion and Next Steps 

Congratulations! You learned how to secure your RAG pipelines with fine-grained authorization using SpiceDB. 

OpenAI uses SpiceDB and [AuthZed Dedicated](https://authzed.com/products/authzed-dedicated) to secure 37 Billion documents for 5 Million users who use ChatGPT Connectors. Read more about it here: https://authzed.com/customers/openai

You can also use [AuthZed Cloud](https://authzed.com/products/authzed-cloud) for low-latency authorization at scale.