# OBRAG Pipeline Demo

Starting in a notebook to get pattern consistent before compositiing into single application.

Starting with imports:

In [1]:
%pip install --quiet --upgrade -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


We'll start by setting up our chat model, embedding model, and vector store.

In [2]:
# chat model
import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

from langchain.chat_models import init_chat_model

llm = init_chat_model("gpt-4o-mini", model_provider="openai")

In [3]:
# embedding model
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model="Qwen/Qwen3-Embedding-0.6B")

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
# vector store
from langchain_chroma import Chroma

vstore = Chroma(
    collection_name="demo",
    embedding_function=embeddings,
    persist_directory="./chroma_demo"
)

Next, we can use some of LangChain's builtin Obsidian tools to gather and chunk our markdown.

In [5]:
from langchain_community.document_loaders import ObsidianLoader

loader = ObsidianLoader(
    path="/home/saafetensors/Documents/ollin/",
    collect_metadata=False
)
chunks = loader.load_and_split()

We can then add these chunks to the vector store.

In [6]:
from tqdm import tqdm

for chunk in tqdm(chunks):
    vstore.add_documents([chunk])

100%|██████████| 253/253 [00:15<00:00, 16.45it/s]


In [7]:
from langchain import hub
from typing_extensions import TypedDict, List
from langchain_core.documents import Document
from langgraph.graph import START, StateGraph

prompt = hub.pull("rlm/rag-prompt")

class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

def retrieve(state: State):
    docs = vstore.similarity_search(state["question"])
    return {"context": docs}

def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}

graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()



In [11]:
response = graph.invoke({"question": "What do we do when we are in the POMDP setting and want to use the Bayesian approach? Cite the documents you use."})

print(f"Context: {response['context']}")
print(f"Answer: {response['answer']}")

Context: [Document(id='296fd158-b287-4bd3-87d5-5f50574cf7aa', metadata={'created': 1748955714.7666247, 'source': '21 POMDP Search Methods.md', 'path': '/home/saafetensors/Documents/ollin/archived/Notes/Lecture Notes/6.4110/21 POMDP Search Methods.md', 'last_accessed': 1750349477.3930354, 'last_modified': 1745948752.532999}, page_content="Generally, we want to choose actions to maximize expected discounted sum of rewards. Similar to last time, we'll capture the uncertainty by defining a *belief MDP*, we make a reduction from a POMDP to an MDP:\n- states are *distributions* over $\\mathcal S$, existing on a *probability simplex* consisting of the vectors $(p_a, p_b, p_c)$ with $p_a + p_b + p_c = 1$ and $p_i \\in [0, 1]$; belief states are points on this simplex, so the state space is now *continuous*\n- reward function looks like $$R(b, a) =  \\sum_s R(s, a) b(s)$$\n\t- simply just the expected reward based on the reward we already know in the original space, weighted by the state distri