In [1]:
# Let's perform traditional RAG on a mock Obsidian vault

In [2]:
from dotenv import load_dotenv

load_dotenv()  # load environment variables

True

In [3]:
# Indexing: Let's load our full documents

from langchain_community.document_loaders import DirectoryLoader

loader = DirectoryLoader("./", glob="**/*.md", show_progress=True, use_multithreading=True)

full_documents = loader.load()
full_documents[0]

  0%|          | 0/66 [00:00<?, ?it/s]Need to load profiles.
Need to load profiles.
Need to load profiles.
Need to load profiles.
Need to load profiles.
short text: "![[priority_queue_overview.png|500]]". Defaulting to English.
short text: "Example". Defaulting to English.
short text: "Multiply by two:". Defaulting to English.
short text: "Operations". Defaulting to English.
short text: "https://leetcode.com/problems/minimum-height-trees/". Defaulting to English.
short text: "x <<= 1". Defaulting to English.
short text: "%%". Defaulting to English.
short text: "Divide by two:". Defaulting to English.
short text: "stub". Defaulting to English.
short text: "Applications". Defaulting to English.
short text: "x >>= 1". Defaulting to English.
  2%|▏         | 1/66 [00:02<02:31,  2.32s/it]short text: "%%". Defaulting to English.
short text: "Swap two numbers:". Defaulting to English.
short text: "![[algo_seam_carving_overview.png]]". Defaulting to English.
short text: "How It Works". Default

Document(metadata={'source': 'rsc/vault/Priority Queue.md'}, page_content='A priority queue is an abstract data structure, similar to a queue, in which each element has an associated priority and elements with high priority are served before elements with low priority.\n\n![[priority_queue_overview.png|500]]\n\nPriority queues are commonly implemented using [[Heap|heaps]], giving $O(\\log n)$ performance for inserts and removals, and $O(n)$ to build the heap initially from a set of $n$ elements.\n\nOperations\n\nPriority queues support the following operations:\n\nBasic - enqueue: add an element to the queue with an associated priority. - dequeue: remove the highest priority element from the queue, and return it. - delete: remove an element from the queue. - peek: return the highest priority element from the queue.\n\nInspection - size: return the number of elements in the queue. - is_empty: check whether the queue has no elements.\n\nEquivalence of priority queues and sorting algorith

In [4]:
# Indexing: Let's chunk and insert our documents into FAISS

from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS

# split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=64)
documents = text_splitter.split_documents(full_documents)

# store in FAISS with nomic embeddings (ollama)
embedding_model = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = FAISS.from_documents(documents=documents, embedding=embedding_model)

documents[0]

Document(metadata={'source': 'rsc/vault/Priority Queue.md'}, page_content='A priority queue is an abstract data structure, similar to a queue, in which each element has an associated priority and elements with high priority are served before elements with low priority.\n\n![[priority_queue_overview.png|500]]\n\nPriority queues are commonly implemented using [[Heap|heaps]], giving $O(\\log n)$ performance for inserts and removals, and $O(n)$ to build the heap initially from a set of $n$ elements.\n\nOperations\n\nPriority queues support the following operations:')

In [5]:
# Define a query

# query = "Given an integer array nums, handle multiple queries of the following types: 1. Update the value of an element in nums. 2. Calculate the sum of the elements of nums between indices left and right inclusive where left <= right. Give a broad strokes overview of how you'd solve this?"

# query = "What data structures and algorithms do you use to solve the trapping rainwater coding problem?"

# query = "What is a fenwick tree?"

# query = "What is the seam carving algorithm?"

# query = "What is a multiset good for?"

query = "What is the relationship between a prefix sum and an integral image?"

In [6]:
# Retrieval and generation: Let's set up a RAG chain

from langchain_ollama import OllamaLLM
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template('''
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. A longer answer isn't necessarily better. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
''')

llm = OllamaLLM(model="llama3.2:1b")

rag_chain = (
        {
            "context": vectorstore.as_retriever(),
            "question": RunnablePassthrough(),
        }
        | prompt
        | llm
        | StrOutputParser()
)

rag_response = rag_chain.invoke(query)
print(rag_response)

A prefix sum and an integral image share the same underlying operation, which involves calculating the sum of elements up to a given index in both cases. In a prefix sum, this is typically done using a simple loop that sums adjacent elements; in an integral image, it's achieved by iteratively adding pixels above and to the left of each pixel.


In [7]:
# Let's set up a simple chain to test our RAG against

simple_prompt = PromptTemplate.from_template('''
Use three sentences maximum and keep the answer concise.
Question: {question}
Answer:
''')
simple_chain = simple_prompt | llm | StrOutputParser()

simple_response = simple_chain.invoke(query)
print(simple_response)

A prefix sum in the context of integers, particularly in binary arithmetic, calculates the weighted average of a set of digits. This process involves converting each digit into its equivalent value based on powers of 2 (e.g., 1 = 2^0, 10 = 2^1). The result is then added together to form the final integer sum.


In [8]:
# Use a stronger LLM to judge which response is better

judge_prompt = PromptTemplate.from_template('''
You will be given a question and two different answers to that question. Based on your knowledge, judge which answer is better, in terms of factual accuracy and relevance to the question. If they are about the same, then say so. Use three sentences maximum and keep the answer concise.

Question: {question}
Answer1: {answer1}
Answer2: {answer2}

Your response:
''')
judge_chain = (
        judge_prompt
        | OllamaLLM(model="gemma3n")
        | StrOutputParser()
)

judge_response = judge_chain.invoke({
    "question": query,
    "answer1": rag_response,
    "answer2": simple_response,
})
print(judge_response)

Answer1 is better because it accurately describes the core relationship between prefix sums and integral images: both compute cumulative sums. Answer2 describes a prefix sum in the context of weighted averages and binary arithmetic, which is not the primary connection to integral images and therefore less relevant.

