In [1]:
# Let's perform traditional rag on a mock Obsidian vault

In [2]:
from dotenv import load_dotenv

load_dotenv()  # load environment variables

True

In [3]:
# Indexing: Let's load our full documents

from langchain_community.document_loaders import DirectoryLoader

loader = DirectoryLoader("./", glob="**/*.md", show_progress=True, use_multithreading=True)

full_documents = loader.load()
full_documents[0]

  3%|▎         | 2/66 [00:02<01:29,  1.40s/it]short text: "Multiply by two:". Defaulting to English.
short text: "x <<= 1". Defaulting to English.
short text: "Divide by two:". Defaulting to English.
short text: "x >>= 1". Defaulting to English.
short text: "Swap two numbers:". Defaulting to English.
short text: "Complexity Analysis". Defaulting to English.
  9%|▉         | 6/66 [00:02<00:28,  2.11it/s]short text: "Structure". Defaulting to English.
short text: "![[sorting_network_overview.png]]". Defaulting to English.
short text: "Operation". Defaulting to English.
short text: "Construct". Defaulting to English.
short text: "Zero-One Principle". Defaulting to English.
short text: "Query Range". Defaulting to English.
short text: "Edge List". Defaulting to English.
short text: "Parallelism". Defaulting to English.
short text: "Update". Defaulting to English.
short text: "Adjacency Matrix". Defaulting to English.
short text: "Complexity Analysis". Defaulting to English.
short text: "Va

Document(metadata={'source': 'rsc/vault/Priority Queue.md'}, page_content='A priority queue is an abstract data structure, similar to a queue, in which each element has an associated priority and elements with high priority are served before elements with low priority.\n\n![[priority_queue_overview.png|500]]\n\nPriority queues are commonly implemented using [[Heap|heaps]], giving $O(\\log n)$ performance for inserts and removals, and $O(n)$ to build the heap initially from a set of $n$ elements.\n\nOperations\n\nPriority queues support the following operations:\n\nBasic - enqueue: add an element to the queue with an associated priority. - dequeue: remove the highest priority element from the queue, and return it. - delete: remove an element from the queue. - peek: return the highest priority element from the queue.\n\nInspection - size: return the number of elements in the queue. - is_empty: check whether the queue has no elements.\n\nEquivalence of priority queues and sorting algorith

In [4]:
# Indexing: Let's chunk and insert our documents into FAISS

from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS

# split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=64)
documents = text_splitter.split_documents(full_documents)

# store in FAISS with nomic embeddings (ollama)
embedding_model = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = FAISS.from_documents(documents=documents, embedding=embedding_model)

documents[0]

Document(metadata={'source': 'rsc/vault/Priority Queue.md'}, page_content='A priority queue is an abstract data structure, similar to a queue, in which each element has an associated priority and elements with high priority are served before elements with low priority.\n\n![[priority_queue_overview.png|500]]\n\nPriority queues are commonly implemented using [[Heap|heaps]], giving $O(\\log n)$ performance for inserts and removals, and $O(n)$ to build the heap initially from a set of $n$ elements.\n\nOperations\n\nPriority queues support the following operations:')

In [58]:
# Define a query

# query = "Given an integer array nums, handle multiple queries of the following types: 1. Update the value of an element in nums. 2. Calculate the sum of the elements of nums between indices left and right inclusive where left <= right. Give a broad strokes overview of how you'd solve this?"

# query = "What data structures and algorithms do you use to solve the trapping rainwater coding problem?"

# query = "What is a fenwick tree?"

# query = "What is the seam carving algorithm?"

# query = "What is a multiset good for?"

query = "What is the relationship between a prefix sum and an integral image?"

In [59]:
# Retrieval and generation: Let's set up a RAG chain

from langchain_ollama import OllamaLLM
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template('''
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
''')

llm = OllamaLLM(model="llama3.2:1b")

rag_chain = (
        {
            "context": vectorstore.as_retriever(),
            "question": RunnablePassthrough(),
        }
        | prompt
        | llm
        | StrOutputParser()
)

rag_response = rag_chain.invoke(query)
print(rag_response)

A prefix sum typically represents a running sum of elements from an array up to a given index, whereas an integral image is the 2D extension of a prefix sum that can also represent sums in higher dimensions. The relationship between the two is analogous; the value for each pixel in an integral image corresponds to the sum of all pixels in the source image (or other dimension). This means that, conceptually, an integral image is a 2-dimensional analogue of a prefix sum.


In [60]:
# Let's set up a simple chain to test our RAG against

simple_prompt = PromptTemplate.from_template('''
Use three sentences maximum and keep the answer concise.
Question: {question}
Answer:
''')
simple_chain = simple_prompt | llm | StrOutputParser()

simple_response = simple_chain.invoke(query)
print(simple_response)

A prefix sum is the result of adding all the digits in a number, while integral image represents a single digit. The process of converting a decimal to another base involves both these operations. For example, 128 in base 10 can be converted to integral image (one with the digit 1) in base 6 by prefix summing the original value.


In [61]:
# Use a stronger LLM to judge which response is better

judge_prompt = PromptTemplate.from_template('''
You will be given a question and two different answers to that question. Based on your knowledge, judge which answer is better, in terms of factual accuracy and relevance to the question. If they are about the same, then say so. Use three sentences maximum and keep the answer concise.

Question: {question}
Answer1: {answer1}
Answer2: {answer2}

Your response:
''')
judge_chain = (
        judge_prompt
        | OllamaLLM(model="gemma3n")
        | StrOutputParser()
)

judge_response = judge_chain.invoke({
    "question": query,
    "answer1": rag_response,
    "answer2": simple_response,
})
print(judge_response)

Answer1 is significantly better. It accurately describes the relationship between a prefix sum and an integral image, highlighting the 2D analogy and how integral images store cumulative sums. Answer2 provides a fundamentally incorrect definition of both concepts and a nonsensical example, demonstrating a lack of understanding. 

