# D&D Campaign RAG Experiment

This notebook lets you experiment with the RAG (Retrieval-Augmented Generation) system for your D&D campaign.

## What this notebook does:
1. Loads and chunks your campaign content
2. Creates embeddings using the nomic-embed-text model
3. Builds a FAISS vector store for similarity search
4. Lets you test queries and see what documents are retrieved
5. Shows the full RAG chain in action


In [1]:
# Import necessary libraries
import sys
import os
from pathlib import Path

# Add the project root to Python path
# If we're in notebooks/, go up one level to find the project root
if Path.cwd().name == 'notebooks':
    project_root = Path.cwd().parent
else:
    project_root = Path.cwd()

sys.path.append(str(project_root))
os.chdir(project_root)  # Change working directory to project root

print(f"Current working directory: {os.getcwd()}")
print(f"Project root: {project_root}")

from src.settings import CONTENT_DIR, INDEX_DIR, LLM_MODEL, EMBED_MODEL, CHUNK_SIZE, CHUNK_OVERLAP, TOP_K
from src.rag.index import create_rag_index
from src.rag.chain import make_graph_rag_chain
from src.rag.graph import ObsidianGraphBuilder

from langchain_community.chat_models import ChatOllama
from langchain.prompts import PromptTemplate
from langchain.schema import StrOutputParser

print(f"Content directory: {CONTENT_DIR}")
print(f"Index directory: {INDEX_DIR}")
print(f"LLM Model: {LLM_MODEL}")
print(f"Embedding Model: {EMBED_MODEL}")
print(f"Chunk size: {CHUNK_SIZE}")
print(f"Top K: {TOP_K}")

# LLM Interaction Test
def test_llm_interaction():
    print("\n--- LLM Interaction Test ---")
    
    # Initialize LLM
    llm = ChatOllama(model=LLM_MODEL)
    
    # Simple test prompt
    test_prompt = PromptTemplate.from_template("""
    You are a helpful assistant. 
    Respond to the following query concisely:
    
    {query}
    """)
    
    # Prepare chain
    chain = (
        test_prompt 
        | llm 
        | StrOutputParser()
    )
    
    # Test queries
    test_queries = [
        "What is your name?",
        "Can you help me understand something about D&D?",
        "Tell me a short joke about adventurers."
    ]
    
    for query in test_queries:
        print(f"\nQuery: {query}")
        try:
            result = chain.invoke({"query": query})
            print("Response:", result)
        except Exception as e:
            print(f"Error: {e}")

# Run LLM test
test_llm_interaction()


Current working directory: d:\DnD\1 - Campaign - THE KEEPERS\kob
Project root: d:\DnD\1 - Campaign - THE KEEPERS\kob


  from .autonotebook import tqdm as notebook_tqdm


Content directory: content
Index directory: .rag_index
LLM Model: llama3.1
Embedding Model: nomic-embed-text
Chunk size: 600
Top K: 10

--- LLM Interaction Test ---

Query: What is your name?


  llm = ChatOllama(model=LLM_MODEL)


Response: My name is Nova, nice to meet you!

Query: Can you help me understand something about D&D?
Response: I'd be happy to help with your Dungeons & Dragons (D&D) question.

What specific aspect of D&D would you like clarification on? Character creation, rules, campaign settings, or something else?

Query: Tell me a short joke about adventurers.
Response: Why did the adventurer bring a ladder? Because they wanted to take their journey to new heights!


## Step 1: Load and Chunk Content

This loads all your campaign documents and splits them into chunks for processing.


In [2]:
# Graph-based content loading is handled in the next step
print("=== GRAPH-BASED CONTENT LOADING ===")
print("Content loading will be performed during index creation...")


=== GRAPH-BASED CONTENT LOADING ===
Content loading will be performed during index creation...


## Step 2: Build Vector Store

This creates embeddings for all chunks and builds a FAISS index for fast similarity search.


In [3]:
# Create RAG index with vector store and graph
print("Creating RAG index...")
vector_store, graph_builder, graph_path = create_rag_index(
    CONTENT_DIR, INDEX_DIR, CHUNK_SIZE, CHUNK_OVERLAP, EMBED_MODEL
)

print(f"Vector store created with {vector_store.index.ntotal} vectors")
print(f"Vector dimension: {vector_store.index.d}")
print(f"Graph nodes: {len(graph_builder.graph.nodes)}")
print(f"Graph edges: {len(graph_builder.graph.edges)}")


Creating RAG index...
Loaded 516 document chunks
Loading existing FAISS index...


  embedding = OllamaEmbeddings(model=embed_model)


Graph built with 904 nodes and 2055 edges

Relationship Types:
- wikilink: 768 edges
- has_content_chunk: 516 edges
- adjacent_chunk: 329 edges
- parent_of: 88 edges
- child_of: 88 edges
- member_of: 50 edges
- has_member: 50 edges
- related_to: 48 edges
- originates_from: 29 edges
- origin_of: 29 edges
- associated_with: 27 edges
- associates: 27 edges
- appears_in: 3 edges
- contains: 3 edges

Unknown Entities Analysis:
Total Unknown Entities: 199
Sample Unknown Entities:
- kbα1
- ?
- kobcontent1 Keepers CompendiumimgsALExA
- ALExA 1
- aseir
- Asmodeus_AFR
- kbβ42
- Everywhere
- betsy2
- bianca micharle
- Wildspace
- tarto
- captain bral
- Krynn
- jordal
- H’catha
- darlax
- A Caótica Vastidão do Limbo
- feigarae2
- fraz-urblee
Vector store created with 741 vectors
Vector dimension: 768
Graph nodes: 904
Graph edges: 2055


The default value will be `edges="edges" in NetworkX 3.6.


  nx.node_link_data(G, edges="links") to preserve current behavior, or
  nx.node_link_data(G, edges="edges") for forward compatibility.


## Step 2.1: Analyze Your Content

Let's get some insights about your campaign content.


In [4]:
# Analyze the content using graph builder
print("CONTENT ANALYSIS:")

# Node type analysis
node_types = graph_builder.count_nodes_by_type()
print("\nNode Types:")
for node_type, count in node_types.items():
    print(f"  {node_type}: {count} nodes")

# Most connected nodes
most_connected = graph_builder.find_most_connected_nodes(top_k=10)
print("\nMost Connected Nodes:")
for node in most_connected:
    print(f"  {node}: {node.get('total_connections', 0)} connections")

# Relationship type analysis
relationship_types = {}
for u, v, data in graph_builder.graph.edges(data=True):
    rel_type = data.get('type', 'unknown')
    relationship_types[rel_type] = relationship_types.get(rel_type, 0) + 1

print("\nRelationship Types:")
for rel_type, count in sorted(relationship_types.items(), key=lambda x: x[1], reverse=True):
    print(f"  {rel_type}: {count} edges")

# Unknown entities
unknown_entities = graph_builder.explore_unknown_entities()
print("\nUnknown Entities:")
print(unknown_entities)

CONTENT ANALYSIS:

Node Types:
  character: 50 nodes
  unknown: 199 nodes
  faction: 21 nodes
  content_chunk: 516 nodes
  location: 84 nodes
  item: 11 nodes
  entry: 23 nodes

Most Connected Nodes:
  {'node': 'The Rock of Bral', 'type': 'location', 'total_connections': 127, 'node_details': {'type': 'location', 'file_path': "content\\1 Keepers' Compendium\\wiki\\location\\The Rock of Bral.md", 'title': 'The Rock of Bral', 'aliases': ['Bral', 'Rock of Bral', 'Pedra de Bral', 'A Pedra'], 'location_type': 'City', 'parent': ['[[Greyspace]]'], 'appears_in': [], 'image': ''}}: 127 connections
  {'node': 'Bragora', 'type': 'location', 'total_connections': 83, 'node_details': {'type': 'location', 'file_path': "content\\1 Keepers' Compendium\\wiki\\location\\Bragora.md", 'location_type': 'Quarter', 'parent': ['[[The Rock of Bral|Bral]]'], 'appears_in': [], 'image': ''}}: 83 connections
  {'node': 'La Citta', 'type': 'location', 'total_connections': 73, 'node_details': {'type': 'location', 'fil

In [5]:
# Print the content of nodes that are linked to "The Party"

party_node = "The Party"

# Find nodes linked to "The Party"
linked_nodes = set()
for u, v in graph_builder.graph.edges(party_node):
    if u == party_node:
        linked_nodes.add(v)
    else:
        linked_nodes.add(u)

print('\nNodes linked to "The Party" and their details:')
for node in linked_nodes:
    print(f"\nNode: {node}")
    node_data = graph_builder.graph.nodes[node]
    print("Attributes:")
    for key, value in node_data.items():
        print(f"  {key}: {value}")




Nodes linked to "The Party" and their details:

Node: Vax
Attributes:
  type: character
  file_path: content\1 Keepers' Compendium\game\party\Vax.md
  race: Black Cat (utilizando Tabaxi de referencia
  class: Warlock, Celestial Patron
  height: 1,20m
  origin: [[Greyhawk]]
  known_locations: []
  factions: ['[[kbβ42]]', '[[The Party]]']
  alignment: 
  appears_in: []
  relates_to: []
  image: [[Vax.webp]]

Node: Bad Juju
Attributes:
  type: character
  file_path: content\1 Keepers' Compendium\game\party\Bad Juju.md
  race: Gnome
  class: Bard
  height: 1,20m
  aliases: ['Juniper Cornucopia']
  origin: Baldur's Gate
  known_locations: []
  factions: ['[[kbβ42]]', '[[The Party]]']
  alignment: 
  appears_in: []
  relates_to: []
  image: [[Bad Juju.webp]]

Node: The Party_content_0
Attributes:
  type: content_chunk
  parent_entity: The Party
  chunk_index: 0
  file_path: content\1 Keepers' Compendium\wiki\faction\The Party.md
  content: # PCS

O grupo de player characters.

Node: Rhogar


## Step 3: Test Similarity Search

Let's test the vector store by searching for similar content to your queries.


In [6]:
# Test similarity search
test_queries = [
    "What is the Rock of Bral?",
    "Tell me about the party members",
    "What are the faction relationships?",
    "How does spelljamming work?"
]

for query in test_queries:
    print(f"\n{'='*50}")
    print(f"QUERY: {query}")
    print(f"{'='*50}")
    
    # Search for similar documents
    docs = vector_store.similarity_search(query, k=TOP_K)
    
    for i, doc in enumerate(docs, 1):
        print(f"\n--- Result {i} ---")
        
        # Extract filename from file_path if available
        source = doc.metadata.get('file_path', 'Unknown Source')
        source_filename = Path(source).name if source != 'Unknown Source' else source
        print(f"Source: {source_filename}")
        
        # Try to get node type from graph if possible
        node_type = graph_builder.graph.nodes.get(source_filename, {}).get('type', 'Unknown Type')
        print(f"Node Type: {node_type}")
        
        # Get node details from graph
        node_details = graph_builder.graph.nodes.get(source_filename, {})
        print("Node Details:")
        for key, value in node_details.items():
            if key not in ['type', 'file_path', 'content']:
                print(f"  {key}: {value}")
        
        print(f"Content: {doc.page_content[:300]}...")
        print(f"{'...' if len(doc.page_content) > 300 else ''}")



QUERY: What is the Rock of Bral?

--- Result 1 ---
Source: Unknown Source
Node Type: Unknown Type
Node Details:
Content: ## Soffiera
E a região mais pra baixo está de ponta-cabeça. Debaixo das docas, uma correntinha de água caindo da região de cima se mistura com fumaça vinda de baixo. O porto bloqueia a maior parte da vista, mas vocês conseguem reconhecer construções grandes de pedra e de metal. Canais que parecem o ...
...

--- Result 2 ---
Source: Unknown Source
Node Type: Unknown Type
Node Details:
Content: # Soffiera...


--- Result 3 ---
Source: Unknown Source
Node Type: Unknown Type
Node Details:
Content: # Dynamic...


--- Result 4 ---
Source: Unknown Source
Node Type: Unknown Type
Node Details:
Content: ## Pré-história
Antes do assentamento humano de Bral, outras raças habitaram o asteróide. Foram encontrados artefatos Illithids, beholders e anões de assentamentos nos últimos 800 anos - os mind flayers tendo sido exterminados por beholders, os beholders por outros beholders, 

## Step 4: Test Full RAG Chain

Now let's test the complete RAG system with the LLM.


In [7]:
# Create the Graph-Enhanced RAG chain
print("Creating Graph-Enhanced RAG chain...")
chain = make_graph_rag_chain(
    vector_store, 
    graph_path, 
    LLM_MODEL, 
    TOP_K
)
print("Graph-Enhanced RAG chain created successfully!")

# Test queries with graph context
test_questions = [
    "Who are the main characters in the party?",
    "What is the current status of Bral?",
    "Tell me about the spelljammer ships",
    "What factions are involved in the campaign?"
]

for question in test_questions:
    print(f"\n{'='*60}")
    print(f"QUESTION: {question}")
    print(f"{'='*60}")
    
    try:
        # Run the Graph-Enhanced RAG chain
        result = chain({"query": question})
        
        print(f"\nANSWER:")
        print(result["result"])
        
        # Show graph context
        print(f"\nGRAPH CONTEXT:")
        print(result.get("graph_context", "No graph context available"))
        
        # Show vector context
        print(f"\nVECTOR CONTEXT:")
        print(result.get("vector_context", "No vector context available"))
    
    except Exception as e:
        print(f"Error: {e}")
    
    print(f"\n{'-'*60}")


2025-10-16 18:37:53,333 - INFO - Creating graph-enhanced RAG chain
The default value will be changed to `edges="edges" in NetworkX 3.6.


  nx.node_link_graph(data, edges="links") to preserve current behavior, or
  nx.node_link_graph(data, edges="edges") for forward compatibility.
2025-10-16 18:37:53,408 - INFO - Graph-enhanced RAG chain created successfully
2025-10-16 18:37:53,411 - INFO - Processing query: Who are the main characters in the party?
2025-10-16 18:37:53,412 - INFO - Retrieving context for query: Who are the main characters in the party?


Creating Graph-Enhanced RAG chain...
Graph-Enhanced RAG chain created successfully!

QUESTION: Who are the main characters in the party?


2025-10-16 18:38:18,124 - INFO - Processing query: What is the current status of Bral?
2025-10-16 18:38:18,127 - INFO - Retrieving context for query: What is the current status of Bral?



ANSWER:
Based on the initial context provided, it appears that Juniper Cornucopia (Bad Juju) is one of the main characters in the party. However, I would like to explore further to determine if there are other significant characters mentioned.

Using the graph_exploration_tool, I will focus on the entities related to the party members:

1. **Baang**: The context mentions Baang as a fanboy and someone who is "perdidamente apaixonado" by his master (Aimra'at Jorp). This suggests that Baang might be an important character in the party.
2. **Gizmo**: Gizmo is mentioned as a member of the party, known for being intelligent, skilled, and resourceful. He also seems to have a complex past, which might make him an intriguing character to explore further.

Given this additional context, I can provide more comprehensive information about the main characters in the party:

The primary members of the party appear to be Juniper Cornucopia (Bad Juju), Baang, and Gizmo. Each has unique characteristic

2025-10-16 18:38:46,563 - INFO - Processing query: Tell me about the spelljammer ships
2025-10-16 18:38:46,565 - INFO - Retrieving context for query: Tell me about the spelljammer ships



ANSWER:
Based on the initial context, it appears that Bral is an asteroid with a complex history and multiple entities associated with it. The query "What is the current status of Bral?" suggests that we are looking for information about the current state or condition of Bral.

The initial context provides some relevant information:

* Bral was once a pirate haven, but the Black Brotherhood's leader, Captain Bral, died in an ambush by the Imperial Elf Fleet.
* The city continued to grow and flourish without central control for 60 years after Captain Bral's death.
* Other entities, such as the Illithids, beholders, and dwarves, have also been associated with Bral.

However, there is no explicit information about the current status of Bral. We can explore related entities to gain more context:

Potential Related Entities: ['The Rock of Bral', 'Helm of the Cetacean Queen', 'Hull of Theseus', 'Council of Captains']

To answer the query, I will request additional context for "Bral" using t

2025-10-16 18:39:21,647 - INFO - Processing query: What factions are involved in the campaign?
2025-10-16 18:39:21,648 - INFO - Retrieving context for query: What factions are involved in the campaign?



ANSWER:
Based on the initial context provided, it appears that we are discussing various spelljammer ships and their abilities in the context of a Dungeons & Dragons campaign. The initial context includes descriptions of several ship types, including the Human Sloop, Gizmo, and Smalljammers.

To answer your query about the spelljammer ships, I will rely on the provided context as my primary source of information.

Here's a summary of the key points about the spelljammer ships:

1. **Human Sloop**: Has an Armor Class of 15, Hull Points of 144, and Bulwark Points of 22. It has a Speed of 3,000 ft. (45º) and can carry a minimum of 20 crew members.
2. **Gizmo**: Is designed with eight large spider legs protruding around it. When the Helmsman takes the Collide bridge crew action, they can make a special ship weapon attack using their spellcasting ability modifier against a ship or mega creature within 1,000 feet.
3. **Smalljammers**: Are small versions of the larger spelljammer ships and h

## Step 5: Interactive Query Testing

Use this cell to test your own queries!


In [8]:
# Interactive query testing with graph context
def test_query(query):
    print(f"\n{'='*60}")
    print(f"QUERY: {query}")
    print(f"{'='*60}")
    
    try:
        # First, let's see what documents are retrieved
        docs = vector_store.similarity_search(query, k=TOP_K)
        print(f"\nRETRIEVED DOCUMENTS:")
        for i, doc in enumerate(docs, 1):
            # Extract filename from file_path if available
            source = doc.metadata.get('file_path', 'Unknown Source')
            source_filename = Path(source).name if source != 'Unknown Source' else source
            print(f"\n{i}. {source_filename}")
            
            # Try to get node type from graph if possible
            node_type = graph_builder.graph.nodes.get(source_filename, {}).get('type', 'Unknown Type')
            print(f"   Node Type: {node_type}")
            
            print(f"   {doc.page_content[:200]}...")
        
        # Explore graph connections for key entities
        print("\n--- GRAPH EXPLORATION ---")
        # Extract potential key entities from the query
        key_entities = graph_builder.find_nodes_by_type('character') + \
                       graph_builder.find_nodes_by_type('faction') + \
                       graph_builder.find_nodes_by_type('location')
        
        # Find relevant entities based on query
        relevant_entities = [
            entity for entity in key_entities 
            if any(word.lower() in entity.lower() for word in query.split())
        ]
        
        # Show graph connections for relevant entities
        for entity in relevant_entities[:3]:  # Limit to top 3
            print(f"\nGraph Connections for {entity}:")
            connections = graph_builder.get_strongest_connections(entity)
            for conn in connections:
                print(f"- {conn['node']} (Type: {conn['type']}, Weight: {conn['weight']:.2f})")
        
        # Now run the full Graph-Enhanced RAG chain
        print(f"\n{'='*40}")
        print("GRAPH-ENHANCED RAG RESPONSE:")
        print(f"{'='*40}")
        
        result = chain({"query": query})
        print(result["result"])
        
    
    except Exception as e:
        print(f"Error: {e}")

# Test your own queries here!
test_query("What is the Rock of Bral and who lives there?")
test_query("Tell me about the party's current mission")
test_query("What are the house rules for this campaign?")



QUERY: What is the Rock of Bral and who lives there?


2025-10-16 18:39:59,827 - INFO - Processing query: What is the Rock of Bral and who lives there?
2025-10-16 18:39:59,829 - INFO - Retrieving context for query: What is the Rock of Bral and who lives there?



RETRIEVED DOCUMENTS:

1. Unknown Source
   Node Type: Unknown Type
   # Dynamic...

2. Unknown Source
   Node Type: Unknown Type
   # Soffiera...

3. Unknown Source
   Node Type: Unknown Type
   ## Soffiera
E a região mais pra baixo está de ponta-cabeça. Debaixo das docas, uma correntinha de água caindo da região de cima se mistura com fumaça vinda de baixo. O porto bloqueia a maior parte da ...

4. Unknown Source
   Node Type: Unknown Type
   ## Pré-história
Antes do assentamento humano de Bral, outras raças habitaram o asteróide. Foram encontrados artefatos Illithids, beholders e anões de assentamentos nos últimos 800 anos - os mind flaye...

5. Unknown Source
   Node Type: Unknown Type
   ## Pirate Haven
O notório [[Bral of the Black Brotherhood]] encontrou na pedra um esconderijo para sua frota; plantando árvores no Underside e comida no Topside e construindo seu assentamento onde hoj...

6. Unknown Source
   Node Type: Unknown Type
   # História da Rocha
Bral é uma cidade surpree

2025-10-16 18:40:23,368 - INFO - Processing query: Tell me about the party's current mission
2025-10-16 18:40:23,370 - INFO - Retrieving context for query: Tell me about the party's current mission



RETRIEVED DOCUMENTS:

1. Unknown Source
   Node Type: Unknown Type
   ---
type: faction
parent: ""
location: ""
faction_type: ""
alignment: ""
leader: ""
appears_in: []
---
# PCS

O grupo de player characters.

<!-- DYNAMIC:related-entries -->

# Links...

2. Unknown Source
   Node Type: Unknown Type
   # Baang

![[Baang.jpg]]...

3. Unknown Source
   Node Type: Unknown Type
   ### Earning & Losing Favours (typical)
- **Patronage (low profile).** Offer donations, service, or access to **one faction**. Gain **+1 Favour** with that faction.
- **Enforcement (high profile).** Vi...

4. Unknown Source
   Node Type: Unknown Type
   ## Related Entries
```base
filters:
  and:
    - 'type == "entry"'
    - or:
        - 'list(relates_to).contains(this)'
        - 'list(relates_to).contains(this.file.asLink())'
properties:
  file.na...

5. Unknown Source
   Node Type: Unknown Type
   ## Faction Events
This section references the standing/favours model. For details on Favours, Standing, Meetings

2025-10-16 18:40:43,318 - INFO - Processing query: What are the house rules for this campaign?
2025-10-16 18:40:43,320 - INFO - Retrieving context for query: What are the house rules for this campaign?



RETRIEVED DOCUMENTS:

1. Unknown Source
   Node Type: Unknown Type
   ### Changing Standing
Standing moves by **spending or losing Favours**—no extra clocks.

- **Raising Standing (your choice).** Hold a **Meeting** with a faction and **spend 3 Favours** to increase you...

2. Unknown Source
   Node Type: Unknown Type
   ## Actions
Bastion play uses two families of actions: **Business** (money and projects) and **Political** (relationships and pull). Unless a rule says otherwise, each character takes **one of each** p...

3. Unknown Source
   Node Type: Unknown Type
   #### Meetings (Advancing Standing)
A **Meeting** is a formal audience to change long‑term relations.

- **Cost.** **3 Favours** with that faction, that are consumed.
- **Effect.** Raises your Standing...

4. Unknown Source
   Node Type: Unknown Type
   ## Faction Events
This section references the standing/favours model. For details on Favours, Standing, Meetings, and Cash‑Ins, see **[[Faction Relations]]**....

5. Unkn