# In-Memory Context Augmented Generation (CAG)
**Goal:** Use temporary RAM storage for knowledge, avoiding external databases.
**Method:** We use LlamaIndex's `SummaryIndex`. It stores your documents in a Python list (in memory). When you ask a question, it loads the relevant context into the prompt automatically.

**Note:** This data disappears when you restart the kernel.

In [1]:
from llama_index.core import Document, SummaryIndex, Settings
from llama_index.llms.ollama import Ollama

# 1. Setup LLM
Settings.llm = Ollama(model="qwen3:0.6b", request_timeout=120.0)

# 2. Create "In-Memory" Documents
# These are just Python objects living in RAM.
docs = [
    Document(text="DETI (Dept of Electronics) is located in Building 11 of University of Aveiro."),
    Document(text="The Director of DETI is Prof. Nuno Borges Carvalho."),
    Document(text="To access the labs, students must use their student card (Cartão de Estudante)."),
    Document(text="The cafeteria in Building 11 serves sandwiches and coffee until 6 PM.")
]

# 3. Build Index (In-Memory)
# SummaryIndex keeps all nodes in a list. No Vector DB required.
index = SummaryIndex.from_documents(docs)

# 4. Query Engine
# This engine will retrieve the context from RAM and feed it to the LLM.
query_engine = index.as_query_engine()

print("✅ In-Memory Index Ready.")

✅ In-Memory Index Ready.


In [2]:
# 5. Ask Questions
response = query_engine.query("Who is the director?")
print(f"Q: Who is the director?\nA: {response}\n")

response = query_engine.query("When does the cafeteria close?")
print(f"Q: When does the cafeteria close?\nA: {response}")

Q: Who is the director?
A: The director of DETI is Prof. Nuno Borges Carvalho.

Q: When does the cafeteria close?
A: The cafeteria in Building 11 closes at 6 PM.
