# RAG (Retrieval Augmented Generation)

## Introducción

En este notebook vamos a:
* Cargar el índice creado en `02_indexing.ipynb` (vectorstore + metadatos).
* Recuperar chunks relevantes (retrieval) para una pregunta del usuario usando búsqueda semántica.
* Aplicar una regla simple anti-alucinación: si no hay evidencia suficiente, no recomendar y pedir aclaración.
* Generar una respuesta con un modelo *instruct* usando únicamente el texto recuperado (grounded answer).
* Incluir citas (snippets + URLs) para justificar las recomendaciones.

Usaremos LlamaIndex para el retrieval y la generación, y el vectorstore persistente en Chroma como base de conocimiento.

## Documentos de referencia
* [Persisting & Loading Data | LlamaIndex Python Documentation](https://developers.llamaindex.ai/python/framework/module_guides/storing/save_load/)
* [Retriever | LlamaIndex Python Documentation](https://developers.llamaindex.ai/python/framework/module_guides/querying/retriever/)

## Imports
* Importamos `chromadb`y `ChromaVectorStore` para reconectar con la DB persistente.
* De `llama_index` nos traemos:
    * [StorageContext](https://developers.llamaindex.ai/python/framework-api-reference/storage/storage_context/), `load_index_from_storage` porque hacen falta para cargar el índice.
    * [Settings](https://developers.llamaindex.ai/python/framework/module_guides/supporting_modules/settings/) para fijar una configuración global de embeddings y LLM que vamos a usar.
    * [HuggingFaceEmbeddings](https://developers.llamaindex.ai/typescript/framework/modules/models/embeddings/huggingface/) para continuar usando el embedding model `all-MiniLM-L6-v2`
    * [HuggingFaceLLM](https://developers.llamaindex.ai/python/framework/integrations/llm/huggingface/) para usar un LLM que nos genere la respuesta.

In [1]:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import Settings, StorageContext, load_index_from_storage
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
import re

In [2]:
# Configurar embeddings
Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

## LLM
Para esta tarea, necesitamos un modelo _instruct_ que siga unas instrucciones para generar las respuestas a los usuarios basadas en texto recuperado (RAG). También le podemos indicar que no de respuestas sin datos. [`phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) es pequeño, se puede ejecutar en local y es suficiente para la tarea. Con 4k de tokens de contexto tenemos suficiente para las instrucciones, prompts, chunks recuperados y generar una respuesta corta.

In [3]:
# Configurar LLM
Settings.llm = HuggingFaceLLM(
    model_name="microsoft/phi-3-mini-4k-instruct",
    tokenizer_name="microsoft/phi-3-mini-4k-instruct",
    device_map="auto",
    max_new_tokens=300,    
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the disk.


In [4]:
# Cargar datos
chroma_client = chromadb.PersistentClient(path="../data/chroma_db")
chroma_collection = chroma_client.get_or_create_collection("charities")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

storage_context = StorageContext.from_defaults(
    persist_dir="../data/index_store",
    vector_store=vector_store,
)

index = load_index_from_storage(storage_context)

## Query
Existen varios módulos de `Query` en LlamaIndex. En este caso nos interesa usar `Retriever` porque así podemos ver si hay chunks relacionados con la pregunta del usuario y evitar responder algo sin datos.

In [6]:
# Retriever
retriever = index.as_retriever(similarity_top_k=3)

SYSTEM_PROMPT = """You are a donation advisor.

STRICT RULES:
- Use ONLY the provided context snippets.
- Never invent numbers, ratings, or organizational details.
- If the context is insufficient, say exactly: "I can't recommend based on my sources." and ask 1–2 clarifying questions.
- Always include citations in the form: [SNIPPET X + URL].
- Cite ONLY the snippets you actually relied on in your answer.
- Do not add extra sections. Do not repeat the instructions.
"""

ANSWER_FORMAT = """Answer using EXACTLY these 5 sections and STOP after section 5.

Rules:
- Use IDs A, B, C and reuse the same IDs in all sections.
- Max 3 charities.
- Keep it short: each line in sections 2 and 3 must be <= 18 words.
- No sub-bullets.
- No numbers unless explicitly present in snippets.
- If a section has no content, write "None."
- In section 5, copy the FULL URL exactly as shown in the context snippets.
- Do NOT write the word "URL". Use the actual link.
STOP after section 5.

1) Recommended charities
A) <Charity name>
B) <Charity name>
C) <Charity name>

2) Why
A) <reason>
B) <reason>
C) <reason>

3) Transparency notes
A) <note or "None.">
B) <note or "None.">
C) <note or "None.">

4) What I'm unsure about
- <one uncertainty>
(or write "None.")

5) Citations
- A) SNIPPET X — <full link from snippet>
- B) SNIPPET Y — <full link from snippet>
- C) SNIPPET Z — <full link from snippet>

End with <<<END>>>.
"""

def get_recommendation(query: str, min_sources: int = 2, max_snippets: int = 3):
    nodes = retriever.retrieve(query) or []

    # Si no hay suficiente información, hacer preguntas
    if len(nodes) < min_sources:
        return {
            "answer": (
                "I can't recommend based on my sources. "
                "Could you clarify your cause preference and whether you have any region constraints?"
            ),
            "citations": []
        }

    context_parts = []
    citations = []

    for idx, node in enumerate(nodes[:max_snippets], start=1):
        md = node.node.metadata or {}
        snippet_text = node.node.get_text().strip()

        # Si por lo que sea viene vacío, sáltalo
        if not snippet_text:
            continue

        source_url = md.get("source_url", "")
        source_primary = md.get("source_primary", "")

        context_parts.append(
            f"[SNIPPET {idx}] SOURCE: {source_primary}\n"
            f"{snippet_text}\n"
            f"URL: {source_url}\n"
        )

        citations.append({
            "snippet_id": idx,
            "charity_name": md.get("name", ""),
            "source_url": source_url,
            "primary_source": source_primary
        })

    # Si no tenemos suficiente contexto, hacer preguntas
    if len(context_parts) < min_sources:
        return {
            "answer": (
                "I can't recommend based on my sources. "
                "Could you clarify your cause preference and any region constraints?"
            ),
            "citations": []
        }

    prompt = f"""{SYSTEM_PROMPT}

User question:
{query}

Context snippets:
{chr(10).join(context_parts)}

{ANSWER_FORMAT}

End your answer with the token <<<END>>>.

Answer:
"""
    raw = Settings.llm.complete(prompt).text
    response = raw.split("<<<END>>>")[0].strip()

    used = set(int(x) for x in re.findall(r"\[SNIPPET\s*(\d+)", response))
    filtered = [c for c in citations if c["snippet_id"] in used]
    
    return {"answer": response, "citations": filtered}

# Ejemplo
query = "I have €20/month and want to reduce extreme poverty with measurable outcomes. Recommend 1-3 charities."
result = get_recommendation(query)
print(result["answer"])

1) Recommended charities
A) Ultra-Poverty Graduation Programs Multi-Component
B) Raising The Village
C) Ultra-Poor Entrepreneurship Programs

2) Why
A) Asset transfer plus support, sustainable livelihoods, women-led.
B) Livelihood development, health, savings groups, education support.
C) Business skills, seed capital, mentorship, sustainable micro-enterprises.

3) Transparency notes
A) Collects detailed household data, multiple RCTs.
B) Publishes detailed program data, household outcome metrics.
C) Collects outcome data through surveys, multiple RCTs.

4) What I'm unsure about
- None.

5) Citations
- A) [SNIPPET 1] — https://www.thelifeyoucansave.org/best-charities/raising-the-village/
- B) [SNIPPET 2] — https://www.thelifeyoucansave.org/best-charities/raising-the-village/
- C) [SNIPPET 3] — https://www.thelifeyoucansave.org/best-charities/village-enterprise
