# 🌍 GraphRAG Core: Climate Intelligence Tutorial

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nunezmatias/grafoRag/blob/main/examples/Tutorial_GraphRAG.ipynb)

Welcome to the **GraphRAG Core** interactive research tool. This system bridges the gap between unstructured scientific literature and structured causal knowledge.

### 🧠 How it works (The "Node-Centric" Engine)
Unlike traditional RAG which just finds text chunks, this engine operates in three cognitive layers:
1.  **Scope (Vector Search):** Identifies the key *Topics* (Nodes) relevant to your query.
2.  **Depth (Context Expansion):** For each identified topic, it dives deep to find specific evidence, ensuring high semantic density.
3.  **Causality (Graph Traversal):** It follows the causal links in the Knowledge Graph to find downstream effects (e.g., *Heat* -> *Drought* -> *Food Security*).


## 1. Setup & Installation
We install the `graphrag_core` library (which includes the engine and data) and the Google GenAI SDK for the final answer generation.

In [None]:
!pip install git+https://github.com/nunezmatias/grafoRag.git
!pip install -q -U google-genai

import os
from graphrag_core import GraphRAGEngine
print("✅ Libraries Installed & Loaded")

## 2. Initialize the Knowledge Engine
We initialize the engine without arguments. It will automatically detect if the Climate Database is missing and download it (~300MB) from the cloud to your Colab environment.

In [None]:
engine = GraphRAGEngine()
# Watch the output for the download progress...

## 3. Run a Deep Research Query
Here you can tune the research parameters. Think of this as adjusting the lens of a microscope.

### 🎛️ Parameters Explained:
*   **`top_k` (Breadth):** How many distinct *Topics* (Nodes) to investigate. Higher means a broader scope.
*   **`context_k` (Depth):** How many scientific abstracts to read *per topic*. Higher means more nuance and consensus checking.
*   **`hops` (Causality):** How many steps to traverse in the causal graph.
    *   `1`: Direct effects (Heat -> Health)
    *   `2`: Cascading effects (Heat -> Drought -> Agriculture)

Try changing the query below!

In [None]:
# Define your research question
query = "cascading risks of extreme heat and urban floods"

# Execute the Search
results = engine.search(
    query=query, 
    top_k=3,        # Look for 3 main topics
    context_k=4,    # Read 4 papers per topic
    hops=2          # Find 2nd order consequences
)

print(f"--- Research Stats ---")
print(f"Primary Sources: {results['stats']['primary']}")
print(f"Context Expansion: {results['stats']['context']}")
print(f"Causal Links:      {results['stats']['graph']}")

## 4. Inspect the Retrieved Intelligence
Before generating the answer, let's verify what the engine found. This "White Box" approach builds trust in the result.

In [None]:
# 1. Check the Top Paper
if results['papers']:
    p = results['papers'][0]
    print(f"📄 Top Paper: {p['title']}")
    print(f"   Snippet: {p['content'][:200]}...")

# 2. Check Discovered Causal Chains
if results['graph_links']:
    print("
🔗 Sample Causal Chains:")
    for link in results['graph_links'][:5]:
        print(f"   {link['node1']} --[{link['relation']}]--> {link['node2']}")

## 5. Construct the Expert Prompt
The engine uses a specialized template to package this data for the LLM. It explicitly instructs the model to:
1.  **Triangulate** data (Graph vs Text).
2.  Identify **Cascades**.
3.  Cite sources with **URLs**.

You can customize the `system_role` below to change the persona (e.g., "Policy Maker", "Engineer").

In [None]:
prompt = engine.format_prompt(
    results,
    query,
    system_role="You are a Senior Climate Adaptation Specialist at the UN."
)

# print(prompt) # Uncomment to see the full massive prompt

## 6. Generate Answer with Gemini Flash ⚡
Finally, we send the prompt to Google's Gemini model to synthesize the final report.

**Note:** You need a Google API Key.
1. Get it from [Google AI Studio](https://aistudio.google.com/).
2. In Colab, click the **Key icon** (Secrets) on the left sidebar.
3. Add a new secret named `GOOGLE_API_KEY` with your key value.
4. Toggle the "Notebook access" switch to ON.

In [None]:
from google import genai
from google.colab import userdata
from IPython.display import Markdown, display

# Configuración de la API Key
try:
    GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
    client = genai.Client(api_key=GOOGLE_API_KEY)
    print("✅ Gemini Client Configured")
except Exception as e:
    print("⚠️ Error: API Key not found. Please add 'GOOGLE_API_KEY' to Colab Secrets.")

# Generar contenido
print("⏳ Generating expert response with Gemini 2.0 Flash...")
try:
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=prompt
    )
    
    # Mostrar la respuesta formateada
    display(Markdown("### 🤖 Response:"))
    display(Markdown(response.text))
except Exception as e:
    print(f"❌ Generation Error: {e}")