# 🌍 GraphRAG Core: Climate Intelligence Tutorial

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nunezmatias/grafoRag/blob/main/examples/Tutorial_GraphRAG.ipynb)

Welcome to the **GraphRAG Core** tutorial. This library comes **pre-loaded** with a massive Climate Adaptation Knowledge Graph, allowing you to perform deep scientific research instantly.

### 🚀 Features
1. **Plug & Play**: No data setup required. The Climate Knowledge Base is embedded (auto-downloaded).
2. **Deep Retrieval**: Finds not just one paper, but the entire context around a topic.
3. **Causal Reasoning**: Traverses the graph to find systemic risks (e.g., Heat -> Drought -> Food Insecurity).

## 1. Installation
Install the library directly from GitHub. This will setup the engine.

In [None]:
!pip install git+https://github.com/nunezmatias/grafoRag.git
!pip install -q -U google-genai

import os
from graphrag_core import GraphRAGEngine
print("✅ Libraries Installed & Loaded")

## 2. Initialize the Engine
We initialize the engine without arguments. It will automatically detect missing data and download the Climate Database (~300MB) from the cloud.

In [None]:
engine = GraphRAGEngine()
# Output should say: "Attempting to download from Google Drive..." followed by "System Ready"

## 3. Run a Deep Research Query
We will now perform a complex search. The engine allows you to tune the depth of the investigation:

- **`top_k`** controls breadth (how many distinct topics to start with).
- **`context_k`** controls depth (how many papers to read per topic).
- **`hops`** controls causal reasoning (how far to travel in the graph).

A configuration of `hops=2` allows us to see second-order effects, essential for understanding cascading risks.

In [None]:
# Define your research question
query = "cascading risks of extreme heat and urban floods"

# Execute the Search
results = engine.search(
    query=query, 
    top_k=3,        # Breadth
    context_k=4,    # Depth
    hops=2          # Causality
)

print(f"--- Research Stats ---")
print(f"Primary Sources: {results['stats']['primary']}")
print(f"Context Expansion: {results['stats']['context']}")
print(f"Causal Links:      {results['stats']['graph']}")

## 4. Inspect the Intelligence
It is important to verify the quality of the retrieved data. Below, we print the top-ranked paper and a sample of the causal chains discovered by the graph traversal.

In [None]:
# 1. Check the Top Paper
if results['papers']:
    p = results['papers'][0]
    print(f"📄 Top Paper: {p['title']}")
    print(f"   Snippet: {p['content'][:200]}...")

# 2. Check Discovered Causal Chains
if results['graph_links']:
    print("
🔗 Sample Causal Chains:")
    for link in results['graph_links'][:5]:
        print(f"   {link['node1']} --[{link['relation']}]--> {link['node2']}")

## 5. Construct the Expert Prompt
We use the engine's built-in expert template to package this structured data into a rigorous prompt for the LLM. This template forces the model to triangulate evidence and cite sources.

In [None]:
# Using the default expert template designed for this Climate Graph
prompt = engine.format_prompt(results, query)

print("Here is your optimized prompt (COPY THIS):")
print("--------------------------------------------------")
print(prompt)
print("--------------------------------------------------")

## 6. Generate Answer with Gemini Flash ⚡
Finally, we send the generated prompt to Google's Gemini model.

**Prerequisite:** Add your API Key to Colab Secrets (Key icon on the left) with the name `GOOGLE_API_KEY`.

In [None]:
from google import genai
from google.colab import userdata
from IPython.display import Markdown, display

# Configuración de la API Key
try:
    GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
    client = genai.Client(api_key=GOOGLE_API_KEY)
    print("✅ Gemini Client Configured")
except Exception as e:
    print("⚠️ Error: API Key not found. Please add 'GOOGLE_API_KEY' to Colab Secrets.")

# Generar contenido
print("⏳ Generating expert response with Gemini 2.0 Flash...")
try:
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=prompt
    )
    
    # Mostrar la respuesta formateada
    display(Markdown("### 🤖 Response:"))
    display(Markdown(response.text))
except Exception as e:
    print(f"❌ Generation Error: {e}")

## 7. Advanced: Build Your Own Prompt Template
Do you want full control? Here is how you can access the raw variables `papers` and `graph_links` to modify the prompt structure entirely before sending it to the LLM.

You can edit the F-String below to change the persona, instructions, or layout.

In [None]:
# --- CUSTOMIZE YOUR TEMPLATE HERE ---my_role = "You are a Data Journalist writing for a general audience."my_instruction = "Summarize the risks in 3 bullet points. Be concise."
# 1. Flatten the Papers data into a string
papers_text = ""for p in results['papers']:
    papers_text += f"- {p['title']}: {p['content'][:200]}...
"
# 2. Flatten the Graph data into a string
graph_text = ""for link in results['graph_links']:
    graph_text += f"- {link['node1']} causes {link['node2']}
"
# 3. Build the F-String (Edit this!)
custom_prompt = (f"ROLE: {my_role}
"                 f"QUESTION: {query}

"                 f"SCIENTIFIC EVIDENCE:
{papers_text}
"                 f"CAUSAL LINKS:
{graph_text}
"                 f"INSTRUCTION: {my_instruction}")

print("--- Custom Prompt Created ---")
print(custom_prompt[:500] + "... [Truncated]")

# 4. Send Custom Prompt to Gemini
print("
⏳ Generating Custom Response...")
try:
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=custom_prompt
    )
    display(Markdown("### 🤖 Custom Response:"))
    display(Markdown(response.text))
except Exception as e:
    print(f"❌ Generation Error: {e}")