# 🤖 LLM Integration – End-to-End RAG Pipeline

This notebook completes the Retrieval-Augmented Generation (RAG) loop by:

- Formatting top retrieved documents into a prompt
- Sending that prompt to a language model (Mistral AI)
- Comparing generated answers with/without context

In [1]:
import os
import sys
import json
from pathlib import Path

# Add the src directory to the path
sys.path.append(os.path.abspath('..'))

from src.rag_pipeline import RAGPipeline
from src.mistral_generator import MistralGenerator

# Set paths
DATA_DIR = Path("../data")
PROCESSED_DIR = DATA_DIR / "processed"
REAL_DOCS_DIR = DATA_DIR / "real_docs"

# Set Mistral API key - replace with your own or use environment variable
# os.environ["MISTRAL_API_KEY"] = "your-api-key-here"

# Check if API key is set
if not os.environ.get("MISTRAL_API_KEY"):
    print("⚠️ Warning: MISTRAL_API_KEY environment variable not set.")
    print("Please set your API key using os.environ['MISTRAL_API_KEY'] = 'your-key-here'")
    print("or export it in your environment before running this notebook.")

## 1. Load Documents and Initialize Pipeline

In [2]:
# Load sample docs
documents = []
for filename in ["hrv_basics.txt", "sleep_optimization.txt", "training_recovery.txt"]:
    try:
        with open(REAL_DOCS_DIR / filename, 'r') as f:
            doc_content = f.read()
            doc_id = filename.split('.')[0]
            documents.append({"id": doc_id, "content": doc_content, "metadata": {"source": filename}})
    except FileNotFoundError:
        print(f"Warning: File {filename} not found")

print(f"Loaded {len(documents)} documents")

# Build pipeline
pipeline = RAGPipeline()
pipeline.load_documents_from_list(documents)

# Define a query
query = "Based on my HRV, should I reduce tomorrow's workout?"

## 2. Retrieve Relevant Documents

In [3]:
# Retrieve top docs
retrieval_results = pipeline.query(query, top_k=2, rerank=True)

# Format documents for LLM
retrieved_docs = []
for doc in retrieval_results['results']:
    retrieved_docs.append({
        "id": doc["id"],
        "content": doc["content"],
        "metadata": {
            "source": doc.get("metadata", {}).get("source", "Unknown"),
            "score": doc["score"]
        }
    })

# Display retrieved documents
print(f"Retrieved {len(retrieved_docs)} documents for query: '{query}'\n")
for i, doc in enumerate(retrieved_docs):
    print(f"Document {i+1}: {doc['id']} (Score: {doc['metadata']['score']:.3f})")
    print(f"Source: {doc['metadata']['source']}")
    print(f"Content snippet: {doc['content'][:150]}...\n")

## 3. Send to Mistral AI

In [4]:
# Initialize Mistral generator
mistral = MistralGenerator(
    model="mistral-small",  # You can also use "mistral-medium" or "mistral-large"
    temperature=0.3,
    max_tokens=500
)

# Create a custom prompt template for this specific query
recovery_coach_prompt = """
You are a recovery coach specializing in athletic performance.

Given the following context information about HRV and training recovery:
{context}

Answer the athlete's question: "{query}"
Base your answer only on the provided context. If you can't determine the answer from the context, explain what additional information would be needed.
"""

# Generate answer with context
result = mistral.generate_answer(query, retrieved_docs, recovery_coach_prompt)

print("💬 Generated Answer (With Context):")
print(result['answer'])

## 4. Without Context Comparison

In [5]:
# Generate answer without context
no_context_query = "Based on my HRV, should I reduce tomorrow's workout?"

# We'll use the same API but without providing any context documents
no_context_prompt = "You are a recovery coach. " + no_context_query

# Create a simple payload for the API request
import requests
import os

headers = {
    "Authorization": f"Bearer {os.environ.get('MISTRAL_API_KEY')}",
    "Content-Type": "application/json"
}

payload = {
    "model": "mistral-small",
    "messages": [
        {"role": "user", "content": no_context_prompt}
    ],
    "temperature": 0.3,
    "max_tokens": 500
}

response = requests.post("https://api.mistral.ai/v1/chat/completions", headers=headers, json=payload)
response_data = response.json()

no_context_answer = response_data["choices"][0]["message"]["content"].strip()

print("🚫 Without RAG (No Context):")
print(no_context_answer)

## 5. Compare Multiple Models (Optional)

In [6]:
# If you have both Mistral and Gemini API keys, you can compare them
try:
    from src.gemini_generator import GeminiGenerator
    
    if os.environ.get("GOOGLE_API_KEY"):
        # Initialize Gemini generator
        gemini = GeminiGenerator(
            model="gemini-pro",
            temperature=0.3,
            max_tokens=500
        )
        
        # Generate answer with Gemini
        gemini_result = gemini.generate_answer(query, retrieved_docs, recovery_coach_prompt)
        
        print("\n\n💬 Gemini Generated Answer (With Context):")
        print(gemini_result['answer'])
    else:
        print("\n\nSkipping Gemini comparison (API key not set)")
except Exception as e:
    print(f"\n\nError comparing with Gemini: {str(e)}")

## 🧠 Insights

Using retrieved context significantly improves:
- factual grounding
- consistency of tone
- alignment with health coaching voice

Next steps:
- Test with other open-source models via HuggingFace
- Add prompt template evaluation
- Try few-shot or RAG-as-tool approaches