# EcoHome Energy Advisor - RAG Setup

In this notebook, you'll set up the Retrieval-Augmented Generation (RAG) pipeline for the EcoHome Energy Advisor. This will allow the agent to access and cite relevant energy-saving tips and best practices.

## Learning Objectives
- Set up ChromaDB vector store
- Load and process energy-saving documents
- Create embeddings for document chunks
- Implement semantic search functionality
- Test the RAG pipeline

## Documents Available
- `tip_device_best_practices.txt` - Device-specific optimization tips
- `tip_energy_savings.txt` - General energy-saving strategies
- `tip_hvac_optimization.txt` - HVAC optimization strategies
- `tip_smart_home_automation.txt` - Smart home automation tips
- `tip_renewable_energy_integration.txt` - Renewable energy integration
- `tip_seasonal_energy_management.txt` - Seasonal energy management
- `tip_energy_storage_optimization.txt` - Battery storage optimization

## 1. Import Required Libraries


In [1]:
# Import the necessary libraries for RAG setup
import os
import glob
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from dotenv import load_dotenv

In [2]:
load_dotenv()

True

## 2. Load and Process Documents


In [3]:
# Load all energy-saving tip documents from the documents directory
# Use glob to automatically discover all .txt files

documents = []
document_paths = sorted(glob.glob("data/documents/*.txt"))

for doc_path in document_paths:
    loader = TextLoader(doc_path)
    docs = loader.load()
    documents.extend(docs)
    print(f"Loaded {len(docs)} document(s) from {doc_path}")

print(f"\nTotal documents loaded: {len(documents)}")

Loaded 1 document(s) from data/documents/tip_device_best_practices.txt
Loaded 1 document(s) from data/documents/tip_energy_savings.txt
Loaded 1 document(s) from data/documents/tip_energy_storage_optimization.txt
Loaded 1 document(s) from data/documents/tip_hvac_optimization.txt
Loaded 1 document(s) from data/documents/tip_renewable_energy_integration.txt
Loaded 1 document(s) from data/documents/tip_seasonal_energy_management.txt
Loaded 1 document(s) from data/documents/tip_smart_home_automation.txt

Total documents loaded: 7


## 3. Split Documents into Chunks


In [4]:
# Split documents into smaller chunks for better retrieval
# Use RecursiveCharacterTextSplitter with appropriate chunk_size and chunk_overlap
# Experiment with different chunk sizes (e.g., 500, 1000, 1500 characters)

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len,
    separators=["\n\n", "\n", " ", ""]
)

# Split the documents
splits = text_splitter.split_documents(documents)
print(f"Split {len(documents)} documents into {len(splits)} chunks")

# Show sample chunk
if splits:
    print(f"\nSample chunk (first 200 characters):")
    print(splits[0].page_content[:200] + "...")


Split 7 documents into 31 chunks

Sample chunk (first 200 characters):
Large devices like electric vehicles, washing machines and dishwashers often support delayed start or timer functions. Schedule these devices to run outside of peak electricity pricing hours or during...


## 4. Create Vector Store


In [5]:
# Create a ChromaDB vector store with OpenAI embeddings
# Persist the vector store to disk for future use

persist_directory = "data/vectorstore"
os.makedirs(persist_directory, exist_ok=True)

# Initialize embeddings using OpenAI API key
embeddings = OpenAIEmbeddings(
    api_key=os.getenv("OPENAI_API_KEY"),
)

# Create the vector store
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    persist_directory=persist_directory
)

print(f"Vector store created and persisted to {persist_directory}")
print(f"Total vectors stored: {len(splits)}")

Vector store created and persisted to data/vectorstore
Total vectors stored: 31


## 5. Test the RAG Pipeline


In [6]:
# Test the search functionality
# Try different queries related to energy optimization
# Test queries like:
# - "electric vehicle charging tips"
# - "thermostat optimization"
# - "dishwasher energy saving"
# - "solar power maximization"

test_queries = [
    "electric vehicle charging tips",
    "thermostat optimization",
    "dishwasher energy saving",
    "solar power maximization",
    "HVAC system efficiency",
    "pool pump scheduling"
]

print("=== Testing Vector Search ===")
for query in test_queries:
    print(f"\nQuery: '{query}'")
    docs = vectorstore.similarity_search(query, k=2)
    for i, doc in enumerate(docs):
        print(f"  Result {i+1}: {doc.page_content[:100]}...")


=== Testing Vector Search ===

Query: 'electric vehicle charging tips'


  Result 1: Large devices like electric vehicles, washing machines and dishwashers often support delayed start o...
  Result 2: Integration Between Devices:
- Connect your solar inverter to your smart home system to automaticall...

Query: 'thermostat optimization'


  Result 1: Smart Thermostat Programming:
- Program setback temperatures: 68°F when home and awake, 62°F when sl...
  Result 2: HVAC Optimization Strategies for Maximum Energy Efficiency

Heating, ventilation, and air conditioni...

Query: 'dishwasher energy saving'
  Result 1: Dishwasher Best Practices:
- Only run when completely full
- Use the energy-saving or eco mode when ...
  Result 2: Smart Home Automation for Energy Optimization

Smart home automation transforms energy management fr...

Query: 'solar power maximization'


  Result 1: Self-Consumption Maximization:
- Shift high-consumption activities to solar generation hours. Run po...
  Result 2: Solar Panel Optimization:
- Panel orientation matters significantly. South-facing panels (in the Nor...

Query: 'HVAC system efficiency'


  Result 1: HVAC Optimization Strategies for Maximum Energy Efficiency

Heating, ventilation, and air conditioni...
  Result 2: Heat Pump Optimization:
- Heat pumps are 2-3 times more efficient than traditional resistance heatin...

Query: 'pool pump scheduling'


  Result 1: Dishwasher Best Practices:
- Only run when completely full
- Use the energy-saving or eco mode when ...
  Result 2: Self-Consumption Maximization:
- Shift high-consumption activities to solar generation hours. Run po...


## 6. Test the Search Tool


In [7]:
# Test the search_energy_tips tool from tools.py
# Import and test the tool with various queries
# Verify that it returns relevant results

from tools import search_energy_tips

# Test the search_energy_tips function
print("=== Testing search_energy_tips Tool ===")

test_queries = [
    "electric vehicle charging",
    "thermostat settings",
    "dishwasher optimization",
    "solar power tips"
]

for query in test_queries:
    print(f"\nQuery: '{query}'")
    result = search_energy_tips.invoke(
        input={
            "query": query, 
            "max_results": 3,
        }
    )
    
    if "error" in result:
        print(f"  Error: {result['error']}")
    else:
        print(f"  Found {result['total_results']} results")
        for i, tip in enumerate(result['tips']):
            print(f"    {i+1}. {tip['content'][:100]}...")
            print(f"       Source: {tip['source']}")
            print(f"       Relevance: {tip['relevance_score']}")


=== Testing search_energy_tips Tool ===

Query: 'electric vehicle charging'


  Found 3 results
    1. Integration Between Devices:
- Connect your solar inverter to your smart home system to automaticall...
       Source: data/documents/tip_smart_home_automation.txt
       Relevance: high
    2. Large devices like electric vehicles, washing machines and dishwashers often support delayed start o...
       Source: data/documents/tip_device_best_practices.txt
       Relevance: high
    3. Self-Consumption Maximization:
- Shift high-consumption activities to solar generation hours. Run po...
       Source: data/documents/tip_renewable_energy_integration.txt
       Relevance: medium

Query: 'thermostat settings'


  Found 3 results
    1. Smart Thermostat Programming:
- Program setback temperatures: 68°F when home and awake, 62°F when sl...
       Source: data/documents/tip_hvac_optimization.txt
       Relevance: high
    2. - Lower water heater temperature from 140°F to 120°F. This saves 6-10% on water heating costs and re...
       Source: data/documents/tip_seasonal_energy_management.txt
       Relevance: high
    3. Heat Pump Optimization:
- Heat pumps are 2-3 times more efficient than traditional resistance heatin...
       Source: data/documents/tip_hvac_optimization.txt
       Relevance: medium

Query: 'dishwasher optimization'


  Found 3 results
    1. Dishwasher Best Practices:
- Only run when completely full
- Use the energy-saving or eco mode when ...
       Source: data/documents/tip_device_best_practices.txt
       Relevance: high
    2. Large devices like electric vehicles, washing machines and dishwashers often support delayed start o...
       Source: data/documents/tip_device_best_practices.txt
       Relevance: high
    3. Smart Home Automation for Energy Optimization

Smart home automation transforms energy management fr...
       Source: data/documents/tip_smart_home_automation.txt
       Relevance: medium

Query: 'solar power tips'


  Found 3 results
    1. Solar Panel Optimization:
- Panel orientation matters significantly. South-facing panels (in the Nor...
       Source: data/documents/tip_renewable_energy_integration.txt
       Relevance: high
    2. Self-Consumption Maximization:
- Shift high-consumption activities to solar generation hours. Run po...
       Source: data/documents/tip_renewable_energy_integration.txt
       Relevance: high
    3. Seasonal Considerations:
- Summer generates 40-60% more solar energy than winter. Plan battery usage...
       Source: data/documents/tip_renewable_energy_integration.txt
       Relevance: medium
