# EcoHome Energy Advisor - Agent Run & Evaluation

In this notebook, you'll run the Energy Advisor agent with various real-world scenarios and see how it helps customers optimize their energy usage.

## Learning Objectives
- Create the agent's instructions
- Run the Energy Advisor with different types of questions
- Evaluate response quality and accuracy
- Measure tool usage effectiveness
- Identify areas for improvement
- Implement evaluation metrics

## Evaluation Criteria
- **Accuracy**: Correct information and calculations
- **Relevance**: Responses address the user's question
- **Completeness**: Comprehensive answers with actionable advice
- **Tool Usage**: Appropriate use of available tools
- **Reasoning**: Clear explanation of recommendations


## 1. Import and Initialize

In [1]:
from datetime import datetime
from agent import Agent

ImportError: cannot import name 'create_react_agent' from 'langgraph.prebuilt' (/home/marius/Documents/Udacity Submissions/Agentic_AI_with_LangChain_and_LangGraph/Energy_Advisor/.venv/lib/python3.12/site-packages/langgraph/prebuilt/__init__.py)

In [None]:
## Agent instructions

ECOHOME_SYSTEM_PROMPT = """
You are EcoHome, a proactive residential energy advisor for homeowners and renters.
Role: deliver actionable, data-backed recommendations that reduce costs and improve energy efficiency.

Steps to follow:
1) Clarify location, timeframe, and devices if missing; state any assumptions.
2) Pull relevant data using tools: weather for solar/thermal context, electricity prices for time-of-use windows, usage and solar history for trends, recent summary when timeframe is unclear, and energy tips for best practices.
3) Analyze patterns (peaks, off-peak windows, forecasted conditions) and decide the best actions.
4) Quantify impact (kWh and USD) with calculate_energy_savings when numbers are available; otherwise give conservative ranges.
5) Present 2-4 prioritized recommendations with reasoning and next steps; note gaps and ask one concise follow-up question if needed.

Key capabilities:
- get_weather_forecast: assess upcoming conditions and solar potential.
- get_electricity_prices: identify off-peak vs peak hours for load shifting.
- query_energy_usage / query_solar_generation: inspect historical consumption and production.
- get_recent_energy_summary: get a quick view when the user provides little context.
- search_energy_tips: retrieve best practices via RAG.
- calculate_energy_savings: quantify savings for proposed actions.

Recommendations guidance:
- Tie every suggestion to retrieved data (price periods, forecast, usage patterns) and make them specific and time-bound.
- Prefer scheduling and load shifting to cheaper hours; suggest thermostat, EV, appliance, and solar-usage tweaks.
- Include expected savings and assumptions; provide quick wins plus one longer-term improvement when relevant.
- If data is missing, state the assumption and request the needed detail succinctly.

Example questions you handle:
- "Given this week's forecast, when should I run my dishwasher to save the most?"
- "How can I cut my EV charging costs in San Diego tomorrow?"
- "Review my past 7 days of usage and suggest ways to reduce peak load."
- "Compare my solar generation last week to expected weather and give optimizations."

Respond concisely, show key tool findings briefly, then deliver the final plan.
"""


In [None]:
ecohome_agent = Agent(
    instructions=ECOHOME_SYSTEM_PROMPT,
)

In [None]:
response = ecohome_agent.invoke(
    question="When should I charge my electric car tomorrow to minimize cost and maximize solar power?",
    context="Location: San Francisco, CA"
)

In [None]:
print(response["messages"][-1].content)

In [None]:
print("TOOLS:")
for msg in response["messages"]:
    obj = msg.dict()
    if obj.get("type") == "function":
        print("-", obj.get("name"))

## 2. Define Test Cases

In [None]:
# TODO: Define comprehensive test cases for the Energy Advisor
# Create 10 test cases covering different scenarios:
# - EV charging optimization
# - Thermostat settings
# - Appliance scheduling
# - Solar power maximization
# - Cost savings calculations

In [None]:
test_cases = [
    {
        "id": "ev_charging_1",
        "question": "When should I charge my electric car tomorrow to minimize cost and maximize solar power?",
        "expected_tools": ["get_weather_forecast", "get_electricity_prices"],
        "expected_response": "The response should contain time recommendation, cost analysis and solar consideration",
    },
]

if len(test_cases) < 10:
    raise ValueError("You MUST have at least 10 test cases")

## 3. Run Agent Tests

In [None]:
CONTEXT = "Location: San Francisco, CA"

In [None]:
# Run the agent tests
# For each test case, call the agent and collect the response
# Store results for evaluation

print("=== Running Agent Tests ===")
test_results = []

for i, test_case in enumerate(test_cases):
    print(f"\nTest {i+1}: {test_case['id']}")
    print(f"Question: {test_case['question']}")
    print("-" * 50)
    
    try:
        # Call the agent
        response = ecohome_agent.invoke(
            question=test_case['question'],
            context=CONTEXT
        )
        
        # Store the result
        result = {
            'test_id': test_case['id'],
            'question': test_case['question'],
            'response': response,
            'expected_tools': test_case['expected_tools'],
            'expected_response': test_case['expected_response'],
            'timestamp': datetime.now().isoformat()
        }
        test_results.append(result)
                
    except Exception as e:
        print(f"Error: {e}")
        result = {
            'test_id': test_case['id'],
            'question': test_case['question'],
            'response': f"Error: {str(e)}",
            'expected_tools': test_case['expected_tools'],
            'expected_response': test_case['expected_response'],
            'timestamp': datetime.now().isoformat(),
            'error': str(e)
        }
        test_results.append(result)

print(f"\nCompleted {len(test_results)} tests")


In [None]:
test_results

## 4. Evaluate Responses

In [None]:
# TODO: Implement evaluation functions
# Create functions to evaluate:
# - Final Response
# - Tool usage

In [None]:
# TODO: Create a response evaluator
def evaluate_response(question, final_response, expected_response):
    """Evaluate a single response against expected response"""
    pass

In [None]:
# TODO: Create a tool udage evaluator
def evaluate_tool_usage(messages, expected_tools):
    """Evaluate if the right tools were used"""
    pass

In [None]:
# TODO: Generate a comprehensive evaluation report
# Calculate overall scores and metrics
# Identify strengths and weaknesses
# Provide recommendations for improvement
def generate_evaluation_report():
    pass