# EcoHome Energy Advisor - Agent Run & Evaluation

In this notebook, you'll run the Energy Advisor agent with various real-world scenarios and see how it helps customers optimize their energy usage.

## Learning Objectives
- Create the agent's instructions
- Run the Energy Advisor with different types of questions
- Evaluate response quality and accuracy
- Measure tool usage effectiveness
- Identify areas for improvement
- Implement evaluation metrics

## Evaluation Criteria
- **Accuracy**: Correct information and calculations
- **Relevance**: Responses address the user's question
- **Completeness**: Comprehensive answers with actionable advice
- **Tool Usage**: Appropriate use of available tools
- **Reasoning**: Clear explanation of recommendations


## 1. Import and Initialize

In [1]:
from datetime import datetime
from agent import Agent

In [2]:
## TODO: Create the agent's instructions

ECOHOME_SYSTEM_PROMPT = """
You are EcoHome Energy Advisor, an intelligent agent capable of optimizing energy usage across multiple smart home devices and systems.
Your goal is to answer questions regarding optimizing energy consumption and/or come up with personalized recommendations.

Guidelines:
- Rely on tool usage rather than own thinking or guessing
- Work data-driven and use them in your answer for reasoning
- Make clear reommendations by naming temperature ranges/hours/durations and so on rather than giving vague answers
- If you need clarification you are allowed to ask for it
- If possible estimate savings either as energy (kwh) or money (USD/EUR/...)
- Be honest, if you cannot solve a problem just say so and do not try to make something up
"""

In [3]:
ecohome_agent = Agent(
    instructions=ECOHOME_SYSTEM_PROMPT,
)

In [4]:
response = ecohome_agent.invoke(
    question="When should I charge my electric car tomorrow to minimize cost and maximize solar power?",
    context=f"Location: San Francisco, CA; Time: {datetime.now()}"
)

In [5]:
print(response["messages"][-1].content)

It seems that I'm currently unable to retrieve the weather forecast for San Francisco, CA. However, I can still provide you with the electricity pricing data for tomorrow.

### Electricity Pricing for December 30, 2025:
- **Off-Peak Hours (0:00 - 5:00 AM)**: $0.15 per kWh
- **Peak Hours (6:00 AM - 7:00 PM)**: $0.18 per kWh
- **Off-Peak Hours (8:00 PM - 11:59 PM)**: $0.15 per kWh

### Recommendations for Charging Your Electric Car:
1. **Best Time to Charge**: 
   - **From 12:00 AM to 5:00 AM**: Charge during these hours to take advantage of the lowest rate of $0.15 per kWh.
   - **After 8:00 PM**: You can also charge during this time to benefit from the same off-peak rate.

2. **Avoid Charging**: 
   - **From 6:00 AM to 7:00 PM**: Avoid charging during these peak hours as the rate is higher at $0.18 per kWh.

### Estimated Savings:
If you charge your electric car during off-peak hours instead of peak hours, you could save approximately $0.03 per kWh. For example, if you charge 10 kWh:
-

In [6]:
print("TOOLS:")
for msg in response["messages"]:
    obj = msg.model_dump()
    if obj.get("tool_call_id"):
        print("-", msg.name)

TOOLS:
- get_weather_forecast
- get_electricity_prices
- get_weather_forecast


## 2. Define Test Cases

In [7]:
# Define comprehensive test cases for the Energy Advisor
# Create 10 test cases covering different scenarios:
# - EV charging optimization
# - Thermostat settings
# - Appliance scheduling
# - Solar power maximization
# - Cost savings calculations

In [8]:
test_cases = [
    {
        "id": "ev_charging_1",
        "question": "When should I charge my electric car tomorrow to minimize cost and maximize solar power?",
        "expected_tools": ["get_weather_forecast", "get_electricity_prices"],
        "expected_response": "The response should contain time recommendation, cost analysis and solar consideration",
    },
    {
        "id": "energy_tips_8",
        "question": "Can you suggest three practical ways to lower my household electricity consumption?",
        "expected_tools": ["search_energy_tips"],
        "expected_response": "Should provide three concrete, tailored recommendations to improve efficiency."
    },
    {
        "id": "laundry_4",
        "question": "What’s the cheapest time to run my washing machine over the weekend?",
        "expected_tools": ["get_electricity_prices"],
        "expected_response": "Should recognize weekend rate patterns and point out the lowest-cost hours."
    },
    {
        "id": "thermostat_2",
        "question": "To cut costs, what thermostat setting makes sense for Wednesday afternoon?",
        "expected_tools": ["get_electricity_prices", "get_weather_forecast"],
        "expected_response": "Should recommend a numeric temperature range and justify it using price and weather data."
    },
    {
        "id": "usage_history_6",
        "question": "Which device consumed the most power in the previous month?",
        "expected_tools": ["query_energy_usage"],
        "expected_response": "Should name the top-consuming appliance and include kWh usage and cost details."
    },
    {
        "id": "dishwasher_3",
        "question": "What kind of savings could I see if I run my dishwasher overnight instead of around 6 PM?",
        "expected_tools": ["get_electricity_prices", "calculate_energy_savings"],
        "expected_response": "Should approximate savings per run and per month based on TOU price differences."
    },
    {
        "id": "recent_summary_9",
        "question": "Can you give me a quick overview of my electricity use during the last 48 hours?",
        "expected_tools": ["get_recent_energy_summary"],
        "expected_response": "Should include total kWh, total cost, a device-level breakdown, and brief insights."
    },
    {
        "id": "solar_forecast_5",
        "question": "How much solar power is likely to be generated tomorrow in San Francisco?",
        "expected_tools": ["get_weather_forecast"],
        "expected_response": "Should mention expected sunshine, irradiance, or typical generation trends."
    },
    {
        "id": "optimization_multi_device_7",
        "question": "Can you plan the optimal run times for my EV, dishwasher, and dryer tomorrow to minimize costs?",
        "expected_tools": ["get_electricity_prices", "get_weather_forecast"],
        "expected_response": "Should suggest a coordinated schedule that avoids peak pricing."
    },
    {
        "id": "pool_pump_10",
        "question": "When should I operate my pool pump over the coming week for best efficiency and cost?",
        "expected_tools": ["get_weather_forecast", "get_electricity_prices"],
        "expected_response": "Should outline a day-by-day schedule that balances sunlight availability with off-peak rates."
    }
]

if len(test_cases) < 10:
    raise ValueError("You MUST have at least 10 test cases")

## 3. Run Agent Tests

In [9]:
CONTEXT = f"Location: San Francisco, CA; Time: {datetime.now()}"

In [10]:
# Run the agent tests
# For each test case, call the agent and collect the response
# Store results for evaluation

print("=== Running Agent Tests ===")
test_results = []

for i, test_case in enumerate(test_cases):
    print(f"\nTest {i+1}: {test_case['id']}")
    print(f"Question: {test_case['question']}")
    print("-" * 50)
    
    try:
        # Call the agent
        response = ecohome_agent.invoke(
            question=test_case['question'],
            context=CONTEXT
        )
        
        # Store the result
        result = {
            'test_id': test_case['id'],
            'question': test_case['question'],
            'response': response,
            'expected_tools': test_case['expected_tools'],
            'expected_response': test_case['expected_response'],
            'timestamp': datetime.now().isoformat()
        }
        test_results.append(result)
                
    except Exception as e:
        print(f"Error: {e}")
        result = {
            'test_id': test_case['id'],
            'question': test_case['question'],
            'response': f"Error: {str(e)}",
            'expected_tools': test_case['expected_tools'],
            'expected_response': test_case['expected_response'],
            'timestamp': datetime.now().isoformat(),
            'error': str(e)
        }
        test_results.append(result)

print(f"\nCompleted {len(test_results)} tests")


=== Running Agent Tests ===

Test 1: ev_charging_1
Question: When should I charge my electric car tomorrow to minimize cost and maximize solar power?
--------------------------------------------------

Test 2: energy_tips_8
Question: Can you suggest three practical ways to lower my household electricity consumption?
--------------------------------------------------

Test 3: laundry_4
Question: What’s the cheapest time to run my washing machine over the weekend?
--------------------------------------------------

Test 4: thermostat_2
Question: To cut costs, what thermostat setting makes sense for Wednesday afternoon?
--------------------------------------------------

Test 5: usage_history_6
Question: Which device consumed the most power in the previous month?
--------------------------------------------------

Test 6: dishwasher_3
Question: What kind of savings could I see if I run my dishwasher overnight instead of around 6 PM?
--------------------------------------------------

Test

In [11]:
test_results

[{'test_id': 'ev_charging_1',
  'question': 'When should I charge my electric car tomorrow to minimize cost and maximize solar power?',
  'response': {'messages': [SystemMessage(content='Location: San Francisco, CA; Time: 2025-12-29 05:55:42.284364', additional_kwargs={}, response_metadata={}, id='6ab2cfa5-7b24-4e58-98e1-2fd8956fad25'),
    HumanMessage(content='When should I charge my electric car tomorrow to minimize cost and maximize solar power?', additional_kwargs={}, response_metadata={}, id='8a92ebd3-faad-480d-94d2-7ad54d91497c'),
    AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_bC6Rd6Xk45L2QlQ9G9456EcU', 'function': {'arguments': '{"location": "San Francisco, CA", "days": 1}', 'name': 'get_weather_forecast'}, 'type': 'function'}, {'id': 'call_90AyYUrrVYt5rUm2uP4Qu1Iw', 'function': {'arguments': '{"date": "2025-12-30"}', 'name': 'get_electricity_prices'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 61, '

## 4. Evaluate Responses

In [None]:
# TODO: Implement evaluation functions
# Create functions to evaluate:
# - Final Response
# - Tool usage

In [None]:
# TODO: Create a response evaluator
def evaluate_response(question, final_response, expected_response):
    """Evaluate a single response against expected response"""
    pass

In [None]:
# TODO: Create a tool udage evaluator
def evaluate_tool_usage(messages, expected_tools):
    """Evaluate if the right tools were used"""
    pass

In [None]:
# TODO: Generate a comprehensive evaluation report
# Calculate overall scores and metrics
# Identify strengths and weaknesses
# Provide recommendations for improvement
def generate_evaluation_report():
    pass