# Testing Energy Optimization Goals

This notebook demonstrates how to test each goal from our energy optimization project.

## Setup
First, let's import required libraries and create our Spark session

In [None]:
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
import json
from datetime import datetime
import random

# Create Spark session
spark = SparkSession.builder \
    .appName("Energy Optimization Analysis") \
    .config("spark.jars.packages", "org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.0") \
    .getOrCreate()

## Goal 1: Identifier des opportunités d'optimisation énergétique

Let's test the energy optimization identification by creating sample data and running our analysis

In [None]:
# Create sample data
sample_data = [
    # High consumption during peak hours
    ("2025-01-20 14:00:00", 1, 180.0, 24.0, 60.0, 50, 1, 14),
    # High temperature scenario
    ("2025-01-20 14:05:00", 1, 140.0, 28.0, 65.0, 45, 1, 14),
    # Low occupancy scenario
    ("2025-01-20 14:10:00", 1, 120.0, 22.0, 55.0, 10, 1, 14)
]

# Create DataFrame
schema = StructType([
    StructField("timestamp", StringType(), True),
    StructField("building_id", IntegerType(), True),
    StructField("energy_consumption", DoubleType(), True),
    StructField("temperature", DoubleType(), True),
    StructField("humidity", DoubleType(), True),
    StructField("occupancy", IntegerType(), True),
    StructField("day_of_week", IntegerType(), True),
    StructField("hour_of_day", IntegerType(), True)
])

df = spark.createDataFrame(sample_data, schema)
df.show()

## Goal 2: Proposer des recommandations

Now let's test the recommendation generation

In [None]:
def generate_recommendations(df):
    return df.withColumn("recommendations",
        when((col("hour_of_day").between(9, 17)) & (col("energy_consumption") > 150),
            "High consumption during peak hours: 1) Adjust HVAC settings 2) Schedule high-energy tasks for off-peak hours 3) Implement automated lighting controls") \
        .when((col("temperature") > 25) & (col("energy_consumption") > 130),
            "High energy use with high temperature: 1) Optimize cooling system efficiency 2) Install solar shading 3) Use natural ventilation when possible") \
        .when((col("occupancy") < 30) & (col("energy_consumption") > 100),
            "High energy use with low occupancy: 1) Implement motion sensors 2) Reduce base load 3) Audit always-on equipment") \
        .otherwise("Energy consumption within normal parameters"))

# Generate recommendations
recommendations_df = generate_recommendations(df)
recommendations_df.select("timestamp", "energy_consumption", "temperature", "occupancy", "recommendations").show(truncate=False)

## Goal 3: Simuler la provenance de données en temps réels

Let's test the real-time data simulation

In [None]:
def create_energy_data():
    """Generate simulated energy consumption data"""
    current_time = datetime.now()
    
    data = {
        'timestamp': current_time.strftime('%Y-%m-%d %H:%M:%S'),
        'building_id': random.randint(1, 10),
        'energy_consumption': random.uniform(50, 200),
        'temperature': random.uniform(15, 35),
        'humidity': random.uniform(30, 80),
        'occupancy': random.randint(0, 100),
        'day_of_week': current_time.weekday(),
        'hour_of_day': current_time.hour
    }
    return data

# Generate sample real-time data
sample_realtime_data = [create_energy_data() for _ in range(5)]

# Create DataFrame from real-time data
realtime_df = spark.createDataFrame([row for row in sample_realtime_data])
realtime_df.show()

## Testing Complete System

Now let's analyze the real-time data and generate recommendations

In [None]:
# Generate recommendations for real-time data
realtime_recommendations = generate_recommendations(realtime_df)

# Show results
realtime_recommendations.select(
    "timestamp", 
    "building_id",
    "energy_consumption",
    "temperature",
    "occupancy",
    "recommendations"
).show(truncate=False)

## Analyzing Results

The above tests demonstrate:
1. Energy optimization opportunity identification
2. Automated recommendation generation
3. Real-time data simulation and processing

You can modify the test data values to see different recommendations based on:
- Peak hours (9 AM - 5 PM)
- Temperature thresholds
- Occupancy levels