# Energy Consumption Forecasting

## Overview

This script performs energy consumption forecasting using a trained machine learning model.
 It loads a dataset containing  energy consumption records, generates forecasts for the past 2 months and the next 2 months, and merges the forecasted data with the original data.



- The script employs efficient data handling techniques, leveraging the capabilities of pandas for data manipulation and preprocessing tasks.
- By utilizing Python's datetime module, it facilitates seamless handling of timestamps, allowing for accurate generation of past and future dates.
- The trained machine learning model plays a central role in the forecasting process, enabling the script to make accurate predictions based on historical data.
- Various pandas functions, such as dataframe indexing and concatenation, are utilized to streamline data processing and manipulation tasks.
- The script offers flexibility in terms of parameter customization, such as adjusting the forecast period and specifying input and output file names.

## Dependencies

- `pandas`: Used for data manipulation and analysis.
- `joblib`:  to load the trained machine learning model.
- `datetime`, `timedelta`:  for handling dates and time intervals.

In [1]:
import pandas as pd
import joblib
from datetime import datetime, timedelta

1. **Load Model**: The script loads a pre-trained machine learning model from the file `model.h5` using the `joblib.load()` function.

  This model is crucial for making accurate energy consumption predictions.

In [None]:
# Load the saved model
model = joblib.load('model.h5')

2. **Load Dataset**: It loads the dataset containing historical energy consumption records from the Excel file `new_data.xlsx` using the `pd.read_excel()` function.

  This dataset serves as the basis for generating forecasts.

In [None]:
# Assuming your dataset is stored in a DataFrame called df_next_2_months
df = pd.read_excel("new_data.xlsx")

3. **Extract Data**: Using pandas' dataframe indexing, the script extracts the necessary data from the dataset.
 
  Specifically, it isolates the timestamp and energy consumption columns.

In [2]:
# Extract the timestamp and energy consumption columns from the original dataset
original_data = df[['Timestamp', 'Energy Consumption (kWh)']]

# Extract the first and last dates from the dataset
first_date = original_data['Timestamp'].min()
last_date = original_data['Timestamp'].max()


4. **Generate Timestamps**: Past timestamps for the last 2 months and future timestamps for the next 2 months are generated using the `pd.date_range()` function.

  This function allows for the creation of a range of datetime values based on specified start and end dates, along with the desired frequency.

In [3]:
# Generate past timestamps for the last 2 months
end_past_date = first_date - timedelta(days=1)  # Last day of the previous month
start_past_date = end_past_date - timedelta(days=60)  # Forecast for the past 2 months
past_timestamps = pd.date_range(start=start_past_date, end=end_past_date, freq='H')

# Generate future timestamps starting from the day after the last date in the dataset
start_future_date = last_date + timedelta(days=1)
end_future_date = start_future_date + timedelta(days=60)  # Forecast for the next 2 months
future_timestamps = pd.date_range(start=start_future_date, end=end_future_date, freq='H')  # Generate timestamps for each hour

5. **Feature Extraction**: Features such as year, month, day, hour, minute, and second are extracted from the generated timestamps.
 
  These features serve as input variables for the machine learning model, enabling it to make predictions.

In [8]:
# Extract features from past timestamps
past_features = pd.DataFrame({
    'Timestamp': past_timestamps,
    'Year': past_timestamps.year,
    'Month': past_timestamps.month,
    'Day': past_timestamps.day,
    'Hour': past_timestamps.hour,
    'Minute': past_timestamps.minute,
    'Second': past_timestamps.second
})

# Extract features from future timestamps
future_features = pd.DataFrame({
    'Timestamp': future_timestamps,
    'Year': future_timestamps.year,
    'Month': future_timestamps.month,
    'Day': future_timestamps.day,
    'Hour': future_timestamps.hour,
    'Minute': future_timestamps.minute,
    'Second': future_timestamps.second
})

6. **Make Predictions**: The trained machine learning model is utilized to make energy consumption predictions for both past and future timestamps. 

 This is achieved by passing the extracted features as input to the model's `predict()` function.

In [9]:

# Make predictions for past timestamps
past_predicted_consumption = model.predict(past_features[['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second']])


# Make predictions for future timestamps
future_predicted_consumption = model.predict(future_features[['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second']])

7. **Create DataFrames**: Separate DataFrames are created to store the predicted energy consumption values for the past and future timestamps.
 
  These DataFrames consist of the timestamp and corresponding predicted energy consumption values.

In [10]:
# Create a DataFrame for past forecasted data
past_forecast_data = pd.DataFrame({
    'Timestamp': past_timestamps,
    'Energy Consumption (kWh)': past_predicted_consumption
})

In [14]:
# Create a DataFrame for future forecasted data
future_forecast_data = pd.DataFrame({
    'Timestamp': future_timestamps,
    'Energy Consumption (kWh)': future_predicted_consumption
})

8. **Concatenate DataFrames**: Using the `pd.concat()` function, the past, original, and future DataFrames are concatenated along the specified axis. 

 This results in a single DataFrame containing the complete set of energy consumption data, including historical records and forecasts.

In [15]:

# Concatenate the past, original, and future dataframes
final_df = pd.concat([past_forecast_data, original_data, future_forecast_data], ignore_index=True)

9. **Display Results**: Finally, the script displays the merged DataFrame, which encompasses past, present, and future energy consumption data.

In [16]:
# Display the final merged dataframe
print(final_df)

               Timestamp  Energy Consumption (kWh)
0    2023-11-01 00:00:00                 18.155503
1    2023-11-01 01:00:00                 18.597747
2    2023-11-01 02:00:00                 19.039990
3    2023-11-01 03:00:00                 19.482234
4    2023-11-01 04:00:00                 19.924478
...                  ...                       ...
4093 2024-04-21 11:00:00                 18.396724
4094 2024-04-21 12:00:00                 18.838968
4095 2024-04-21 13:00:00                 19.281212
4096 2024-04-21 14:00:00                 19.723456
4097 2024-04-21 15:00:00                 20.165699

[4098 rows x 2 columns]


# Time series Plot using plotly
 - An overview of the distribution of energy consumption over time. It also offers insights into the interpretation of the plot and the value it provides in understanding energy consumption patterns.


In [7]:
# Plot the distribution of energy consumption over time using Plotly
import plotly.express as px
fig = px.line(final_df, x='Timestamp', y='Energy Consumption (kWh)', title='Energy Consumption Over Time')
fig.show()