# 📊 Building Energy Efficiency Prediction Project

## 🔎 Project Overview
This project focuses on predicting the energy efficiency of buildings 
(Heating Load & Cooling Load) based on their architectural features.  
It aims to show how **AI/ML can support sustainable design** by optimizing 
energy use in future constructions.


## ❓ Problem Statement
The goal is to build a Machine Learning model that predicts the **Heating Load** 
and **Cooling Load** of buildings given parameters such as wall area, roof area, 
glazing area, and orientation.  
This helps architects design **energy-efficient buildings**.


## 📂 Dataset Source
We are using the **Energy Efficiency Dataset** available on Kaggle:  
[🔗 Kaggle Energy Efficiency Dataset](https://www.kaggle.com/datasets/elikplim/eergy-efficiency-dataset)


## 📑 Dataset Description

| Feature          | Description                                      |
|------------------|--------------------------------------------------|
| X1 - Relative Compactness | Measure of shape efficiency of the building |
| X2 - Surface Area         | Total external surface area of the building |
| X3 - Wall Area            | Total wall area of the building             |
| X4 - Roof Area            | Total roof area of the building             |
| X5 - Overall Height       | Height of the building                      |
| X6 - Orientation          | Direction of the building (1-4)             |
| X7 - Glazing Area         | Window-to-wall ratio                        |
| X8 - Glazing Area Dist.   | Window distribution                         |
| y1 - Heating Load         | Energy required for heating                 |
| y2 - Cooling Load         | Energy required for cooling                 |


In [None]:
# 📥 Data Collection & Loading
import pandas as pd

df = pd.read_csv("ENB2012_data.csv")  # replace with your dataset path
df.head()

In [None]:
# 🔍 Data Understanding
df.info()
df.describe()
df.isnull().sum()

In [None]:
# 📊 Correlation Heatmap
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
plt.title("Correlation Heatmap of Features")
plt.show()

In [None]:
# ⚙️ Data Preprocessing
from sklearn.preprocessing import StandardScaler

X = df.drop(["Y1","Y2"], axis=1)  # features
y = df[["Y1","Y2"]]  # targets

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

print("Before Scaling:\n", X.head())
print("\nAfter Scaling:\n", X_scaled[:5])

In [None]:
# ✂️ Data Splitting
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.2, random_state=42
)

print("Training set size:", X_train.shape)
print("Test set size:", X_test.shape)