# Energy Consumption Dataset - Linear Regression
Description:
This dataset is designed for predicting energy consumption based on various building features and environmental factors. It contains data for multiple building types, square footage, the number of occupants, appliances used, average temperature, and the day of the week. The goal is to build a predictive model to estimate energy consumption using these attributes.

The dataset can be used for training machine learning models such as linear regression to forecast energy needs based on the building's characteristics. This is useful for understanding energy demand patterns and optimizing energy consumption in different building types and environmental conditions.

In [None]:
# Load Dataset
import pandas as pd 


train_data = pd.read_csv('train_energy_data.csv')
test_data = pd.read_csv('test_energy_data.csv')
train_data.shape, test_data.shape, train_data.head(), test_data.head()

In [None]:
train_data.info()

In [None]:
train_data.describe(include='all')

In [None]:
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.compose import ColumnTransformer

ct = ColumnTransformer(
    transformers=[('encoder', OneHotEncoder(drop='first'), ['Building Type', 'Day of Week'])],
    remainder='passthrough'
)

# Splitting the dataset into features and target variable

X = ct.fit_transform(train_data.iloc[:, :-1])
y = train_data.iloc[:, -1].values

x_test = ct.transform(test_data.iloc[:, :-1])
y_test = test_data.iloc[:, -1].values

X.shape, y.shape, x_test.shape, y_test.shape


In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score


# Splitting the dataset into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42) 

# Training the Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train) 

# Making predictions on the validation set
y_pred = model.predict(X_val)
# Evaluating the model
mae = mean_absolute_error(y_val, y_pred)
mse = mean_squared_error(y_val, y_pred)
r2 = r2_score(y_val, y_pred)
mae, mse, r2

In [None]:
y_pred = model.predict(x_test)

In [None]:
model.score(X_val, y_val)

In [None]:
model.score(x_test, y_test)