## Assignment
Energy Efficiency
Description: Multi-Linear and Polynomial Regression on the Energy Efficiency Dataset
In this assignment, you will perform multi-linear and polynomial regression on the Energy
Efficiency dataset to predict the heating load (y1) of buildings. Follow the instructions below:

1. Load the Energy Efficiency dataset using the pandas library.
● Dataset Name: Energy Dataset

2. Apply necessary preprocessing steps on the dataset, such as handling missing
values, scaling features, or encoding categorical variables if required.

3. Separate the features (X) and the target variable (y: heating load) from the dataset

4. Split the dataset into training and testing sets using an 80:20 ratio.

5. Perform multi-linear regression:

● Fit a multi-linear regression model to the training data using the
LinearRegression class from the sklearn.linear_model module.
● Predict the heating load for the testing data using the trained model.
● Evaluate the performance of the model by calculating metrics such as mean
squared error (MSE) and coefficient of determination (R^2).
● Print the MSE and R^2 values to assess the model's accuracy.
6. Perform polynomial regression:

● Use the PolynomialFeatures class from the sklearn.preprocessing module to
transform the features into polynomial features.
● Fit a polynomial regression model to the training data using the
LinearRegression class.
● Predict the heating load for the testing data using the trained polynomial
regression model.
● Evaluate the performance of the model by calculating MSE and R^2.
● Print the MSE and R^2 values.

7. Compare the performance of the multi-linear regression and polynomial regression
models based on the MSE and R^2 values.


Energy Data sent
Link-https://docs.google.com/spreadsheets/d/1jXngyixNhyj7C6yj5olExZQeWWwzSUj
a/edit?usp=sharing&ouid=111885139572109362769&rtpof=true&sd=true
.

In [25]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import mean_squared_error, r2_score

In [26]:
#Load the dataset
data = pd.read_csv("Energy Dataset.xlsx - Energy Dataset.csv")

In [27]:
data.isnull().sum()

X1              0
X2              0
X3              0
X4              0
X5              0
X6              0
X7              0
X8              0
Heating Load    0
dtype: int64

In [28]:
X = data.drop("Heating Load", axis=1)
y = data["Heating Load"]

In [29]:
#Train-test split (80:20 ratio)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [30]:
# Fit the model
multi_linear_model = LinearRegression()
multi_linear_model.fit(X_train, y_train)

In [31]:
# Predict on test data
y_pred_multi_linear = multi_linear_model.predict(X_test)

In [32]:
# Evaluate performance
mse_multi_linear = mean_squared_error(y_test, y_pred_multi_linear)
r2_multi_linear = r2_score(y_test, y_pred_multi_linear)
print("Multi-linear Regression:")
print("Mean Squared Error:", mse_multi_linear)
print("R^2 Score:", r2_multi_linear)

Multi-linear Regression:
Mean Squared Error: 9.916830750702234
R^2 Score: 0.8980913537361468


In [33]:
#Polynomial regression
#Transform features into polynomial features
poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)

In [34]:
# Fit the model
poly_model = LinearRegression()
poly_model.fit(X_train_poly, y_train)

In [35]:
# Predict on test data
y_pred_poly = poly_model.predict(X_test_poly)

In [36]:
# Evaluate performance
mse_poly = mean_squared_error(y_test, y_pred_poly)
r2_poly = r2_score(y_test, y_pred_poly)
print("\nPolynomial Regression (degree=2):")
print("\nMean Squared Error:", mse_poly)
print("\nR^2 Score:", r2_poly)


Polynomial Regression (degree=2):

Mean Squared Error: 0.6635593299701004

R^2 Score: 0.9931810439511419


In [37]:
#Comparison
print("Comparison:")
print("\nMulti-linear Regression MSE:", mse_multi_linear)
print("\nPolynomial Regression MSE:", mse_poly)
print("\nMulti-linear Regression R^2 Score:", r2_multi_linear)
print("\nPolynomial Regression R^2 Score:", r2_poly)

Comparison:

Multi-linear Regression MSE: 9.916830750702234

Polynomial Regression MSE: 0.6635593299701004

Multi-linear Regression R^2 Score: 0.8980913537361468

Polynomial Regression R^2 Score: 0.9931810439511419
