<a href="https://colab.research.google.com/github/Umesh1307/Appliances-Energy-Prediction/blob/main/Appliances_Energy_Prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Problem Statement:

---



### The data set is at 10 min for about 4.5 months. The house temperature and humidity conditions were monitored with a ZigBee wireless sensor network. Each wireless node transmitted the temperature and humidity conditions around 3.3 min. Then, the wireless data was averaged for 10 minutes periods. The energy data was logged every 10 minutes with m-bus energy meters. Weather from the nearest airport weather station (Chievres Airport, Belgium) was downloaded from a public data set from Reliable Prognosis (rp5.ru) and merged together with the experimental data sets using the date and time column. Two random variables have been included in the data set for testing the regression models and to filter out non-predictive attributes(parameters).

1. date: year-month-day hour:minute:second
2. T1: Temperature in kitchen area, in Celsius
3. RH_1: Humidity in kitchen area, in %
4. T2: Temperature in living room area, in Celsius
5. RH_2: Humidity in living room area, in %
6. T3: Temperature in laundry room area
7. RH_3: Humidity in laundry room area, in %
8. T4: Temperature in office room, in Celsius
9. RH_4: Humidity in office room, in %
10. T5: Temperature in bathroom, in Celsius
11. vRH_5: Humidity in bathroom, in %
12. T6: Temperature outside the building (north side), in Celsius
13. RH_6: Humidity outside the building (north side), in %
14. T7: Temperature in ironing room, in Celsius
15. RH_7: Humidity in ironing room, in %
16. T8: Temperature in teenager room 2, in Celsius
17. RH_8: Humidity in teenager room 2, in %
18. T9: Temperature in parents’ room, in Celsius
19. RH_9: Humidity in parents’ room, in %
20. T_out: Temperature outside (from Chievres weather station), in Celsius
21. Pressure: (from Chievres weather station), in mm Hg
22. RH_out: Humidity outside (from Chievres weather station), in %
23. Wind speed: (from Chievres weather station), in m/s
24. Visibility: (from Chievres weather station), in km
25. T_dewpoint: (from Chievres weather station), Â°C
26. rv1: Random variable 1, non-dimensional
27. rv2: Random variable 2, non-dimensional
28. Lights: energy use of light fixtures in the house in Wh
29. Appliances: energy use in Wh (Target Variable)

### Where indicated, hourly data (then interpolated) from the nearest airport weather station (Chievres Airport, Belgium) was downloaded from a public data set from Reliable Prognosis,rp5.ru. Permission was obtained from Reliable Prognosis for the distribution of the 4.5 months of weather data.

##😇 Before Delving deep straight into the coding part, let's understand the problem statement together.

---


Energy is the ability to do work
Scientists define energy as the ability to do work. Modern civilization is possible because people have learned how to change energy from one form to another and then use it to do work. People use energy to walk and bicycle, to move cars along roads and boats through water, to cook food on stoves, to make ice in freezers, to light our homes and offices, to manufacture products, and to send astronauts into space.

There are many different forms of energy, including:

Heat

Light

Motion

Electrical

Chemical

Gravitational

😇 curious to know about energy more refere this [link text](https://www.eia.gov/energyexplained/what-is-energy/)


# Objective of Project:
---
### The increasing trend in energy consumption is becoming cause of concern for the entire world, as the energy consumption is increasing year after year so is the carbon and greenhouse gas emission, the majority portion of the electricity generated is consumed by industrial sector but a considerable amount is also consumed by residential sector. It is important to study the energy consuming behaviour in the residential sector and predict the energy consumption by home appliances as it consume maximum amount of energy in the residence. This project focuses on predicting the energy consumption of home appliances based on humidity and temperature.

---

# What we can do?

---





### Energy prediction of appliances requires identifying and predicting individual appliance energy consumption when combined in a closed chain environment. This experiment aims to provide insight into reducing energy consumption by identifying trends and appliances involved.

# Tentative Roadmap to Follow:

---

* ### Loading the dataset.

* ### cleaning and transforming of features (Null value treatment, Data type consistency check).
* ### Descriptive statistical analysis.
* ### Skewness and outlier (anomalies) detection analysis.
* ### Feature engineering (standardizing, normalizing, multicolinearity assumption check, linearity between independent and dependent variable check).
* ### Exploratory data analysis(understanding the patteren and behaviour of data. EDA involves generating summary statistics for numerical data in the dataset and creating various graphical representations to understand the data better).
* ### Understanding the feature importance (PCA can be handy for feature selection, lasso regression can be a another option).
* ### Model Selection.
* ### Model Training.
* ### Model Evaluation.
* ### Conclusion.






# ***STEP 1: LOADING THE DATASET***

---



In [None]:
# Let's get started with very first step loading the wapon's(libraries):
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.decomposition import PCA, LatentDirichletAllocation
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge, Lasso
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor, ExtraTreesRegressor
from sklearn.neural_network import MLPRegressor
import xgboost as xgb
from sklearn import neighbors
from sklearn.svm import SVR
import time
from math import sqrt
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.pipeline import Pipeline
from tensorflow.keras import Sequential, layers, Input
import warnings
warnings.filterwarnings('ignore')

In [None]:
# Mounting the drive:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Creating the directorial path for the data set:
dir_path="/content/drive/MyDrive/Almabetter Project/Capstone - Projects/Module 4 Supervised ML Regression/Appliances Energy Prediction"

In [None]:
# Loading the dataset:
energy_df=pd.read_csv(dir_path+"/data_application_energy.csv")

In [None]:
# Checking the head of the dataset, traditional way yet useful
energy_df.head()

In [None]:
# Let's use the colab data table feature to visualize explictly! This feature was new one for me :)
from google.colab.data_table import DataTable
DataTable(energy_df)

# A few interesting features of the data table display:😇

---



* ### Clicking the Filter button in the upper right allows you to search for terms or values in any particular column.
* ### Clicking on any column title lets you sort the results according to that column's value.
* ### The table displays only a subset of the data at a time. You can navigate through pages of data using the controls on the lower right.

In [None]:
# Checking the tail of the dataset:)
energy_df.tail(3)


In [None]:
# Let's have a look at the data type of the features.
energy_df.info()

* ### **Number of entries** : 19735
* ### **Number of features : 27 ( 2 Random Variables excluded )**

* ### **Target Variable : Appliances**

### **All features are numerical. No categorical variables. There seems to be no null values in our data set.**

In [None]:
# Rechecking for the null values if any
energy_df.isnull().any().sum()

## ***DESCRIPTIVE STATISTICAL ANALYSIS***

---
### Here we will be using pandas describe method to have an intution about the basic behaviour of data, furthermore we will use pandas profilling to have a more understanding of the data.


In [None]:
# Now let's use pandas describe method
energy_df.describe()