# Appliances Energy Prediction

## Data Description
The data set is at 10 min for about 4.5 months. The house temperature and humidity conditions were monitored with a ZigBee wireless sensor network. Each wireless node transmitted the temperature and humidity conditions around 3.3 min. 
Then, the wireless data was averaged for 10 minutes periods. The energy data was logged every 10 minutes with m-bus energy meters. Weather from the nearest airport weather station (Chievres Airport, Belgium) was downloaded from a public data set from Reliable Prognosis (rp5.ru), and merged together with the experimental data sets using the date and time column. 
Two random variables have been included in the data set for testing the regression models and to filter out non predictive attributes (parameters)

## Environment Setup

Loading library packages

In [16]:
# Haandling request errors
from urllib.error import URLError

#Notebook visual  side effects
import warnings
warnings.filterwarnings('ignore')
import IPython.display as ipd

# Data manipulation and analysis
import pandas as pd
import numpy as np # Numerical analysis and computation

# Visualization
import matplotlib.pyplot as plt # Basic MATLAB-inspired visualization
import seaborn as sns # More aesthetic viz.

# OS operations
import os
from pathlib import Path # Platform-agnostic file path handling

# Set library options
pd.set_option('max_columns', None) # No limit cap on dataframe column display
pd.set_option('max_colwidth', None) # Limitless column width

# Reproducibility
import random
SEED = 2022
random.seed(SEED)
np.random.seed(SEED)
os.environ['PYTHONHASHSEED'] = '{}'.format(SEED)

## Loading DATA

In [8]:
HOME_PATH = Path.cwd()

LOCAL_DATA_URI = HOME_PATH / 'data' / 'energydata_complete.csv' # as pathlib.Path object
LOCAL_DATA_FILE_PATH = str(HOME_PATH) # As string representation

REMOTE_DATA_URI = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00374/energydata_complete.csv'

### Reading DATA

In [17]:
try:
    energy_data = pd.read_csv(REMOTE_DATA_URI, encoding='latin',parse_dates=['date'])
except URLError as e:
    energy_data = pd.read_csv(LOCAL_DATA_FILEPATH, encoding='latin',parse_dates=['date'])
else:
    ipd.display(energy_data.head())

Unnamed: 0,date,Appliances,lights,T1,RH_1,T2,RH_2,T3,RH_3,T4,RH_4,T5,RH_5,T6,RH_6,T7,RH_7,T8,RH_8,T9,RH_9,T_out,Press_mm_hg,RH_out,Windspeed,Visibility,Tdewpoint,rv1,rv2
0,2016-01-11 17:00:00,60,30,19.89,47.596667,19.2,44.79,19.79,44.73,19.0,45.566667,17.166667,55.2,7.026667,84.256667,17.2,41.626667,18.2,48.9,17.033333,45.53,6.6,733.5,92.0,7.0,63.0,5.3,13.275433,13.275433
1,2016-01-11 17:10:00,60,30,19.89,46.693333,19.2,44.7225,19.79,44.79,19.0,45.9925,17.166667,55.2,6.833333,84.063333,17.2,41.56,18.2,48.863333,17.066667,45.56,6.483333,733.6,92.0,6.666667,59.166667,5.2,18.606195,18.606195
2,2016-01-11 17:20:00,50,30,19.89,46.3,19.2,44.626667,19.79,44.933333,18.926667,45.89,17.166667,55.09,6.56,83.156667,17.2,41.433333,18.2,48.73,17.0,45.5,6.366667,733.7,92.0,6.333333,55.333333,5.1,28.642668,28.642668
3,2016-01-11 17:30:00,50,40,19.89,46.066667,19.2,44.59,19.79,45.0,18.89,45.723333,17.166667,55.09,6.433333,83.423333,17.133333,41.29,18.1,48.59,17.0,45.4,6.25,733.8,92.0,6.0,51.5,5.0,45.410389,45.410389
4,2016-01-11 17:40:00,60,40,19.89,46.333333,19.2,44.53,19.79,45.0,18.89,45.53,17.2,55.09,6.366667,84.893333,17.2,41.23,18.1,48.59,17.0,45.4,6.133333,733.9,92.0,5.666667,47.666667,4.9,10.084097,10.084097
