Data section
====================

The data available is from the smart meters and plug-level devices of three households, covering power consumption over a period of time. The primary variable of interest in the smart meter data is the sum of real power over all power phases consumed in the household, while the plug data provides appliance-level consumption.
I will be focusing on the smart meter data of household 4 specialized in kitchen (measure type 01: Fridge, 02: Kitchen appliances, 08: Microwave ) from June to December 2012. Specifically, and I am interested in exploring the overall electricity consumption pattern of the household and how it changes over time, with a focus on identifying any trends or seasonality in the data.

Data-science questions
=============

How does the total electricity consumption of each household compare over time, and what are the trends?    

How does the electricity consumption of different appliances especially in kitchen for Fridge, Kitchen appliances, and Microwave vary over time in households 4?  

Data preparation and EDA
=====

set up dataframe of electricity consumption of different appliances especially in kitchen for Fridge, Kitchen appliances, and Microwave vary over time in households

In [143]:
import pandas as pd
import numpy as np
import glob
import os
import plotly.graph_objects as go
import altair as alt
import plotly.io as pio

pio.renderers.default = "plotly_mimetype+notebook_connected"

In [59]:

plug01_files = glob.glob("./eco/04/01/" + "*.csv")
plug02_files = glob.glob("./eco/04/02/" + "*.csv")
plug08_files = glob.glob("./eco/04/08/" + "*.csv")

plug01 = []
# loop over the list of csv files read in all plugs data for each appliances
for f in plug01_files:
    # read the csv file
    df01 = pd.read_csv(f,names=["measurement"])
# replace all missing data with 0
    df01 = df01.replace(-1, 0)
# calculate the sum of all day measurement
    total01 = df01["measurement"].sum()
# combine measurement sums together 
    plug01.append(total01)

df01 = pd.DataFrame(plug01)
df01.columns = ["Fridge measurement"]
# create a new column with dates starting from 06/27/2012

df01['date'] = pd.date_range(start='2012-06-27', periods=len(df01), freq='D')


# create a list of missing dates
missing_dates = ['2012-09-06', '2012-09-07', '2012-09-08', '2012-09-09', '2012-09-10', '2012-10-26', '2012-10-27', '2012-10-28', '2012-10-29', '2012-10-30', '2012-10-31', '2012-11-01', '2012-11-02', '2012-11-03', '2012-11-04', '2012-11-05', '2012-11-06']

# create a new dataframe with a complete range of dates excluding the missing dates
date_range = pd.date_range(start='2012-06-27', end='2013-01-23', freq='D')
complete_dates = pd.Index(date_range)
missing_dates_index = pd.Index(missing_dates)
valid_dates = complete_dates.difference(missing_dates_index)
df_date_range = pd.DataFrame({'date': valid_dates})
# drop the original date column in df01
df01.drop('date', axis=1, inplace=True)

# set the date column as the index of df_date_range
df_date_range.set_index('date', inplace=True)
# add the date column in df_date_range to df01
df01['date'] = df_date_range.index

# reorder the columns in df01 with the date column at the first position
df01 = df01.reindex(columns=['date'] + list(df01.columns[:-1]))


In [55]:

plug02 = []
# loop over the list of csv files read in all plugs data for each appliances
for f in plug02_files:
    # read the csv file
    df02 = pd.read_csv(f,names=["measurement"])
# replace all missing data with 0
    df02 = df02.replace(-1, 0)
# calculate the sum of all day measurement
    total02 = df02["measurement"].sum()
# combine measurement sums together 
    plug02.append(total02)

df02 = pd.DataFrame(plug02)
df02.columns = ["Kitchen appliances measurement"]
# df02

Unnamed: 0,Kitchen appliances measurement
0,6.883157e+05
1,1.695786e+06
2,8.464724e+05
3,7.781294e+05
4,7.006198e+05
...,...
189,9.080765e+05
190,5.646038e+05
191,2.163583e+06
192,7.764368e+05


In [60]:
plug08 = []
# loop over the list of csv files read in all plugs data for each appliances
for f in plug08_files:
    # read the csv file
    df08 = pd.read_csv(f,names=["measurement"])
# replace all missing data with 0
    df08 = df08.replace(-1, 0)
# calculate the sum of all day measurement
    total08 = df08["measurement"].sum()
# combine measurement sums together 
    plug08.append(total08)

df08 = pd.DataFrame(plug08)
df08.columns = ["Microwave measurement"]
# df08

Unnamed: 0,Microwave measurement
0,1.517307e+06
1,1.471807e+06
2,5.028336e+06
3,1.406613e+06
4,4.126534e+05
...,...
189,4.988689e+05
190,5.094321e+05
191,1.108172e+06
192,1.008480e+06


### convert Joules per day to kWh per day for all the measurement column

In [141]:
# concatenate columns from df01, df02, and df08 together
df_combined = pd.concat([df01, df02, df08], axis=1)

# convert Joules per day to kWh per day for the Fridge measurement column
df_combined['Fridge measurement'] = df_combined['Fridge measurement'] / 3600000

# convert Joules per day to kWh per day for the Kitchen appliances measurement column
df_combined['Kitchen appliances measurement'] = df_combined['Kitchen appliances measurement'] / 3600000

# convert Joules per day to kWh per day for the Microwave measurement column
df_combined['Microwave measurement'] = df_combined['Microwave measurement'] / 3600000

# round all measurements to 2 decimal places
df_combined = df_combined.round(2)

# print the resulting dataframe
print(df_combined)

# save df_combined as a CSV file
df_combined.to_csv("combined_data.csv", index=False)


          date  Fridge measurement  Kitchen appliances measurement  \
0   2012-06-27                0.73                            0.19   
1   2012-06-28                0.65                            0.47   
2   2012-06-29                0.61                            0.24   
3   2012-06-30                0.86                            0.22   
4   2012-07-01                0.77                            0.19   
..         ...                 ...                             ...   
189 2013-01-19                0.54                            0.25   
190 2013-01-20                0.52                            0.16   
191 2013-01-21                0.52                            0.60   
192 2013-01-22                0.51                            0.22   
193 2013-01-23                0.38                            0.14   

     Microwave measurement  
0                     0.42  
1                     0.41  
2                     1.40  
3                     0.39  
4             

Results section
=============

### A rationale for your design decisions


The first visualization could show the electricity consumption over time for all kitchen appliances measured with smart plugs in the three households. The x-axis would represent time (e.g. days), and the y-axis would represent power consumption in watts. The different lines on the plot would correspond to the different appliances, and the plot could potentially have a dropdown or interactive legend to toggle the different appliances on and off. This would allow users to explore how the electricity consumption of different kitchen appliances varies over time in the 4 household.

The second visualization could show the aggregate electricity consumption (powerallphases) over time for each of the three households. The x-axis would represent time (e.g. in hours or days), and the y-axis would represent the total electricity consumption in watts or kWh. The plot could have three lines, one for each household, with different colors or styles to distinguish them. Additionally, the plot could include a rolling average or trendline to highlight any patterns or trends in the data. This visualization would allow users to compare the overall electricity consumption of the three households and explore any differences or similarities over time.

### Visualizations

In this code, we first import the necessary modules from Plotly. We then create a subplot for each appliance using the make_subplots function, with one row and three columns. We also set the subplot titles to be 'Fridge', 'Kitchen Appliances', and 'Microwave'.

We then plot the data for each appliance using the add_trace method of the go.Scatter class. We specify the x and y data for each trace, as well as a name for each trace.

Next, we set the x-axis title using the update_xaxes method, and set the y-axis title for each subplot using the update_yaxes method. We also add a title to the plot using the update_layout method.

In [138]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# create a subplot for each appliance
fig = make_subplots(rows=1, cols=3, subplot_titles=('Fridge', 'Kitchen Appliances', 'Microwave'))

# plot the data for each appliance
fig.add_trace(go.Scatter(x=df_combined['date'], y=df_combined['Fridge measurement'], name='Fridge'), row=1, col=1)
fig.add_trace(go.Scatter(x=df_combined['date'], y=df_combined['Kitchen appliances measurement'], name='Kitchen Appliances'), row=1, col=2)
fig.add_trace(go.Scatter(x=df_combined['date'], y=df_combined['Microwave measurement'], name='Microwave'), row=1, col=3)

# set the x-axis title
fig.update_xaxes(title_text='Date')

# set the y-axis title for each subplot
fig.update_yaxes(title_text='Energy Consumption (kWh/day)', row=1, col=1)
fig.update_yaxes(title_text='Energy Consumption (kWh/day)', row=1, col=2)
fig.update_yaxes(title_text='Energy Consumption (kWh/day)', row=1, col=3)

# add a title to the plot
fig.update_layout(title_text='Energy Consumption by Appliance')

# show the plot
fig.show()


This graph creates a time series graph with three traces, one for each energy consumption measurement (fridge, kitchen appliances, and microwave). The x-axis represents the date, and the y-axis represents the energy consumption in kilowatt-hours per day. Each trace has a different color to distinguish the measurements.

In [139]:
import plotly.graph_objects as go

# INITIALIZE GRAPH OBJECT
fig = go.Figure()

# TRACE-1: Fridge measurement
fig.add_trace(
    go.Scatter(
        x=df_combined["date"],
        y=df_combined["Fridge measurement"],
        mode="lines+markers",
        marker=dict(
            color=df_combined["Fridge measurement"],
            size=5,
            symbol="circle",
            line=dict(color="DarkBlue", width=1),
            colorbar=dict(title="Fridge Energy Consumption (kWh per day)")
        ),
        line=dict(color="blue", width=1.5),
        name="Fridge",
        visible=True,
    )
)

# TRACE-2: Kitchen appliances measurement
fig.add_trace(
    go.Scatter(
        x=df_combined["date"],
        y=df_combined["Kitchen appliances measurement"],
        mode="lines+markers",
        marker=dict(
            color=df_combined["Kitchen appliances measurement"],
            size=5,
            symbol="square",
            line=dict(color="DarkGreen", width=1),
            colorbar=dict(title="Kitchen Appliances Energy Consumption (kWh per day)")
        ),
        line=dict(color="green", width=1.5, dash="dot"),
        name="Kitchen appliances",
        visible=False,
    )
)

# TRACE-3: Microwave measurement
fig.add_trace(
    go.Scatter(
        x=df_combined["date"],
        y=df_combined["Microwave measurement"],
        mode="lines+markers",
        marker=dict(
            color=df_combined["Microwave measurement"],
            size=5,
            symbol="diamond",
            line=dict(color="DarkRed", width=1),
            colorbar=dict(title="Microwave Energy Consumption (kWh per day)")
        ),
        line=dict(color="red", width=1.5, dash="dash"),
        name="Microwave",
        visible=False,
    )
)

# SET THEME, AXIS LABELS
fig.update_layout(
    template="plotly_white",
    xaxis_title="Date",
    yaxis_title="Energy Consumption (kWh per day)",
    title="Daily Energy Consumption",
)

# DROPDOWN MENUS
fig.update_layout(
    updatemenus=[
        dict(
            buttons=[
                dict(
                    label="Fridge",
                    method="update",
                    args=[{"visible": [True, False, False]},
                        {"title": "Fridge measurement (days)"}]
                    ),
                dict(
                    label="Kitchen appliances",
                    method="update",
                    args=[{"visible": [False, True, False]},
                        {"title": "Kitchen appliance measurement (days)"}]
                    ),
                dict(
                    label="Microwave",
                    method="update",
                    args=[{"visible": [False, False, True]},
                        {"title": "Fridge measurement (days)"}]
                    ),
            ],
            direction="down",
            showactive=True,
            pad={"r": 10, "t": 10},
            x=0,
            y=1.15,
            xanchor="left",
            yanchor="top",
        )
    ]
)

fig.show()


In [140]:
import plotly.graph_objects as go

# INITIALIZE GRAPH OBJECT
fig = go.Figure()

# TRACE-1: Fridge measurement
fig.add_trace(
    go.Scatter(
        x=df_combined["date"],
        y=df_combined["Fridge measurement"],
        mode="lines+markers",
        marker=dict(
            size=5,
            symbol="circle",
            line=dict(color="DarkBlue", width=1),
        ),
        line=dict(color="blue", width=1.5),
        name="Fridge",
        visible=True,
    )
)

# TRACE-2: Kitchen appliances measurement
fig.add_trace(
    go.Scatter(
        x=df_combined["date"],
        y=df_combined["Kitchen appliances measurement"],
        mode="lines+markers",
        marker=dict(
            size=5,
            symbol="square",
            line=dict(color="DarkGreen", width=1),
        ),
        line=dict(color="green", width=1.5, dash="dot"),
        name="Kitchen appliances",
        visible=True,
    )
)

# TRACE-3: Microwave measurement
fig.add_trace(
    go.Scatter(
        x=df_combined["date"],
        y=df_combined["Microwave measurement"],
        mode="lines+markers",
        marker=dict(
            size=5,
            symbol="diamond",
            line=dict(color="DarkRed", width=1),
        ),
        line=dict(color="red", width=1.5, dash="dash"),
        name="Microwave",
        visible=True,
    )
)

# SET THEME, AXIS LABELS
fig.update_layout(
    template="plotly_white",
    xaxis_title="Date",
    yaxis_title="Energy Consumption (kWh per day)",
    title="Daily Energy Consumption",
)


fig.show()


Fridge: The electricity consumption of a fridge varies over time in households depending on its age, size, and energy efficiency. In general, older fridges and larger fridges consume more electricity than newer and smaller fridges. The data provided shows that the average electricity consumption of a fridge in a household is around 1-2 kWh per day.

Kitchen appliances: the coffee machine, bread baking machine, and toaster were connected to the same electrical outlet or circuit, then they would be consuming electricity from the same source. This means that the total electricity consumption of these appliances combined would be higher than if they were used separately, as there may be some energy loss due to inefficiencies in the circuit or outlet. It is also important to note that the specific electricity consumption of each appliance may vary depending on factors such as usage time and energy efficiency.

Microwave: The electricity consumption of a microwave oven is relatively low compared to other kitchen appliances. The data provided shows that the average electricity consumption of a microwave in a household is around 0.1-0.2 kWh per day.

In conclusion, the electricity consumption of different appliances, especially in the kitchen, varies over time in households depending on various factors such as age, size, and energy efficiency. It is important for households to consider these factors when purchasing new appliances and to use them efficiently to reduce their overall energy consumption.

References
======