Script for 'building_heating' program in Python<br>
Licensed under the Apache License, Version 2.0<br>
http://www.apache.org/licenses/LICENSE-2.0

In the first part of the analysis, we construct a schematic representation of the building.

In [127]:
# The magic command "%matplotlib notebook" to make interactive plots within the Jupyter Notebook
# Import numpy library (for arrays operations)
# Import matplotlib.pyplot interface (for MATLAB-like plots)

%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt

In [128]:
# Range of coordinates indices x, y, z for the building plot

x, y, z = np.indices((12, 26, 6))

In [129]:
# Definition of the building volumes and colors

ground = (x < 12) & (y < 26) & (z < 1)
floors = (x < 12) & (y < 26) & (1 <= z) & (z < 6)
building = ground | floors
    
building_color = np.empty(building.shape, dtype=object)
building_color[ground] = 'grey'
building_color[floors] = 'white'

In [130]:
# Definition of the 4th floor apartments volumes and colors

apart_41 = (0 <= x) & (x < 12) & (19 <= y) & (y < 26) & (3 <= z) & (z < 4)
apart_42 = (8 <= x) & (x < 12) & (10 <= y) & (y < 19) & (3 <= z) & (z < 4)
apart_43 = (6 <= x) & (x < 12) & (0 <= y) & (y < 8) & (3 <= z) & (z < 4)
apart_44 = (0 <= x) & (x < 6) & (0 <= y) & (y < 10) & (3 <= z) & (z < 4)
apart_45 = (0 <= x) & (x < 6) & (10 <= y) & (y < 17) & (3 <= z) & (z < 4)
apartments = apart_41 | apart_42 | apart_43 | apart_44 | apart_45

apartments_color = np.empty(building.shape, dtype=object)
apartments_color[apart_41] = 'gold'
apartments_color[apart_42] = 'red'
apartments_color[apart_43] = 'blue'
apartments_color[apart_44] = 'violet'
apartments_color[apart_45] = 'green'

In [131]:
# Interactive 3D plot of the building

# Parameters, title and text of the 3D plot

ax = plt.figure(figsize=(6, 6)).add_subplot(projection='3d')
ax.voxels(building, facecolors=building_color, alpha=0.4)
ax.voxels(apartments, facecolors=apartments_color)

plt.title('Apartment distribution (4th floor)')

ax.text(8.5, -3, 3.3, '43', weight='bold')
ax.text(2.5, -3, 3.3, '44', weight='bold')
ax.text(7, 29, 3.3, '41', weight='bold')
ax.text(13.5, 13.5, 3.3, '42', weight='bold')
ax.text(-1.5, 14.5, 3.3, '45', weight='bold')

plt.tight_layout()

<IPython.core.display.Javascript object>

In the next part, we use Python librarires to set up the connection to the MySQL database containing the data of the building.

In [132]:
# Import os module (for interacting with the operating system)
# Import pandas library (for data analysis in Python)
# Import create_engine from the sqlalchemy toolkit (standard SQL toolkit in Python)
# Import load_dotenv from the dotenv module (for setting environment variables)

import os
import pandas as pd
from sqlalchemy import create_engine
from dotenv import load_dotenv

In [133]:
# Load from a (hidden) .env file the MySQL credentials of the mysql database of the building

load_dotenv()
user = os.getenv('MySQL_USER')
passwd = os.getenv('MySQL_PASSWORD')
host = os.getenv('MySQL_HOST')
port = os.getenv('MySQL_PORT')
db = os.getenv('MySQL_DB')

In [134]:
# Create a connection ("engine") to the MySQL database using the credentials

engine = create_engine('mysql://%s:%s@%s:%s/%s' % (user, passwd, host, port, db))

The analysis involves 3 tables from the database, "Apartments", "Power" and "Temperatures", that we now describe.

The Apartments table contains the list of all units of the building (31 apartments + 4 weather stations). The "esmart_id" is the identification number given to each unit by the firm eSMART. The column "name" is a more intuitive identification number for the units. For instance, the name "43" corresponds to floor 4, apartment 3.

In [135]:
# Read the Apartments table into a DataFrame "df_apart"

df_apart = pd.read_sql('SELECT * FROM Apartments', engine)
df_apart

Unnamed: 0,esmart_id,name
0,1046,11
1,1047,12
2,1048,13
3,1049,14
4,1050,15
5,1041,21
6,1042,22
7,1043,23
8,1044,24
9,1045,25


The Power table contains information about power consumption of each unit starting from the date 2017-10-12. Each row of the table contains the "apartment_id" (= esmart_id), the "module_id" of the module taking the measure, namely,

- <b>17</b> for electric power measured in Watt [W],
- <b>18</b> for the energy measured in Watt-hour [W⋅h]

the "date" of the measure and the "value" measured by the module.

In [136]:
# Read the first 100 rows of Power table into a DataFrame "df_power"

df_power = pd.read_sql('SELECT * FROM Power LIMIT 100', engine)
df_power

Unnamed: 0,apartment_id,module_id,date,value
0,1026,17,2017-10-12,9242
1,1026,17,2017-10-13,10658
2,1026,17,2017-10-14,11523
3,1026,17,2017-10-15,12394
4,1026,17,2017-10-16,13260
...,...,...,...,...
95,1026,17,2018-01-15,387485
96,1026,17,2018-01-16,392215
97,1026,17,2018-01-17,396945
98,1026,17,2018-01-18,401808


The Temperatures table contains information about temperatures in each unit starting from the date 2021-03-15. Each row of the table contains the "apartment_id", the "module_id" for the room location within the unit of the module taking the measure, namely,

- <b>26</b> for the living room,
- <b>25</b> for dormitory 1,
- <b>1</b> for dormitory 2,
- <b>2</b> for dormitory 3,
- <b>37</b> for dormitory 4,
- <b>38</b> for dormitory 5,
- <b>53</b> for bathroom 1,
- <b>54</b> for bathroom 2,

the "date" of the measure, the "action" of the module (<b>get</b> for the temperature measured in the room and <b>set</b> for the temperature set in the room) and the "value" for the temperature.

In [137]:
# Read the first 100 rows of Temperatures table into a DataFrame "df_temp"

df_temp = pd.read_sql('SELECT * FROM Temperatures LIMIT 100', engine)
df_temp

Unnamed: 0,apartment_id,module_id,date,action,value
0,1026,1,2021-03-15 11:46:46,set,22.0
1,1026,1,2021-03-15 11:56:56,set,22.0
2,1026,1,2021-03-15 12:07:06,set,22.0
3,1026,1,2021-03-15 12:17:17,set,22.0
4,1026,1,2021-03-15 12:27:27,set,22.0
...,...,...,...,...,...
95,1026,1,2021-03-16 03:37:37,set,18.0
96,1026,1,2021-03-16 03:47:47,set,18.0
97,1026,1,2021-03-16 03:57:57,set,18.0
98,1026,1,2021-03-16 04:08:07,set,18.0


In this part of the analysis, we retrieve and plot in pie chart the data of energy consumption of each unit of the building, over any prescribed period of time defined by a "start_date" and an "end_date".

In [138]:
# Set start_date and end_date for the DataFrame of energy consumption

start_date = '"2021-03-15"'
end_date = '"2022-12-31"'

# Read, for each unit of the building, the first energy after the start date into a DataFrame "df_start"

df_start = pd.read_sql('SELECT Apartments.name, Power.date AS initial_date, Power.value AS initial_energy \
                   FROM Power INNER JOIN Apartments ON Power.apartment_id=Apartments.esmart_id \
                   WHERE Power.module_id=18 AND Power.date>=%s \
                   GROUP BY Apartments.name' % start_date, engine)

# Read, for each unit of the building, the last energy before the end date into a DataFrame "df_end"

df_end = pd.read_sql('SELECT Apartments.name, Power.date AS final_date, Power.value AS final_energy \
                   FROM Power INNER JOIN Apartments ON Power.apartment_id=Apartments.esmart_id \
                   WHERE Power.module_id=18 AND Power.date<=%s \
                   ORDER BY Power.date DESC LIMIT 31' % end_date, engine)

# Merging 'df_start' and 'df_end' into a DataFrame 'df_energy' (using 'name' as key)
# (in 2018 there are missing measures; the first measure for unit "O2" is at initial_date "2018-10-10")

df_energy = pd.merge(df_start, df_end, how='left', on='name')
df_energy

Unnamed: 0,name,initial_date,initial_energy,final_date,final_energy
0,11,2021-03-15,9385000,2022-04-12 15:39:12,10770000
1,12,2021-03-15,8045000,2022-04-12 15:31:30,10017000
2,13,2021-03-15,10177000,2022-04-12 15:33:03,12517000
3,14,2021-03-15,9686000,2022-04-12 15:33:33,13449000
4,15,2021-03-15,9526000,2022-04-12 15:38:19,12091000
5,21,2021-03-15,7672000,2022-04-12 15:33:42,8929000
6,22,2021-03-15,11263000,2022-04-12 15:40:40,12114000
7,23,2021-03-15,2509000,2022-04-12 15:40:19,2666000
8,24,2021-03-15,2779000,2022-04-12 15:34:35,3528000
9,25,2021-03-15,3092000,2022-04-12 15:33:17,4043000


In [149]:
# Pie chart plot of the energy consumption in the building

# Set in a "units" list the names of the pie chart wedges
# Set in a "data" list the differences between final_energy and initial_energy
# Set in a "scales" list the scales of the pie chart wedges
# Set in a "wedges" dictionary the properties of the pie chart wedges
# Set in a "colors" list the colors of the pie chart wedges
# See https://matplotlib.org/stable/tutorials/colors/colormaps.html for details

units = (df_energy['name']).tolist()
data = (df_energy['final_energy']-df_energy['initial_energy']).tolist()
scales = [0.15 if data[i] == max(data) or data[i] == min(data) else 0.1 for i in range(len(data))]
wedges = {'width':0.33, 'edgecolor':'black', 'linewidth':0.5}
colors = plt.colormaps['tab20b'](range(20))

# Parameters and title of the pie chart plot

_, ax = plt.subplots(figsize=(8, 8))
_, _, pcts = ax.pie(data, labels=units, explode=scales, radius=0.9, autopct='%.1f%%', colors=colors,
                    wedgeprops=wedges, pctdistance=0.82, textprops={'size': 'small'})
ax.set_title('Energy consumption of each unit between %s and %s \n \
(biggest energy consumer: unit \"%s\", smallest energy consumer: unit \"%s\")'
             % (eval(start_date), eval(end_date), units[data.index(max(data))], units[data.index(min(data))]))
plt.setp(pcts, color='w')

plt.tight_layout()

<IPython.core.display.Javascript object>

In [140]:
initial_date = '"2021-03-15"'
final_date = '"2022-12-31"'

dflist_get, dflist_set = [], []

for unit in units:
    temp_get = pd.read_sql('SELECT DATE(Temperatures.date) AS date, ROUND(AVG(Temperatures.value),2) AS \'unit %s (get)\' \
                       FROM Temperatures INNER JOIN Apartments ON Temperatures.apartment_id=Apartments.esmart_id \
                       WHERE Apartments.name=\'%s\' AND Temperatures.action="get" \
                       AND Temperatures.date>=%s AND Temperatures.date<=%s \
                       AND TIME(Temperatures.date) BETWEEN "00:00:00" AND "23:59:59" \
                       GROUP BY DATE(Temperatures.date)' % (unit, unit, initial_date, final_date), engine)
    
    temp_set = pd.read_sql('SELECT DATE(Temperatures.date) AS date, ROUND(AVG(Temperatures.value),2) AS \'unit %s (set)\' \
                       FROM Temperatures INNER JOIN Apartments ON Temperatures.apartment_id=Apartments.esmart_id \
                       WHERE Apartments.name=\'%s\' AND Temperatures.action="set" \
                       AND Temperatures.date>=%s AND Temperatures.date<=%s \
                       AND TIME(Temperatures.date) BETWEEN "00:00:00" AND "23:59:59" \
                       GROUP BY DATE(Temperatures.date)' % (unit, unit, initial_date, final_date), engine)
    
    dflist_get.append(temp_get)
    dflist_set.append(temp_set)

In [165]:
from functools import reduce

merge_get = reduce(lambda x, y: pd.merge(round(x,2), round(y,2), on='date', how='outer'), dflist_get)
df_get = merge_get.copy(deep=True)

df_get['mean (get)'] = round((merge_get.drop('date', axis=1)).mean(axis=1),2)
df_get['min (get)'] = round((merge_get.drop('date', axis=1)).min(axis=1),2)
df_get['max (get)'] = round((merge_get.drop('date', axis=1)).max(axis=1),2)

merge_set = reduce(lambda x, y: pd.merge(round(x,2), round(y,2), on='date', how='outer'), dflist_set)
df_set = merge_set.copy(deep=True)

df_set['mean (set)'] = round((merge_set.drop('date', axis=1)).mean(axis=1),2)
df_set['min (set)'] = round((merge_set.drop('date', axis=1)).min(axis=1),2)
df_set['max (set)'] = round((merge_set.drop('date', axis=1)).max(axis=1),2)

df_get

Unnamed: 0,date,unit 11 (get),unit 12 (get),unit 13 (get),unit 14 (get),unit 15 (get),unit 21 (get),unit 22 (get),unit 23 (get),unit 24 (get),...,unit 55 (get),unit Communs (get),unit O1 (get),unit O2 (get),unit O3 (get),unit O4 (get),unit O5 (get),mean (get),min (get),max (get)
0,2021-03-15,21.61,22.20,21.30,22.22,22.31,21.73,23.08,22.22,21.52,...,21.60,20.82,21.03,21.23,18.37,21.37,23.08,21.70,18.37,23.50
1,2021-03-16,21.57,22.45,21.28,22.22,,21.92,23.22,22.05,21.43,...,21.93,21.16,21.01,21.13,18.61,21.53,23.21,21.78,18.61,23.53
2,2021-03-17,21.48,22.35,20.97,22.23,,21.61,23.27,22.15,21.36,...,21.99,20.96,20.68,21.20,18.33,21.55,23.23,21.68,18.33,23.64
3,2021-03-18,21.46,22.26,20.80,21.96,,21.49,23.36,22.11,21.45,...,22.10,21.06,20.69,21.37,18.35,21.42,23.13,21.68,18.35,23.71
4,2021-03-19,21.40,22.22,20.81,21.95,,21.45,23.22,22.23,21.35,...,22.00,21.17,20.70,21.78,18.72,21.56,22.81,21.67,18.72,23.72
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
316,2022-04-08,21.86,22.09,22.07,22.29,21.71,21.43,21.60,22.21,21.83,...,21.82,21.17,21.04,21.36,21.02,22.02,23.03,21.74,20.34,23.03
317,2022-04-09,21.78,21.84,22.13,22.40,22.17,21.54,21.64,22.21,21.86,...,21.95,21.68,21.00,21.29,21.18,22.05,23.21,21.78,20.28,23.21
318,2022-04-10,21.87,22.31,22.10,21.78,21.25,21.37,21.81,22.18,21.88,...,22.02,21.58,20.80,22.01,21.72,21.97,23.39,21.79,20.44,23.39
319,2022-04-11,22.08,22.36,22.06,22.17,21.63,21.52,21.84,22.34,21.92,...,22.16,21.66,20.59,21.53,21.58,22.13,23.38,21.85,20.59,23.38


In [148]:
unit = '43'

_, (ax1, ax2) = plt.subplots(2, figsize=(8, 8))

ax1.plot(df_get['date'], df_get['max (get)'], color='k', label='max (get)')
ax1.plot(df_get['date'], df_get['unit %s (get)'% unit], color='r', label='unit %s (get)' % unit)
ax1.plot(df_get['date'], df_get['mean (get)'], color='c', label='mean (get)')
ax1.plot(df_get['date'], df_get['min (get)'], color='k', label='min (get)')

ax1.set_title('Daily temperatures averages measured between %s and %s'
              % (eval(initial_date), eval(final_date)), size='medium')

ax1.legend(fontsize='small',)
ax1.set_ylabel('Temperature [°C]')

ax2.plot(df_set['date'], df_set['max (set)'], color='k', label='max (set)')
ax2.plot(df_set['date'], df_set['unit %s (set)'% unit], color='r', label='unit %s (set)' % unit)
ax2.plot(df_set['date'], df_set['mean (set)'], color='c', label='mean (set)')
ax2.plot(df_set['date'], df_set['min (set)'], color='k', label='min (set)')

ax2.set_title('Daily temperatures averages set between %s and %s'
              % (eval(initial_date), eval(final_date)), size='medium')

ax2.legend(fontsize='small')
ax2.set_xlabel('Date')
ax2.set_ylabel('Temperature [°C]')

plt.rcParams['xtick.labelsize'] = 'small'
plt.rcParams['ytick.labelsize'] = 'small'

plt.tight_layout()

<IPython.core.display.Javascript object>

In [163]:
deviations = pd.DataFrame()

for unit in units:
    deviations['unit %s deviation (get)'% unit] = df_get['unit %s (get)'% unit] - df_get['mean (get)']
    deviations['unit %s deviation (set)'% unit] = df_set['unit %s (set)'% unit] - df_set['mean (set)']

deviations.loc[:, 'unit 11 deviation (set)'::2]
    
MD = pd.DataFrame()
MD['unit'] = units
MD['mean deviation (get)'] = deviations.loc[:, ::2].mean().tolist()
MD['mean deviation (set)'] = deviations.loc[:, 'unit 11 deviation (set)'::2].mean().tolist()
MD = round(MD, 2)
MD.sort_values(by=['mean deviation (get)'], inplace=True, ascending=False)
MD

Unnamed: 0,unit,mean deviation (get),mean deviation (set)
17,43,1.22,0.9
30,O5,1.19,2.38
6,22,0.63,0.56
3,14,0.52,1.43
0,11,0.48,-1.42
29,O4,0.43,-0.53
11,32,0.32,1.83
19,45,0.29,-0.17
18,44,0.26,-1.71
23,54,0.14,0.55
