Script for 'building_heating' program in Python<br>
Licensed under the Apache License, Version 2.0<br>
http://www.apache.org/licenses/LICENSE-2.0

In the first part of the analysis, we construct a schematic representation of the building.

In [1]:
# The magic command "%matplotlib notebook" to make interactive plots within the Jupyter Notebook
# Import numpy library (for arrays operations)
# Import matplotlib.pyplot interface (for MATLAB-like plots)

%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt

In [2]:
# Range of coordinates indices x, y, z for the building plot

x, y, z = np.indices((12, 26, 6))

In [3]:
# Definition of the building volumes and colors

ground = (x < 12) & (y < 26) & (z < 1)
floors = (x < 12) & (y < 26) & (1 <= z) & (z < 6)
building = ground | floors
    
building_color = np.empty(building.shape, dtype=object)
building_color[ground] = 'grey'
building_color[floors] = 'white'

In [4]:
# Definition of the 4th floor apartments volumes and colors

apart_41 = (0 <= x) & (x < 12) & (19 <= y) & (y < 26) & (3 <= z) & (z < 4)
apart_42 = (8 <= x) & (x < 12) & (10 <= y) & (y < 19) & (3 <= z) & (z < 4)
apart_43 = (6 <= x) & (x < 12) & (0 <= y) & (y < 8) & (3 <= z) & (z < 4)
apart_44 = (0 <= x) & (x < 6) & (0 <= y) & (y < 10) & (3 <= z) & (z < 4)
apart_45 = (0 <= x) & (x < 6) & (10 <= y) & (y < 17) & (3 <= z) & (z < 4)
apartments = apart_41 | apart_42 | apart_43 | apart_44 | apart_45

apartments_color = np.empty(building.shape, dtype=object)
apartments_color[apart_41] = 'gold'
apartments_color[apart_42] = 'red'
apartments_color[apart_43] = 'blue'
apartments_color[apart_44] = 'violet'
apartments_color[apart_45] = 'green'

In [5]:
# Interactive 3D plot of the building

# Parameters, title and text of the 3D plot

ax = plt.figure(figsize=(6, 6)).add_subplot(projection='3d')
ax.voxels(building, facecolors=building_color, alpha=0.4)
ax.voxels(apartments, facecolors=apartments_color)

plt.title('Apartment distribution (4th floor)')

ax.text(8.5, -3, 3.3, '43', weight='bold')
ax.text(2.5, -3, 3.3, '44', weight='bold')
ax.text(7, 29, 3.3, '41', weight='bold')
ax.text(13.5, 13.5, 3.3, '42', weight='bold')
ax.text(-1.5, 14.5, 3.3, '45', weight='bold')

plt.tight_layout()

<IPython.core.display.Javascript object>

In the next part, we use Python librarires to set up a connection to the MySQL database containing the data of the building.

In [6]:
# Import os module (for interacting with the operating system)
# Import pandas library (for data analysis in Python)
# Import create_engine from the sqlalchemy toolkit (standard SQL toolkit in Python)
# Import load_dotenv from the dotenv module (for setting environment variables)

import os
import pandas as pd
from sqlalchemy import create_engine
from dotenv import load_dotenv

In [7]:
# Load from a (hidden) .env file the MySQL credentials of the mysql database of the building

load_dotenv()
user = os.getenv('MySQL_USER')
passwd = os.getenv('MySQL_PASSWORD')
host = os.getenv('MySQL_HOST')
port = os.getenv('MySQL_PORT')
db = os.getenv('MySQL_DB')

In [8]:
# Create a connection ("engine") to the MySQL database using the credentials

engine = create_engine('mysql://%s:%s@%s:%s/%s' % (user, passwd, host, port, db))

The analysis involves 3 tables from the database, "Apartments", "Power" and "Temperatures", that we now describe.

The Apartments table contains the list of all units of the building (31 apartments + 4 weather stations). The "esmart_id" is the identification number given to each unit by the firm eSMART. The column "name" is a more intuitive identification number for the units. For instance, the name "43" corresponds to floor 4, apartment 3.

In [9]:
# Read the Apartments table into a DataFrame "df_apart"

df_apart = pd.read_sql('SELECT * FROM Apartments', engine)
df_apart.head()

Unnamed: 0,esmart_id,name
0,1046,11
1,1047,12
2,1048,13
3,1049,14
4,1050,15


For future use, we store in a DataFrame "df_dim" the names, the areas and the volumes of the 31 units of the building provided by the architect.

In [10]:
units = tuple(list(df_apart['name'])[:31])

areas = (127.1, 55.8, 78.1, 90.2, 46.9, 113.6, 69.3, 78.1, 90.2, 46.9, 127.1, 55.8, 78.1, 90.2,
         46.9, 113.6, 69.3, 78.1, 90.2, 46.9, 113.6, 69.3, 78.1, 90.2, 46.9, 82.5, 70.2, 120.6,
         85.5, 90.2, 46.9)

volumes = (330.46, 145.08, 203.06, 234.52, 121.94, 295.36, 180.18, 203.06, 234.52, 121.94, 330.46,
           145.08, 203.06, 234.52, 121.94, 295.36, 180.18, 203.06, 234.52, 121.94, 295.36, 180.18,
           203.06, 234.52, 121.94, 231, 203.58, 349.74, 247.95, 261.58, 136.01)

df_dim = pd.DataFrame({'unit': units, 'area': areas, 'volume': volumes})
df_dim.head()

Unnamed: 0,unit,area,volume
0,11,127.1,330.46
1,12,55.8,145.08
2,13,78.1,203.06
3,14,90.2,234.52
4,15,46.9,121.94


The Power table contains data about power consumption for heating of each unit starting from 2017-10-12. Each row of the table contains the "apartment_id" (= esmart_id), the "module_id" of the module taking the measure, namely,

- <b>17</b> for electric power measured in Watt [W],
- <b>18</b> for the energy measured in Watt-hour [W⋅h]

the "date" of the measure and the "value" measured by the module.

In [11]:
# Read the first 100 rows of Power table into a DataFrame "df_power"

df_power = pd.read_sql('SELECT * FROM Power LIMIT 100', engine)
df_power.head()

Unnamed: 0,apartment_id,module_id,date,value
0,1026,17,2017-10-12,9242
1,1026,17,2017-10-13,10658
2,1026,17,2017-10-14,11523
3,1026,17,2017-10-15,12394
4,1026,17,2017-10-16,13260


The Temperatures table contains data about temperatures of each unit starting from 2021-03-15. Each row of the table contains the "apartment_id", the "module_id" for the room location of the module taking the measure, the "date" of the measure, the "action" of the module (<b>get</b> for temperature measured in the room and <b>set</b> for temperature set in the room) and the "value" of the temperature.

In [12]:
# Read the first 100 rows of Temperatures table into a DataFrame "df_temp"

df_temp = pd.read_sql('SELECT * FROM Temperatures LIMIT 100', engine)
df_temp.head()

Unnamed: 0,apartment_id,module_id,date,action,value
0,1026,1,2021-03-15 11:46:46,set,22.0
1,1026,1,2021-03-15 11:56:56,set,22.0
2,1026,1,2021-03-15 12:07:06,set,22.0
3,1026,1,2021-03-15 12:17:17,set,22.0
4,1026,1,2021-03-15 12:27:27,set,22.0


In this part of the analysis, we retrieve and plot in pie chart and bar chart data about energy consumption of each unit of the building, over any prescribed period of time between "start_date" and "end_date".

In [147]:
# Set start_date and end_date for the DataFrame of energy consumption

start_date = '"2018-01-01"'
end_date = '"2018-12-31"'

# Read, for each unit of the building, the first energy after the start date into a DataFrame "df_start"
# We dive "Power.value" by 1000 to get energies in [kW⋅h]

df_start = pd.read_sql('SELECT Apartments.name AS unit, Power.date AS \'start date\', \
                   Power.value/1000 AS \'initial energy [kW⋅h]\' \
                   FROM Power INNER JOIN Apartments ON Power.apartment_id=Apartments.esmart_id \
                   WHERE Power.module_id=18 AND Power.date>=%s \
                   GROUP BY Apartments.name' % start_date, engine)

# Read, for each unit of the building, the last energy before the end date into a DataFrame "df_end"
# We dive "Power.value" by 1000 to get energies in [kW⋅h]

df_end = pd.read_sql('SELECT Apartments.name AS unit, Power.date AS \'end date\', \
                   Power.value/1000 AS \'final energy [kW⋅h]\' \
                   FROM Power INNER JOIN Apartments ON Power.apartment_id=Apartments.esmart_id \
                   WHERE Power.module_id=18 AND Power.date<=%s \
                   ORDER BY Power.date DESC LIMIT 31' % end_date, engine)

# Merge "df_start" and "df_end" into a DataFrame "df_energy" (using "unit" as key)
# (in 2018 there are missing measures; the first measure for unit "O2" is at initial_date "2018-10-10")

df_energy = pd.merge(df_start, df_end, how='left', on='unit')

# Store in "df_energy" the energy consumption and the energy consumption per cubic metre

df_energy['\u0394 energy [kW⋅h]'] = df_energy['final energy [kW⋅h]']-df_energy['initial energy [kW⋅h]']
df_energy['\u0394 energy per m\u00b3 [kW⋅h/m\u00b3]'] \
= [round(df_energy['\u0394 energy [kW⋅h]'][i]/volumes[i], 1) for i in range(len(volumes))]

df_energy.head()

Unnamed: 0,unit,start date,initial energy [kW⋅h],end date,final energy [kW⋅h],Δ energy [kW⋅h],Δ energy per m³ [kW⋅h/m³]
0,11,2018-01-01,2350.0,2018-12-31,5364.0,3014.0,9.1
1,12,2018-01-01,1119.0,2018-12-31,3365.0,2246.0,15.5
2,13,2018-01-01,1266.0,2018-12-31,3644.0,2378.0,11.7
3,14,2018-01-01,241.0,2018-12-31,2254.0,2013.0,8.6
4,15,2018-01-01,1586.0,2018-12-31,4378.0,2792.0,22.9


We use the data stored in the DataFrame "df_energy" to produce a pie chart of of the heating energy consumption of each unit of the building.

In [148]:
# Store in variable "data" the list of energy consumptions
# Store in variables "max_unit" and "min_unit" the units having maximum (resp. minimum) energy consumption

data = df_energy['\u0394 energy [kW⋅h]'].tolist()
max_unit = units[data.index(max(data))]
min_unit = units[data.index(min(data))]

# Check if there is energy consumption during the time period before doing the plot

if min(data)>0:

    # Set in a "scales" the corresponding list the scales of the pie chart wedges
    # Set in a "wedges" dictionary the properties of the pie chart wedges
    # Set in a "colors" list the colors of the pie chart wedges
    # See https://matplotlib.org/stable/tutorials/colors/colormaps.html for details
    
    scales = [0.15 if data[i] == max(data) or data[i] == min(data) else 0.1 for i in range(len(data))]
    wedges = {'width':0.33, 'edgecolor':'black', 'linewidth':0.5}
    colors = plt.colormaps['tab20b_r'](range(20))

    # Parameters, title and legend of the pie chart plot

    _, ax = plt.subplots(figsize=(8, 8))
    _, _, pcts = ax.pie(data, labels=units, explode=scales, radius=0.9, autopct='%.1f%%', colors=colors,
                    wedgeprops=wedges, pctdistance=0.82, textprops={'size': 'small'})
    plt.setp(pcts, color='w')

    ax.set_title('Energy consumption for heating between %s and %s' % (eval(start_date), eval(end_date)))
    
    ax.text(0, 0, 'Total energy consumption: %s [kW⋅h]' % round(sum(data), 2),
            transform=ax.transAxes, size='small')
    ax.text(0, -0.02, 'Biggest energy consumer: unit "%s" with %s [kW⋅h]' \
            % (max_unit, max(data)), transform=ax.transAxes, size='small')
    ax.text(0, -0.04, 'Smallest energy consumer: unit "%s" with %s [kW⋅h]' \
            % (min_unit, min(data)), transform=ax.transAxes, size='small')

    plt.tight_layout()

else:
    print('Nothing to plot: no energy consumption between %s and %s' % (eval(start_date), eval(end_date)))

<IPython.core.display.Javascript object>

In [171]:
# Bar chart of the heating energy consumption per cubic metre of each unit
# Check if there is energy consumption during the time period before doing the plot

if min(data)>0:
    
    # Order "df_energy" by descending value of energy
    
    df_energy = df_energy.sort_values('\u0394 energy per m\u00b3 [kW⋅h/m\u00b3]', ascending=False)

    # Parameters and title of the pie chart plot

    _, ax = plt.subplots(figsize=(8, 6))
    
    ax.set_title('Energy consumption for heating per per cubic metre between %s and %s'
                 % (eval(start_date), eval(end_date)), size='medium')
    
    ax.set_xlabel('Unit')
    ax.set_ylabel('Energy consumption per cubic metre [kW⋅h/m\u00b3]')
    colors = plt.colormaps['jet_r'](range(2,64,2))
    plt.xticks(rotation=90)
    pl = ax.bar(df_energy['unit'], df_energy['\u0394 energy per m\u00b3 [kW⋅h/m\u00b3]'], color=colors, width=0.6)
    
    for bar in pl:
        plt.annotate(bar.get_height(), xy=(bar.get_x()-0.1, bar.get_height()+0.2), size='x-small')

    plt.tight_layout()

else:
    print('Nothing to plot: no energy consumption between %s and %s' % (eval(start_date), eval(end_date)))

<IPython.core.display.Javascript object>

In [16]:
initial_date = '"2018-01-01"'
final_date = '"2022-12-31"'

dflist_get, dflist_set = [], []

for unit in units:
    temp_get = pd.read_sql('SELECT DATE(Temperatures.date) AS date, ROUND(AVG(Temperatures.value),2) AS \'unit %s (get)\' \
                       FROM Temperatures INNER JOIN Apartments ON Temperatures.apartment_id=Apartments.esmart_id \
                       WHERE Apartments.name=\'%s\' AND Temperatures.action="get" \
                       AND Temperatures.date>=%s AND Temperatures.date<=%s \
                       AND TIME(Temperatures.date) BETWEEN "00:00:00" AND "23:59:59" \
                       GROUP BY DATE(Temperatures.date)' % (unit, unit, initial_date, final_date), engine)
    
    temp_set = pd.read_sql('SELECT DATE(Temperatures.date) AS date, ROUND(AVG(Temperatures.value),2) AS \'unit %s (set)\' \
                       FROM Temperatures INNER JOIN Apartments ON Temperatures.apartment_id=Apartments.esmart_id \
                       WHERE Apartments.name=\'%s\' AND Temperatures.action="set" \
                       AND Temperatures.date>=%s AND Temperatures.date<=%s \
                       AND TIME(Temperatures.date) BETWEEN "00:00:00" AND "23:59:59" \
                       GROUP BY DATE(Temperatures.date)' % (unit, unit, initial_date, final_date), engine)
    
    dflist_get.append(temp_get)
    dflist_set.append(temp_set)

In [17]:
from functools import reduce

merge_get = reduce(lambda x, y: pd.merge(round(x,2), round(y,2), on='date', how='outer'), dflist_get)
df_get = merge_get.copy(deep=True)

masked_get = np.ma.masked_array(merge_get.drop('date', axis=1), np.isnan(merge_get.drop('date', axis=1)))
df_get['mean (get)'] = np.ma.average(masked_get, axis=1, weights=volumes)
df_get['mean (get)'] = round(df_get['mean (get)'], 2)

df_get['min (get)'] = round((merge_get.drop('date', axis=1)).min(axis=1), 2)
df_get['max (get)'] = round((merge_get.drop('date', axis=1)).max(axis=1), 2)

merge_set = reduce(lambda x, y: pd.merge(round(x,2), round(y,2), on='date', how='outer'), dflist_set)
df_set = merge_set.copy(deep=True)

masked_set = np.ma.masked_array(merge_set.drop('date', axis=1), np.isnan(merge_set.drop('date', axis=1)))
df_set['mean (set)'] = np.ma.average(masked_set, axis=1)
df_set['mean (set)'] = round(df_set['mean (set)'], 2)


#df_set['mean (set)'] = round((merge_set.drop('date', axis=1)).mean(axis=1), 2)
df_set['min (set)'] = round((merge_set.drop('date', axis=1)).min(axis=1), 2)
df_set['max (set)'] = round((merge_set.drop('date', axis=1)).max(axis=1), 2)

df_set.head()

Unnamed: 0,date,unit 11 (set),unit 12 (set),unit 13 (set),unit 14 (set),unit 15 (set),unit 21 (set),unit 22 (set),unit 23 (set),unit 24 (set),...,unit 55 (set),unit Communs (set),unit O1 (set),unit O2 (set),unit O3 (set),unit O4 (set),unit O5 (set),mean (set),min (set),max (set)
0,2021-03-15,18.87,20.0,19.5,22.0,22.7,20.83,25.0,20.75,20.34,...,21.66,20.25,20.58,21.25,20.25,20.17,24.0,20.96,18.5,25.0
1,2021-03-16,19.04,20.0,19.46,22.0,,20.83,25.0,20.74,20.33,...,21.67,20.25,20.75,21.25,20.25,20.17,24.0,20.88,18.5,25.0
2,2021-03-17,19.5,20.0,19.25,22.0,,20.83,25.0,20.75,20.33,...,21.67,20.25,20.8,21.25,20.25,20.17,24.0,20.88,18.5,25.0
3,2021-03-18,19.5,20.0,19.25,22.0,,20.83,25.0,20.75,20.33,...,21.67,20.25,20.87,21.25,20.25,20.17,24.0,20.83,18.5,25.0
4,2021-03-19,19.5,20.0,19.25,22.0,,20.83,25.0,20.75,20.34,...,21.67,20.25,20.87,21.25,20.25,20.17,24.11,20.87,18.5,25.0


In [18]:
unit = '43'

_, (ax1, ax2) = plt.subplots(2, figsize=(8, 8))

ax1.plot(df_get['date'], df_get['max (get)'], color='k', label='max (get)')
ax1.plot(df_get['date'], df_get['unit %s (get)'% unit], color='r', label='unit %s (get)' % unit)
ax1.plot(df_get['date'], df_get['mean (get)'], color='grey', label='mean (get)')
ax1.plot(df_get['date'], df_get['min (get)'], color='k', label='min (get)')

ax1.set_title('Daily temperatures averages measured between %s and %s'
              % (eval(initial_date), eval(final_date)), size='medium')

ax1.legend(fontsize='small',)
ax1.set_ylabel('Temperature [°C]')

ax2.plot(df_set['date'], df_set['max (set)'], color='k', label='max (set)')
ax2.plot(df_set['date'], df_set['unit %s (set)'% unit], color='r', label='unit %s (set)' % unit)
ax2.plot(df_set['date'], df_set['mean (set)'], color='grey', label='mean (set)')
ax2.plot(df_set['date'], df_set['min (set)'], color='k', label='min (set)')

ax2.set_title('Daily temperatures averages set between %s and %s'
              % (eval(initial_date), eval(final_date)), size='medium')

ax2.legend(fontsize='small')
ax2.set_xlabel('Date')
ax2.set_ylabel('Temperature [°C]')

plt.rcParams['xtick.labelsize'] = 'small'
plt.rcParams['ytick.labelsize'] = 'small'

plt.tight_layout()

<IPython.core.display.Javascript object>

In [19]:
deviations = pd.DataFrame()

for unit in units:
    deviations['unit %s deviation (get)'% unit] = df_get['unit %s (get)'% unit] - df_get['mean (get)']
    deviations['unit %s deviation (set)'% unit] = df_set['unit %s (set)'% unit] - df_set['mean (set)']
    
MD = pd.DataFrame({'unit': units})
MD['mean deviation (get)'] = np.average(deviations.loc[:, ::2]).tolist()
MD['mean deviation (set)'] = np.average(deviations.loc[:, 'unit 11 deviation (set)'::2]).tolist()
MD = round(MD, 2)
blankIndex = [''] * len(units)
MD.index = blankIndex

md_get = MD['mean deviation (get)'].tolist()
md_set = [round(x,2) for x in deviations.loc[:, 'unit 11 deviation (set)'::2].mean().tolist()]
radii = [100*(md_get[i]**2+md_set[i]**2) for i in range(len(md_get))]

deviations

Unnamed: 0,unit 11 deviation (get),unit 11 deviation (set),unit 12 deviation (get),unit 12 deviation (set),unit 13 deviation (get),unit 13 deviation (set),unit 14 deviation (get),unit 14 deviation (set),unit 15 deviation (get),unit 15 deviation (set),...,unit O1 deviation (get),unit O1 deviation (set),unit O2 deviation (get),unit O2 deviation (set),unit O3 deviation (get),unit O3 deviation (set),unit O4 deviation (get),unit O4 deviation (set),unit O5 deviation (get),unit O5 deviation (set)
0,-0.01,-2.09,0.58,-0.96,-0.32,-1.46,0.60,1.04,0.69,1.74,...,-0.59,-0.38,-0.39,0.29,-3.25,-0.71,-0.25,-0.79,1.46,3.04
1,-0.13,-1.84,0.75,-0.88,-0.42,-1.42,0.52,1.12,,,...,-0.69,-0.13,-0.57,0.37,-3.09,-0.63,-0.17,-0.71,1.51,3.12
2,-0.12,-1.38,0.75,-0.88,-0.63,-1.63,0.63,1.12,,,...,-0.92,-0.08,-0.40,0.37,-3.27,-0.63,-0.05,-0.71,1.63,3.12
3,-0.14,-1.33,0.66,-0.83,-0.80,-1.58,0.36,1.17,,,...,-0.91,0.04,-0.23,0.42,-3.25,-0.58,-0.18,-0.66,1.53,3.17
4,-0.20,-1.37,0.62,-0.87,-0.79,-1.62,0.35,1.13,,,...,-0.90,0.00,0.18,0.38,-2.88,-0.62,-0.04,-0.70,1.21,3.24
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
318,0.13,-1.00,0.57,1.05,0.36,1.63,0.04,1.21,-0.49,1.38,...,-0.94,-0.87,0.27,3.13,-0.02,0.38,0.23,0.38,1.65,3.38
319,0.29,-0.97,0.57,1.07,0.27,1.65,0.38,1.23,-0.16,1.40,...,-1.20,-0.85,-0.26,3.15,-0.21,0.40,0.34,0.40,1.59,3.40
320,0.32,-0.97,0.68,1.07,0.43,1.65,0.43,1.23,-0.83,1.40,...,-1.35,-0.85,-0.04,3.15,-0.22,0.40,0.34,0.40,1.45,3.40
321,0.45,-0.96,0.38,1.09,0.30,1.67,0.49,1.25,-0.10,1.42,...,-1.34,-0.83,0.16,3.17,-0.18,0.42,0.37,0.42,1.02,3.42


In [20]:
from pandas.plotting import table

_, (ax1, ax2) = plt.subplots(1,2, figsize=(9, 8))
ax1.scatter(md_get, md_set, s=radii, color='b', alpha=0.3)

for unit, x, y in zip(units, MD['mean deviation (get)'], md_set):
    ax1.text(x+0.02, y+0.02, unit, size='small')

ax2.axis('tight')
ax2.axis('off')
the_table = table(ax2, MD, loc='center', colWidths=[0.2, 0.4, 0.4], cellLoc='center')
the_table.auto_set_font_size(False)
the_table.set_fontsize(7)

<IPython.core.display.Javascript object>