# Twomes virtual home interactive inverse grey-box analysis pipeline

This Jupyter Labs notebook can be used to interactively test the Twomes inverse grey-box analysis pipeline.
Don't forget to install the requirements listed in [requirements.txt](../requirements.txt) first!

## Setting the stage

First several imports and variables need to be defined


### Imports and generic settings

In [None]:
import os

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib widget


from tqdm.notebook import tqdm

from gekko import GEKKO

import sys
sys.path.append('../data/')
sys.path.append('../view/')
sys.path.append('../analysis/')

# usually, two decimals suffice for displaying DataFrames (NB internally, precision may be higher)
pd.options.display.precision = 2

%load_ext autoreload

from preprocessor import Preprocessor
from inversegreyboxmodel import Learner

%matplotlib inline
%matplotlib widget

from plotter import Plot



### Load Data from Real Homes

In [None]:
%%time
# Prerequisite: for this example to work, you need to have the b4b_raw_properties.parquet, located e.g. in the ../data/B4B_virtual_ds/ folder.
# One way to get this is to run B4BExtractionBackup.ipynb first
df_prop = pd.read_parquet('../data/twomes_realhomes_raw_properties.parquet', engine='pyarrow')

#sorting the DataFrame index is needed to get good performance on certain filters
#this guarding code to check whether DataFramews are properly sorted
if not df_prop.index.is_monotonic_increasing:
    print('df_prop needed index sorting')
    df_prop = df_prop.sort_index()  

In [None]:
df_prop.index.unique(level='id')

In [None]:
df_prop

### Convert real home property data to preprocessed data

In [None]:
# TODO: for real home data (with no noise and measurement errors), preprocessing is NOT trivial
df_prep = Preprocessor.unstack_prop(df_prop)

# TODO: more preprocessing 

In [None]:
df_prep.info()

In [None]:
df_prep

In [None]:
%autoreload 2
units_to_mathtext = property_types = {
    'degC' : r'$°C$',
    'ppm' : r'$ppm$',
    '0' : r'$[-]$',
    'bool': r'$0 = False; 1 = True$',
    'p' : r'$persons$',
    'W' : r'$W$',
    'W_m_2' : r'$W/m^{2}$',
    'm_s_1' : r'$m/s$'    
}

In [None]:
# visuaize all input data
df_plot = df_prep

In [None]:
df_plot.info()

In [None]:
#Plot all properties from all sources for all ids
Plot.dataframe_preprocessed_plot(df_plot, units_to_mathtext)

## Learn parameters using inverse grey-box analysis

Most of the heavy lifting is done by the `learn_home_parameters()` function, which again uses the [GEKKO Python](https://machinelearning.byu.edu/) dynamic optimization toolkit.

In [None]:
%%time 
%autoreload 2
# set room metadata to None, then learn_room_parameters() will derive the metadata from the ids.

hints = {
    'A__m2': 12.0,                                     # initial estimate for apparent solar aperture
    'eta_sup_CH__0' : 0.97,                            # average home heating efficiency of a gas boiler (superior value)
    'eta_sup_noCH__0' : 0.34,                          # average home heating efficiency indirecly DHW & cooking (superior value)
    'g_noCH__m3_a_1' : 339,                            # average gas use in m^3 per year for other purposes than home heating 
    'occupancy__p' : (2.2 * 7.7/24),                   # average house occupancy (2.2 persons, 7.7 of 24h )
    'Q_gain_int__W_p_1' : (77 * 8.6/24 + 105 * 7.7/24) # average heat gain per occupant (77W for 8.6 hours, 105W for 7.7 hours)
}

learn = ['A__m2']

#select column names
property_sources = {
    'temp_out__degC' : 'model_temp_out__degC',
    'wind__m_s_1' : 'model_wind__m_s_1',
    'ghi__W_m_2' : 'model_ghi__W_m_2', 
    'g_use__W' : 'model_g_use__W',
    'e_use__W' : 'model_e_use__W',
    'e_ret__W' : 'model_e_ret__W'
}

# learn the model parameters and write results to a dataframe
df_results_per_period, df_results = Learner.learn_home_parameters(df_prep, 
                                                                  property_sources = property_sources, 
                                                                  learn = learn, 
                                                                  hints = hints,
                                                                  ev_type = 2
                                                                 )

### Result Visualization

In [None]:
df_results_per_period

In [None]:
df_results

In [None]:
df_plot = df_prep[[prop for prop in df_prep.columns.values if prop.split('__')[-1] == 'degC']]

In [None]:
#Plot only temperatures from all sources for all ids
Plot.dataframe_preprocessed_plot(df_plot, units_to_mathtext)

In [None]:
df_lot = df_prep

In [None]:
#Plot all properties from all sources for all ids
Plot.dataframe_preprocessed_plot(df_plot, units_to_mathtext)