We start with importing the necessary python libraries that we are going to work with.

In [None]:
# import python libraries
from __future__ import division, print_function
import numpy as np
import pandas as pd
import os
import shutil
from matplotlib import pyplot as plt

# import cea-specific libraries
import cea.api # this is the API to call scripts in CEA, such as the data-helper, radiation engine, demand,...
import cea.inputlocator # this module provides paths to input and output files according to the CEA folder structure
import cea.config # this module let's us interact with the CEA configuration

As a next step, we are going to set up our configuration to the desired case study. I am going to use the "reference-case-cooling" that ships with the CEA source code. Because CEA provides a script to extract the different reference cases, we can call this script via the API directly in jupyter notebooks. 

In [None]:
# create the path to the destination folder
path_to_folder_for_blog = os.path.expandvars(r'${userprofile}/Documents/GitHub/blog')
# extract the reference-case-cooling to the destination folder
cea.api.extract_reference_case(destination=path_to_folder_for_blog, case='cooling')

After I have extracted my case study, I have to make sure that the configuration for the other scripts points to the right folder. Therefore I am setting the project path and scenario name to the correct values. Setting the configuration can be done by manipulating the cea.config file with a text editor or also directly in jupyter via the `config.Configuration` object.


In [None]:
path_to_project = os.path.join(path_to_folder_for_blog, 'reference-case-cooling') # construct the path to the project

config = cea.config.Configuration() # load the configuration
config.project = path_to_project # set the path to the project
config.scenario_name = 'baseline' # set the scenario name
print(config.scenario) # print the scenario to check the changes
config.save() # save the changes to the cea.config file

Now, we are ready for a run-through of the usual simulation steps for the district energy demand. They are:
- Running the data-helper to set up our building model parameters
- Running the radiation engine to quantify the solar heat gains of the buildings
- Running the energy demand simulation

The input into all scripts called via the CEA API is the configuration object. On my laptop it took around 10 minutes to run the scripts

In [None]:
cea.api.data_helper(region='SG') # run the data-helper script
cea.api.radiation_daysim(weather='Singapore') # run the radiation engine
cea.api.demand(weather='Singapore') # run the building energy demand simulation

Now, we will use some python functionalities to create a new scenario by copying the files of the baseline scenario, modify some building occupancy schedules, and comparing the simulated energy demand results.

In [None]:
new_scenario_name = 'modified-occupancy-schedules' # the new scenario name
path_to_new_scenario = os.path.join(path_to_project, new_scenario_name) # create the destination path for copying the baseline scenario

path_to_baseline = os.path.join(path_to_project, 'baseline')

shutil.copytree(path_to_baseline, path_to_new_scenario) # copy the all files from the baseline to the new scenario

Part of the CEA building-properties inputs are the various schedules that determine the occupant presence, ventilation rate, electrical appliance use, etc.
These schedules are created based on the region-specific archetypes database during a demand simulation and saved to a CSV file, if they are not provided by the user. 

With the help of the CEA `inputlocator` we are going to read these building schedule files (they have been created during the baseline simulation and copied with the above code) and visualize them.

The various colums contain information about:

- `Ea` : Electrical appliance use
- `Ed` : Electrical energy demand of data centers
- `El` : Electrical lighting use
- `Epro` : Electrical process energy use
- `Qhpro` : Thermal process energy use
- `Qs` : Sensible heat gains from occupants
- `Vw` : Fresh water use
- `Vww` : Hot water use
- `X` : Latent heat gains (humidity gains) from occupants
- `people` : Occupant presence
- `ve` : Required ventilation rate due to occupant presence

In [None]:
inputlocator = cea.inputlocator.InputLocator(scenario=path_to_new_scenario) # the input for the inputlocator is the path to the scenario
buildings = inputlocator.get_zone_building_names() # get all building names in scenario

building_schedule_path = inputlocator.get_building_schedules(buildings[0]) # the path to the first building schedules file
df_schedules = pd.read_csv(building_schedule_path) # use pandas to read the CSV file into a DataFrame

df_schedules # print the DataFrame

We can quickly visualize the data by plotting the first week of some schedules 

In [None]:
df_schedules['Ea'][0:168].plot() # visualize 1 week of electrical appliance schedule
plt.legend()
plt.show()
df_schedules['people'][0:168].plot() # visualize 1 week of occupant presence schedule
plt.legend()
plt.show()

Next, I am going to introduce some randomness (or stochasticity) to the schedules by multiplying each value with a factor sampled from a normal distribution centered around 1 (`mu = 1`) with a standard distribution of 5% (`sigma = 0.05`).

Such random factors can be obtaind with `sigma * np.random.randn(...) + mu`

https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.randn.html

We can test and visualize how this looks like for our schedules.

In [None]:
mu = 1.0
sigma = 0.05

df_schedules['Ea'][0:168].plot(label='default') # visualize 1 week of electrical appliance schedule
df_schedules['Ea'] = df_schedules['Ea'] * (sigma * np.random.randn(8760) + mu) # randomize the schedule
df_schedules['Ea'][0:168].plot(label='randomized')
plt.legend()
plt.show()

df_schedules['people'][0:168].plot(label='default') # visualize 1 week of occupant presence schedule
df_schedules['people'] = df_schedules['people'] * (sigma * np.random.randn(8760) + mu) # randomize the schedule
df_schedules['people'][0:168].plot(label='randomized')
plt.legend()
plt.show()

Now, I am simply going to randomize all schedules of all buildings in a nested `for` loop by reading, modifying and saving, the schedules again.
When I save the schedules from DataFrames back to CSV I set `index=False` to comply with the default file format. However, the files should also work fine with CEA if they have some additional columns, such as an index. They will just be ignored.

In [None]:
buildings = inputlocator.get_zone_building_names() # get all building names in scenario

for building in buildings:
    
    building_schedule_path = inputlocator.get_building_schedules(building) # the path to the building schedules file
    df_schedules = pd.read_csv(building_schedule_path) # use pandas to read the CSV file into a DataFrame
    
    for schedule in df_schedules.keys(): # iterate over all columns
        
        df_schedules[schedule] = df_schedules[schedule] * (sigma * np.random.randn(8760) + mu) # randomize the schedule

    df_schedules.to_csv(building_schedule_path, index=False) # save the schedules after modification

Because we do not want to change other building properties and because the schedules do not have an impact on the solar heat gains, I can now already run the energy demand simulation for my modified case study.

In [None]:
cea.api.demand(scenario=path_to_new_scenario, weather='Singapore') # run the demand simulation of the new scenario

Now let's compare the district energy demand results of our two case studies.
For this simple example, I'm just going to look at the `Total_demand.csv` file that compiles a bunch of data for the district from the individual building demand output files.

An interesting analysis would e.g., be to look at the differences in cumulative annual energy demands and peak power demands. We do not expect to see differences in annual energy demand, because the stochasticity factor introduced is normally distributed around 1 and deviations towards higher and lower energy consumption should balance out. However, we do expect to see some deviations in peak loads, but how much?

The columns I'm going to look at are:

- `QC_sys_MWhyr` the total annual cooling demand
- `QC_sys0_kW` the peak cooling demand

In [None]:
# get the new total demand file
inputlocator_new = cea.inputlocator.InputLocator(scenario=path_to_new_scenario)
total_demand_path_random = inputlocator_new.get_total_demand()
df_total_demand_random = pd.read_csv(total_demand_path_random, index_col='Name')

# get the baseline total demand file
inputlocator_baseline = cea.inputlocator.InputLocator(scenario=path_to_baseline)
total_demand_path_baseline = inputlocator_baseline.get_total_demand()
df_total_demand_baseline = pd.read_csv(total_demand_path_baseline, index_col='Name')

# plot the yearly cooling energy demand
fig, ax = plt.subplots()
x_data = np.arange(len(df_total_demand_baseline.index))
ax.bar(x_data-0.1, df_total_demand_baseline['QC_sys_MWhyr'], width=0.6, label='baseline')
ax.bar(x_data+0.1, df_total_demand_random['QC_sys_MWhyr'], width=0.6, label='randomized schedule')
ax.set_xticks(x_data)
ax.set_xticklabels(df_total_demand_baseline.index)
plt.ylabel('yearly cooling energy use (MWh/yr)')
plt.legend()
plt.show()

# plot the peak cooling energy demand
fig, ax = plt.subplots()
ax.bar(x_data-0.1, df_total_demand_baseline['QC_sys0_kW'], width=0.6, label='baseline')
ax.bar(x_data+0.1, df_total_demand_random['QC_sys0_kW'], width=0.6, label='randomized schedule')
ax.set_xticks(x_data)
ax.set_xticklabels(df_total_demand_baseline.index)
plt.ylabel('peak cooling demand (kW)')
plt.legend()
plt.show()

Interestingly, the buildings' peak cooling loads are affected in different ways...
There seem to be lot's of research opportunities.