<a href="https://colab.research.google.com/github/intelligent-environments-lab/CityLearn/blob/master/examples/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# QuickStart

Install the latest CityLearn version from PyPi with the :code:`pip` command:

In [None]:
!pip install CityLearn

## No Control (Baseline)

Run the following to simulate an environment where the storage systems and heat pumps are not controlled (baseline). The storage actions prescribed will be 0.0 and the heat pump will have no action, i.e. `None`, causing it to deliver the ideal load in the building time series files:

In [1]:
from citylearn.agents.base import BaselineAgent as Agent
from citylearn.citylearn import CityLearnEnv

dataset_name = 'citylearn_challenge_2023_phase_2_local_evaluation'
env = CityLearnEnv(dataset_name, central_agent=True)
model = Agent(env)
model.learn(episodes=1)

# print cost functions at the end of episode
kpis = model.env.evaluate()
kpis = kpis.pivot(index='cost_function', columns='name', values='value').round(3)
kpis = kpis.dropna(how='all')
display(kpis)

name,Building_1,Building_2,Building_3,District
cost_function,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
all_time_peak_average,,,,1.0
annual_normalized_unserved_energy_total,0.019,0.018,0.018,0.018
carbon_emissions_total,1.0,1.0,1.0,1.0
cost_total,1.0,1.0,1.0,1.0
daily_one_minus_load_factor_average,,,,1.0
daily_peak_average,,,,1.0
discomfort_cold_delta_average,1.611,0.043,0.643,0.766
discomfort_cold_delta_maximum,4.741,1.772,3.466,3.326
discomfort_cold_delta_minimum,0.0,0.0,0.0,0.0
discomfort_cold_proportion,0.36,0.0,0.082,0.147


## Centralized RBC
Run the following to simulate an environment controlled by centralized RBC agent for a single episode:

In [2]:
from citylearn.agents.rbc import BasicRBC as Agent
from citylearn.citylearn import CityLearnEnv

dataset_name = 'citylearn_challenge_2023_phase_2_local_evaluation'
env = CityLearnEnv(dataset_name, central_agent=True)
model = Agent(env)
model.learn(episodes=1)

# print cost functions at the end of episode
kpis = model.env.evaluate()
kpis = kpis.pivot(index='cost_function', columns='name', values='value').round(3)
kpis = kpis.dropna(how='all')
display(kpis)

name,Building_1,Building_2,Building_3,District
cost_function,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
all_time_peak_average,,,,1.179
annual_normalized_unserved_energy_total,0.017,0.016,0.016,0.016
carbon_emissions_total,1.997,1.936,1.737,1.89
cost_total,1.93,1.877,1.709,1.839
daily_one_minus_load_factor_average,,,,0.721
daily_peak_average,,,,1.352
discomfort_cold_delta_average,9.731,3.446,3.163,5.446
discomfort_cold_delta_maximum,13.562,9.93,5.399,9.63
discomfort_cold_delta_minimum,0.0,0.0,0.0,0.0
discomfort_cold_proportion,0.975,0.892,0.953,0.94


## Decentralized-Independent SAC

Run the following to simulate an environment controlled by decentralized-independent SAC agents for 1 training episode:

In [3]:
from citylearn.agents.sac import SAC as Agent
from citylearn.citylearn import CityLearnEnv

dataset_name = 'citylearn_challenge_2023_phase_2_local_evaluation'
env = CityLearnEnv(dataset_name, central_agent=False)
model = Agent(env)
model.learn(episodes=2, deterministic_finish=True)

# print cost functions at the end of episode
kpis = model.env.evaluate()
kpis = kpis.pivot(index='cost_function', columns='name', values='value').round(3)
kpis = kpis.dropna(how='all')
display(kpis)

name,Building_1,Building_2,Building_3,District
cost_function,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
all_time_peak_average,,,,0.947
annual_normalized_unserved_energy_total,0.013,0.013,0.012,0.013
carbon_emissions_total,0.943,0.96,0.939,0.947
cost_total,0.906,0.921,0.907,0.911
daily_one_minus_load_factor_average,,,,0.945
daily_peak_average,,,,0.928
discomfort_cold_delta_average,1.899,0.878,0.823,1.2
discomfort_cold_delta_maximum,6.387,4.487,2.908,4.594
discomfort_cold_delta_minimum,0.0,0.0,0.0,0.0
discomfort_cold_proportion,0.435,0.313,0.208,0.319


## Decentralized-Cooperative MARLISA

Run the following to simulate an environment controlled by decentralized-cooperative MARLISA agents for 1 training episodes:

In [4]:
from citylearn.agents.marlisa import MARLISA as Agent
from citylearn.citylearn import CityLearnEnv

dataset_name = 'citylearn_challenge_2023_phase_2_local_evaluation'
env = CityLearnEnv(dataset_name, central_agent=False)
model = Agent(env)
model.learn(episodes=2, deterministic_finish=True)

kpis = model.env.evaluate()
kpis = kpis.pivot(index='cost_function', columns='name', values='value').round(3)
kpis = kpis.dropna(how='all')
display(kpis)

name,Building_1,Building_2,Building_3,District
cost_function,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
all_time_peak_average,,,,0.949
annual_normalized_unserved_energy_total,0.013,0.013,0.012,0.013
carbon_emissions_total,0.951,0.974,0.957,0.961
cost_total,0.914,0.935,0.925,0.925
daily_one_minus_load_factor_average,,,,0.939
daily_peak_average,,,,0.932
discomfort_cold_delta_average,1.918,0.908,0.865,1.23
discomfort_cold_delta_maximum,6.424,4.54,2.976,4.647
discomfort_cold_delta_minimum,0.0,0.0,0.0,0.0
discomfort_cold_proportion,0.445,0.315,0.235,0.332


## Stable Baselines3 Reinforcement Learning Algorithms

Install the latest version of Stable Baselines3:

In [None]:
!pip install stable-baselines3

Before the environment is ready for use in Stable Baselines3, it needs to be wrapped. Firstly, wrap the environment using the `NormalizedObservationWrapper` (see [docs](https://www.citylearn.net/api/citylearn.wrappers.html#citylearn.wrappers.NormalizedObservationWrapper)) to ensure that observations served to the agent are min-max normalized between [0, 1] and cyclical observations e.g. hour, are encoded using the cosine transformation.

Next, we wrap with the `StableBaselines3Wrapper` (see [docs](https://www.citylearn.net/api/citylearn.wrappers.html#citylearn.wrappers.StableBaselines3Wrapper)) that ensures observations, actions and rewards are served in manner that is compatible with Stable Baselines3 interface.

For the following Stable Baselines3 example, the `baeda_3dem` dataset that support building temperature dynamics is used.

> ⚠️ **NOTE**: `central_agent` in the `env` must be `True` when using Stable Baselines3  as it does not support multi-agents.

In [5]:
from stable_baselines3.sac import SAC as Agent
from citylearn.citylearn import CityLearnEnv
from citylearn.wrappers import NormalizedObservationWrapper, StableBaselines3Wrapper

dataset_name = 'citylearn_challenge_2023_phase_2_local_evaluation'
env = CityLearnEnv(dataset_name, central_agent=True)
env = NormalizedObservationWrapper(env)
env = StableBaselines3Wrapper(env)
model = Agent('MlpPolicy', env)
episodes = 2
model.learn(total_timesteps=env.unwrapped.time_steps*episodes)

# evaluate
observations, _ = env.reset()

while not env.unwrapped.terminated:
    actions, _ = model.predict(observations, deterministic=True)
    observations, _, _, _, _ = env.step(actions)

kpis = env.unwrapped.evaluate()
kpis = kpis.pivot(index='cost_function', columns='name', values='value').round(3)
kpis = kpis.dropna(how='all')
display(kpis)

name,Building_1,Building_2,Building_3,District
cost_function,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
all_time_peak_average,,,,0.839
annual_normalized_unserved_energy_total,0.015,0.012,0.013,0.013
carbon_emissions_total,0.386,0.363,0.554,0.435
cost_total,0.366,0.339,0.523,0.409
daily_one_minus_load_factor_average,,,,1.291
daily_peak_average,,,,0.7
discomfort_cold_delta_average,0.0,0.004,0.001,0.002
discomfort_cold_delta_maximum,0.124,0.581,0.372,0.359
discomfort_cold_delta_minimum,0.0,0.0,0.0,0.0
discomfort_cold_proportion,0.0,0.0,0.0,0.0
