Copyright (c) 2022, salesforce.com, inc and MILA.  
All rights reserved.  
SPDX-License-Identifier: BSD-3-Clause  
For full license text, see the LICENSE file in the repo root  
or https://opensource.org/licenses/BSD-3-Clause  

# How to use this notebook


- The purpose of this notebook is to walk people through the process 
- The expected enviornment to run this notebook is [colab](https://colab.research.google.com/)
- Change runtime type to [GPU](https://research.google.com/colaboratory/faq.html#gpu-availability) for GPU training, CPU is slower
- If GPU : "Train agents with GPU"
- If CPU : "Train agents with CPU"
- If restart runtime : rerun "Install prerequisite packages" and "Load dependency" sections


# [IMPORTANT, DO NOT SKIP] Swtich the python version to 3.7.16

Please run the below block and restart runtime after it finishes.

In [None]:
# Reference: https://stackoverflow.com/a/74538231
# Import the Python sys module
import sys

# Get the Python version
python_version = sys.version_info

# Print the Python version
print(f"Current Python version: {python_version.major}.{python_version.minor}")

# If the Python version is not 3.7, then install Python 3.7 and change the alternatives
if python_version.major != 3 or python_version.minor != 7:
    print("\n Switching to Python version 3.7. Please restart runtime after finishing running this cell.")
    # Install Python 3.7
    !sudo apt-get update -y
    !sudo apt-get install python3.7

    # Change alternatives
    !sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python{python_version.major}.{python_version.minor} 1
    !sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 2

    # Install pip for new python 
    !sudo apt-get install python3.7-distutils
    !wget https://bootstrap.pypa.io/get-pip.py
    !python get-pip.py

    # Install colab's dependencies
    !python -m pip install ipython ipython_genutils ipykernel jupyter_console prompt_toolkit httplib2 astor

    # Link to the old google package
    !ln -s /usr/local/lib/python{python_version.major}.{python_version.minor}/dist-packages/google \
           /usr/local/lib/python3.7/dist-packages/google

    # Modify import statements in Google Colab Python files
    !sed -i "s/from IPython.utils import traitlets as _traitlets/import traitlets as _traitlets/" /usr/local/lib/python3.7/dist-packages/google/colab/*.py
    !sed -i "s/from IPython.utils import traitlets/import traitlets/" /usr/local/lib/python3.7/dist-packages/google/colab/*.py

    print("\n\nRestarting the Python runtime! Please (re-)run the cells below.")
    # Restart Runtime
    import os
    os.kill(os.getpid(), 9)
else:
    print("Python version is already 3.7")

Please make sure both the below 2 blocks are returning `3.7.16`. If it does not work in the future, please contact `ai4climatecoop@gmail.com` to update the switch.

In [None]:
#check python version
!python --version

In [None]:
import sys
sys.version

# Install prerequisite packages
Running this block takes about 4 minutes.

In [None]:
!git clone https://github.com/mila-iqia/climate-cooperation-competition.git

In [None]:
import os
os.chdir(os.getcwd()+"/climate-cooperation-competition")
_ROOT = os.getcwd()
!pip install -r requirements.txt
!pip install rl_warp_drive==1.7.0 # For troubleshooting, please refer to https://github.com/salesforce/warp-drive
!pip install ray[rllib]==1.0.0
# !pip install codecarbon

# Load dependency

In [None]:
import os
import sys

import warnings
warnings.filterwarnings('ignore')
_ROOT = os.getcwd()
sys.path.append(_ROOT+"/scripts")
sys.path = [os.path.join(_ROOT, "/scripts")] + sys.path

from desired_outputs import desired_outputs
from importlib import reload
# from codecarbon import EmissionsTracker

# Train agents with GPU

<!-- To train with GPU, you need to make sure that you have an **Nvdia Graphic Card** and be able to install critical packages such as ``warp-drive`` and ``pytorch``. If you don't have an Nvdia Graphic Card, you may refer to the section **Train Agents with CPU** below. -->

In this section, two examples of GPU-based training with [WarpDrive](https://github.com/salesforce/warp-drive) are presented. 


1.   The first example does not include negotiation between regions. Since there is no direct interaction between the different regions without negotiation, total runtime is ~2 minutes.
2.   The second example includes negotiations between regions. These negotiations take place according to the negotiation protocol outlined in ``rice.py``. Total runtime is 15~20 minutes.



In [None]:
import train_with_warp_drive as gpu_trainer

Here are some suggested baseline parameter values. The training process is done by a single GPU.

```python
num_envs = 100 # ensemble results with 100 randomly initialized enviornments
train_batch_size = 1024 # train with 1024 batch_size
num_episodes = 30000 # number of episodes
lr = 0.005 # learning rate
model_params_save_freq = 5000 # save model for every 5000 steps
```
Additionally, we specify 
```python 
negotiation_on = 0 # no negotiation
```


The following codes are for the carbon emission tracking using [codecarbon](https://github.com/mlco2/codecarbon). Please comment them out if you do not wish the codecarbon to track your carbon footprint. Please read more at [here](https://codecarbon.io/) for more details.
```python 
tracker = EmissionsTracker()
tracker.start()
pass # GPU Intensive code goes here
tracker.stop()
```

Running this next cell will take approximately 2 minutes.

In [None]:
# tracker = EmissionsTracker()
# tracker.start()

gpu_trainer_off, gpu_nego_off_ts = gpu_trainer.trainer(negotiation_on=0, # no negotiation
  num_envs=100, 
  train_batch_size=1024, 
  num_episodes=3000, 
  lr=0.0005,
  model_params_save_freq=5000, 
  desired_outputs=desired_outputs, # a list of values that the simulator will output
  output_all_envs=False # output the mean of all "num_envs" results. Set to True for output all results
  )

# tracker.stop()

To train the agents with negotiation, we modify ``negotiation_on``:

```python
negotiation_on = 1 # with naive negotiation
```
A naive negotiation protocol is already implemented, but **participants are expected to modify, improve and/or replace this protocol to maximize climate and economic outcomes**.

Running this next cell will take 15~20 minutes.

In [None]:
# tracker = EmissionsTracker()
# tracker.start()

gpu_trainer_on, gpu_nego_on_ts = gpu_trainer.trainer(negotiation_on=1, # with naive negotiation
  num_envs=100,
  train_batch_size=1024,
  num_episodes=30000,
  lr=0.0005,
  model_params_save_freq=5000,
  desired_outputs=desired_outputs, # a list of values that the simulator will output
  output_all_envs=False # output the mean of all "num_envs" results. Set to True for output all results
  )

# tracker.stop()

The trainer `gpu_trainer_on` closes gracefully, so `gpu_nego_on_ts` contains the timeseries data from the trainer.


If you encounter the following error:

```
RuntimeError: CUDA out of memory.
```
reducing ``num_envs`` and ``train_batch_size`` can help to some extent.

If you encounter unexpected errors such as 

```
RuntimeError: CUDA error: invalid resource handle
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
```

please try to restart runtime before open an issue.

To customize the training script, please check ``gpu_trainer.py`` for more details.

# Train agents with CPU

CPU-based training can also be done with `rllib`, although it can take much longer depending on the complexity of the negotiation protocol (~3 times longer for the naive negotiation protocol).

In [None]:
# This is necessary for rllib to get the correct path!
os.chdir(_ROOT+"/scripts")
import train_with_rllib as cpu_trainer

Here are some suggested baseline parameter values. The training process is done by a single CPU.

```python
num_envs = 1 # ensemble results with 100 random intialized enviornments
train_batch_size = 1024 # train with 1024 batch_size
num_episodes = 30000 # number of episodes
lr = 0.005 # learning rate
model_params_save_freq = 5000 # save model for every 5000 steps
num_workers=1 # a single CPU
```
Additionally, we specify 
```python 
yaml_path="rice_rllib.yaml" # no negotiation, the orginal Rice
```


Running this next cell will take ~6 minutes.

In [None]:
cpu_trainer = reload(cpu_trainer)

In [None]:
cpu_trainer_off, cpu_nego_off_ts = cpu_trainer.trainer(
  num_envs=1, 
  train_batch_size=1024, 
  num_episodes=300, 
  lr=0.0005, 
  model_params_save_freq=5000, 
  desired_outputs=desired_outputs, # a list of values that the simulator will output
  num_workers=1,
  yaml_path="rice_rllib.yaml"
  )

To train the agents with negotiation, we modify ``negotiation_on``:

```python
yaml_path="rice_rllib_binary_negotiation.yaml"
```
A binary negotiation protocol is already implemented, but **participants are expected to modify, improve and/or replace this protocol to maximize climate and economic outcomes**.

Running this next cell will take  ~33 minutes.

In [None]:
cpu_trainer = reload(cpu_trainer)

In [None]:
cpu_trainer_on, cpu_nego_on_ts = cpu_trainer.trainer(
  num_envs=1, 
  train_batch_size=1024, 
  num_episodes=300, 
  lr=0.0005, 
  model_params_save_freq=5000, 
  desired_outputs=desired_outputs, # a list of values that the simulator will output
  num_workers=1,
  yaml_path="rice_rllib_binary_negotiation.yaml"
  )

The trainer `cpu_trainer_on` closes gracefully, so `cpu_nego_on_ts` contains the timeseries data from the trainer.

If the process is killed during training, reducing ``num_envs`` and ``train_batch_size`` can help to some extent.

# Save or load from previous training results

This section is for saving and loading the results of training (not the trainer itself).

In [None]:
from opt_helper import save, load

To save the output timeseries: 

In [None]:
# [uncomment below to save]
# save({"nego_off":gpu_nego_off_ts, "nego_on":gpu_nego_on_ts}, "filename.pkl")

To load the output timeseries:

In [None]:
# [uncomment below to load]
# dict_ts = load("filename.pkl")
# nego_off_ts, nego_on_ts = dict_ts["nego_off"], dict_ts["nego_on"]

# Plot training procedures

One may want to plot the some metrics such as `mean reward` which are logged during the training procedure.

```python
metrics = ['Iterations Completed',
 'VF loss coefficient',
 'Entropy coefficient',
 'Total loss',
 'Policy loss',
 'Value function loss',
 'Mean rewards',
 'Max. rewards',
 'Min. rewards',
 'Mean value function',
 'Mean advantages',
 'Mean (norm.) advantages',
 'Mean (discounted) returns',
 'Mean normalized returns',
 'Mean entropy',
 'Variance explained by the value function',
 'Gradient norm',
 'Learning rate',
 'Mean episodic reward',
 'Mean policy eval time per iter (ms)',
 'Mean action sample time per iter (ms)',
 'Mean env. step time per iter (ms)',
 'Mean training time per iter (ms)',
 'Mean total time per iter (ms)',
 'Mean steps per sec (policy eval)',
 'Mean steps per sec (action sample)',
 'Mean steps per sec (env. step)',
 'Mean steps per sec (training time)',
 'Mean steps per sec (total)'
 ]
```

To check out the logged submissions, please run the following block.

In [None]:
from glob import glob
glob(os.path.join(_ROOT,"Submissions/*.zip"))

If previous trainings are finished and logged properly, this should give a list of `*.zip` files where the logs are included. 

We picked one of the submissions and the metric `Mean episodic reward` as an example, please check the code below.

In [None]:
from opt_helper import get_training_curve, plot_training_curve

log_zip = glob(os.path.join(_ROOT,"Submissions/*.zip"))[0]
plot_training_curve(None, 'Mean episodic reward', log_zip)

# to check the raw logging dictionary, uncomment below
# logs = get_training_curve(log_zip)
# logs

# Plot results

In [None]:
from desired_outputs import desired_outputs

One may want to check the performance of the agents by plotting graphs. Below, we list all the logged variables. One may change the ``desired_outputs.py`` to add more variables of interest.

```python
desired_outputs = ['global_temperature', 
  'global_carbon_mass', 
  'capital_all_regions', 
  'labor_all_regions', 
  'production_factor_all_regions', 
  'intensity_all_regions', 
  'global_exogenous_emissions', 
  'global_land_emissions', 
  'timestep', 
  'activity_timestep', 
  'capital_depreciation_all_regions', 
  'savings_all_regions', 
  'mitigation_rate_all_regions', 
  'max_export_limit_all_regions', 
  'mitigation_cost_all_regions', 
  'damages_all_regions', 
  'abatement_cost_all_regions', 
  'utility_all_regions', 
  'social_welfare_all_regions', 
  'reward_all_regions', 
  'consumption_all_regions', 
  'current_balance_all_regions', 
  'gross_output_all_regions', 
  'investment_all_regions', 
  'production_all_regions', 
  'tariffs', 
  'future_tariffs', 
  'scaled_imports', 
  'desired_imports', 
  'tariffed_imports', 
  'stage', 
  'minimum_mitigation_rate_all_regions', 
  'promised_mitigation_rate', 
  'requested_mitigation_rate', 
  'proposal_decisions',
  'global_consumption',
  'global_production']
```

In [None]:
from opt_helper import plot_result

`plot_result()` plots the time series of logged variables.

```python
plot_result(variables, nego_off, nego_on, k)
```
* ``variables`` can be either a single variable of interest or a list of variable names from the above list. 
* The ``nego_off_ts`` and ``nego_on_ts`` are the logged time series for these variables, with and without negotiation. 
* ``k`` represents the dimension of the variable of interest ( it should be ``0`` by default for most situations).

Here's an example of plotting a single variable of interest.

In [None]:
plot_result("global_temperature", 
  nego_off=gpu_nego_off_ts, # change it to cpu_nego_off_ts if using CPU
  nego_on=gpu_nego_on_ts, 
  k=0)

Here's an example of plotting a list of variables.

In [None]:
plot_result(desired_outputs[0:3], # truncated for demonstration purposes
  nego_off=gpu_nego_off_ts, 
  nego_on=gpu_nego_on_ts, 
  k=0)

If one only want to plot negotiation-off plots, feel free to set `nego_on=None`. 

In [None]:
plot_result(desired_outputs[0:3], # truncated for demonstration purposes
  nego_off=gpu_nego_off_ts, 
  nego_on=None, 
  k=0)

# How to quickly evaluate the results

This section to for evaluating the trained agents. One can edit the evaluation function ``eval metrics`` in ``evaluate_submission.py`` to include more metrics of interest.

The evaluation script requires as input:
1. The trainer
2. The logged_variables
3. The framework of the trainer. If using GPU-based training, it should be ``warpdrive``. If using CPU-based training, it should be ``rllib``.

We give one example below.

In [None]:
os.chdir(os.path.join(_ROOT,"scripts"))
from evaluate_submission import val_metrics
val_metrics(logged_ts=gpu_nego_off_ts, framework="warpdrive") # for GPU
# val_metrics(logged_ts=cpu_nego_off_ts, framework="rllib") # for CPU

If you want to evaluate a specific zip submission. You may do the followings. Please replace the `FILENAME.zip` with your zip filename.

In [None]:
from evaluate_submission import perform_evaluation, get_results_dir
unzip_path, _ = get_results_dir("/content/climate-cooperation-competition/Submissions/FILENAME.zip")
from run_unittests import _BASE_CODE_PATH, _BASE_RICE_PATH, _BASE_RICE_HELPERS_PATH, _BASE_RICE_BUILD_PATH, _BASE_CONSISTENCY_CHECKER_PATH
perform_evaluation(unzip_path)

# Code pieces that can be modified

As a running example, we use the bilateral negotiation protocol. For more examples, please see section 5.3 in [the white paper](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/White_Paper.pdf).

## Introduction of environment codes

[``rice.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py), [``rice_cuda.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_cuda.py), [``rice_step.cu``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_step.cu) and [``rice_helpers.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_helpers.py) are responsible for the GPU code.

* [``rice.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py) includes interactions between the agents and the environment. **[``rice.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py) is the main script to be modified.**

* [``rice_helpers.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_helpers.py) includes all the socioeconomic and climate dynamics. [``rice_helpers.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_helpers.py) should not be changed.

* [GPU needed] [``rice_cuda.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_cuda.py) connects the data between the python script and CUDA code.

* [GPU needed] [``rice_step.cu``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_step.cu) is the CUDA version of the code which contains the socioeconomic and climate dynamics, as well as the interactions between the agents and the environment. **To use GPU-based training, the CUDA code in ``rice_step.cu`` must have the same logic as the python code in [``rice.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py) and [``rice_helpers.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice_helpers.py).** The CUDA code mostly follows the grammar of C++. Please refer to [here](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) for more details.



## How to add extra observations

To add extra observations or make changes to the observation space, at least two functions must be modified.
1.   [`generate_observation()`](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L379)
2.   [`reset()`](https://github.com/mila-iqia/climate-cooperation-competition/blob/)

As an example, [here](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L428) are the features added when the naive bilateral negotiation protocol is enabled in the simulator: 

``` python
        if self.negotiation_on:
            global_features += ["stage"]

            public_features += []

            private_features += [
                "minimum_mitigation_rate_all_regions",
            ]

            bilateral_features += [
                "promised_mitigation_rate",
                "requested_mitigation_rate",
                "proposal_decisions",
            ]

        shared_features = np.array([])
        for feature in global_features + public_features:
            shared_features = np.append(
                shared_features,
                self.flatten_array(
                    self.global_state[feature]["value"][self.timestep]
                    / self.global_state[feature]["norm"]
                ),
            )


```


## How to add actions

By default, agents' actions are contained in [`self.actions_nvec`](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L136) during [`init()`](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L64):

```python
        self.actions_nvec = (
            self.savings_action_nvec
            + self.mitigation_rate_action_nvec
            + self.export_action_nvec
            + self.import_actions_nvec
            + self.tariff_actions_nvec
        )

```

Extra actions related to the negotiation protocol can be appended to `self.actions_nvec`.
It is important that extra actions be appended at the **end** of `self.actions_nvec`.
``` python 
            # Each region proposes to each other region
            # self mitigation and their mitigation values
            self.proposal_actions_nvec = (
                [self.num_discrete_action_levels] * 2 * self.num_regions
            )

            # Each region evaluates a proposal from every other region,
            # either accept or reject.
            self.evaluation_actions_nvec = [2] * self.num_regions

            # extra actions are appended to the end of self.actions_nvec
            self.actions_nvec += (
                self.proposal_actions_nvec + self.evaluation_actions_nvec
            )

```

## How to implement the logic for negotiation protocols

The baseline logic for bilateral negotiation actions is a naive bargain process with two steps:
1. A [``proposal_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L536) for each agent to propose certains actions to other agents, for example a minimum mitigation rate.
2. An [``evaluation_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L585) for each agent to evaluation other agents' proposals. 

These functions describe how the negotiations actions affect the observation space and the action masking (for more, see the next section).
Both steps are done sequentially in the [``step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L346) function in [``rice.py``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py): 

```python
        if self.negotiation_on:
            # Note: The '+1` below is for the climate_and_economy_simulation_step
            self.stage = self.timestep % (self.num_negotiation_stages + 1)
            self.set_global_state(
                "stage", self.stage, self.timestep, dtype=self.int_dtype
            )
            if self.stage == 1:
                return self.proposal_step(actions)

            if self.stage == 2:
                return self.evaluation_step(actions)

        return self.climate_and_economy_simulation_step(actions)

```
Once the stages of the negotiation protocol are concluded, then the [`climate_and_economy_simulation_step()`](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L651) implements the socioeconomic and climate dynamics associated with the updated observation space and masked actions.

We expect competitors to propose different mechanisms to encourage global cooperation along climate and economic objectives.
Participants should therefore modify this code to match the logic of their proposed negotiation protocol, even proposing new functions to replace [``proposal_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L536), [``evaluation_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L585) and the code above.

For example, competitors could propose a mechanism to form [dynamic climate clubs](https://williamnordhaus.com/publications/climate-clubs-overcoming-free-riding-international-climate-policy), where admittance is based on a minimum mitigation rate. Club members enjoy lower tariffs when trading with other club members, while non-members, who do not have to contribute to mitigation, suffer heavy tariffs when trading with club members.



## What is masking?

Action masking determines the feasible subspace of the action space according to the negotiation protocol. Action masks are set before agents choose their actions, so the agent explicitly chooses from the feasible action subspace.
To implement this logic, actions masks are modified in the [``evaluation_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L585), after the [``proposal_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L536) and [``evaluation_step()``](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L585), but before the [`climate_and_economy_simulation_step()`](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L651). This way, the regions are prohibited from taking actions outside of the feasible action subspace.

For example, during the bilateral negotiation process, regions that agree to implement minimum mitigation rates are required to do so. 

```python
        for region_id in range(self.num_regions):
            outgoing_accepted_mitigation_rates = [
                self.global_state["promised_mitigation_rate"]["value"][
                    self.timestep, region_id, j
                ]
                * self.global_state["proposal_decisions"]["value"][
                    self.timestep, j, region_id
                ]
                for j in range(self.num_regions)
            ]
            incoming_accepted_mitigation_rates = [
                self.global_state["requested_mitigation_rate"]["value"][
                    self.timestep, j, region_id
                ]
                * self.global_state["proposal_decisions"]["value"][
                    self.timestep, region_id, j
                ]
                for j in range(self.num_regions)
            ]

            self.global_state["minimum_mitigation_rate_all_regions"]["value"][
                self.timestep, region_id
            ] = max(
                outgoing_accepted_mitigation_rates + incoming_accepted_mitigation_rates
            )

```



## How to implement and/or modify the logic of action masking?


The logic behind action masks is implemented in [`generate_action_mask()`](https://github.com/mila-iqia/climate-cooperation-competition/blob/main/rice.py#L506).
`mask_dict` gives the mapping for each region to its corresponding action `mask`. In the current implementation, `mask` is a binary vector where `0` indicates an action that is not allowed, and `1` indicates an action that is allowed.

For example, in the bilateral negotiation protocol, the action mask is based on the minimum mitigation rate for each region (see code below).
```python
    def generate_action_mask(self):
        """
        Generate action masks.
        """
        mask_dict = {region_id: None for region_id in range(self.num_regions)}
        for region_id in range(self.num_regions):
            mask = self.default_agent_action_mask.copy()
            if self.negotiation_on:
                minimum_mitigation_rate = int(round(
                    self.global_state["minimum_mitigation_rate_all_regions"]["value"][
                        self.timestep, region_id
                    ]
                    * self.num_discrete_action_levels
                ))
                mitigation_mask = np.array(
                    [0 for _ in range(minimum_mitigation_rate)]
                    + [
                        1
                        for _ in range(
                            self.num_discrete_action_levels - minimum_mitigation_rate
                        )
                    ]
                )
                mask_start = sum(self.savings_action_nvec)
                mask_end = mask_start + sum(self.mitigation_rate_action_nvec)
                mask[mask_start:mask_end] = mitigation_mask
            mask_dict[region_id] = mask

        return mask_dict

```