Copyright (c) 2020, salesforce.com, inc.  
All rights reserved.  
SPDX-License-Identifier: BSD-3-Clause  
For full license text, see the LICENSE file in the repo root or https://opensource.org/licenses/BSD-3-Clause

### Colab

Try this notebook on [Colab](http://colab.research.google.com/github/salesforce/ai-economist/blob/master/tutorials/economic_simulation_advanced.ipynb).

# The Structure of Foundation + How to Extend It 

In this tutorial, we will explain the low-level compositional structure of Foundation, the economic simulation. Its architecture stems from three main design goals:

1. Flexibility: e.g., it should be easy to create worlds with or without income taxes.
2. Extensibility: adding new entities and components should follow an easy and transparent process.
3. Simplicity: avoid deep class hierarchies.

To support these goals, Foundation modularizes the pieces of the simulation as much as possible. Below, we explain what these pieces actually are and how simulation environments are built from them. Understanding this structure will undoubtedly be useful when extending Foundation.

Foundation builds economic simulations using Scenario classes. Scenarios compose the simulation's constituent classes into an actual simulation environment. The majority of this tutorial is used to introduce the semantics of each such class type:

1. Scenario
2. World
    - Maps
3. Entities
    - Resources
    - Landmarks
    - Endogenous
4. Agents
5. Components


To conclude the tutorial, we will focus on how to extend the economic simulation:

6. How the simulation pieces interact
7. Exercise: creating a new Component
8. Helpful tips

### Before we jump into the details...
... let's revisit some basics (covered in detail [here](https://github.com/salesforce/ai-economist/blob/master/tutorials/economic_simulation_basic.ipynb)).

Simulation environments exist as Python objects and are interacted with through a gym-style API:
```python
env = Scenario(...)
obs = env.reset()
obs, rew, done, info = env.step(actions) # w/ actions <-- policy(obs)
```
An environment is responsible for providing some *observations* based on the world & agent states and updating these states based on the *actions* taken by the agents and the dynamics of the environment.

These dynamics are encapsulated in **Scenario** and **Component** classes, and most extensions of the simulation framework will likely focus on those classes. As a general description...
- A **Scenario** provides the backbone of the environment: 
    - it sets up the world and the agents,
    - adds some passive dynamics, 
    - supplies some observations, 
    - and generates rewards.
- **Components** are how agents interact with the environment: 
    - they add actions,
    - mediate the effect of whatever actions they add,
    - and provide relevant observations.

(now for the details!)

## Dependencies

You can install the ai-economist package using the pip package manager:

In [None]:
import os, signal, sys, time
IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    !pip install ai-economist
    
    # Restart the Python runtime to automatically use the installed packages
    print("\n\nRestarting the Python runtime! Please (re-)run the cells below.")
    time.sleep(1)
    os.kill(os.getpid(), signal.SIGKILL)
else:
    !pip install ai-economist

In [None]:
from ai_economist import foundation

## Registry

The Registry class enables conveniently creating classes, such as Scenarios and Resources, using their names (as a string). Each class can be added to the Registry using the *add()* method and can be retrieved using their *name* property.

For example, the *make_env_instance* convenience method used below, can create a basic ```"layout_from_file/simple_wood_and_stone"``` Scenario by first retrieving the associated Class from the registry:

In [None]:
from ai_economist.foundation.base.base_env import BaseEnvironment, scenario_registry

In [None]:
test_env_cls = scenario_registry.get("layout_from_file/simple_wood_and_stone")

In [None]:
test_env_cls.name

To add a new class to a Registry, you can use a decorator as follows:

In [None]:
@scenario_registry.add
class NewEnvironment(BaseEnvironment):
    name = "NewEnvironment"

In [None]:
new_env_cls = scenario_registry.get("NewEnvironment")

In [None]:
new_env_cls.name

In [None]:
# These are the Scenario classes registered in scenario_registry
print(scenario_registry.entries)

There is a separate registry for each type of environment component. The ```foundation``` package exposes them as follows:

In [None]:
# Scenarios:
print(foundation.scenarios.entries)

In [None]:
# Entities (landmarks, resources, endogenous):
print(foundation.landmarks.entries)
print(foundation.resources.entries)
print(foundation.endogenous.entries)

In [None]:
# Agents:
print(foundation.agents.entries)

In [None]:
# Components:
print(foundation.components.entries)

# 1. Scenarios

As discussed in [the basics tutorial](https://github.com/salesforce/ai-economist/blob/master/tutorials/economic_simulation_basic.ipynb), the Scenario class implements an economic simulation with multiple agents and (optionally) a social planner. 

We will create the same environment instance used in that tutorial, using the configuration below. This configuration defines a simulation with four agents in a world of 15 by 15 cells. Each Agent can: 

- gather collectible Resources (through the Gather Component), 
- build Houses (through the Build Component), and 
- trade collectible Resources (through the ContinuousDoubleAuction Component).

In [None]:
# Define the configuration of the environment that will be built

env_config = {
    # ===== STANDARD ARGUMENTS ======
    'n_agents': 4,          # Number of non-planner agents
    'world_size': [15, 15], # [Height, Width] of the env world
    'episode_length': 1000, # Number of timesteps per episode
    
    # In multi-action-mode, the policy selects an action for each action subspace (defined in component code)
    # Otherwise, the policy selects only 1 action
    'multi_action_mode_agents': False,
    'multi_action_mode_planner': True,
    
    # When flattening observations, concatenate scalar & vector observations before output
    # Otherwise, return observations with minimal processing
    'flatten_observations': False,
    # When Flattening masks, concatenate each action subspace mask into a single array
    # Note: flatten_masks = True is recommended for masking action logits
    'flatten_masks': True,
    
    
    # ===== COMPONENTS =====
    # Which components to use (specified as list of {"component_name": {component_kwargs}} dictionaries)
    #   "component_name" refers to the component class's name in the Component Registry
    #   {component_kwargs} is a dictionary of kwargs passed to the component class
    # The order in which components reset, step, and generate obs follows their listed order below
    'components': [
        # (1) Building houses
        {'Build': {}},
        # (2) Trading collectible resources
        {'ContinuousDoubleAuction': {'max_num_orders': 5}},
        # (3) Movement and resource collection
        {'Gather': {}},
    ],
    
    # ===== SCENARIO =====
    # Which scenario class to use (specified by the class's name in the Scenario Registry)
    'scenario_name': 'uniform/simple_wood_and_stone',
    
    # (optional) kwargs of the chosen scenario class
    'starting_agent_coin': 10,
    'starting_stone_coverage': 0.10,
    'starting_wood_coverage':  0.10,
}

This configuration dictionary lists the used Components, each can be configured through a dictionary of Component-specific settings. 

Creating a Scenario can be done using a convenience method:

In [None]:
env = foundation.make_env_instance(**env_config)

In [None]:
obs = env.reset()

In the above code, ```env``` is an instance of the Scenario class stored in ```scenario_registry``` as ```"uniform/simple_wood_and_stone"```

In [None]:
uniform_cls = scenario_registry.get(env_config['scenario_name'])
isinstance(env, uniform_cls)

This Scenario class (and all Scenario classes) are subclasses of ```BaseEnvironment``` (meaning ```env``` is also an instance of ```BaseEnvironment```).

In [None]:
isinstance(env, BaseEnvironment)

**Why this structure?** The ```env``` object is responsible for a lot! It organizes all the pieces (the world, agents, and components) into a coherent environment with a simple and consistent API. It also implements some of the behavior of the environment itself: the passive (not action-dependent) dynamics of the world, baseline observations, and rewards.

That first domain of functionality is implemented in the ```BaseEnvironment``` code and the second domain of functionality (the "behavior") is implemented separately by each Scenario class via the following methods:
```python
from ai_economist.foundation.base.base_env import BaseEnvironment, scenario_registry

@scenario_registry.add
class EmptyScenario(BaseEnvironment):
    name = "Empty"
    required_entities = []
    
    def reset_layout(self):
        """Resets the state of the world object (self.world)."""
        pass
    
    def reset_agent_states(self):
        """Resets the state of the agent objects (self.world.agents & self.world.planner)."""
        pass
    
    def scenario_step(self):
        """Implements the passive dynamics of the environment."""
        pass
    
    def generate_observations(self):
        """Yields some basic observations about the world/agent states."""
        pass
    
    def compute_reward(self):
        """Determines the reward each agent receives at the end of each timestep."""
        pass
```

The expected behaviors of these methods are described extensively in the internal documentation of ```BaseEnvironment```, where they are defined as abstract methods (see [foundation/base/base_env.py](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/base/base_env.py)).

```env```, which is an instance of the ```"uniform/simple_wood_and_stone"``` Scenario class, has the following behavior:
- **reset_layout**: Samples a new spatial layout of Stone and Wood source tiles in the world.
- **reset_agent_states**: Resets agent inventories and their starting locations in the world.
- **scenario_step**: Stochastically re-spawns Stone and Wood at empty source tiles.
- **generate_observations**: Generates observations related to inventory and the spatial state of the world.
- **compute_reward**: Marginal utility for each agent in ```env.world.agents```; marginal social welfare for ```env.world.planner```.

Check out [the code for this Scenario class](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/scenarios/simple_wood_and_stone/dynamic_layout.py) to see how this behavior is implemented.

**Note**: This example refers to some concepts we haven't introduced yet (source tiles, inventories, etc.). We'll cover those in the sections below!

# 2. World and Maps

Above, we saw how a Scenario class resets the spatial state of the world, but **where is this spatial state represented?**

Each Scenario will include an instance of the **World** class ```env.world``` to wrap agent instances (more on that below) and an instance of the **Maps** class ```env.world.maps```. **The Maps class stores and manipulates the spatial state of the environment**, such as the locations of Agents and other Entities.

Both classes (World and Maps) are implemented in [foundation/base/world.py](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/base/world.py).

The **maps** object ```env.world.maps``` holds a 2-dimensional NumPy array that records the location of Entities in the world.

In [None]:
# For each key, the maps object has a [Height, Width] array for the spatial layout of that Entity in the world.
env.world.maps.keys()

For instance, we can visualize where the Stone is in the world:

In [None]:
# Note: this map has the same size as our world (15 by 15)
env.world.maps.get("Stone")

The **world** object ```env.world``` provides some tools for interfacing with **maps**.

To see which Resources are in a certain cell, you can use the convenience method *location_resources*:

In [None]:
env.world.location_resources(0, 0)

To see which Landmarks are in a certain cell, you can use the convenience method *location_landmarks*:

In [None]:
env.world.location_landmarks(0, 0)

# 3. Entities

Agents in the economic simulation can interact with Entities. There are 3 groups of Entities, each with their own semantics:

- **Landmarks** show up in the spatial world 
- **Resources** show up in agent inventories and (optionally) also the spatial world
- **Endogenous** entities represent abstract quantities (like effort) that agents can only observe about themselves

You can find their definitions in [foundation/entities](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/entities). Again, we will use convenient Registries to retrieve the various classes.

## Landmarks

Landmarks represent entities that exist exclusively in the spatial world, for example a block of Water that agents can't move over.

In the current implementation, there are three types of Landmarks: House, Water and SourceBlock. SourceBlocks are special Landmarks from which Resources can spawn.

In [None]:
house = foundation.landmarks.get("House")
water = foundation.landmarks.get("Water")
source_block_wood = foundation.landmarks.get("WoodSourceBlock")

For each Landmark, the class defines:

- its name, 
- its color, 
- whether the Landmark is ownable by an Agent (e.g., Houses), and 
- whether it's solid (e.g., Water).

An agent cannot occupy the same location as a solid landmark (e.g. Water) unless it owns that landmark (e.g. a House).

In [None]:
[k for k in dir(house) if k[0] != "_"]

**Note**: The simulation does *not* instantiate a separate Python instance of a Landmark for each occurrence of a Landmark in the world! Rather, each Landmark class defines the abstract properties of any instance of that Landmark.

```env.world.maps``` keeps track of where all the units of a Landmark are using a 2-dimensional NumPy array, as illustrated above.

## Resources

Resources are another important type of Entity in the world. Resources are semantically different from Landmarks, because Resources can be traded, collected, and converted into other Entities (e.g., Wood and Stone are used to build a House).

In particular, Resources are the entities in the world that an agent can own as part of its **inventory**:

In [None]:
env.get_agent(agent_idx=0).inventory

A Resource has three main attributes: 
- its name, 
- its color (convenient for visualization),
- and whether it's collectible.

For example, we can see that Wood is collectible.

In [None]:
wood = foundation.resources.get("Wood")

wood.name, wood.collectible, wood.color

On the other hand, Coin is *not* collectible (but can be owned).

In [None]:
coin = foundation.resources.get("Coin")

coin.collectible

Note that collectible Resources (Wood & Stone) get a special Landmark type (source blocks), which both show up in the Map, whereas non-collectible Resources (Coin) only exist as part of the inventory.

In other words, **collectible Resources start as part of the spatial world but can be moved into an agent's inventory.**

```env.world.maps``` keeps track of where all the units of a **collectible** Resource are using a 2-dimensional NumPy array, as illustrated above.

## Endogenous Entities

Certain semantic concepts do not have a physical realization, but are important because they determine, e.g., an Agent's utility. The main example is Labor.

The definition of Labor is rather simple, it only defines the name.

In [None]:
labor = foundation.endogenous.get("Labor")

[k for k in dir(labor) if k[0] != "_"]

Endogenous entities, like Resources, can be accumulated, **but their quantities are stored outside of the inventory**. (This is done to make it easier to separate Resources and Endogenous entities when generating observations.)

In [None]:
agent0 = env.get_agent(agent_idx=0)
print(agent0.inventory)
print(agent0.endogenous)

# 4. Agents

```env.world``` also wraps **agent** instances. There will be ```env.n_agents``` "mobile" agents + 1 "planner" agent. Each such agent is represented as a separate Python object:

The ```env.n_agents``` "mobile" agents (representing individual workers in the economy):

In [None]:
env.world.agents

The "planner" agent (representing a Social Planner that sets, for example, tax policy)

In [None]:
env.world.planner

Agents can be easily accessed:

In [None]:
agent0  = env.get_agent(agent_idx=0)   # Mobile agents are numerically indexed
agent1  = env.get_agent(agent_idx=1)
planner = env.get_agent(agent_idx='p') # The planner agent always uses index 'p'

Each agent instance maintains the state of the agent

In [None]:
agent0.state

In [None]:
agent1.state

In [None]:
planner.state

# 5. Components

Up to this point, we have learned about how the state of the world is represented in the **world** object: with spatial state represented by ```env.world.maps```, and agent states represented in ```env.world.agents``` and ```env.world.planner```.

We have also learned how custom **Scenario** classes define methods for resetting these states and rules for passive dynamics (in our working example, resource regeneration).

**How then do agents actually _interact_ with the environment?**

**Components** are used to flexibly extend the behavior of a Scenario by encapsulating specific interactions/dynamics. They enable a plug-and-play approach to building economic simulations.

This structure also vastly simplifies the process of extending the simulation framework through the addition of new Component classes.

### Let's revisit our working example to better understand how Components work...

... recall the ```'components'``` argument set in the environment configuration we used to build ```env```:

In [None]:
env_config['components']

This argument tells the Scenario class which Component classes to make use of. Notice that ```env``` has created an instance of each such class:

In [None]:
env._components

In [None]:
# which are better accessed via...
build = env.get_component("Build")

```build``` is an instance of the Component class ```Build```

In [None]:
isinstance(build, foundation.components.get("Build"))

All Component classes are subclasses of ```BaseComponent``` (so ```build``` is also an instance of ```BaseComponent```)

In [None]:
from ai_economist.foundation.base.base_component import BaseComponent
isinstance(build, BaseComponent)

**Why this structure?** Building Component classes (such as ```Build```) on top of ```BaseComponent``` enforces consistent semantics for defining the behavior of the Component and allowing a Scenario to make use of it. Each Component class must implement the following abstract methods:

```python
from foundation.base.base_component import BaseComponent, component_registry

@component_registry.add
class EmptyComponent(BaseComponent):
    name = "Empty"
    required_entities = []
    
    def get_n_agent_actions(self, agent_cls_name):
        """Returns the actions that agents with type agent_cls_name can take through this component."""
        pass
    
    def get_additional_state_fields(self, agent_cls_name):
        """Returns a dictionary to be be added to the state dictionary of agents with type agent_cls_name."""
        pass
    
    def component_step(self):
        """Implements the (passive and active) dynamics that this Component adds to the environment."""
        pass
    
    def generate_observations(self):
        """Yields observations."""
        pass
    
    def generate_masks(self):
        """Specifies which of the Component actions are valid given the current state."""
        pass
```

The expected behaviors of these methods are described extensively in the internal documentation of ```BaseComponent```, where they are defined as abstract methods (see [foundation/base/base_component.py](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/base/base_component.py)).

As an example, the ```Build``` Component class implements the following behavior:
- **get_n_agent_actions**: returns 1 action that mobile agents can take (to build a house).
- **get_additional_state_fields**: returns mobile agents' state dictionary, which includes payment-per-house info.
- **component_step**: For each agent that takes the build action, place an agent-owned house landmark at the agent's location and update its state (remove Stone & Wood used for building, add Coin income, add Labor cost).
- **generate_observations**: Generates observations related to the payment-per-house state info.
- **generate_masks**: Mask the build action for agents that are on non-empty map cells or do not have the resources to build.

Check out [the code for the Build Component class](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/components/build.py) to see how this behavior is implemented.

As an additional example, the ```Gather``` Component class implements the following behavior:
- **get_n_agent_actions**: returns 4 actions that mobile agents can take (move up, down, left, and right).
- **get_additional_state_fields**: returns mobile agents' state dictionary, which includes probability of collecting bonus resources.
- **component_step**: For each agent that takes a move action: update its location, if the new location has a Resource, move it from the spatial world to the agent's inventory, add Labor cost(s) associated with moving and collecting.
- **generate_observations**: Generates observations related to the bonus probability state info.
- **generate_masks**: For each agent, mask whichever move actions would move it to a location it is not allowed to occupy.

Check out [the code for the Gather Component class](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/components/move.py) to see how this behavior is implemented.

When the agent objects are created, each one registers the actions afforded to it by the ```env``` components. As described: ```Build``` adds 1 action and ```Gather``` adds 4. ```ContinuousDoubleAuction``` (which implements trading) is more complex: it adds several action sets for buying and selling Wood and Stone (each action in a set corresponds to a different price level).

More concretely, the ```ContinuousDoubleAuction``` Component class implements the following behavior:
- **get_n_agent_actions**: (For mobile agents) returns a *pair* action action sets (for buying and selling) for each *collectible* resource; each action set adds M+1 actions, where M is the maximum trading price.
- **get_additional_state_fields**: Doesn't add any state fields.
- **component_step**: For each agent that takes a buy/sell action: create an order in the associated resource market and add a small Labor cost; match orders and execute trades, which moves Coin and the resource between inventories; update the order books, removing expired orders.
- **generate_observations**: For each agent, generates observations related to price levels of past successful trades, current available orders, and the agent's own outstanding orders.
- **generate_masks**: For each agent, mask any buying actions that it does not have enough Coin to fulfill and mask any selling actions that it does not have the resources to fulfill.

Check out [the code for the ContinuousDoubleAuction Component class](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/components/continuous_double_auction.py) to see how this behavior is implemented.

After setting up the environment, the agents have registered the following actions:

In [None]:
agent0.action_dim

Right now, the planner has not registered any actions because none of the 3 Components add any planner actions.

In [None]:
planner.action_dim

In particular, we did not include a Taxation Component in ```env_config['components']```, which would introduce actions for the planner.

In [None]:
from copy import deepcopy
tax_config = deepcopy(env_config)
tax_config['components'].append({"PeriodicBracketTax": {}})

tax_env = foundation.make_env_instance(**tax_config)

tax_env._components

In [None]:
# Same as with the other env (PeriodicBracketTax doesn't add actions for this agent type)
tax_env.get_agent(agent_idx=0).action_dim

In [None]:
# Now the planner has actions (PeriodicBracketTax creates an action set for the planner for each tax bracket)
tax_env.get_agent(agent_idx='p').action_dim

# Extending the Economic Simulation

Having introduced the constituent parts of the simulation, we now focus on how to extend it.

# 6. How the Simulation Pieces Interact

Before we can dive into actually writing new code, we still need to understand how the simulation pieces interact to build a coherent environment with a consistent and simple API.

In particular, let's look at how the environment sets itself up when an environment instance is created and what happens under the hood when stepping through the simulation.

The underlying design follows the principle of *encapsulation* by making it easy to create new simulation classes, such as Entities, Scenarios, and Components, without having to re-write existing code.

### Set up

Earlier, when we looked at ```env.world.maps```, we saw that it already had populated a handful of maps, ones for Wood, Stone, their associated SourceBlocks, and Houses.

Also, when we looked at some of the ```env.world.agents```, we saw that their states were already populated, for example with inventories referencing Resources such as Wood, Stone, and Coin.

**Where did that come from?**

Each Scenario and Component must declare the entities that it will interact with in the attribute ```required_entities```. For example:

```python
@scenario_registry.add
class Uniform(BaseEnvironment):
    name = "uniform/simple_stone_and_wood"
    required_entities = ["Stone", "Wood"]
    ...
```

or

```python
@component_registry.add
class Build(BaseComponent):
    name = "Build"
    required_entities = ["Stone", "Wood", "House"]
    ...
```

When ```env``` gets created (as part of ```BaseEnvironment.__init__```), the following happens:

1. It looks at the ```required_entities``` of the Scenario class and the included Component classes and it determines which Resources, Landmarks, and Endogenous entities need to be part of the game.
    - By default, Coin and Labor are always included, even if they are not mentioned in the Scenario's or Components' ```required_entities```.


2. It constructs a world object ```env.world```, which involves creating partially-initialized agent objects for each agent in the environment and creating the maps object ```env.world.maps```. 
    - The world object is told which Resources and Landmarks to include and this gets passed to the maps object. The maps object creates a map for these and uses their class properties to preserve semantics: for example, the maps object maintain a separate map indicating ownership for ownable Landmarks, like Houses.
    - Agent objects are also told which Resources and Endogenous entities are in use. The ```inventory```, ```escrow```, and ```endogenous``` portions of their state dictionaries are configured accordingly.


3. It creates an instance of each of the included Component classes (using, for each, any paired keyword arguments).


4. It finishes initializing the agent objects by allowing each one to register the actions defined by the different component objects.

### Stepping

Outside of initialization, the logic for integrating Scenarios with Components is fairly straightforward.

When calling ```env.step(actions)``` (the main method for interacting with the environment), the following happens:
1. The environment interprets ```actions```, which updates each agent objects' ```agent.action``` to represent which action the agent is taking for each of the action sets (0 denotes NO-OP, meaning no action).


2. The environment performs the ```component_step``` method for each of the component objects. Inside ```component_step```, agent method ```agent.get_component_action(self.name, [action_set_name])``` can be used to query the action(s) ```agent``` chose.


3. The environment performs the ```scenario_step``` of its own Scenario class.


4. For each agent, ```agent.action``` is reset to all NO-OPs.


5. The environment collects observations from its own ```generate_observations``` method and those of each of the component objects, and it combines and formats them into a single observations dictionary.


6. The environment collects action masks using the ```generate_masks``` method of each of the component objects, and it combines and formats them before packaging them as ```'action_masks'``` in each agent's observations.  
  
  


A similar logic is applied for ```env.reset()```, in which:
1. ```env.reset_starting_layout``` and ```env.reset_agent_states``` (which are defined by the Scenario class) are first called.


2. Each component object's ```reset``` method is called.


3. Finally, ```env.additional_reset_steps``` is called (which is also defined by the Scenario class).

# 7. Exercise: Creating a New BuyWidgetFromVirtualStore Component

Let's put all these concepts together and introduce a new Resource entity, a Widget, and implement a simple Component in which agents can buy Widgets from an external source, like an online store, for a fixed price of 5 Coin. The store randomly adds a single Widget to its inventory each step. 

### Adding "Widget" as a new Resource entity

In order to add a new Resource entity that other Scenario and Component classes can reference, we simply need to define a new Resource class and put it in the appropriate registry. Let's do this directly in code:

In [None]:
from ai_economist.foundation.entities.resources import Resource, resource_registry

@resource_registry.add
class Widget(Resource):
    name = "Widget"
    color = [1, 1, 1]
    collectible = False # <--- Goes in agent inventory, but not in the world

That's it. It's that easy.

### Component Initialization

Let's start with the initialization of the Component. We'll define a customizable ```widget_refresh_rate``` which determines how likely the store will add a new Widget unit to its inventory each step. Additionally, we'll use a fixed price of 5 Coin per Widget, and initialize the store's inventory to 0.

```python
@component_registry.add
class BuyWidgetFromVirtualStore(BaseComponent):
    name = "BuyWidgetFromVirtualStore"
    required_entities = ["Coin", "Widget"]  # <--- We can now look up "Widget" in the resource registry
    agent_subclasses = ["BasicMobileAgent"]

    def __init__(
        self,
        *base_component_args,
        widget_refresh_rate=0.1,
        **base_component_kwargs
    ):
        super().__init__(*base_component_args, **base_component_kwargs)
        self.widget_refresh_rate = widget_refresh_rate
        self.available_widget_units = 0
        self.widget_price = 5
```        

Note that we define the Component's name as a string ```BuyWidgetFromVirtualStore```, and decorate the Component with the ```add``` method from the component registry. This allows us to create the Component by using the ```get``` method on ```component_registry```.

We also declare the ```required_entities``` as ```Coin``` and ```Widget```. This instructs ```BaseEnvironment``` to include these entity types in the environment when ```BuyWidgetFromVirtualStore``` is used as a component.

### Reset

Sometimes, a Component wants to expose part of the state it manages as a part of the agents' state. ```get_additional_state_fields``` is used to set that up and reset the associated state when a new episode starts. Here, we won't use that functionality so we return an empty dictionary, which is interpreted as *no additional state fields*.

```python
def get_additional_state_fields(self, agent_cls_name):
    return {}
```

When a new episode starts (whenever the ```BaseEnvironment``` resets), the store should have 0 Widgets. We can use ```additional_reset_steps``` to implement this behavior.

```python    
def additional_reset_steps(self):
    self.available_wood_units = 0
```        

### Actions

Each agent can choose to *buy* a Widget or not each step. Hence, we add an extra action to the action space of a ```BasicMobileAgent```. Other agent types (like planners) do not get an extra action: if ```get_n_actions``` returns ```None```, it is interpreted as *no action added*.  **Note**: the simulation framework only supports discrete action types for now.

```python
def get_n_actions(self, agent_cls_name):
    # This component adds 1 binary action that mobile agents can take: buy widget (or not).
    if agent_cls_name == "BasicMobileAgent":
        return 1  # Buy or not.

    return None
```

### Action Masks

Whether or not an agent can buy depends on:

- Does the agent have at least ```self.widget_price``` Coin? We check this by looking at ```agent.state["inventory"]["Coin"]```.
- Does the store have at least 1 Widget in store?

Because a BasicMobileAgent has 1 extra discrete action, the mask is simply a single bit, stored in a NumPy array.

**Note: ```world.agents``` loops over ```BasicMobileAgent```s only! It does not include the planner agent!**

```python
def generate_masks(self, completions=0):
    masks = {}
    # Mobile agents' buy action is masked if they cannot build with their
    # current coin or if no widgets are available.
    for agent in self.world.agents:
        masks[agent.idx] = np.array([
            agent.state["inventory"]["Coin"] >= self.widget_price and self.available_widget_units > 0
        ])

    return masks
```

### Step

The main logic of this Component is defined in ```component_step```. Two pieces of logic are defined: 

1. The store randomly adds a unit of Wood to its inventory. 
2. Agents buy orders are executed in random order (to break ties if, say, there's only 1 Widget but 2 agents try to buy it).

```python
def component_step(self):
    # Maybe add a Widget to store's inventory.
    if random.random() < self.widget_refresh_rate: 
        self.available_widget_units += 1

    # Agents can buy 1 unit of Wood, in random order.
    for agent in self.world.get_random_order_agents():

        action = agent.get_component_action(self.name)
        
        if action == 0: # NO-OP. Agent is not interacting with this component.
            continue

        if action == 1: # Agent wants to buy. Execute a purchase if possible.
            if self.available_widget_units > 0 and agent.state["inventory"]["Coin"] >= self.widget_price: 
                # Remove the purchase price from the agent's inventory
                agent.state["inventory"]["Coin"] -= self.widget_price
                # Add a Widget to the agent's inventory
                agent.state["inventory"]["Widget"] += 1
                # Remove the Widget from the market
                self.available_widget_units -= 1
                
        else: # We only declared 1 action for this agent type, so action > 1 is an error.
            raise ValueError
```

**Note how the step logic supports action=0 and action=1.** action=0 denotes ``NO-OP`` (no operation). **All Components are expected to obey this semantic.** The action added by this Component starts (and in this case ends) with action=1.

### Observations

The store is quite transparent: each ```BasicMobileAgent``` can observe how likely it is that the store will add new a new Widget unit, what the store's current inventory looks like, and what the price is.

The observation that a Component generates should be structured as a dictionary, keyed by each agent's ```id``` and each value being a dictionary.

```python
def generate_observations(self):
    obs_dict = dict()
    for agent in self.world.agents:
        obs_dict[agent.idx] = {
            "widget_refresh_rate": self.widget_refresh_rate,
            "available_widget_units": self.available_widget_units,
            "widget_price": self.widget_price
        }

    return obs_dict
```

### Final Component

Let's combine this into actual code so we can create the new Component class and have it available in the component registry:

In [None]:
import numpy as np
from ai_economist.foundation.base.base_component import BaseComponent, component_registry

@component_registry.add
class BuyWidgetFromVirtualStore(BaseComponent):
    name = "BuyWidgetFromVirtualStore"
    required_entities = ["Coin", "Widget"]  # <--- We can now look up "Widget" in the resource registry
    agent_subclasses = ["BasicMobileAgent"]

    def __init__(
        self,
        *base_component_args,
        widget_refresh_rate=0.1,
        **base_component_kwargs
    ):
        super().__init__(*base_component_args, **base_component_kwargs)
        self.widget_refresh_rate = widget_refresh_rate
        self.available_widget_units = 0
        self.widget_price = 5

    def get_additional_state_fields(self, agent_cls_name):
        return {}

    def additional_reset_steps(self):
        self.available_wood_units = 0

    def get_n_actions(self, agent_cls_name):
        if agent_cls_name == "BasicMobileAgent":
            return 1
        return None

    def generate_masks(self, completions=0):
        masks = {}
        for agent in self.world.agents:
            masks[agent.idx] = np.array([
                agent.state["inventory"]["Coin"] >= self.widget_price and self.available_widget_units > 0
            ])

        return masks

    def component_step(self):
        if random.random() < self.widget_refresh_rate: 
            self.available_widget_units += 1

        for agent in self.world.get_random_order_agents():

            action = agent.get_component_action(self.name)

            if action == 0: # NO-OP. Agent is not interacting with this component.
                continue

            if action == 1: # Agent wants to buy. Execute a purchase if possible.
                if self.available_widget_units > 0 and agent.state["inventory"]["Coin"] >= self.widget_price: 
                    agent.state["inventory"]["Coin"] -= self.widget_price
                    agent.state["inventory"]["Widget"] += 1
                    self.available_widget_units -= 1

            else: # We only declared 1 action for this agent type, so action > 1 is an error.
                raise ValueError

    def generate_observations(self):
        obs_dict = dict()
        for agent in self.world.agents:
            obs_dict[agent.idx] = {
                "widget_refresh_rate": self.widget_refresh_rate,
                "available_widget_units": self.available_widget_units,
                "widget_price": self.widget_price
            }

        return obs_dict

### Create a new environment instance that uses the new Component 

To add the ```BuyWoodFromVirtualStore``` to the Scenario, modify the ```env_config``` as follows:

In [None]:
new_env_config = deepcopy(env_config)

# Compared to env_config, new_env_config simply adds our new Component
new_env_config['components'] = [
    # (1) Building houses
    {'Build': {}},
    # (2) Trading collectible resources
    {'ContinuousDoubleAuction': {'max_num_orders': 5}},
    # (3) Movement and resource collection
    {'Gather': {}},
    # (4) Let each mobile agent buy widgets from a virtual store.
    {'BuyWidgetFromVirtualStore': {'widget_refresh_rate': 0.1}},  # <--- This.
]

In [None]:
# And there you have it!
new_env = foundation.make_env_instance(**new_env_config)
obs = new_env.reset()

Let's compare ```env``` and ```new_env```!

In [None]:
env.resources

In [None]:
new_env.resources

Notice how ```new_env``` now includes ```'Widget'``` as a Resource in the environment. This is because ```BuyWidgetFromVirtualStore.required_entities``` includes ```'Widget'```!

This difference also shows up in the agent states -- specifically, the inventory:

In [None]:
old_agent0 = env.get_agent(agent_idx=0)
new_agent0 = new_env.get_agent(agent_idx=0)

In [None]:
# Inventory includes Coin, Stone, and Wood...
old_agent0.state

In [None]:
# Inventory includes Coin, Stone, Wood and Widget!
new_agent0.state

Mobile agents in ```new_env``` should also have an extra action set:

In [None]:
old_agent0.action_dim

In [None]:
new_agent0.action_dim

In [None]:
new_agent0.get_component_action('BuyWidgetFromVirtualStore')

And, with that, ```BuyWidgetFromVirtualStore``` is a brand new Component, ready to go. Pretty cool, huh?!

If you're interested in learning more about how to extend the simulation by adding new classes, we encourage you to check out the existing implementations provided in the code and to refer back to the documentation in the base classes on which everything is built!

### One last thing (because it's cool)...
Once we included our new Component, Widgets automatically became part of the agents' inventory space. That's because we defined the Widget entity as a Resource class.

However, Widgets are not part of the spatial map. That's because we defined ```Widget.collectible = False```.

In [None]:
# No Widget map:
new_env.world.maps.keys()

Let's re-define the Widget Resource class, but with ```Widget.collectible = True```, which will give it the same semantics as Wood and Stone.

In [None]:
from ai_economist.foundation.entities.landmarks import Landmark, landmark_registry

@resource_registry.add
class Widget(Resource):
    name = "Widget"
    color = [1, 1, 1]
    collectible = True # <--- Goes in agent inventory, AND in the world
    
# Since we're doing this in a notebook, we need to manually add a Source Block Landmark for Widgets.
#     If we defined the Widget class in /foundation/entities/resources.py,
#     this class construction would happen automatically.
@landmark_registry.add
class SourceBlock(Landmark):
    """Special Landmark for generating collectible resources. Not ownable. Not solid."""

    name = "{}SourceBlock".format(Widget.name)
    color = np.array(Widget.color)
    ownable = False
    solid = False

Now that we've given Widget new semantics, let's make another environment object and look at the map keys

In [None]:
new_env_with_collectible_widgets = foundation.make_env_instance(**new_env_config)

new_env_with_collectible_widgets.world.maps.keys()

Cool! Spatial maps for ```'Widget'``` and ```'WidgetSourceBlock'``` are automatically created because we've defined Widget as something that should be collectible from the spatial world. 

The Component ```BuyWidgetFromVirtualStore``` will still work just the same -- no need to re-write that. However, the two new maps will always be empty because our Scenario class only handles populating/regenerating Wood and Stone.

That's fine. After all, this was just to demonstrate the plug-and-play design of the simulation framework!

# 8. Helpful Tips

### Components can be passive.

If you wish to introduce a dynamic to the environment that doesn't depend on agent actions, you can do so through a Component class. Components don't need to add actions and the ```component_step``` doesn't need to depend on actions. An example is found in the [WealthRedistribution](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/components/redistribution.py) class.

### Components can be stateful.

In the actual environment, components are Python objects, so you might as well use them that way. A simple demonstration of this is actually found in the example Component above (```BuyWidgetFromVirtualStore```). Notice how the class has a ```available_widget_units``` attribute, which it creates during ```__init___```, updates during ```component_step```, and resets in ```additional_reset_steps```.

Conceptually, ```BuyWidgetFromVirtualStore.available_widget_units``` is just as much a part of the environment state as, say, ```new_agent0.state```. You should feel free to take advantage of the object-oriented design. Just make sure to properly reset internally managed states in ```additional_reset_steps```!

### Components can add multiple action sets per agent.

Components can create many sets of actions. For example (for mobile agents):
- ```BuyWidgetFromVirtualStore``` adds 1 action set with only 1 action. 
- ```Gather``` adds 1 action set with 4 actions.
- ```ContinuousDoubleAuction``` adds 2\*N action sets each with M+1 actions, where N is the number of collectible Resources and M is the maximum buying/selling price.

In that last example ```ContinuousDoubleAuction``` (which implements trading) the structure of the action sets it creates depend on the choice of maximum buying/selling price (an argument to the class) as well as the collectible Resources in the environment.

That might seem complicated but it's simpler than it sounds. Check out [the actual code](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/components/continuous_double_auction.py) for a useful example. In particular, look at the ```__init__``` and ```get_n_agent_actions``` to see how the action sets are set up and look at ```component_step``` to see how the step method makes use of them.

### Only *you* can ensure NO-OP semantics.

Smokey the Bear famously said, "Only *you* can prevent forest fires." While that hazard doesn't apply here, the sense of responsibility is just the same:  
**If ```agent.get_component_action(...)``` returns 0, that means NO-OP!**

If you look through [the implemented Components](https://github.com/salesforce/ai-economist/blob/master/ai_economist/foundation/components), you'll notice throughout the ```component_step``` methods something along the lines of:
```python
for agent in self.world.agents:
    action = agent.get_component_action(self.name)

    # NO-OP! Agent is NOT interacting with this component.
    if action == 0: 
        continue # Move on to the next agent

    # Agent is interacting with this component
    else:
        ... # Do something
```

Referring back to our ```BuyWidgetFromVirtualStore``` example, notice how it adds 1 action for mobile agents. When an agent actually takes that action we would see ```agent.get_component_action('BuyWidgetFromVirtualStore')``` returns 1. Not 0! If the agent chose an action belonging to another component (say, took a movement action), then we would see ```agent.get_component_action('BuyWidgetFromVirtualStore')``` returns 0, and it would be up to ```BuyWidgetFromVirtualStore``` to treat that like the NO-OP that it is.

**When you implement a new Component class, it is up to you to ensure that NO-OP semantics are preserved!**

### Environments come with a couple tools for logging.

There are 2 main types of logs that ```BaseEnvironment``` supports: metrics and dense logs.

**Metrics** are used to summarize an episode. Scenarios and Components each have a method for producing a metrics dictionary, which adds a tool to generate a readout on what happened. At the end of the episode, any such metrics are combined into a single metrics dictionary which is accessible through ```env.metrics```.

**Dense logs** offer a timestep-by-timestep breakdown of how an episode played out. By default, ```BaseEnvironment``` includes world state, agent state, and action info for each timestep. Components can contribute their own dense logs, which get added to the final log at the end of the episode.

Because they can be time consuming to create, dense logs are not (by default) generated during every episode. You can use the ```BaseEnvironment``` argument ```dense_log_frequency``` to set how often they are created. If, for example, you use ```dense_log_frequency=20```, then the environment will create dense logs during episodes where the number of total episode completions is a multiple of 20 (that is, every 20 episodes). If you don't want to wait, you can use ```env.reset(force_dense_logging=True)``` to tell the environment to create a dense log for the upcoming episode.

### Have fun!
And congratulate yourself on making it to the end of the advanced tutorial :)

If you really want to go for the extra credit, check out [optimal_taxation_theory_and_simulation.ipynb](https://github.com/salesforce/ai-economist/blob/master/tutorials/optimal_taxation_theory_and_simulation.ipynb), our final tutorial which walks through how Foundation is used to study the problem of optimal taxation!