# Quick Start

Below is a simple demo of interaction with the environment.

In [1]:
from maro.simulator import Env
from maro.simulator.scenarios.citi_bike.common import Action, DecisionEvent

env = Env(scenario="citi_bike", topology="ny.201912", start_tick=0, durations=1440, snapshot_resolution=30)

is_done: bool = False
reward: int = None
decision_event: DecisionEvent = None

while not is_done:
    action: Action = None
    reward, decision_event, is_done = env.step(action)

# Environment of the bike repositioning

To initialize an environment, you need to specify the values of several parameters:
- **scenario**: The target scenario of this Env. "citi_bike" denotes for the bike repositioning.
- **topology**: The target topology of this Env.
   + There are some predefined topologies in MARO, that you can directly use it as in the demo.
   + Also, you can define your own topologies following the guidance in the [doc](docs/customization/new_topology.rst).
- **start_tick**: The start tick of this Env, 1 tick corresponds to 1 minute in citi_bike.
   + In the demo above, *start_tick=0* indicates a simulation start from the beginning of the given topology.
- **durations**: The duration of thie Env, in the unit of tick/minute.
   + In the demo above, *durations=1440* indicates a simulation length of 1 day (24h * 60min/h).
- **snapshot_resolution**: The time granularity of maintaining the snapshots of the environments, in the unit of tick/minute.
   + In the demo above, *snapshot_resolution=30* indicates that a snapshot will be created and saved every 30 minutes during the simulation.

You can get all available scenarios and topologies by calling:

In [2]:
from maro.simulator.utils import get_available_envs

get_available_envs()    # TODO: specify the scenario

[{'scenario': 'citi_bike', 'topology': 'ny201912'},
 {'scenario': 'citi_bike', 'topology': 'train'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.0'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.3'},
 {'scenario': 'ecr', 'topology': '5p_ssddd_l0.6'},
 {'scenario': 'ecr', 'topology': '5p_ssddd_l0.5'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.4'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.5'},
 {'scenario': 'ecr', 'topology': '5p_ssddd_l0.2'},
 {'scenario': 'ecr', 'topology': '22p_global_trade_l0.1'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.6'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.8'},
 {'scenario': 'ecr', 'topology': '22p_global_trade_l0.3'},
 {'scenario': 'ecr', 'topology': '22p_global_trade_l0.4'},
 {'scenario': 'ecr', 'topology': '5p_ssddd_l0.8'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.2'},
 {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.7'},
 {'scenario': 'ecr', 'topology': '22p_global_trade_l0.2'},
 {'scenario': 'ecr', 'topology': '5p_ssddd_

Once you created an instance of the environment, you can easily access the real-time information of this environment, like:

In [3]:
from maro.backends.frame import SnapshotList
from maro.simulator import Env
from pprint import pprint
from typing import List


# Initialize an Env for citi_bike scenario
env = Env(scenario="citi_bike", topology="ny201912", start_tick=0, durations=1440, snapshot_resolution=30)

# The current tick
tick: int = env.tick
print(f"The current tick: {tick}.")

# The current frame index, which indicates the index of current frame in the snapshot-list
frame_index: int = env.frame_index
print(f"The current frame index: {frame_index}.")

# The agent index list in the environment
agent_idx_list: List[int] = env.agent_idx_list
print(f"There are {len(agent_idx_list)} agents in this Env.")

# The whole snapshot-list of the environment, snapshots are taken in the granularity of the given snapshot_resolution
# The example of how to use the snapshot will be shown later
snapshot_list: SnapshotList = env.snapshot_list
print(f"There will be {len(snapshot_list)} snapshots in total.")

# The summary info of the environment
summary: dict = env.summary
print(f"\nEnv Summary:")
pprint(summary, depth=3)

print(f"\nEnv Summary - matrices:")
pprint(summary['node_detail']['matrices'])

print(f"\nEnv Summary - stations:")
pprint(summary['node_detail']['stations'])

The current tick: 0.
The current frame index: 0.
There are 528 agents in this Env.
There will be 48 snapshots in total.

Env Summary:
{'node_detail': {'matrices': {'attributes': {...}, 'number': 1},
                 'stations': {'attributes': {...}, 'number': 528}},
 'node_mapping': {}}

Env Summary - matrices:
{'attributes': {'distance_adj': {'slots': 278784, 'type': 'f'},
                'trips_adj': {'slots': 278784, 'type': 'i'}},
 'number': 1}

Env Summary - stations:
{'attributes': {'bikes': {'slots': 1, 'type': 'i'},
                'capacity': {'slots': 1, 'type': 'i'},
                'extra_cost': {'slots': 1, 'type': 'i'},
                'failed_return': {'slots': 1, 'type': 'i'},
                'fulfillment': {'slots': 1, 'type': 'i'},
                'holiday': {'slots': 1, 'type': 'i2'},
                'shortage': {'slots': 1, 'type': 'i'},
                'temperature': {'slots': 1, 'type': 'i2'},
                'transfer_cost': {'slots': 1, 'type': 'i'},
           

# Interaction with the environment

Before starting interaction with the environment, we need to know **DecisionEvent** and **Action** first.

## DecisionEvent

Once the environment need the agent's response to promote the simulation, it will throw an **DecisionEvent**. In the scenario of citi_bike, the information of each DecisionEvent is listed as below:
- **station_idx**: the id of the station/agent that needs to respond to the environment
- **tick**: the corresponding tick
- **frame_index**: the corresponding frame index, that is the index of the corresponding snapshot in the snapshot list
- **type**: the decision type of this decision event. In citi_bike scenario, there are 2 types:
   + **Supply**: There is too many bikes in the corresponding station, it's better to reposition some of them to other stations.
   + **Demand**: There is no enough bikes in the corresponding station, it's better to reposition bikes from other stations
- **action_scope**: a dictionary of valid action items.
   + The key of the item indicates the station/agent id;
   + The meaning of the value differs for different decision type:
      * If the decision type is Supply, the value of the station itself means its bike inventory at that moment, while the value of other target stations means the number of their empty docks;
      * If the decision type is Demand, the value of the station itself means the number of its empty docks, while the value of other target stations means their bike inventory.

## Action

Once we get a **DecisionEvent** from the envirionment, we should respond with an **Action**. Valid Action could be:
- None, which means do nothing.
- A valid Action instance, including:
   + **from_station_idx**: int, the id of the source station of the bike transportation
   + **to_station_idx**: int, the id of the destination station of the bike transportation
   + **number**: int, the quantity of the bike transportation

## Generate random actions based on the DecisionEvent

The demo code in the Quick Start part has shown an interaction mode that doing nothing(responding with None action). Here we read the detailed information about the DecisionEvent and generate random actions based on it.

In [None]:
from maro.simulator import Env
from maro.simulator.scenarios.citi_bike.common import Action, DecisionEvent, DecisionType

import random

# Initialize an Env for citi_bike scenario
env = Env(scenario="citi_bike", topology="ny201912", start_tick=0, durations=1440, snapshot_resolution=30)

is_done: bool = False
reward: int = None
decision_event: DecisionEvent = None
action: Action = None

# Start the env with a None Action
reward, decision_event, is_done = env.step(action)

while not is_done:
    if decision_event.type == DecisionType.Supply:
        # the value of the station itself means the bike inventory if Supply
        self_bike_inventory = decision_event.action_scope[decision_event.station_idx]
        # the value of other stations means the quantity of empty docks if Supply
        target_idx_dock_tuple_list = [
            (k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx
        ]
        # random choose a target station weighted by the quantity of empty docks
        target_idx, target_dock = random.choices(
            target_idx_dock_tuple_list,
            weights=[item[1] for item in target_idx_dock_tuple_list]
        )[0]
        # generate the corresponding random Action
        action = Action(
            from_station_idx=decision_event.station_idx,
            to_station_idx=target_idx,
            number=random.randint(0, min(self_bike_inventory, target_dock))
        )

    elif decision_event.type == DecisionType.Demand:
        # the value of the station itself means the quantity of empty docks if Demand
        self_available_dock = decision_event.action_scope[decision_event.station_idx]
        # the value of other stations means their bike inventory if Demand
        target_idx_inventory_tuple_list = [
            (k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx
        ]
        # random choose a target station weighted by the bike inventory
        target_idx, target_inventory = random.choices(
            target_idx_inventory_tuple_list,
            weights=[item[1] for item in target_idx_inventory_tuple_list]
        )[0]
        # generate the corresponding random Action
        action = Action(
            from_station_idx=target_idx,
            to_station_idx=decision_event.station_idx,
            number=random.randint(0, min(self_available_dock, target_inventory))
        )

    else:
        action = None
    
    # Random sampling some records to show in the output   TODO
#     if random.random() > 0.95:
#         print("*************\n{decision_event}\n{action}")
    
    # Respond the environment with the generated Action
    reward, decision_event, is_done = env.step(action)

## Get the environment observation

You can also implement other strategies or build models to take action. At this time, real-time information and historical records of the environment are very important for making good decisions. In this case, the the environment snapshot list is exactly what you need.

The information in the snapshot list is indexed by 3 dimensions:
- A frame index or a frame index list. (int or list of int) Empty indicates for all time slides till now
- A station id (list). (int of list of int) Empty indicates for all stations/agents
- An Attribute name (list). (str of list of str) You can get all available attributes in env.summary as shown before.

The return value from the snapshot list is a numpy.ndarray with shape **(frame * attribute * station, )**.

More detailed introduction to the snapshot list is [here](). # TODO: add hyper-link

In [12]:
from maro.simulator import Env
from pprint import pprint


# Initialize an Env for citi_bike scenario
env = Env(scenario="citi_bike", topology="ny201912", start_tick=0, durations=1440, snapshot_resolution=30)

# The summary info of the environment
print(f"\nEnv Summary - matrices:")
pprint(env.summary['node_detail']['stations'])


Env Summary - matrices:
{'attributes': {'bikes': {'slots': 1, 'type': 'i'},
                'capacity': {'slots': 1, 'type': 'i'},
                'extra_cost': {'slots': 1, 'type': 'i'},
                'failed_return': {'slots': 1, 'type': 'i'},
                'fulfillment': {'slots': 1, 'type': 'i'},
                'holiday': {'slots': 1, 'type': 'i2'},
                'shortage': {'slots': 1, 'type': 'i'},
                'temperature': {'slots': 1, 'type': 'i2'},
                'transfer_cost': {'slots': 1, 'type': 'i'},
                'trip_requirement': {'slots': 1, 'type': 'i'},
                'weather': {'slots': 1, 'type': 'i2'},
                'weekday': {'slots': 1, 'type': 'i2'}},
 'number': 528}


In [11]:
from maro.backends.frame import SnapshotList
from maro.simulator import Env
from pprint import pprint
from typing import List


# Initialize an Env for citi_bike scenario, from 07:00 to 12:00
env = Env(scenario="citi_bike", topology="ny201912", start_tick=420, durations=300, snapshot_resolution=30)

# Run the environment to the end
_, _, is_done = env.step(None)
while not is_done:
    _, _, is_done = env.step(None)

# Get trip requirement from snapshot list by directly using station id and attribute name
station_id = 5
trip_info = env.snapshot_list["stations"][:station_id:"trip_requirement"]
print(type(trip_info), trip_info.shape)
print(f"Trip requirements for station {station_id} with time going by: {trip_info}\n")

# Get capacity and bikes from snapshot list simultaneously by using attribute list
attribute_list = ["capacity", "bikes"]
info = env.snapshot_list["stations"][::attribute_list]
print(type(info), info.shape)

# Reshape the info of capacity and bikes into a user-friendly shape
num_attributes = len(attribute_list)
num_frame = env.frame_index + 1
num_stations = len(env.agent_idx_list)
info = info.reshape(num_frame, num_attributes, num_stations)
print(type(info), info.shape)

# Pring and show the change of bikes in some stations:
bikes_idx = 1
for station_id in [1, 3, 5, 7, 9, 11, 13, 17, 19]:
    print(f"Station {station_id}: {info[:, bikes_idx, station_id]}")

<class 'numpy.ndarray'> (10,)
Trip requirements for station 5 with time going by: [1. 0. 0. 1. 4. 2. 1. 4. 2. 0.]

<class 'numpy.ndarray'> (10560,)
<class 'numpy.ndarray'> (10, 2, 528)
Station 1: [12. 13. 13. 13. 13. 13. 14. 14. 12. 12.]
Station 3: [19. 18. 21. 20. 18. 17. 16. 16. 15. 16.]
Station 5: [11. 12. 12. 12.  9.  9.  8.  4.  7.  7.]
Station 7: [8. 8. 8. 7. 8. 7. 7. 7. 7. 7.]
Station 9: [15. 15. 15. 16. 17. 18. 22. 21. 23. 23.]
Station 11: [13. 13. 14. 13. 16. 17. 20. 17. 17. 16.]
Station 13: [13. 13. 13. 14. 14. 14. 12.  9. 11. 12.]
Station 17: [27. 28. 28. 27. 29. 28. 31. 32. 32. 33.]
Station 19: [18. 19. 19. 19. 20. 20. 20. 20. 20. 21.]
