
## MangoEnv Class

### `MangoEnv` Class

The `MangoEnv` class represents an environment that combines a concept with an underlying OpenAI Gym environment. It abstracts the environment state based on the given concept.


In [1]:
import sys; sys.path.append('../')

from mango.actions import ActionCompatibility
from mango.concepts import Concept, IdentityConcept
from mango.dynamicpolicies import DQnetPolicyMapper
from mango.utils import ReplayMemory, Transition, torch_style_repr

In [2]:
from dataclasses import dataclass, field
from typing import Generic, Optional, Sequence, SupportsFloat, TypeVar
import gymnasium as gym
import numpy as np
import numpy.typing as npt


ObsType = TypeVar("ObsType")

@dataclass(eq=False, slots=True, repr=False)
class MangoEnv(Generic[ObsType]):
    concept: Concept[ObsType]
    environment: gym.Env[ObsType, int]
    abs_state: npt.NDArray = field(init=False)

    def step(self, action: int) -> tuple[ObsType, SupportsFloat, bool, bool, dict]:
        ...
    
    def reset(
        self, *, seed: Optional[int] = None, options: Optional[dict] = None
    ) -> tuple[ObsType, dict]:
        ...


### `reset` Method

The `reset` method resets the environment to an initial state.

- `seed` (optional): An optional seed for reproducibility.
- `options` (optional): Additional options as a dictionary.

Returns:
- `env_state`: The initial environment state after resetting.
- `info`: Additional information as a dictionary.


**Usage:**

You can create an instance of `MangoEnv` by providing a concept and an underlying OpenAI Gym environment. This allows you to work with abstract states in the environment.


In [9]:

from mango.actions import ActionCompatibility
from mango.concepts import Concept, IdentityConcept
from mango.mango import MangoEnv
from mango.utils import ReplayMemory, Transition, torch_style_repr

In [10]:
# Create an instance of MangoEnv
concept = IdentityConcept()
env = MangoEnv(concept, gym.make("CartPole-v1"))

# Reset the environment
initial_state, info = env.reset()



### `step` Method

The `step` method performs a step in the environment by taking an action and returning information about the transition.

- `action`: An integer representing the action taken in the environment.

Returns:
- `env_state`: The environment state after the step.
- `reward`: The reward obtained from the environment.
- `done`: A boolean indicating whether the episode is done.
- `truncated`: A boolean indicating whether the episode was truncated.
- `info`: Additional information as a dictionary.


**Usage:**

In [11]:
# Perform a step in the environment
action = 0
env_state, reward, done, truncated, info = env.step(action)

**Usage:**


## MangoLayer Class

### `MangoLayer` Class

The `MangoLayer` class represents a layer in the Mango framework. It combines a concept, action compatibility, and a lower-layer environment or layer.


In [13]:

from mango.actions import ActionCompatibility
from mango.concepts import Concept, IdentityConcept
from mango.mango import MangoEnv, MangoLayer
from mango.utils import ReplayMemory, Transition, torch_style_repr

In [14]:
@dataclass(eq=False, slots=True, repr=False)
class MangoLayer(Generic[ObsType]):
    concept: Concept[ObsType]
    action_compatibility: ActionCompatibility
    lower_layer: MangoLayer[ObsType] | MangoEnv[ObsType]
    max_steps: int = 10

    def step(self, action: int) -> tuple[ObsType, SupportsFloat, bool, bool, dict]:
        ...
    
    def reset(
        self, *, seed: Optional[int] = None, options: Optional[dict] = None
    ) -> tuple[ObsType, dict]:
        ...

**Usage:**

You can create an instance of `MangoLayer` by providing a concept, action compatibility, and a lower-layer environment or layer. This allows you to build a hierarchical structure of layers in the Mango framework.




### `reset` Method

The `reset` method resets the layer to an initial state by resetting the lower-layer environment or layer.

- `seed` (optional): An optional seed for reproducibility.
- `options` (optional): Additional options as a dictionary.

Returns:
- `env_state`: The initial environment state after resetting.
- `info`: Additional information as a dictionary.


**Usage:**


In [None]:
# Create an instance of MangoLayer
concept = IdentityConcept()
action_compatibility = ActionCompatibility(...)
lower_layer = MangoEnv(concept, gym.make("CartPole-v1"))
layer = MangoLayer(concept, action_compatibility, lower_layer)

# Reset the layer
initial_state, info = layer.reset()

### `step` Method

The `step` method performs a step in the layer by taking an action and returning information about the transition. It handles interactions with the lower-layer environment or layer.

- `action`: An integer representing the action taken in the layer.

Returns:
- `env_state`: The environment state after the step.
- `accumulated_reward`: The accumulated reward obtained from the environment.
- `done`: A boolean indicating whether the episode is done.
- `truncated`: A boolean indicating whether the episode was truncated.
- `info`: Additional information as a dictionary.

**Usage:**

In [None]:

# Perform a step in the layer
action = 0
env_state, accumulated_reward, done, truncated, info = layer.step(action)

## Mango Class

### `Mango` Class

The `Mango` class represents the top-level control for the Mango framework. It orchestrates the interactions between different layers and provides methods for execution and training.


In [None]:
class Mango(Generic[ObsType]):
    def __init__(
        self,
        environment: gym.Env[ObsType, int],
        concepts: Sequence[Concept[ObsType]],
        action_compatibilities: Sequence[ActionCompatibility],
        base_concept: Concept[ObsType] = IdentityConcept(),
    ) -> None:
        ...
    
    def execute_option(
        self, action: int, layer: int = 0
    ) -> tuple[ObsType, SupportsFloat, bool, bool, dict]:
        ...
    
    def train(self, steps: int, layer_idx: int = -1, epochs: int = 1) -> None:
        ...
    
    def reset(
        self, *, seed: Optional[int] = None, options: Optional[dict] = None
    ) -> tuple[ObsType, dict]:
        ...

**Usage:**

You can create an instance of `Mango` by providing an environment, a sequence of concepts, and a sequence of action compatibilities. This allows you to build a hierarchical agent using the Mango framework.


### `execute_option` Method

The `execute_option` method executes an option (action) in a specific layer of the Mango framework.

- `action`: An integer representing the action to execute.
- `layer`: The index of the layer in which to execute the action (default is the base layer).

Returns:
- `env_state`: The environment state after executing the option.
- `accumulated_reward`: The accumulated reward obtained from the environment.
- `done`:

 A boolean indicating whether the episode is done.
- `truncated`: A boolean indicating whether the episode was truncated.
- `info`: Additional information as a dictionary.


**Usage:**

In [None]:
# Create an instance of Mango
concept = IdentityConcept()
environment = gym.make("CartPole-v1")
concepts = [concept]
action_compatibilities = [ActionCompatibility(environment.action_space)]
mango_agent = Mango(environment, concepts, action_compatibilities)

# Execute an option in the agent
action = 0
env_state, accumulated_reward, done, truncated, info = mango_agent.execute_option(action)


### `train` Method

The `train` method trains the Mango framework by executing options and updating policies for a specified number of steps.

- `steps`: The number of steps to train the agent.
- `layer_idx`: The index of the layer to train (default is the topmost layer).
- `epochs`: The number of training epochs (default is 1).


**Usage:**

In [None]:
# Train the Mango agent
num_steps = 1000
layer_index = 0
num_epochs = 5
mango_agent.train(num_steps, layer_idx=layer_index, epochs=num_epochs)

### `reset` Method

The `reset` method resets the Mango agent to an initial state.

- `seed` (optional): An optional seed for reproducibility.
- `options` (optional): Additional options as a dictionary.

Returns:
- `env_state`: The initial environment state after resetting.
- `info`: Additional information as a dictionary.


**Usage:**

In [None]:
# Reset the Mango agent
initial_state, info = mango_agent.reset(seed=42, options={"key": "value"})

## Example Usage

Here's an example of how to create and use a Mango agent:

In [None]:
# Create an instance of Mango
concept = IdentityConcept()
environment = gym.make("CartPole-v1")
concepts = [concept]
action_compatibilities = [ActionCompatibility(environment.action_space)]
mango_agent = Mango(environment, concepts, action_compatibilities)

# Execute an option in the agent
action = 0
env_state, accumulated_reward, done, truncated, info = mango_agent.execute_option(action)

# Train the Mango agent
num_steps = 1000
layer_index = 0
num_epochs = 5
mango_agent.train(num_steps, layer_idx=layer_index, epochs=num_epochs)

# Reset the Mango agent
initial_state, info = mango_agent.reset(seed=42, options={"key": "value"})


In this example, we create a Mango agent, execute options, train the agent, and reset it to an initial state. The agent consists of a base layer with a CartPole environment and a simple IdentityConcept.
