# Environment Demo

In [418]:
import numpy as np
import pygame

import gymnasium as gym
from gymnasium import spaces
from math import sqrt, atan2, degrees

## Robot features

- 25 cm diameter
- compass
- 360 vision sensor and object reconition in range 50cm
- comunication between others robots
- ability to pick up stuff (in they're in the same position of the object)
- holonomic motion (every directions)
- maximum velocity: 200 cm/s
- maximum acceleration: 400 cm/s²

In [419]:
TIME_PER_STEP = 1 # environemnt step in seconds
ROBOT_SIZE = 25 # in cm (diameter)
SENSOR_RANGE = 50 # in cm
MAX_VELOCITY = 200 # in cm/s
MAX_ACC = 400 # in cm/s^2
MAX_DISTANCE = MAX_VELOCITY * TIME_PER_STEP # in cm

We are in a continuous 2D environment (no physics), a robot possesses the capability to navigate in any direction, covering any distance up to a defined maximum per step. Additionally, the robot can pick up (when underneath) and deposit objects.

In [420]:
STAY = 0
MOVE = 1
PICK_UP = 2
PUT_DOWN = 3

The robots are equipped with sensory equipment capable of identifying nearby entities. A "neighbor" is characterized by a tuple comprising the type of object, the distance to it, and its relative direction. Accordingly, each robot maintains a list of such tuples for a predefined fixed number of neighboring entities.

## Arena

5m x 5m with robots and colored objects 

In [421]:
ARENA_SIZE = 500 # in cm
SIMULATION_ARENA_SIZE = ARENA_SIZE / ROBOT_SIZE # robot size is 1 in the simulation

## Environment construction

In [422]:
class GridWorldEnv(gym.Env):
    metadata = {"render_modes": ["human", "rgb_array"], "render_fps": 4}
    def __init__(
            self, 
            seed=None,
            render_mode=None, 
            size=100, 
            n_agents=3, 
            n_blocks=3, 
            n_sensors = 6,
            sensor_range = 5,
            sensor_degree = 360,
            max_agent_movment_per_step = 5,
            sensitivity = 0.5 # How close the agent can get to the block to pick it up
            ):
        self.seed = seed
        self.size = size  # The size of the square grid
        self.window_size = 512  # The size of the PyGame window
        self._sensitivity = sensitivity

        self._n_agents = n_agents
        self.n_blocks = n_blocks

        self._n_neighbors = n_sensors
        self._sensors_range = sensor_range
        self._sensors_degree = sensor_degree
        self._sensors_angle = self._sensors_degree / self._n_neighbors
        self._neighbors = np.zeros((self._n_agents, n_sensors, 3), dtype=float) # init sensors

        self._agents_locations = np.zeros((self._n_agents, 2), dtype=float)
        self._agents_picked_up = np.full(self._n_agents, -1, dtype=int)

        self._blocks_location = np.zeros((self.n_blocks, 2), dtype=float)
        self._blocks_colors = np.zeros(self.n_blocks, dtype=int)
        self._blocks_picked_up = np.full(self.n_blocks, -1, dtype=int)

        self.color_map = {
            2: "\033[91m",  # Red
            3: "\033[94m",  # Blue
            4: "\033[92m",  # Green
            5: "\033[93m",  # Yellow
            6: "\033[95m",  # Purple
            7: "\033[33m",   # Orange
            8: "\033[90m",   # Dark Gray
        }
        self.reset_color = "\033[0m"  # Resets color to default

        # Define the action space for a single robot
        single_robot_action_space = spaces.Dict({
            "move": spaces.Box(low=np.array([0, 0]), high=np.array([max_agent_movment_per_step, 360]), dtype=float),
            "action": spaces.Discrete(4)  # 0: STAY, 1: MOVE, 2: PICK_UP, 3: PUT_DOWN
        })

        # Create a tuple of action spaces, one for each robot
        self.action_space = spaces.Tuple([single_robot_action_space for _ in range(self._n_agents)])

        self.observation_space = spaces.Dict(
            {
                "sensors": spaces.Box(0, 255, shape=(self._n_agents, n_sensors, 3), dtype=float),
            }
        )

        assert render_mode is None or render_mode in self.metadata["render_modes"]
        self.render_mode = render_mode
        self.window = None
        self.clock = None

    def _calculate_distance_direction(self, pointA, pointB, distance_type='euclidean'):
        """
        Calculate the distance and direction in degrees from pointA to pointB in a grid world.

        Parameters:
        - pointA: Tuple[int, int] representing the coordinates of the first point (x1, y1).
        - pointB: Tuple[int, int] representing the coordinates of the second point (x2, y2).
        - distance_type: String indicating the type of distance to calculate ('manhattan' or 'euclidean').

        Returns:
        - distance: The calculated distance between the two points.
        - direction_degrees: The direction from pointA to pointB in degrees from 0 to 360.
        """
        x1, y1 = pointA
        x2, y2 = pointB

        # Calculate distance
        if distance_type == 'manhattan':
            distance = abs(x1 - x2) + abs(y1 - y2)
        elif distance_type == 'euclidean':
            distance = sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
        else:
            raise ValueError("Invalid distance type. Use 'manhattan' or 'euclidean'.")

        # Calculate direction in degrees
        angle_radians = atan2(y2 - y1, x2 - x1)
        direction_degrees = degrees(angle_radians)

        # Normalize the direction to be between 0 and 360 degrees
        if direction_degrees < 0:
            direction_degrees += 360
        # down is 0/360 degrees, right is 90 degrees, up is 180 degrees, left is 270 degrees

        return distance, direction_degrees
    
    def _get_obs(self):
        # Reset sensors
        self._neighbors = np.zeros((self._n_agents, self._n_neighbors, 3), dtype=float)
        
        # Mimic sensors reading
        for i in range(self._n_agents):
            neighbor_counter = -1
            # TODO: covered_directions = [] # To ensure that the sensors only detect one agent per direction (the closest one)
            
            # Check if the sensors detect other agents
            for j in range(self._n_agents):
                if i != j:
                    distance, direction = self._calculate_distance_direction(self._agents_locations[i], 
                                                                            self._agents_locations[j])
                    if distance <= self._sensors_range: # If the other agent is within the sensor range
                        if (neighbor_counter >= self._n_neighbors - 1):
                            # Substitute with the furthest
                            max_distance = -1
                            max_distance_index = -1
                            for k in range(self._n_neighbors):
                                if self._neighbors[i, k, 1] > max_distance:
                                    max_distance = self._neighbors[i, k, 1]
                                    max_distance_index = k
                            if distance < max_distance:
                                self._neighbors[i, max_distance_index] = [1, distance, direction]
                        else:
                            neighbor_counter += 1
                            self._neighbors[i, neighbor_counter] = [1, distance, direction] # 1 to indicate an agent
            
            # Check if the sensors detect blocks
            for j in range(self.n_blocks):
                distance, direction = self._calculate_distance_direction(self._agents_locations[i], 
                                                                        self._blocks_location[j])
                if distance <= self._sensors_range: # If the block is within the sensor range
                    if distance <= self._sensors_range: # If the other agent is within the sensor range
                        if (neighbor_counter >= self._n_neighbors - 1):
                            # Substitute with the furthest
                            max_distance = -1
                            max_distance_index = -1
                            for k in range(self._n_neighbors):
                                if self._neighbors[i, k, 1] > max_distance:
                                    max_distance = self._neighbors[i, k, 1]
                                    max_distance_index = k
                            if distance < max_distance:
                                self._neighbors[i, max_distance_index] = [1, distance, direction]
                        else:
                            neighbor_counter += 1
                            self._neighbors[i, neighbor_counter] = [self._blocks_colors[j], distance, direction]
        
        # Sort the neighbors by distance
        # Define a custom sort key that ignores rows with 0 in the second column
        def sort_key(row):
            return row[1] if row[1] != 0 else np.inf
        
        self._neighbors = np.array([sorted(subarr, key=sort_key) for subarr in self._neighbors])

        return {"neighbors": self._neighbors, "agents_block_picked": self._agents_picked_up}
                
    def reset(self, seed=None, options=None):
        # We need the following line to seed self.np_random
        super().reset(seed=seed)
        self._agents_picked_up = np.full(self._n_agents, -1, dtype=int)
        self._blocks_picked_up = np.full(self.n_blocks, -1, dtype=int)
       
        # Choose the agent's location uniformly at random
        for i in range(self._n_agents):
            # Check if the agents are not spawning in the same location
            while True:
                self._agents_locations[i] = self.np_random.uniform(0, self.size, size=2)
                if i == 0 or not np.any(np.linalg.norm(self._agents_locations[i] - self._agents_locations[:i], axis=1) < self._sensitivity):
                    break

        for i in range(self.n_blocks):
            # Check if the blocks are not spawning in the same location
            while True:
                self._blocks_location[i] = self.np_random.uniform(0, self.size, size=2)
                if i == 0 or not np.any(np.linalg.norm(self._blocks_location[i] - self._blocks_location[:i], axis=1) < self._sensitivity):
                    break
            self._blocks_colors[i] = self.np_random.integers(2, 2 + len(self.color_map), dtype=int)
        
        observation = self._get_obs()
        info = {}

        return observation, info
    
    def step(self, action):

        for i in range(self._n_agents):
            
            if action[i]['action'] == PICK_UP:
                if self._agents_picked_up[i] == -1: # If the agent is not carrying a block
                    for j in range(self.n_blocks):
                        # If the agent is in the same location as the block
                        if np.linalg.norm(self._agents_locations[i] - self._blocks_location[j]) < self._sensitivity: 
                            self._blocks_location[j] = (-1,-1) # Set as picked up
                            self._agents_picked_up[i] = self._blocks_colors[j] # The agent knows the color of the block it picked up
                            self._blocks_picked_up[j] = i # The block is picked up by the agent
            
            if action[i]['action'] == PUT_DOWN:
                if self._agents_picked_up[i] != -1: # If the agent is carrying a block
                    for j in range(self.n_blocks):
                        if (self._blocks_picked_up[j] == i): # If the block is picked up by the agent
                            self._blocks_location[j] = self._agents_locations[i] # Set the block location to the agent location
                            self._agents_picked_up[i] = -1 # The agent is not carrying a block anymore
                            self._blocks_picked_up[j] = -1 # The block is not picked up by any agent anymore

            if action[i]['action'] == STAY:
                continue

            if action[i]['action'] == MOVE:
                # Map the action to the direction we walk in
                distance, direction = action[i]['move']
                if direction > 180:
                    direction -= 360
                direction = np.radians(direction)

                dx = distance * np.cos(direction)
                dy = distance * np.sin(direction)
                # We use `np.clip` to make sure we don't leave the grid
                new_x = np.clip(
                    self._agents_locations[i][0] + dx, 0, self.size - 1
                )
                new_y = np.clip(
                    self._agents_locations[i][1] + dy, 0, self.size - 1
                )
                new_position = np.array([new_x, new_y])
                
                # Check if the new position is not too close to another agent
                occupied_by_agent = np.any(np.linalg.norm(new_position - self._agents_locations, axis=1) < self._sensitivity)

                # Check if the new position is not too close to a block while carrying one (can't pick up two blocks)
                occupied_by_block_while_carrying = np.any(np.linalg.norm(new_position - self._blocks_location, axis=1) < self._sensitivity) and self._agents_picked_up[i] != -1
                if not occupied_by_agent and not occupied_by_block_while_carrying:
                    self._agents_locations[i] = new_position

        observation = self._get_obs()
        done = False
        reward = 0
        info = {}
        
        return observation, reward, done, info
    
    def print_env(self):
        # Define the size of the visualization grid
        vis_grid_size = 25  # Adjust based on desired resolution
        
        # Create an empty visual representation of the environment
        visual_grid = [["." for _ in range(vis_grid_size)] for _ in range(vis_grid_size)]
        
        # Populate the visual grid with blocks
        for i, block in enumerate(self._blocks_location):
            # Convert continuous coordinates to discrete grid positions
            x, y = int(block[0] * vis_grid_size / self.size), int(block[1] * vis_grid_size / self.size)
            if 0 <= x < vis_grid_size and 0 <= y < vis_grid_size:
                color_id = self._blocks_colors[i]
                color_code = self.color_map.get(color_id, self.reset_color)
                visual_grid[x][y] = f"{color_code}O{self.reset_color}"
        
        # Populate the visual grid with agents
        for i, agent in enumerate(self._agents_locations):
            # Convert continuous coordinates to discrete grid positions
            x, y = int(agent[0] * vis_grid_size / self.size), int(agent[1] * vis_grid_size / self.size)
            if 0 <= x < vis_grid_size and 0 <= y < vis_grid_size:
                if self._agents_picked_up[i] != -1:
                    color_id = self._agents_picked_up[i]
                    color_code = self.color_map.get(color_id, self.reset_color)
                    visual_grid[x][y] = f"{color_code}{i}{self.reset_color}"
                else:
                    visual_grid[x][y] = str(i)
        
        # Print the visual representation
        for row in visual_grid:
            print(" ".join(row))

        
    def print_neighbors(self):
        for i in range(self._n_agents):
            flag = False
            for j in range(self._n_neighbors):
                if self._neighbors[i,j,0] != 0:
                    entity = "agent"
                    if self._neighbors[i,j,0] != 1:
                        entity = f"block (color: {self._neighbors[i,j,0]})"
                    distance = self._neighbors[i,j,1]
                    direction = self._neighbors[i,j,2]
                    print(f"Agent {i} sees {entity}: {distance} distance and {direction} degrees direction")
                    flag = True
            if not flag:
                print(f"Agent {i} doesn't see anything")
            print()
        
    def close(self):
        if self.window is not None:
            pygame.display.quit()
            pygame.quit()

In [423]:
env = GridWorldEnv(render_mode='rgb_array', size=20, n_agents=5, n_blocks=5)
env.reset() # Initial state

({'neighbors': array([[[  1.        ,   4.52520084, 239.46966207],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ]],
  
         [[  7.        ,   4.39179284,  48.26391421],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ]],
  
         [[  1.        ,   2.6219923 ,  18.92698574],
          [  7.        ,   4.46386116, 335.96086709],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ],
          [  0.        ,   0.        ,   0.        ]],
  
   

In [424]:
env.print_env()

. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 2 . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . [95mO[0m . . . . . . . 1 . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 4 . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . [33mO[0m . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 0
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . [95mO[0m . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . 3 . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .

In [425]:
env.print_neighbors()

Agent 0 sees agent: 4.525200835343188 distance and 239.46966206506647 degrees direction

Agent 1 sees block (color: 7.0): 4.391792840556168 distance and 48.26391420528498 degrees direction

Agent 2 sees agent: 2.621992299914558 distance and 18.926985735920006 degrees direction
Agent 2 sees block (color: 7.0): 4.463861156791819 distance and 335.96086708795235 degrees direction

Agent 3 sees block (color: 6.0): 1.5221255045664275 distance and 337.7251117132772 degrees direction

Agent 4 sees agent: 2.621992299914558 distance and 198.92698573592 degrees direction
Agent 4 sees block (color: 7.0): 3.109924206593207 distance and 300.8870423494042 degrees direction
Agent 4 sees agent: 4.525200835343188 distance and 59.469662065066494 degrees direction



## Little Demo for actions

In [426]:
action = env.action_space.sample()
action[0]['action'] = MOVE
action[0]['move'] = [5, 90]
next_state, _, _, _ = env.step(action)
env.print_env()

. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 2 . . . . . .
. . . . . . . . . . . . . . . 1 . . . . . . . . .
. . . . [95mO[0m . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 4 . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . [33mO[0m . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 0 .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . [95mO[0m . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . 3 . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .

In [427]:
env.print_neighbors()

Agent 0 sees agent: 4.718847822195695 distance and 241.31727502896243 degrees direction

Agent 1 sees agent: 2.1744239063866173 distance and 113.16094688803285 degrees direction
Agent 1 sees agent: 2.6651862872044827 distance and 51.505488032058594 degrees direction
Agent 1 sees block (color: 7.0): 3.2265328531003656 distance and 3.2117851843439653 degrees direction

Agent 2 sees agent: 2.1744239063866173 distance and 293.16094688803287 degrees direction
Agent 2 sees agent: 2.515649241786771 distance and 1.9769209187120476 degrees direction
Agent 2 sees block (color: 7.0): 4.463861156791819 distance and 335.96086708795235 degrees direction

Agent 3 sees block (color: 6.0): 1.5221255045664275 distance and 337.7251117132772 degrees direction

Agent 4 sees block (color: 7.0): 2.4639958159410074 distance and 309.3570959376473 degrees direction
Agent 4 sees agent: 2.515649241786771 distance and 181.97692091871204 degrees direction
Agent 4 sees agent: 2.6651862872044827 distance and 231.5054

In [428]:
next_state

{'neighbors': array([[[  1.        ,   4.71884782, 241.31727503],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ]],
 
        [[  1.        ,   2.17442391, 113.16094689],
         [  1.        ,   2.66518629,  51.50548803],
         [  7.        ,   3.22653285,   3.21178518],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ]],
 
        [[  1.        ,   2.17442391, 293.16094689],
         [  1.        ,   2.51564924,   1.97692092],
         [  7.        ,   4.46386116, 335.96086709],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ],
         [  0.        ,   0.        ,   0.        ]],
 
        [[  6.        , 