00000001
11110000



## MinAtar Breakout-v1 Environment: Observation Space Overview

### Observation Space
- In the MinAtar Breakout-v1 environment, we're dealing with a `10x10x4` observation space.
- This represents a grid where each cell can contain parts of the game (like the ball, paddle, or bricks).
- The `4` refers to different layers or "channels" for various game elements.

### Naive Representation
- Initially, I flattened this `10x10x4` space into a single bitstring (extending from the simple integer action/state case).
- This resulted in a gigantic number with `2^400` possible states – a massively complex space to navigate and I don't think agent will be able to do that no matter how hard I try so I moved on.

### My Implementation: Grid Structure
- To simplify and, I retain the grid structure of the observation space.
- Each `10x10x4` grid is transformed into a continuous space by converting 0's to -1.0 and 1's to 1.0. This turns the data into analog bits.
- This approach preserves spatial locality and relationships within the game, hopefully helping in pattern recognition and decision-making for the model.

### Transformations code

In [1]:
import numpy as np

def convert_box_to_grid(box_observation):
    """
    Converts a box observation space from a Mini Atari environment 
    into a binary grid with multiple channels.
    """
    binary_grid = box_observation.astype(int)
    return binary_grid

def binbox2analog_bit(box_observation):
    """
    Converts the box observation into analog bits.
    """
    box_grid = convert_box_to_grid(box_observation)
    analog_bits_grid = (box_grid * 2) - 1
    #convert to float for continuity 
    analog_bits_grid = analog_bits_grid.astype(np.float32)
    return analog_bits_grid.reshape(10, 10, 4)

### Example Transformation (First 2 rows of the grid)

In [11]:
import gymnasium as gym
env = gym.make("MinAtar/Breakout-v1")


# Sample observation from the environment 
sample_observation = env.reset()[0]
#sample_observation = np.random.randint(2, size=(10, 10, 4))

# Original box observation (before any transformation)
print("Original Box Observation:")
print(sample_observation[:2])

Original Box Observation:
[[[False False False False]
  [False False False False]
  [False False False False]
  [False False False False]
  [False False False False]
  [False False False False]
  [False False False False]
  [False False False False]
  [False False False False]
  [False False False False]]

 [[False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]
  [False False False  True]]]


In [12]:
# Convert to binary grid
binary_grid = convert_box_to_grid(sample_observation)
print("\nBinary Grid:")
print(binary_grid[:2])


Binary Grid:
[[[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]
  [0 0 0 1]]]


In [13]:
# Convert to analog bits
analog_bits = binbox2analog_bit(sample_observation)
print("\nAnalog Bits Grid:")
print(analog_bits[:2])


Analog Bits Grid:
[[[-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]
  [-1. -1. -1. -1.]]

 [[-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]
  [-1. -1. -1.  1.]]]
