# Ariane

In [1]:
from a2perf.domains import circuit_training
import gymnasium as gym

env = gym.make('CircuitTraining-Ariane-v0')

2024-08-15 14:13:40.063385: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-15 14:13:40.197600: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-15 14:13:40.197668: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-15 14:13:40.218491: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-15 14:13:40.269410: I tensorflow/core/platform/cpu_feature_guar

<table>
    <tr>
        <th style="text-align:right">Action Space</th>
        <td style="text-align:left">Discrete(16384)</td>
    </tr>
    <tr>
        <th style="text-align:right">Observation Space</th>
        <td style="text-align:left">
            Dict('current_node': Box(0, 3499, (1,), int32), 'fake_net_heatmap': Box(0.0, 1.0, (16384,), float32), 'is_node_placed': Box(0, 1, (3500,), int32), 'locations_x': Box(0.0, 1.0, (3500,), float32), 'locations_y': Box(0.0, 1.0, (3500,), float32), 'mask': Box(0, 1, (16384,), int32), 'netlist_index': Box(0, 0, (1,), int32))
        </td>
    </tr>
    <tr>
        <th style="text-align:right">Reward Range</th>
        <td style="text-align:left">(0, 1)</td>
    </tr>
    <tr>
        <th style="text-align:right">Creation</th>
        <td style="text-align:left">gym.make("CircuitTraining-Ariane-v0")</td>
    </tr>
</table>

## Description

Circuit Training is an open-source framework for generating chip floor plans with distributed deep reinforcement learning. This framework reproduces the methodology published in the Nature 2021 paper:

A graph placement methodology for fast chip design. Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, William Hang, Emre Tuncer, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter & Jeff Dean, 2021. Nature, 594(7862), pp.207-212. [PDF]

At each timestep, the agent must place a single macro onto the chip canvas. 


## Action Space


In [2]:
env.action_space

Discrete(16384)

Circuit Training represents the chip canvas as a grid. The action space corresponds to the different locations that the next macro can be placed onto the canvas. In the Ariane netlist case, the canvas is of size $128 \times 128$, resulting in $16384$ possible actions.

## Observation Encoding


In [3]:
env.observation_space

Dict('current_node': Box(0, 3499, (1,), int32), 'fake_net_heatmap': Box(0.0, 1.0, (16384,), float32), 'is_node_placed': Box(0, 1, (3500,), int32), 'locations_x': Box(0.0, 1.0, (3500,), float32), 'locations_y': Box(0.0, 1.0, (3500,), float32), 'mask': Box(0, 1, (16384,), int32), 'netlist_index': Box(0, 0, (1,), int32))

| Key | Description |
|-----|-------------|
| current_node | The node currently being considered for placement |
| fake_net_heatmap | A representation of estimated connections between nodes |
| is_node_placed | Indicates which nodes have already been placed on the chip |
| locations_x | The x-coordinates of placed nodes |
| locations_y | The y-coordinates of placed nodes |
| mask | Indicates which actions are valid in the current state |
| netlist_index | Identifier for the current netlist being processed |

## Rewards

The reward is evaluated at the end of each episode. The placement cost binary is used to calculate the reward based on proxy wirelength, congestion, and density. An infeasible placement results in a reward of -1.0.

The reward function is defined as:

$$R(p, g) = -\text{Wirelength}(p, g) - \lambda \cdot \text{Congestion}(p, g) - \gamma \cdot \text{Density}(p, g)$$

Where:
- $p$ represents the placement
- $g$ represents the netlist graph
- $\lambda$ is the congestion weight
- $\gamma$ is the density weight

Default values in A2Perf:
- The congestion weight $\lambda$ is set to 0.01
- The density weight $\gamma$ is set to 0.01 
- The maximum density threshold is set to 0.6

These default values are based on the methodology described in [Mirhoseini et al. (2021)][1].

[1]: https://www.nature.com/articles/s41586-021-03544-w "A graph placement methodology for fast chip design"

## Termination

The episode is terminated once all macros have been placed on the canvas, then the final reward is calculated.

## Registered Configurations
* `CircuitTraining-Ariane-v0`