Skip to content

Commit

Permalink
docs: update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
joennlae committed Dec 10, 2023
1 parent f4eee66 commit efaf69e
Showing 1 changed file with 36 additions and 64 deletions.
100 changes: 36 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<div align="center">

# Stella Nera: A halutmatmul based accelerator
# Stella Nera: A halutmatmul based Accelerator
</div>

<div align="center">
Expand Down Expand Up @@ -34,9 +34,6 @@

![Maddness Animation](/docs/images/maddness_animation.webp)

### Differentiable Maddness

![Differentiable Maddness](docs/images/code_preview.png)

### ResNet-9 LUTs, Thresholds, Dims

Expand Down Expand Up @@ -64,7 +61,7 @@ mse = np.square(C_halut - C).mean()
print(mse)
```

## Install
## Installation

```bash
# install conda environment & activate
Expand All @@ -76,28 +73,37 @@ conda activate halutmatmul
conda env create -f environment_gpu.yml --prefix /scratch/janniss/conda/halutmatmul_gpu
```

# Citation

```bibtex
@article{schonleber2023stella,
title={Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication},
author={Sch{\"o}nleber, Jannis and Cavigelli, Lukas and Andri, Renzo and Perotti, Matteo and Benini, Luca},
journal={arXiv preprint arXiv:2311.10207},
year={2023}
}
```
### Differentiable Maddness
<div align="center">
<img src="docs/images/code_preview.png" alt="Differentiable Maddness" width="600">
</div>

## Hardware - OpenROAD flow results from CI - NOT OPTIMIZED
### Hardware - OpenROAD flow results from CI - NOT OPTIMIZED

All completely open hardware results are NOT OPTIMIZED! The results are only for reference and to show the flow works.
All completely open hardware results are NOT OPTIMIZED! The results are only for reference and to show the flow works. In the paper results from commercial tools are shown. See this as a community service to make the hardware results more accessible.

| All Designs | NanGate45 |
| ------------- | ------------- |
| All Report | [All](https://github.com/joennlae/halutmatmul-openroad-reports/tree/main/latest/nangate45) |
| History | [History](https://github.com/joennlae/halutmatmul-openroad-reports/tree/main/history/nangate45) |


### Full design (halutmatmul)
#### Open Hardware Results Table
| NanGate45 | halut_matmul | halut_encoder_4 | halut_decoder |
| ------------- | ------------- | ------------- | ------------- |
| Area [μm^2] | 128816 | 46782 | 24667.5 |
| Freq [Mhz] | 166.7 | 166.7 | 166.7 |
| GE | 161.423 kGE | 58.624 kGE | 30.911 kGE |
| Std Cell [#] | 65496 | 23130 | 12256 |
| Voltage [V] | 1.1 | 1.1 | 1.1 |
| Util [%] | 50.4 | 48.7 | 52.1 |
| TNS | 0 | 0 | 0 |
| Clock Net | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_matmul/reports/final_clocks.webp" alt="Clock Net" width="150"> | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_encoder_4/reports/final_clocks.webp" alt="Clock Net" width="150"> | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_decoder/reports/final_clocks.webp" alt="Clock Net" width="150"> |
| Routing | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_matmul/reports/final_routing.webp" alt="Routing" width="150"> | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_encoder_4/reports/final_routing.webp" alt="Routing" width="150"> | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_decoder/reports/final_routing.webp" alt="Routing" width="150"> |
| GDS | [GDS Download](https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_matmul/results/6_final.gds) | [GDS Download](https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_encoder_4/results/6_final.gds) | [GDS Download](https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_decoder/results/6_final.gds) |


#### Full design (halutmatmul)

Run locally with:
```bash
Expand All @@ -106,53 +112,19 @@ cd hardware
ACC_TYPE=INT DATA_WIDTH=8 NUM_M=8 NUM_DECODER_UNITS=4 NUM_C=16 make halut-open-synth-and-pnr-halut_matmul
```


### Full Design
| halut_matmul | NanGate45 |
| ------------- | ------------- |
| Area [μm^2] | 128816 |
| Freq [Mhz] | 166.7 |
| GE | 161.423 kGE |
| Std Cell [#] | 65496 |
| Voltage [V] | 1.1 |
| Util [%] | 50.4 |
| TNS | 0 |
| Clock Net | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_matmul/reports/final_clocks.webp" alt="Clock Net" width="150"> |
| Routing | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_matmul/reports/final_routing.webp" alt="Routing" width="150"> |
| GDS | [GDS Download](https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_matmul/results/6_final.gds) |


### Encoder
| halut_encoder_4 | NanGate45 |
| ------------- | ------------- |
| Area [μm^2] | 46782 |
| Freq [Mhz] | 166.7 |
| GE | 58.624 kGE |
| Std Cell [#] | 23130 |
| Voltage [V] | 1.1 |
| Util [%] | 48.7 |
| TNS | 0 |
| Clock Net | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_encoder_4/reports/final_clocks.webp" alt="Clock Net" width="150"> |
| Routing | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_encoder_4/reports/final_routing.webp" alt="Routing" width="150"> |
| GDS | [GDS Download](https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_encoder_4/results/6_final.gds) |


### Decoder
| halut_decoder | NanGate45 |
| ------------- | ------------- |
| Area [μm^2] | 24667.5 |
| Freq [Mhz] | 166.7 |
| GE | 30.911 kGE |
| Std Cell [#] | 12256 |
| Voltage [V] | 1.1 |
| Util [%] | 52.1 |
| TNS | 0 |
| Clock Net | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_decoder/reports/final_clocks.webp" alt="Clock Net" width="150"> |
| Routing | <img src="https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_decoder/reports/final_routing.webp" alt="Routing" width="150"> |
| GDS | [GDS Download](https://raw.githubusercontent.com/joennlae/halutmatmul-openroad-reports/main/latest/nangate45/halut_decoder/results/6_final.gds) |


### References

* [arXiv](https://arxiv.org/abs/2106.10860) Maddness paper
* Based on [MADDness/Bolt](https://github.com/dblalock/bolt).


## Citation

```bibtex
@article{schonleber2023stella,
title={Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication},
author={Sch{\"o}nleber, Jannis and Cavigelli, Lukas and Andri, Renzo and Perotti, Matteo and Benini, Luca},
journal={arXiv preprint arXiv:2311.10207},
year={2023}
}
```

0 comments on commit efaf69e

Please sign in to comment.