Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated hardware components and actions for lab4 #32

Merged
merged 12 commits into from
Jan 30, 2024
Merged

Conversation

jianyicheng
Copy link
Collaborator

@jianyicheng jianyicheng commented Jan 30, 2024

This PR manually uploads the necessary hardware components and files for lab 4.

@jianyicheng jianyicheng merged commit 72d2e97 into main Jan 30, 2024
5 checks passed
@pgimenes pgimenes deleted the lab4-update branch January 31, 2024 23:35
jianyicheng pushed a commit that referenced this pull request Feb 12, 2024
* Testing by disable apt upgrade

* Format
jianyicheng pushed a commit that referenced this pull request Feb 12, 2024
* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
jianyicheng pushed a commit that referenced this pull request Feb 13, 2024
* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
jianyicheng pushed a commit that referenced this pull request Feb 17, 2024
* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
jianyicheng pushed a commit that referenced this pull request Feb 17, 2024
* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
rotask pushed a commit to rotask/mase_rotask that referenced this pull request Mar 2, 2024
* Testing by disable apt upgrade

* Format
jianyicheng pushed a commit that referenced this pull request Mar 27, 2024
* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
JasonShen-SH pushed a commit to JasonShen-SH/mase_real that referenced this pull request Mar 27, 2024
* Remove debug code (DeepWok#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (DeepWok#11)

* Tidy up readme

* Resize

* Updated repo names (DeepWok#14)

* Fix transform (DeepWok#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close DeepWok#12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (DeepWok#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (DeepWok#26)

* fixing search action, errors caused because of recent version bumps, relates to issue DeepWok#28

* quantization pass relink fixed (DeepWok#30)

* force to be on the same device for now (DeepWok#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (DeepWok#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (DeepWok#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (DeepWok#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (DeepWok#22)

* Recovered missing changes for the search action (DeepWok#41)

* basically replicate 5a426ed (DeepWok#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (DeepWok#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (DeepWok#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (DeepWok#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
JasonShen-SH pushed a commit to Ruiqi-Shen/mase that referenced this pull request Mar 29, 2024
* Remove debug code (DeepWok#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (DeepWok#11)

* Tidy up readme

* Resize

* Updated repo names (DeepWok#14)

* Fix transform (DeepWok#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close DeepWok#12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (DeepWok#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (DeepWok#26)

* fixing search action, errors caused because of recent version bumps, relates to issue DeepWok#28

* quantization pass relink fixed (DeepWok#30)

* force to be on the same device for now (DeepWok#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (DeepWok#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (DeepWok#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (DeepWok#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (DeepWok#22)

* Recovered missing changes for the search action (DeepWok#41)

* basically replicate 5a426ed (DeepWok#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (DeepWok#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (DeepWok#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (DeepWok#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
JasonShen-SH pushed a commit to Ruiqi-Shen/mase that referenced this pull request Mar 29, 2024
* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (DeepWok#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (DeepWok#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (DeepWok#11)

* Tidy up readme

* Resize

* Updated repo names (DeepWok#14)

* Fix transform (DeepWok#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close DeepWok#12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (DeepWok#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (DeepWok#26)

* fixing search action, errors caused because of recent version bumps, relates to issue DeepWok#28

* quantization pass relink fixed (DeepWok#30)

* force to be on the same device for now (DeepWok#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (DeepWok#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (DeepWok#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (DeepWok#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (DeepWok#22)

* Recovered missing changes for the search action (DeepWok#41)

* basically replicate 5a426ed (DeepWok#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (DeepWok#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (DeepWok#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (DeepWok#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (DeepWok#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (DeepWok#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (DeepWok#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (DeepWok#11)

* Tidy up readme

* Resize

* Updated repo names (DeepWok#14)

* Fix transform (DeepWok#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close DeepWok#12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (DeepWok#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (DeepWok#26)

* fixing search action, errors caused because of recent version bumps, relates to issue DeepWok#28

* quantization pass relink fixed (DeepWok#30)

* force to be on the same device for now (DeepWok#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (DeepWok#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (DeepWok#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (DeepWok#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (DeepWok#22)

* Recovered missing changes for the search action (DeepWok#41)

* basically replicate 5a426ed (DeepWok#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (DeepWok#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (DeepWok#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (DeepWok#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
JoachimSand pushed a commit to ADLS-Group7/mase_group7 that referenced this pull request Mar 29, 2024
* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (DeepWok#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (DeepWok#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (DeepWok#11)

* Tidy up readme

* Resize

* Updated repo names (DeepWok#14)

* Fix transform (DeepWok#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close DeepWok#12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (DeepWok#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (DeepWok#26)

* fixing search action, errors caused because of recent version bumps, relates to issue DeepWok#28

* quantization pass relink fixed (DeepWok#30)

* force to be on the same device for now (DeepWok#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (DeepWok#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (DeepWok#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (DeepWok#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (DeepWok#22)

* Recovered missing changes for the search action (DeepWok#41)

* basically replicate 5a426ed (DeepWok#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (DeepWok#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (DeepWok#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (DeepWok#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (DeepWok#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (DeepWok#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (DeepWok#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (DeepWok#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (DeepWok#11)

* Tidy up readme

* Resize

* Updated repo names (DeepWok#14)

* Fix transform (DeepWok#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close DeepWok#12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (DeepWok#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (DeepWok#26)

* fixing search action, errors caused because of recent version bumps, relates to issue DeepWok#28

* quantization pass relink fixed (DeepWok#30)

* force to be on the same device for now (DeepWok#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (DeepWok#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (DeepWok#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (DeepWok#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (DeepWok#22)

* Recovered missing changes for the search action (DeepWok#41)

* basically replicate 5a426ed (DeepWok#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (DeepWok#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (DeepWok#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (DeepWok#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
pgimenes added a commit that referenced this pull request Apr 3, 2024
* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
@pgimenes pgimenes mentioned this pull request Apr 3, 2024
pgimenes added a commit that referenced this pull request Apr 4, 2024
* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
@pgimenes pgimenes mentioned this pull request Apr 4, 2024
jianyicheng pushed a commit that referenced this pull request Apr 23, 2024
* Group 7 - Hardware Normalisation (#85)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Revert "Group 7 - Hardware Normalisation (#85)" (#125)

This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a.

* ADLS Group 7 LLM int (#84)

* test

* manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py

* updated tb-related files

* removed test file llm_int8.sv

* added README for pull request

* added flow diagrams

* replaced old fifo with derrek's fifo

* modified format for CI check

* changed the output value of fixed_comparator_tree to be an absolute value. tb passed.

* formatted .py for PR sw test

* formatted .sv files for PR hw test

* removed fixed_point_divide.sv

* fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check

* I'm tired.

* removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment

* added sys library

* changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches

* changed mase_runner argument to module_param_list

* formatted .py again

* IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1

* dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load

* writing docs

* finished docs

* finished README

* modifed title of README

* updated figure paths

* test: latex math

* test again

* test again

* last try

* tired

* fixed markdown format bugs

---------

Co-authored-by: Moteng Ma <852964048@qq.com>

---------

Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
Co-authored-by: Zixian-Jin <84839323+Zixian-Jin@users.noreply.github.com>
Co-authored-by: Moteng Ma <852964048@qq.com>
jianyicheng pushed a commit that referenced this pull request Apr 27, 2024
* Updated CI rules to apply on more branches (#122)

* Group 12 - Hardware Normalisation (#126)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* revert some changes

* remove layernorm 1d

* moving files to correct places

* formatting fixes

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
jianyicheng pushed a commit that referenced this pull request May 15, 2024
* Group 0 - MaseRT: TensorRT and ONNXRuntime Integration for MASE (#83)

* added tensorrt backbone, tensorrt quantize.py tests

* Added requirements and enabled tensorrt as action in cmd line

* onnx transform working

* minor changes

* added calibrate and calib-quant-test

* minor changes

* Setup non-terminal client

* Added training for quantization

* improved train calib

* calib changes

* Added utlils

* adding fake quant

* Fixed toml issue

* minor change

* Fixed toml issues and now calibrate working in py

* Improved file system for quantize

* Calibrate and Quantize working

* Added jsc-trt as test model

* Added analysis pass

* Added lots of metrics and trt support. need to test

* data_loader change

* Squashing bugs

* Quant inference working, need to clean

* unexpected regression model

* Almost finsihed analysis

* TensorRT improves latency!!

* need to fix int8calib

* added training. need to test

* added int8calibrator and onnx summary

* added onnxruntime setup & placeholders

* devised onnxruntime & transform pass structure

* modify requirement from ort to ort-gpu

* ort-gpu in setup.py

* devised onnx_runtime_transform_pass, test_performances to define

* added ort inference session

* added execution provider; todo: add input type to toml and according onnx processing

* Minor changes

* minor change

* minor changes

* FP16 working

* INT8 calibrator implemented but takes ages

* Minor change

* Revert "Minor change"

This reverts commit 745723bfca8e3d3659641485a705b9419f3526d2.

* Minor tidy up

* fixed INT8 and FP16

* Added fine_tune transform pass but getting circular import for chop

* Added scheduler to Chop CLI and train for Cosine Annealing LR capability

* Fine Tuning QAT working

* Improved fine tune

* improvements and bug fixes

* inference consumption measurement

* Improved table

* minor changes

* Minor change

* Minor changes

* JSC behaving itself!

* reformatted tensorrt tutorial

* Adding documentation

* more documentation

* added defaults to fine tune params

* improved directory

* changed folders to lowercase according to conventions

* onnxruntime for jsc-tiny behaves and improves latency :D

* onnx ambiguously slower on vgg7 cpu

* small fixes

* Added summarize quant

* Improved calibration and docs

* Bug fix

* fixed fine tune bug

* vision models functioning on mnist (added input channel pre-processing)

* Minor changes

* minor changes

* transform int8 changes

* INT8 Float16 comparison complete

* added checkpoitns for tensorrt demos

* VGG7 test not behaving

* opt125 config

* added opt125 to notebook

* onnxruntime works and improves latency on vgg7

* adding opt125

* adjusted onnxruntime batching mismatch

* changed opt toml

* fixes

* tidy up

* Opt 125 toml

* Improved mixed precision

* Section1 notebook working

* minor changes

* section 1of tutorial complete

* added pooling and other conv support

* modified module support

* test commit

* test commit

* test commit

* transfer to another gpu

* Added lstm support

* added lightning logs to gitignore

* MaseRT docs

* Documentation improvements

* MaseRT documentation

* Improvements to docs

* Runtime analysis refractor

* updated requirements for onnx

* ONNXRT implementation

* Bug fixes and OnnxRT dynamic quant

* dynamic quantization working

* static quantization working

* added cpu / gpu inference options

* Little improvements on jsc_toy onnx performance

* onnxruntime fixed and toml file structure change

* minor changes

* minor changes

* created trt mixed precision search space

* minor changes

* minors

* pre-tensorrt runner for search

* fixed mixed precision int8

* search space changes

* mixed precision onnx

* Onnxrt tutorial improvements

* minor changes

* still fixin mixed precision

* Mixed precision fixed

* Fixed minor static quant bug. Still large VGG latency

* fixex formatting to keep up with expected coding style

* opt not working for any transform action, investigating cause

* coding spacing adjustments

* Documentation only change

* Fixed onnxruntime large latency -  onnxruntime package issue

* quantization onnxruntime debugging for vgg model

* standardized batch sizes for experiments

* mobilnet experiments

* Starting sphinx documentation for RT

* Transform interface refractor

* Sphinx transforms

* fixed import errors for transform interface passes

* Sphinx documentation 5/6 passes

* deleted old tensorrtdev playground folder

* transform analysis pass fix

* masert readme improvement

* tutorials added to sphinx

* added docstring comments

* adjust mnist dummy_inputs for vision models, tutorial fixes, toml fixes

* minor style changes

* readme docs improvment

* Docstring and readme improvemnets

* Added open source contribution section

* Section 1 and 2 TensorRT tutorial complete

* added open source contribution to masert readme

* minor changes

* updated masert onnxrt overview

* minor changes

* updated jsc toy checkpoint load dir

* minor toml changes

* minor toml changes

* Reformatted using Black

* Fixed sphinx formatting issue

* Updated tomls to support new transfrom style config

* minor changes

* minor pr readme fix

* tensorRT_tutorial ready

* onnx quantization tutorial ready

* Added MASERT tutorials to sphinx docs

* Final formatting

* Onnxrt tutorial finalized

---------

Co-authored-by: mau-mar <mauro.marino23@ic.ac.uk>
Co-authored-by: William F Powell <wfp23@ee-mill1.ee.ic.ac.uk>
Co-authored-by: Will Powell <me@willpowell.uk>
Co-authored-by: Mauro Marino <mauro.y@gmail.com>
Co-authored-by: mau-mar <mauro.y00@gmail.com>

* Team 16: RL Search: (#87)

* removed cost for colab

* /mase/machop/configs/examples/toy_rl_search.toml

* added hw metrics

* trained model for 7k timesteps

* added HiLo env

* added vgg_tpe toml

* added 100,000 timesteps

* added vgg_rl

* fixed toml

* added checkpoint

added checkpoint

rmvd checkpoint

Try and change episode lenght

* added toml for vgg rl

* changed path checkpoint

* fixed reward

* corrected vgg rl toml

* fixed direction scaler

* changed n_steps and eval

* added acc and bitwidth to obs space

* final before oral

* added stuff

* added tinyVGG and cifar10_subset

* added paralellization and seed

* corrected paralellization

* added n_steps and n_envs to toml

* changegd toml

* added colab toml

* changed toml

* toml

* toml and removed prints

* added wandb and paper environment

* Updated Paper Env

* reverted some changes

* added info for debug

* fixed scaled_metrics

* added toml, env, algo

* increased episode len

* changed structure

* added some stuff

* removed comments

* removed Dict, was causing a bug

* modified load

* added self in metrics

* metrics

* added best rewards

* removed logs

* Changed for Pull Request

* changes for PR

* Final commit

* refactored one file

* Added wandb comments Just to test Merge

* Reformated core_algo

* Re-added wandb

* Changed gpu to cpu in test

* test in now only on cpu

* Removed again wandb to see if it merges

* reformated

* removed import wandb

* Reformated test and added wandb

* Removed wandb again

* Added tensorboard to core_algo

* removed tensorboard to pass tests

* Removed progress bar, need an import

* Re-added dependencies, as the previous commit passed the tests

Reformated env.py

* Updated CI rules to apply on more branches

* group 8 (#123)

add sphinx docs, place files in correct directories, remove prof

remove unnecessary files

format tanh test

temporarily disable failing activation emit tests

relaunching actions

Co-authored-by: pgimenes <pgimenes@outlook.com>

* Fix software train runner RunnerBasicTrain (#90)

* ADLS Group 7 (#106)

* Group 7 - Hardware Normalisation (#85)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Revert "Group 7 - Hardware Normalisation (#85)" (#125)

This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a.

* ADLS Group 7 LLM int (#84)

* test

* manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py

* updated tb-related files

* removed test file llm_int8.sv

* added README for pull request

* added flow diagrams

* replaced old fifo with derrek's fifo

* modified format for CI check

* changed the output value of fixed_comparator_tree to be an absolute value. tb passed.

* formatted .py for PR sw test

* formatted .sv files for PR hw test

* removed fixed_point_divide.sv

* fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check

* I'm tired.

* removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment

* added sys library

* changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches

* changed mase_runner argument to module_param_list

* formatted .py again

* IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1

* dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load

* writing docs

* finished docs

* finished README

* modifed title of README

* updated figure paths

* test: latex math

* test again

* test again

* last try

* tired

* fixed markdown format bugs

---------

Co-authored-by: Moteng Ma <852964048@qq.com>

---------

Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1…
jianyicheng pushed a commit that referenced this pull request Jul 7, 2024
* isqrt pipelineing

* cleaned up nr stage

* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

* ADLS: Hardware Normalization (Group 17) (#65)

* migrated matmul verilog

* started on group norm 2d

* wip: group norm

* migrated matrix tools util

* built basic test

* wip group norm hardware

* added full throughput fifo to design

* update docker to HEAD

* added full throughput fifo_v2

* docker

* group norm is wip

* wip: added difference fifo to group_norm_2d

* pipeline working with temporary identity func as inv sqrt

* cleaned up unused state

* C++ NR method finished but needs cleaning

* Cleaned up nr file but need to test different Q formats

* wip: group norm

* added output rounding stage & software model

* migrated new repeat circ buffer

* added first draft of rmsnorm

* group norm has single bit error

* refactored fixed_signed_cast testbench

* fixed broken assertion

* group norm 2d tests passing on constant inv sqrt

* docstrings

* added rms norm testbench and all tests passing with constant inv sqrt

* Adapted isqrt code to collect data

* unified norm file

* wip: layernorm mase integration

* removed INTERNAL_RTL_DEPENDENCIES, new runner for simulation, emit_tb fixed

* Finished building blocks for isqrt

* simulation is running, currently failing due to small error because of repeated integer rounding

* refactored out all models

* MVP of isqrt finished

* Cleaned up a bit and adapted interface of isqrt module

* added layernorm integer quantized

* added abs to invsqrt testbench, integrated inv sqrt unit into group norm

* added group norm 2d randomised dimensions

* Fixed lut index module

* updated interfaces for inv sqrt, moved INV SQRT widths to internal params

* fixed arithmetic deps for norm.sv

* Major fix to structure of group norm2d

* Updated software model to match C++ model

* Fixed range augmentation due to register width not being wide enough

* Fixed register length issues with lut index module

* Made LUT parameterisable by width but not by size

* Fixed testbench and reduced register sizes of nr stage

* Updated the testbenches to import software module parts from a single source

* All isqrt test passing

* Fixed assert stmt

* factored out lut parameter dict

* invsqrt working but group norm sometimes has 1 bit error, fix is WIP

* wip: groupnorm2d still failing test, have no clue how it is expecting unknown data??

* remove nettype none for vivado bug

* added tcl script for vivado

* changed all software quantized modules

* added an error threshold monitor, groupnorm2d still has very weird high error bug in streaming test

* updated error threshold monitor to support no check

* major fix to fifo, groupnorm2d now passing

* add instance norm and fixed software emit verilog

* added batchnorm2d, groupnorm and instance norm quantizable modules in mase

* removed fixed signed cast

* added lut common element

* removed local path

* integrated new lut into fixed isqrt

* fixed streaming mon, added new lut isqrt into groupnorm 2d, added more tests for groupnorm 2d

* added unified norm testbench and removed old temporary inv sqrt

* updated emit verilog tb, top level norm still failing

* fixed isqrt linting error

* removed unused params from tb

* pipelined isqrt module and integrated into group norm 2d

* added new rms norm extension layers, hardware not passing tb

* fixed rms norm implementation and added tests for hw

* added rms norm to software stack

* refactored rms_norm and test_emit_verilog_norm test

* commented out rms_norm circular dependency issue not fixed

* normtb works for rms

* removed online layer norm

* moved inverse sqrt C++ fixed model

* fixed groupnorm net in test_emit_verilog_norm

* refactored test_emit_verilog_norm and readjusted test times

* docstring

* add memoryfile and top for vivado synth

* vivado script

* fixed top level params & ports

* Batch norm 2d working

* added comments

* added u250 constraints file and updated tcl

* stripped xdc file

* docstrings

* Batch norm hook up?

* integrated batchnorm into norm.sv all tests passing

* pipelined batchnorm module

* added docs for normalization layers

* fixed bugged channel selector

* added per-element scale to rms norm

* fixed rms norm function, norm_tb rms not passing

* error analysis code

* added fifo

* switched to non-project vivado flow and 333mhz clk

* removed variance clamp from groupnorm

* fix comma

* added isqrt fix and inverse NUM_VALUES fix to rms norm

* moved around wires for vivado

* isqrt pipelineing

* cleaned up nr stage

* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

---------

Co-authored-by: J-u1i0 <Julio.Castillejo.Motta@gmail.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>

* Updated CI rules to apply on more branches

* group 8 (#123)

add sphinx docs, place files in correct directories, remove prof

remove unnecessary files

format tanh test

temporarily disable failing activation emit tests

relaunching actions

Co-authored-by: pgimenes <pgimenes@outlook.com>

* Fix software train runner RunnerBasicTrain (#90)

* WIP: pow2 rtl for softmax norm

* first working draft of pow2

* added comparator tree

* more tests

* add common metadata for BERT masegraph generated from ONNX backend

* added comparator accumulator

* added first section of softermax

* formalize onnx ir and export pass

* added splitn (untested) and reworked range reduction + lpw_pow2

* enable creating a masegraph directly from fx graphmodule

* x

* bert model from codegen working, but generating different output

* temporarily disable extended model testing

* support for more ops, refactor attr mapping as dict, BERT works, need to fix gather

* small fixes

* only test on bert

* fix tests and docs

* tidy up documentation

* matmul version for fixed linear

* added range of tests for lpw recip

* ADLS Group 7 (#106)

* Group 7 - Hardware Normalisation (#85)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Revert "Group 7 - Hardware Normalisation (#85)" (#125)

This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a.

* ADLS Group 7 LLM int (#84)

* test

* manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py

* updated tb-related files

* removed test file llm_int8.sv

* added README for pull request

* added flow diagrams

* replaced old fifo with derrek's fifo

* modified format for CI check

* changed the output value of fixed_comparator_tree to be an absolute value. tb passed.

* formatted .py for PR sw test

* formatted .sv files for PR hw test

* removed fixed_point_divide.sv

* fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check

* I'm tired.

* removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment

* added sys library

* changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches

* changed mase_runner argument to module_param_list

* formatted .py again

* IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1

* dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load

* writing docs

* finished docs

* finished README

* modifed title of README

* updated figure paths

* test: latex math

* test again

* test again

* last try

* tired

* fixed markdown format bugs

---------

Co-authored-by: Moteng Ma <852964048@qq.com>

---------

Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
Co-authored-by: Zixian-Jin <84839323+Zixian-Jin@users.noreply.github.com>
Co-authored-by: Moteng Ma <852964048@qq.com>

* ADLS Group 17 (#118)

* ADLS: Hardware Normalization (Group 17) (#65)

* migrated matmul verilog

* started on group norm 2d

* wip: group norm

* migrated matri…
Aaron-Zhao123 added a commit that referenced this pull request Jul 7, 2024
* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

* ADLS: Hardware Normalization (Group 17) (#65)

* migrated matmul verilog

* started on group norm 2d

* wip: group norm

* migrated matrix tools util

* built basic test

* wip group norm hardware

* added full throughput fifo to design

* update docker to HEAD

* added full throughput fifo_v2

* docker

* group norm is wip

* wip: added difference fifo to group_norm_2d

* pipeline working with temporary identity func as inv sqrt

* cleaned up unused state

* C++ NR method finished but needs cleaning

* Cleaned up nr file but need to test different Q formats

* wip: group norm

* added output rounding stage & software model

* migrated new repeat circ buffer

* added first draft of rmsnorm

* group norm has single bit error

* refactored fixed_signed_cast testbench

* fixed broken assertion

* group norm 2d tests passing on constant inv sqrt

* docstrings

* added rms norm testbench and all tests passing with constant inv sqrt

* Adapted isqrt code to collect data

* unified norm file

* wip: layernorm mase integration

* removed INTERNAL_RTL_DEPENDENCIES, new runner for simulation, emit_tb fixed

* Finished building blocks for isqrt

* simulation is running, currently failing due to small error because of repeated integer rounding

* refactored out all models

* MVP of isqrt finished

* Cleaned up a bit and adapted interface of isqrt module

* added layernorm integer quantized

* added abs to invsqrt testbench, integrated inv sqrt unit into group norm

* added group norm 2d randomised dimensions

* Fixed lut index module

* updated interfaces for inv sqrt, moved INV SQRT widths to internal params

* fixed arithmetic deps for norm.sv

* Major fix to structure of group norm2d

* Updated software model to match C++ model

* Fixed range augmentation due to register width not being wide enough

* Fixed register length issues with lut index module

* Made LUT parameterisable by width but not by size

* Fixed testbench and reduced register sizes of nr stage

* Updated the testbenches to import software module parts from a single source

* All isqrt test passing

* Fixed assert stmt

* factored out lut parameter dict

* invsqrt working but group norm sometimes has 1 bit error, fix is WIP

* wip: groupnorm2d still failing test, have no clue how it is expecting unknown data??

* remove nettype none for vivado bug

* added tcl script for vivado

* changed all software quantized modules

* added an error threshold monitor, groupnorm2d still has very weird high error bug in streaming test

* updated error threshold monitor to support no check

* major fix to fifo, groupnorm2d now passing

* add instance norm and fixed software emit verilog

* added batchnorm2d, groupnorm and instance norm quantizable modules in mase

* removed fixed signed cast

* added lut common element

* removed local path

* integrated new lut into fixed isqrt

* fixed streaming mon, added new lut isqrt into groupnorm 2d, added more tests for groupnorm 2d

* added unified norm testbench and removed old temporary inv sqrt

* updated emit verilog tb, top level norm still failing

* fixed isqrt linting error

* removed unused params from tb

* pipelined isqrt module and integrated into group norm 2d

* added new rms norm extension layers, hardware not passing tb

* fixed rms norm implementation and added tests for hw

* added rms norm to software stack

* refactored rms_norm and test_emit_verilog_norm test

* commented out rms_norm circular dependency issue not fixed

* normtb works for rms

* removed online layer norm

* moved inverse sqrt C++ fixed model

* fixed groupnorm net in test_emit_verilog_norm

* refactored test_emit_verilog_norm and readjusted test times

* docstring

* add memoryfile and top for vivado synth

* vivado script

* fixed top level params & ports

* Batch norm 2d working

* added comments

* added u250 constraints file and updated tcl

* stripped xdc file

* docstrings

* Batch norm hook up?

* integrated batchnorm into norm.sv all tests passing

* pipelined batchnorm module

* added docs for normalization layers

* fixed bugged channel selector

* added per-element scale to rms norm

* fixed rms norm function, norm_tb rms not passing

* error analysis code

* added fifo

* switched to non-project vivado flow and 333mhz clk

* removed variance clamp from groupnorm

* fix comma

* added isqrt fix and inverse NUM_VALUES fix to rms norm

* moved around wires for vivado

* isqrt pipelineing

* cleaned up nr stage

* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

---------

Co-authored-by: J-u1i0 <Julio.Castillejo.Motta@gmail.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>

* Updated CI rules to apply on more branches

* group 8 (#123)

add sphinx docs, place files in correct directories, remove prof

remove unnecessary files

format tanh test

temporarily disable failing activation emit tests

relaunching actions

Co-authored-by: pgimenes <pgimenes@outlook.com>

* Fix software train runner RunnerBasicTrain (#90)

* WIP: pow2 rtl for softmax norm

* first working draft of pow2

* added comparator tree

* more tests

* add common metadata for BERT masegraph generated from ONNX backend

* added comparator accumulator

* added first section of softermax

* formalize onnx ir and export pass

* added splitn (untested) and reworked range reduction + lpw_pow2

* enable creating a masegraph directly from fx graphmodule

* x

* bert model from codegen working, but generating different output

* temporarily disable extended model testing

* support for more ops, refactor attr mapping as dict, BERT works, need to fix gather

* small fixes

* only test on bert

* fix tests and docs

* tidy up documentation

* matmul version for fixed linear

* added range of tests for lpw recip

* ADLS Group 7 (#106)

* Group 7 - Hardware Normalisation (#85)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Revert "Group 7 - Hardware Normalisation (#85)" (#125)

This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a.

* ADLS Group 7 LLM int (#84)

* test

* manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py

* updated tb-related files

* removed test file llm_int8.sv

* added README for pull request

* added flow diagrams

* replaced old fifo with derrek's fifo

* modified format for CI check

* changed the output value of fixed_comparator_tree to be an absolute value. tb passed.

* formatted .py for PR sw test

* formatted .sv files for PR hw test

* removed fixed_point_divide.sv

* fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check

* I'm tired.

* removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment

* added sys library

* changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches

* changed mase_runner argument to module_param_list

* formatted .py again

* IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1

* dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load

* writing docs

* finished docs

* finished README

* modifed title of README

* updated figure paths

* test: latex math

* test again

* test again

* last try

* tired

* fixed markdown format bugs

---------

Co-authored-by: Moteng Ma <852964048@qq.com>

---------

Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
Co-authored-by: Zixian-Jin <84839323+Zixian-Jin@users.noreply.github.com>
Co-authored-by: Moteng Ma <852964048@qq.com>

* ADLS Group 17 (#118)

* ADLS: Hardware Normalization (Group 17) (#65)

* migrated matmul verilog

* started on group norm 2d

* wip: group norm

* migrated matrix tools util

* built basic test

* wip group norm hardware

*…
jianyicheng pushed a commit that referenced this pull request Jul 10, 2024
* group 8 (#123)

add sphinx docs, place files in correct directories, remove prof

remove unnecessary files

format tanh test

temporarily disable failing activation emit tests

relaunching actions

Co-authored-by: pgimenes <pgimenes@outlook.com>

* Fix software train runner RunnerBasicTrain (#90)

* WIP: pow2 rtl for softmax norm

* first working draft of pow2

* added comparator tree

* more tests

* add common metadata for BERT masegraph generated from ONNX backend

* added comparator accumulator

* added first section of softermax

* formalize onnx ir and export pass

* added splitn (untested) and reworked range reduction + lpw_pow2

* enable creating a masegraph directly from fx graphmodule

* x

* bert model from codegen working, but generating different output

* temporarily disable extended model testing

* support for more ops, refactor attr mapping as dict, BERT works, need to fix gather

* small fixes

* only test on bert

* fix tests and docs

* tidy up documentation

* matmul version for fixed linear

* added range of tests for lpw recip

* ADLS Group 7 (#106)

* Group 7 - Hardware Normalisation (#85)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Revert "Group 7 - Hardware Normalisation (#85)" (#125)

This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a.

* ADLS Group 7 LLM int (#84)

* test

* manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py

* updated tb-related files

* removed test file llm_int8.sv

* added README for pull request

* added flow diagrams

* replaced old fifo with derrek's fifo

* modified format for CI check

* changed the output value of fixed_comparator_tree to be an absolute value. tb passed.

* formatted .py for PR sw test

* formatted .sv files for PR hw test

* removed fixed_point_divide.sv

* fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check

* I'm tired.

* removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment

* added sys library

* changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches

* changed mase_runner argument to module_param_list

* formatted .py again

* IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1

* dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load

* writing docs

* finished docs

* finished README

* modifed title of README

* updated figure paths

* test: latex math

* test again

* test again

* last try

* tired

* fixed markdown format bugs

---------

Co-authored-by: Moteng Ma <852964048@qq.com>

---------

Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>
Co-authored-by: Zixian-Jin <84839323+Zixian-Jin@users.noreply.github.com>
Co-authored-by: Moteng Ma <852964048@qq.com>

* ADLS Group 17 (#118)

* ADLS: Hardware Normalization (Group 17) (#65)

* migrated matmul verilog

* started on group norm 2d

* wip: group norm

* migrated matrix tools util

* built basic test

* wip group norm hardware

* added full throughput fifo to design

* update docker to HEAD

* added full throughput fifo_v2

* docker

* group norm is wip

* wip: added difference fifo to group_norm_2d

* pipeline working with temporary identity func as inv sqrt

* cleaned up unused state

* C++ NR method finished but needs cleaning

* Cleaned up nr file but need to test different Q formats

* wip: group norm

* added output rounding stage & software model

* migrated new repeat circ buffer

* added first draft of rmsnorm

* group norm has single bit error

* refactored fixed_signed_cast testbench

* fixed broken assertion

* group norm 2d tests passing on constant inv sqrt

* docstrings

* added rms norm testbench and all tests passing with constant inv sqrt

* Adapted isqrt code to collect data

* unified norm file

* wip: layernorm mase integration

* removed INTERNAL_RTL_DEPENDENCIES, new runner for simulation, emit_tb fixed

* Finished building blocks for isqrt

* simulation is running, currently failing due to small error because of repeated integer rounding

* refactored out all models

* MVP of isqrt finished

* Cleaned up a bit and adapted interface of isqrt module

* added layernorm integer quantized

* added abs to invsqrt testbench, integrated inv sqrt unit into group norm

* added group norm 2d randomised dimensions

* Fixed lut index module

* updated interfaces for inv sqrt, moved INV SQRT widths to internal params

* fixed arithmetic deps for norm.sv

* Major fix to structure of group norm2d

* Updated software model to match C++ model

* Fixed range augmentation due to register width not being wide enough

* Fixed register length issues with lut index module

* Made LUT parameterisable by width but not by size

* Fixed testbench and reduced register sizes of nr stage

* Updated the testbenches to import software module parts from a single source

* All isqrt test passing

* Fixed assert stmt

* factored out lut parameter dict

* invsqrt working but group norm sometimes has 1 bit error, fix is WIP

* wip: groupnorm2d still failing test, have no clue how it is expecting unknown data??

* remove nettype none for vivado bug

* added tcl script for vivado

* changed all software quantized modules

* added an error threshold monitor, groupnorm2d still has very weird high error bug in streaming test

* updated error threshold monitor to support no check

* major fix to fifo, groupnorm2d now passing

* add instance norm and fixed software emit verilog

* added batchnorm2d, groupnorm and instance norm quantizable modules in mase

* removed fixed signed cast

* added lut common element

* removed local path

* integrated new lut into fixed isqrt

* fixed streaming mon, added new lut isqrt into groupnorm 2d, added more tests for groupnorm 2d

* added unified norm testbench and removed old temporary inv sqrt

* updated emit verilog tb, top level norm still failing

* fixed isqrt linting error

* removed unused params from tb

* pipelined isqrt module and integrated into group norm 2d

* added new rms norm extension layers, hardware not passing tb

* fixed rms norm implementation and added tests for hw

* added rms norm to software stack

* refactored rms_norm and test_emit_verilog_norm test

* commented out rms_norm circular dependency issue not fixed

* normtb works for rms

* removed online layer norm

* moved inverse sqrt C++ fixed model

* fixed groupnorm net in test_emit_verilog_norm

* refactored test_emit_verilog_norm and readjusted test times

* docstring

* add memoryfile and top for vivado synth

* vivado script

* fixed top level params & ports

* Batch norm 2d working

* added comments

* added u250 constraints file and updated tcl

* stripped xdc file

* docstrings

* Batch norm hook up?

* integrated batchnorm into norm.sv all tests passing

* pipelined batchnorm module

* added docs for normalization layers

* fixed bugged channel selector

* added per-element scale to rms norm

* fixed rms norm function, norm_tb rms not passing

* error analysis code

* added fifo

* switched to non-project vivado flow and 333mhz clk

* removed variance clamp from groupnorm

* fix comma

* added isqrt fix and inverse NUM_VALUES fix to rms norm

* moved around wires for vivado

* isqrt pipelineing

* cleaned up nr stage

* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

---------

Co-authored-by: J-u1i0 <Julio.Castillejo.Motta@gmail.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>

* Updated CI rules to apply on more branches (#122)

* Update Slack link (#130)

* Removed instructions in README.md (users should refer to the official doc) and change the order of setup instructions (#131)

* tidy up documentation

* matmul version for fixed linear

* Fixed a bug that the "local" flag is not working (#170)

* Docker update (#171)

* Updated Docker to main for local build

* Sync scripts for updated docker setups (seperating CPU and GPU containers)

* removed docker as submodule

* Pull from Makefile to avoid repeated sync on submodule

* Added instructions to log for better readability on CI log

* remove breakpoint

* refactor mase_components to use pytest for hardware regression

* fix formatting

* unified linear 2d tb

* update action and makefile

* fixes

* update utils import

* mis…
jianyicheng pushed a commit that referenced this pull request Aug 7, 2024
* pipelined batchnorm module

* added docs for normalization layers

* fixed bugged channel selector

* added per-element scale to rms norm

* fixed rms norm function, norm_tb rms not passing

* error analysis code

* added fifo

* switched to non-project vivado flow and 333mhz clk

* removed variance clamp from groupnorm

* fix comma

* added isqrt fix and inverse NUM_VALUES fix to rms norm

* moved around wires for vivado

* isqrt pipelineing

* cleaned up nr stage

* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

* ADLS: Hardware Normalization (Group 17) (#65)

* migrated matmul verilog

* started on group norm 2d

* wip: group norm

* migrated matrix tools util

* built basic test

* wip group norm hardware

* added full throughput fifo to design

* update docker to HEAD

* added full throughput fifo_v2

* docker

* group norm is wip

* wip: added difference fifo to group_norm_2d

* pipeline working with temporary identity func as inv sqrt

* cleaned up unused state

* C++ NR method finished but needs cleaning

* Cleaned up nr file but need to test different Q formats

* wip: group norm

* added output rounding stage & software model

* migrated new repeat circ buffer

* added first draft of rmsnorm

* group norm has single bit error

* refactored fixed_signed_cast testbench

* fixed broken assertion

* group norm 2d tests passing on constant inv sqrt

* docstrings

* added rms norm testbench and all tests passing with constant inv sqrt

* Adapted isqrt code to collect data

* unified norm file

* wip: layernorm mase integration

* removed INTERNAL_RTL_DEPENDENCIES, new runner for simulation, emit_tb fixed

* Finished building blocks for isqrt

* simulation is running, currently failing due to small error because of repeated integer rounding

* refactored out all models

* MVP of isqrt finished

* Cleaned up a bit and adapted interface of isqrt module

* added layernorm integer quantized

* added abs to invsqrt testbench, integrated inv sqrt unit into group norm

* added group norm 2d randomised dimensions

* Fixed lut index module

* updated interfaces for inv sqrt, moved INV SQRT widths to internal params

* fixed arithmetic deps for norm.sv

* Major fix to structure of group norm2d

* Updated software model to match C++ model

* Fixed range augmentation due to register width not being wide enough

* Fixed register length issues with lut index module

* Made LUT parameterisable by width but not by size

* Fixed testbench and reduced register sizes of nr stage

* Updated the testbenches to import software module parts from a single source

* All isqrt test passing

* Fixed assert stmt

* factored out lut parameter dict

* invsqrt working but group norm sometimes has 1 bit error, fix is WIP

* wip: groupnorm2d still failing test, have no clue how it is expecting unknown data??

* remove nettype none for vivado bug

* added tcl script for vivado

* changed all software quantized modules

* added an error threshold monitor, groupnorm2d still has very weird high error bug in streaming test

* updated error threshold monitor to support no check

* major fix to fifo, groupnorm2d now passing

* add instance norm and fixed software emit verilog

* added batchnorm2d, groupnorm and instance norm quantizable modules in mase

* removed fixed signed cast

* added lut common element

* removed local path

* integrated new lut into fixed isqrt

* fixed streaming mon, added new lut isqrt into groupnorm 2d, added more tests for groupnorm 2d

* added unified norm testbench and removed old temporary inv sqrt

* updated emit verilog tb, top level norm still failing

* fixed isqrt linting error

* removed unused params from tb

* pipelined isqrt module and integrated into group norm 2d

* added new rms norm extension layers, hardware not passing tb

* fixed rms norm implementation and added tests for hw

* added rms norm to software stack

* refactored rms_norm and test_emit_verilog_norm test

* commented out rms_norm circular dependency issue not fixed

* normtb works for rms

* removed online layer norm

* moved inverse sqrt C++ fixed model

* fixed groupnorm net in test_emit_verilog_norm

* refactored test_emit_verilog_norm and readjusted test times

* docstring

* add memoryfile and top for vivado synth

* vivado script

* fixed top level params & ports

* Batch norm 2d working

* added comments

* added u250 constraints file and updated tcl

* stripped xdc file

* docstrings

* Batch norm hook up?

* integrated batchnorm into norm.sv all tests passing

* pipelined batchnorm module

* added docs for normalization layers

* fixed bugged channel selector

* added per-element scale to rms norm

* fixed rms norm function, norm_tb rms not passing

* error analysis code

* added fifo

* switched to non-project vivado flow and 333mhz clk

* removed variance clamp from groupnorm

* fix comma

* added isqrt fix and inverse NUM_VALUES fix to rms norm

* moved around wires for vivado

* isqrt pipelineing

* cleaned up nr stage

* vivado scripts

* fix batchnorm

* comment out batchnorm

* pipelined NR stage more

* fix nr stage bug

* more aggressive clk frequency

* added two contraint files to find fmax

* fixed timing constraints

* fixed up tests for submission

* sv linting

* python linting

* more linting

* isqrt CI

* fixed tests

* fix test_emit_verilog_norm

* removed cpp model duplicate

* fixed software tests

* reformat for pytest

* linting

* sv linting

* python linting

* linting

* docstr

* removed matmul

* Trigger Build

* group norm CI

---------

Co-authored-by: J-u1i0 <Julio.Castillejo.Motta@gmail.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>

* Updated CI rules to apply on more branches

* group 8 (#123)

add sphinx docs, place files in correct directories, remove prof

remove unnecessary files

format tanh test

temporarily disable failing activation emit tests

relaunching actions

Co-authored-by: pgimenes <pgimenes@outlook.com>

* Fix software train runner RunnerBasicTrain (#90)

* WIP: pow2 rtl for softmax norm

* first working draft of pow2

* added comparator tree

* more tests

* add common metadata for BERT masegraph generated from ONNX backend

* added comparator accumulator

* added first section of softermax

* formalize onnx ir and export pass

* added splitn (untested) and reworked range reduction + lpw_pow2

* enable creating a masegraph directly from fx graphmodule

* x

* bert model from codegen working, but generating different output

* temporarily disable extended model testing

* support for more ops, refactor attr mapping as dict, BERT works, need to fix gather

* small fixes

* only test on bert

* fix tests and docs

* tidy up documentation

* matmul version for fixed linear

* added range of tests for lpw recip

* ADLS Group 7 (#106)

* Group 7 - Hardware Normalisation (#85)

* Registered batch_norm1d as valid quantisation and INTERNAL_RTL op.

* Started registering batch_norm1d as a valid quantisation op. for testing purposes.

* Added seperate NotImpl for quantized batch_norm1d.

* Temporary stop-gap measure for unnamed variable access when emitting tb.

* Added script for testing quantised batch norm integration.

* Added Linear version of BatchNorm1D and registered it as a quantized module.

* Updated testing script to test quantisation and quantised graph performance.

* Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE.

* Fixed quantised batch norm 1d not using bias quantiser.

* implemented simple testbench - still failing as not implemented model

* Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match.

* Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb.

* More work on batch norm 1d tb.

* Working FE model for batch norm, but precision errors still observed.

* WIP fixed_layer_norm and CORDIC sqrt - none tested

* WIP - testbench for sqrt CORDIC

* Working FE model for batch norm 1d.

* Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer.

* progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo

* Almost working sqrt - values deviate from matlab in STATE_4

* Added PARTS_PER_NORM parameter and explanation to layer norm.

* iterative sqrt working on a single testcase - TODO: broaden test coverage

* Added (semi-functioning) layer norm SV module. Started work on corresponding TB.

* fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented

* Added temporary measures to view post-processed outputs from TB.

* Work on layer norm implementation precision.

* Deleted old fixed layer norm file.

* Started work on cleaning up layer norm design.

* fixed sign extention when calculating sum

* Added STDV and mean as inputs to BatchNorm1d during quantisation.

* variance working - integration with sqrt in progress

* working first draft of layernorm

* Fixed parts of layer block to get Vivado to synthesize.

* fixed double assignement

* parametrizing constant in sqrt cordic

* made the design multi-cycle

* added support for group and instance norm in hardware

* Added quantized layernorm module.

* Added neccesary dependencies for layernorm.

* Updated fixed batch norm to support multiple different widths for its inputs.

* Registered mean as a named parameter for the quantized batch norm.

* Added layer norm to jet substructure model. Remove later.

* Further work on LayerNormInteger integration.

* Reformatted layer norm to have right parameters. Small changes to TBs.

* Pipelined batch norm 1d.

* fixed layer pippelined EXCEPT the sqrt HW

* Pipelined sqrt almost completed - state machine yet to be removed

* working pipeline of sqrt hardware

* Added ability for batch_norm to convert between parallelism levels using new module.

* Slight reworks on parallelism conversions for batch norm.

* Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules.

* 1 cycle timing fix

* fixing consequence of previous 1 cycle change on sqrt

* fixed driving signal for valid in of sqrt

* removed docker credentials (#68)

* removed docker credentials

* switch docker container from ghio to docker hub

* disable page deployment from forked repos

* Skip for forked repo & print message

* Removed missed echo

* Add module passes (#57)

* updated license

* Os sync (#539)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format

* Updated dockerfile (#56)

* refactor

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Add module transform (#541)

* fix: remove import nni (#526)

* Software/emit-verilog-refactoring (#516)

* fix emit verilog test according to new naming standard following analysis pass refactoring

* linear/relu changes for new naming standard

* improved pass import

* random partitioning pass for toy model

* hardware pass refactor

* formatting

* enable new pass import flow on the CI

formatting

* enable new pass import flow on the CI

formatting

formatting

formatting relu

* Added verible path

* emit top verilog refactoring for new naming rules

* fixed errors

emit top working

* fixing bram emit

formating

* Device Partitioning (#518)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* Partition new (#520)

* Added md syntax (#515)

* Added md syntax

* polished code in md

* test md syntax

* Added proper code blocks in doc

* Added device id as metadata for partitioning

* moved dir

* refactored partitioning pass

* updated the pass name in the init

* format

* fixed doc error and verilog format error

* fixed hardware regression test

* fixed most of the tests

* Refactored verilog param collect and add repetition check

* Added pythonpath for machop

* Refactored the interface emit

* refactored the signal and component emit

* fixed term

* refactored wiring

* enable emit verilog in the test

* Sync docker

---------

Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Os mirror (#529)

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* Update README.md with badges and a link to doc (#4)

* Update README.md

Fixed broken link and minor edition to add bibtex

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* fix package name

* Revert lic

---------

Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>

* MASE Hardware Refactor (#528)

* Ignores folders cloned by "make sync"

* Increased docker ram and reduced jobs for verilator

* Basic interface and bringup test

* WIP: grouped attention

* First draft of group_matmul, not tested, passed linting

* WIP: Group matmul testbench

* WIP: simple matrix multiplication with tests

* simple matrix mult tests passing locally

* added repeated random testing

* Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul

* Improved runner

* fix linting issues on generate blocks

* Improved mase_cocotb runner and refactored for single source of truth

* Refactored a bunch of testbenches with new mase runner

* added background white

* Created interface for matmul module

* first draft of circular buffer

* factored out streaming interface

* added circ buffer tests, not passing

* Basic no-backpressure working for circ buffer, wip backpressure tests

* Standardised more interface names, WIP need to change tests, circular buffer working

* cleaned up & linting

* improved circ buffer tests to be generic & more converage

* WIP on matmul.sv

* fixed ports

* improved mase_runner, added valid bit toggling to drivers

* bringup test working for matmul

* added matrix accumulator, not tested

* basic matrix mult test passing

* added signed casting, tests are not passing for edge cases

* temporary change back to fixed_cast so matmul works

* restored docker submodule

* fix verilator flags for version & fix simple matmul multidriven

* casting working for floor rounding

* basic 2 matmul tests working with rounding

* added full window matmul test

* Improved testbench param setting

* WIP: test_chain_matmul test

* fixed signed cast and chain multiply works

* added random backpressure valid tests

* added more variations to chain matmul

* added combinatorial transpose module

* WIP: matrix stream transpose

* minor comment fix

* submodule fix

* minor submodule fix

* Separate all new group_att work from hardware refactor

* minor cleanup

* linting

* fixes for HW refactor PR

format other components

components as package

* mase_components package

* enable higher python versions for pip and fix mase_cocotb imports

deepspeed dependencies

---------

Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* pass verilator linting for linear layer

linting issues fixed

* Adding software test case for lab4 (#530)

* Sync docker

* Added init test case for lab 4

* Added a pass template for cocotb test

* Added hardware models for LLM.int, AWQ, and BigLittle (#531)

* Added llm int hardware model

* Added awq hardware model in hls

* Added big little integer hardware model in hls

* Added big little bfp hardware model in HLS

* Added bfp mm

* Added p&r

* emit and simulate actions

* define parallelism per dimension in hardware metadata

* emit cocotb testbench for emitted verilog

* enable pre-emit in simulate action

* simulate action changes

* syntax shortening for graph and node level metadata handling

* enable emit tb on arbitrary mase graph

* enable emit tb on arbitrary mase graph

editable pip install in sw action

* fix pythonpath for ci

fix

fix

* update lab instructions

* Check versions

* remove verilog analysis

* removed hls part

* revert mistakes

* Os mirror (#536)

* Remove debug code (#139)

* [Draft] Add Lutnet linear and convolution (#358)

* feat: add lut linear

* style: add comment

* feat: add lutnet prune flow testing script

* feat: add lutnet convolution

* style: reformat code

* feat: init LUTNet linear and convolution weight

* feat: add linear layer-wise scaling factor

* fix: add binary_training argument

* feat: add lutnet linear full workflow

* style: run black

* fix: add necessary params in lutnet testing script

* fix: remove transform pass in testing script

* fix: same for lutnet_quantize.py

* fix: use 1 and 0 to represent true, false in toml

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* Add lutnet conv2d workflow (#394)

* feat: add lutnet conv2d workflow

* style: run black

---------

Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>

* LogicNets (#395)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Feat]: Variable fusion for LogicNets (#450)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* LUTNet software (#440)

* fix(LUTNet): add unittest and small bug fixes

* feat: add binary residual

* fix: reformat lutnet script

* fix: update related config for binary residual

* fix: add support for functions in residual to mase

* feat: add residualSign to lutnet

* fix: add torch.stack and size1 tensor result handl

* feat: add linear lutnet pass

* feat: add lutnet cli pass

* feat: add conv2d binary_residual

* add: lut_conv2d with residual sign

* style: run black

* fix: minor bug fixs

* fix: train residual layers

* add: fine-tuning with pruning masks on

* add: training with pruning mask on

* style: add comment

* add: lutnet pipeline completed

* fix: remove softmax

* fix: remove assertion

* fix: update toml file

* fix: remove assertion

* fix: add pruning_masks to conv1d

* fix: add options to disable residual for layer1

* fix: use level-pruner, copy new params in transfom

* fix: update bash script

* chore: rebase to main

* style: run black

* fix: correct quant config dictionary

---------

Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>

* fix: Jsc Models now training (#458)

* fix: convert jsc_dataset output labels to index encoding

* style: run black

* [Draft] LogicNets Hardware Pass (#451)

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* feat: logicnets linear - not yet working

* fix: logicnets linear

* style: run black

* feat: merge linear pruning and half done conv

* feat: add neuron pruning

* feat: add jetsubstructure model and dataset

* feat: logicnets init and remove activation functio

* style: run black

* fix: correct JSC-S architecture

* run black

* feat: add weight decay param

* fix: query activation functions from bl_graph

* fix: rebase to main, add jsc to the new interface.

* fix: rm redundant file

* style: run black

* chore: add dependency to build script

* style: rename model source

* style: run black

* fix: add unittest support for logicnets

* fix: more the dataset to cache directory

* fix: update toml files

* style: add comment to logicnets script

* fix: jsc dataset path

* style: run black

* fix: add jsc dataset info

* chore: update toml file

* fix: put logicN tensor to the same device as input

* fix: update jsc model

* feat: customizable logicnets fusion (not fully verified)

* fix: all logicnets linear bugs fixed, fusion pass verified

* style: run black

* copy logicnets files

* initialise emit_logicnets test file

* refactor logicnets hw code to new class

* fix: remove unneeded print

* feat: logicnets linear hw generating

* style: run black

* trigger ci

* comment failing test

---------

Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* [Draft] Coursework prep (#469)

* fix pruning bugs

* fix jsc bug

* lab1 cont

* minor

* Update lab1.md

Example in in-project cross-reference

* continue on lab 1

* new size

* lab1 done

* lab1

* minor

* remove yaml in jsc

* add jsc to get input, finished drafting lab 2

* [software] Cheng's ADLS Lab1 fix (#472)

* fix git address and format md

* fix test command and add load-type warning/exception to load_model

* fix typo and update lightning introduction

* prevent wandb logger from saving config toml

* new loggers (#473)

* beautify jsc dataset (#471)

* Adls fix logger (#475)

* fix getLogger

* Adls fix logger: format codes (#476)

* format

* Update names

* update link in lab1

* Update lab1.md

aesthetics

* Update lab1.md

* minor

* add docker setup tutorial (#480)

* Update Setup-docker-env.md

Add x11 forward comment for MacOS

* fix typos

* better naming and change the grammar a bit

* lab3 done

* minor

* Coursework Lab2 Fix - CZ (#482)

Add an explanation of MASE types
Support loading checkpoint into the model in notebook
Update statistic profiler example

* add lab1 colab notebook

* feat: add lab2 colab notebook

* fix: recover profile statistics

* feat: remove token

* lab4

* minor

* lab4

* Course prep cz lab3 (#489)

* remove legacy codes

* add comments; fix search bugs

* format codes

* nerf model and dataset skeleton

* [Draft] NeRF Port (#491)

* dataset downloading

* ported model and dataset, not passing sanity check

* training and testingg flow working

* fix: requirements

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* format

* Added missing packages

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* updated license

* update docs and conda environment

docs restructuring

mase env

* lab 4 hardware stream

temporarily disable test opt

* polish labs

* Lab4 md minor tweak, doc editing (#3)

* Update lab4-hardware.md

* standardize docstr

* formatting

* add mase to pip

update to use python flow with setuptools

lutnet quantizer init.py

logicnets verilog init.py

fix license file

* migrate static docs to sphinx

* disable software CI for doc changes

* static doc images

fix code in lab 4

machop image

disable doc build on pull request, only push trigger

* Added txt to gitignore

* doc for doc

* add doc write

* Updated top-level readme (#11)

* Tidy up readme

* Resize

* Updated repo names (#14)

* Fix transform (#15)

* fix lab bugs

* fixed batchnom issue, make data feeding to have batch size greater than 1. close #12

* formatting

---------

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* Added adding pass doc steps

* fixed deepcopy issue

* fix param

* fixed save_load mase

* fix formatting

* fix formatting

* fix numpy corner case

* test file chagned

* formatting again..

* separate conda env .yml and pip requirements.txt

* fix lab issues (#23)

Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk>

* fix to the lab-1 quesiton to point to jsc-tiny (#26)

* fixing search action, errors caused because of recent version bumps, relates to issue #28

* quantization pass relink fixed (#30)

* force to be on the same device for now (#34)

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Updated hardware components and actions for lab4 (#32)

* Updated hardware components and actions for lab4

* manual merge for lab 4 hardware update (#36)

ci paths

gitignore

* verilog format

* verilog format

* Updated the test script for hardware regression test

* Updated hardware testing CI

* Removed HLS folders and remove verilog analysis header

* Updated setup

* update watch path for hardware ci

fix

* fix hardware tests

fix

* Removed metadata value type cast test

---------

Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>

* formatting plus enable accelerator choice on search (#38)

* formatting plus enable accelerator choice on search

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* Fix directory in Train tutorial (#22)

* Recovered missing changes for the search action (#41)

* basically replicate 5a426ed (#43)

* basically replicate 5a426ed

* formating

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>

* minor directory restructure to enable editable pip install

* gtkwave instructions for lab 4

remove prints

make pip install in hw ci editable

update test script paths

* integrate agile hardware library components (#44)

* integrate agile hardware library components

* hardware documentation on sphinx

enable hw cw

formatting

verilog formatting

fixed deps

fixed arith renaming

python3 for test hw script

add images

images from links

* lab3 doc (#47)

* linear testbench passing without data coherency check

* systolic mapping search space

* hw documentation for linear layer

formatting

* update getting started instructions and docker environment

md-> rst for docker getting started and stop triggering CIs on pull request

* bug fix

* Added link to the slack group

* Updated docker container setup (#55)

* Updated docker container setup

* Reenable software test for env test

* Revert Docker

* Updated Docker

* Reverted lic

* Updated conv_bn_fusion pass

* verilog format

* Fixed missing conflict

* python-format

* Updated dep

* Fixed hw regression test

* Synced doc

* Removed redundant files

* Updated config - dangerous!

* Removed redundant passes before changing directories

* Removed old-tests

* Removed old test folder

* python format

---------

Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Fixed doc format (#537)

* Feature/module transform (#538)

* module based swapping for quantization

* cli fix

* transform on module level

* add to script

* formating and flow

* fix formating

* sphinx

* I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this)

* minor

* format

* minor

* remove redundant readme

* seems like same file name clashes with pytest

* +x for .sh

* ch point to python3 for github action

* Updated file location

* Updated docker

* Fixed typo

* Changed gpu to cpu

---------

Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Pointed ch to python3

* support more type option in parse_accelerator func

---------

Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Various bug fixes related to parallelism to pass CI.

* Reformatted files with black.

* Attempt at fixing Black format diff.

* Reformatted internal comp.

* Reformatted hardware to pass CI.

* Temporary disable of Verilator warnings for further CI tests.

* Disabled sqrt TB for now.

* fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits)

* fixed linting issues on layer norm - some ignored as shouldn't have adverse effects

* Fixes to bugs regarding precision tests in LayerNorm.

* Fixed Verilog format in layernorm.

* Reverted accidental constant change.

* Attempt at fixing Black format diff.

* (Hopefully) final reformat.

* Removed few small accidental print-outs throughout codebase.

* Removed sys.path inserts for easy debugging in TBs.

---------

Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com>
Co-authored-by: pgimenes <pgimenes@outlook.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com>
Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com>
Co-authored-by: cano <cx922@ic.ac.uk>

* Revert "Group 7 - Hardware Normalisation (#85)" (#125)

This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a.

* ADLS Group 7 LLM int (#84)

* test

* manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py

* updated tb-related files

* removed test file llm_int8.sv

* added README for pull request

* added flow diagrams

* replaced old fifo with derrek's fifo

* modified format for CI check

* changed the output value of fixed_comparator_tree to be an absolute value. tb passed.

* formatted .py for PR sw test

* formatted .sv files for PR hw test

* removed fixed_point_divide.sv

* fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check

* I'm tired.

* removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment

* added sys library

* changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches

* changed mase_runner argument to module_param_list

* formatted .py again

* IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1

* dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load

* writing docs

* finished docs

* finished README

* modifed title of README

* updated figure paths

* test: latex math

* test again

* test again

* last try

* tired

* fixed markdown format bugs

---------

Co-authored-by: Moteng Ma <852964048@qq.com>

---------

Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com>
Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE>
Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>
Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com>
Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com>
Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com>
Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com>
Co-authored-by: Derek Lai <ddl20@ic.ac.uk>
Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk>
Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com>
Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk>
Co-authored-by: Cheng Zhang <chengzhang98@outlook.com>
Co-authored-by: Aa…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants