Fix/tests #203

Aaron-Zhao123 · 2024-07-07T20:41:09Z

Continuous fixing, do not merge.

Locally tested and passed by issuing just tw

mase_components/activations/test/fixed_relu_tb.py
mase_components/activations/test/fixed_gelu_tb.py

add sphinx docs, place files in correct directories, remove prof remove unnecessary files format tanh test temporarily disable failing activation emit tests relaunching actions Co-authored-by: pgimenes <pgimenes@outlook.com>

… to fix gather

… adls-group-17

* Group 7 - Hardware Normalisation (#85) * Registered batch_norm1d as valid quantisation and INTERNAL_RTL op. * Started registering batch_norm1d as a valid quantisation op. for testing purposes. * Added seperate NotImpl for quantized batch_norm1d. * Temporary stop-gap measure for unnamed variable access when emitting tb. * Added script for testing quantised batch norm integration. * Added Linear version of BatchNorm1D and registered it as a quantized module. * Updated testing script to test quantisation and quantised graph performance. * Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE. * Fixed quantised batch norm 1d not using bias quantiser. * implemented simple testbench - still failing as not implemented model * Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match. * Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb. * More work on batch norm 1d tb. * Working FE model for batch norm, but precision errors still observed. * WIP fixed_layer_norm and CORDIC sqrt - none tested * WIP - testbench for sqrt CORDIC * Working FE model for batch norm 1d. * Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer. * progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo * Almost working sqrt - values deviate from matlab in STATE_4 * Added PARTS_PER_NORM parameter and explanation to layer norm. * iterative sqrt working on a single testcase - TODO: broaden test coverage * Added (semi-functioning) layer norm SV module. Started work on corresponding TB. * fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented * Added temporary measures to view post-processed outputs from TB. * Work on layer norm implementation precision. * Deleted old fixed layer norm file. * Started work on cleaning up layer norm design. * fixed sign extention when calculating sum * Added STDV and mean as inputs to BatchNorm1d during quantisation. * variance working - integration with sqrt in progress * working first draft of layernorm * Fixed parts of layer block to get Vivado to synthesize. * fixed double assignement * parametrizing constant in sqrt cordic * made the design multi-cycle * added support for group and instance norm in hardware * Added quantized layernorm module. * Added neccesary dependencies for layernorm. * Updated fixed batch norm to support multiple different widths for its inputs. * Registered mean as a named parameter for the quantized batch norm. * Added layer norm to jet substructure model. Remove later. * Further work on LayerNormInteger integration. * Reformatted layer norm to have right parameters. Small changes to TBs. * Pipelined batch norm 1d. * fixed layer pippelined EXCEPT the sqrt HW * Pipelined sqrt almost completed - state machine yet to be removed * working pipeline of sqrt hardware * Added ability for batch_norm to convert between parallelism levels using new module. * Slight reworks on parallelism conversions for batch norm. * Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules. * 1 cycle timing fix * fixing consequence of previous 1 cycle change on sqrt * fixed driving signal for valid in of sqrt * removed docker credentials (#68) * removed docker credentials * switch docker container from ghio to docker hub * disable page deployment from forked repos * Skip for forked repo & print message * Removed missed echo * Add module passes (#57) * updated license * Os sync (#539) * fix: remove import nni (#526) * Software/emit-verilog-refactoring (#516) * fix emit verilog test according to new naming standard following analysis pass refactoring * linear/relu changes for new naming standard * improved pass import * random partitioning pass for toy model * hardware pass refactor * formatting * enable new pass import flow on the CI formatting * enable new pass import flow on the CI formatting formatting formatting relu * Added verible path * emit top verilog refactoring for new naming rules * fixed errors emit top working * fixing bram emit formating * Device Partitioning (#518) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * Partition new (#520) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * moved dir * refactored partitioning pass * updated the pass name in the init * format * fixed doc error and verilog format error * fixed hardware regression test * fixed most of the tests * Refactored verilog param collect and add repetition check * Added pythonpath for machop * Refactored the interface emit * refactored the signal and component emit * fixed term * refactored wiring * enable emit verilog in the test * Sync docker --------- Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * Os mirror (#529) * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * Update README.md with badges and a link to doc (#4) * Update README.md Fixed broken link and minor edition to add bibtex * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * fix package name * Revert lic --------- Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> * MASE Hardware Refactor (#528) * Ignores folders cloned by "make sync" * Increased docker ram and reduced jobs for verilator * Basic interface and bringup test * WIP: grouped attention * First draft of group_matmul, not tested, passed linting * WIP: Group matmul testbench * WIP: simple matrix multiplication with tests * simple matrix mult tests passing locally * added repeated random testing * Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul * Improved runner * fix linting issues on generate blocks * Improved mase_cocotb runner and refactored for single source of truth * Refactored a bunch of testbenches with new mase runner * added background white * Created interface for matmul module * first draft of circular buffer * factored out streaming interface * added circ buffer tests, not passing * Basic no-backpressure working for circ buffer, wip backpressure tests * Standardised more interface names, WIP need to change tests, circular buffer working * cleaned up & linting * improved circ buffer tests to be generic & more converage * WIP on matmul.sv * fixed ports * improved mase_runner, added valid bit toggling to drivers * bringup test working for matmul * added matrix accumulator, not tested * basic matrix mult test passing * added signed casting, tests are not passing for edge cases * temporary change back to fixed_cast so matmul works * restored docker submodule * fix verilator flags for version & fix simple matmul multidriven * casting working for floor rounding * basic 2 matmul tests working with rounding * added full window matmul test * Improved testbench param setting * WIP: test_chain_matmul test * fixed signed cast and chain multiply works * added random backpressure valid tests * added more variations to chain matmul * added combinatorial transpose module * WIP: matrix stream transpose * minor comment fix * submodule fix * minor submodule fix * Separate all new group_att work from hardware refactor * minor cleanup * linting * fixes for HW refactor PR format other components components as package * mase_components package * enable higher python versions for pip and fix mase_cocotb imports deepspeed dependencies --------- Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: pgimenes <pgimenes@outlook.com> * pass verilator linting for linear layer linting issues fixed * Adding software test case for lab4 (#530) * Sync docker * Added init test case for lab 4 * Added a pass template for cocotb test * Added hardware models for LLM.int, AWQ, and BigLittle (#531) * Added llm int hardware model * Added awq hardware model in hls * Added big little integer hardware model in hls * Added big little bfp hardware model in HLS * Added bfp mm * Added p&r * emit and simulate actions * define parallelism per dimension in hardware metadata * emit cocotb testbench for emitted verilog * enable pre-emit in simulate action * simulate action changes * syntax shortening for graph and node level metadata handling * enable emit tb on arbitrary mase graph * enable emit tb on arbitrary mase graph editable pip install in sw action * fix pythonpath for ci fix fix * update lab instructions * Check versions * remove verilog analysis * removed hls part * revert mistakes * Os mirror (#536) * Remove debug code (#139) * [Draft] Add Lutnet linear and convolution (#358) * feat: add lut linear * style: add comment * feat: add lutnet prune flow testing script * feat: add lutnet convolution * style: reformat code * feat: init LUTNet linear and convolution weight * feat: add linear layer-wise scaling factor * fix: add binary_training argument * feat: add lutnet linear full workflow * style: run black * fix: add necessary params in lutnet testing script * fix: remove transform pass in testing script * fix: same for lutnet_quantize.py * fix: use 1 and 0 to represent true, false in toml --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * Add lutnet conv2d workflow (#394) * feat: add lutnet conv2d workflow * style: run black --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * LogicNets (#395) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Feat]: Variable fusion for LogicNets (#450) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * LUTNet software (#440) * fix(LUTNet): add unittest and small bug fixes * feat: add binary residual * fix: reformat lutnet script * fix: update related config for binary residual * fix: add support for functions in residual to mase * feat: add residualSign to lutnet * fix: add torch.stack and size1 tensor result handl * feat: add linear lutnet pass * feat: add lutnet cli pass * feat: add conv2d binary_residual * add: lut_conv2d with residual sign * style: run black * fix: minor bug fixs * fix: train residual layers * add: fine-tuning with pruning masks on * add: training with pruning mask on * style: add comment * add: lutnet pipeline completed * fix: remove softmax * fix: remove assertion * fix: update toml file * fix: remove assertion * fix: add pruning_masks to conv1d * fix: add options to disable residual for layer1 * fix: use level-pruner, copy new params in transfom * fix: update bash script * chore: rebase to main * style: run black * fix: correct quant config dictionary --------- Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> * fix: Jsc Models now training (#458) * fix: convert jsc_dataset output labels to index encoding * style: run black * [Draft] LogicNets Hardware Pass (#451) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black * copy logicnets files * initialise emit_logicnets test file * refactor logicnets hw code to new class * fix: remove unneeded print * feat: logicnets linear hw generating * style: run black * trigger ci * comment failing test --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Draft] Coursework prep (#469) * fix pruning bugs * fix jsc bug * lab1 cont * minor * Update lab1.md Example in in-project cross-reference * continue on lab 1 * new size * lab1 done * lab1 * minor * remove yaml in jsc * add jsc to get input, finished drafting lab 2 * [software] Cheng's ADLS Lab1 fix (#472) * fix git address and format md * fix test command and add load-type warning/exception to load_model * fix typo and update lightning introduction * prevent wandb logger from saving config toml * new loggers (#473) * beautify jsc dataset (#471) * Adls fix logger (#475) * fix getLogger * Adls fix logger: format codes (#476) * format * Update names * update link in lab1 * Update lab1.md aesthetics * Update lab1.md * minor * add docker setup tutorial (#480) * Update Setup-docker-env.md Add x11 forward comment for MacOS * fix typos * better naming and change the grammar a bit * lab3 done * minor * Coursework Lab2 Fix - CZ (#482) Add an explanation of MASE types Support loading checkpoint into the model in notebook Update statistic profiler example * add lab1 colab notebook * feat: add lab2 colab notebook * fix: recover profile statistics * feat: remove token * lab4 * minor * lab4 * Course prep cz lab3 (#489) * remove legacy codes * add comments; fix search bugs * format codes * nerf model and dataset skeleton * [Draft] NeRF Port (#491) * dataset downloading * ported model and dataset, not passing sanity check * training and testingg flow working * fix: requirements --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * format * Added missing packages --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * migrate static docs to sphinx * disable software CI for doc changes * static doc images fix code in lab 4 machop image disable doc build on pull request, only push trigger * Added txt to gitignore * doc for doc * add doc write * Updated top-level readme (#11) * Tidy up readme * Resize * Updated repo names (#14) * Fix transform (#15) * fix lab bugs * fixed batchnom issue, make data feeding to have batch size greater than 1. close #12 * formatting --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * Added adding pass doc steps * fixed deepcopy issue * fix param * fixed save_load mase * fix formatting * fix formatting * fix numpy corner case * test file chagned * formatting again.. * separate conda env .yml and pip requirements.txt * fix lab issues (#23) Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * fix to the lab-1 quesiton to point to jsc-tiny (#26) * fixing search action, errors caused because of recent version bumps, relates to issue #28 * quantization pass relink fixed (#30) * force to be on the same device for now (#34) Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Updated hardware components and actions for lab4 (#32) * Updated hardware components and actions for lab4 * manual merge for lab 4 hardware update (#36) ci paths gitignore * verilog format * verilog format * Updated the test script for hardware regression test * Updated hardware testing CI * Removed HLS folders and remove verilog analysis header * Updated setup * update watch path for hardware ci fix * fix hardware tests fix * Removed metadata value type cast test --------- Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> * formatting plus enable accelerator choice on search (#38) * formatting plus enable accelerator choice on search * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Fix directory in Train tutorial (#22) * Recovered missing changes for the search action (#41) * basically replicate 5a426ed (#43) * basically replicate 5a426ed * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * minor directory restructure to enable editable pip install * gtkwave instructions for lab 4 remove prints make pip install in hw ci editable update test script paths * integrate agile hardware library components (#44) * integrate agile hardware library components * hardware documentation on sphinx enable hw cw formatting verilog formatting fixed deps fixed arith renaming python3 for test hw script add images images from links * lab3 doc (#47) * linear testbench passing without data coherency check * systolic mapping search space * hw documentation for linear layer formatting * update getting started instructions and docker environment md-> rst for docker getting started and stop triggering CIs on pull request * bug fix * Added link to the slack group * Updated docker container setup (#55) * Updated docker container setup * Reenable software test for env test * Revert Docker * Updated Docker * Reverted lic * Updated conv_bn_fusion pass * verilog format * Fixed missing conflict * python-format * Updated dep * Fixed hw regression test * Synced doc * Removed redundant files * Updated config - dangerous! * Removed redundant passes before changing directories * Removed old-tests * Removed old test folder * python format --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Fixed doc format * Updated dockerfile (#56) * refactor --------- Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Add module transform (#541) * fix: remove import nni (#526) * Software/emit-verilog-refactoring (#516) * fix emit verilog test according to new naming standard following analysis pass refactoring * linear/relu changes for new naming standard * improved pass import * random partitioning pass for toy model * hardware pass refactor * formatting * enable new pass import flow on the CI formatting * enable new pass import flow on the CI formatting formatting formatting relu * Added verible path * emit top verilog refactoring for new naming rules * fixed errors emit top working * fixing bram emit formating * Device Partitioning (#518) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * Partition new (#520) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * moved dir * refactored partitioning pass * updated the pass name in the init * format * fixed doc error and verilog format error * fixed hardware regression test * fixed most of the tests * Refactored verilog param collect and add repetition check * Added pythonpath for machop * Refactored the interface emit * refactored the signal and component emit * fixed term * refactored wiring * enable emit verilog in the test * Sync docker --------- Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * Os mirror (#529) * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * Update README.md with badges and a link to doc (#4) * Update README.md Fixed broken link and minor edition to add bibtex * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * fix package name * Revert lic --------- Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> * MASE Hardware Refactor (#528) * Ignores folders cloned by "make sync" * Increased docker ram and reduced jobs for verilator * Basic interface and bringup test * WIP: grouped attention * First draft of group_matmul, not tested, passed linting * WIP: Group matmul testbench * WIP: simple matrix multiplication with tests * simple matrix mult tests passing locally * added repeated random testing * Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul * Improved runner * fix linting issues on generate blocks * Improved mase_cocotb runner and refactored for single source of truth * Refactored a bunch of testbenches with new mase runner * added background white * Created interface for matmul module * first draft of circular buffer * factored out streaming interface * added circ buffer tests, not passing * Basic no-backpressure working for circ buffer, wip backpressure tests * Standardised more interface names, WIP need to change tests, circular buffer working * cleaned up & linting * improved circ buffer tests to be generic & more converage * WIP on matmul.sv * fixed ports * improved mase_runner, added valid bit toggling to drivers * bringup test working for matmul * added matrix accumulator, not tested * basic matrix mult test passing * added signed casting, tests are not passing for edge cases * temporary change back to fixed_cast so matmul works * restored docker submodule * fix verilator flags for version & fix simple matmul multidriven * casting working for floor rounding * basic 2 matmul tests working with rounding * added full window matmul test * Improved testbench param setting * WIP: test_chain_matmul test * fixed signed cast and chain multiply works * added random backpressure valid tests * added more variations to chain matmul * added combinatorial transpose module * WIP: matrix stream transpose * minor comment fix * submodule fix * minor submodule fix * Separate all new group_att work from hardware refactor * minor cleanup * linting * fixes for HW refactor PR format other components components as package * mase_components package * enable higher python versions for pip and fix mase_cocotb imports deepspeed dependencies --------- Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: pgimenes <pgimenes@outlook.com> * pass verilator linting for linear layer linting issues fixed * Adding software test case for lab4 (#530) * Sync docker * Added init test case for lab 4 * Added a pass template for cocotb test * Added hardware models for LLM.int, AWQ, and BigLittle (#531) * Added llm int hardware model * Added awq hardware model in hls * Added big little integer hardware model in hls * Added big little bfp hardware model in HLS * Added bfp mm * Added p&r * emit and simulate actions * define parallelism per dimension in hardware metadata * emit cocotb testbench for emitted verilog * enable pre-emit in simulate action * simulate action changes * syntax shortening for graph and node level metadata handling * enable emit tb on arbitrary mase graph * enable emit tb on arbitrary mase graph editable pip install in sw action * fix pythonpath for ci fix fix * update lab instructions * Check versions * remove verilog analysis * removed hls part * revert mistakes * Os mirror (#536) * Remove debug code (#139) * [Draft] Add Lutnet linear and convolution (#358) * feat: add lut linear * style: add comment * feat: add lutnet prune flow testing script * feat: add lutnet convolution * style: reformat code * feat: init LUTNet linear and convolution weight * feat: add linear layer-wise scaling factor * fix: add binary_training argument * feat: add lutnet linear full workflow * style: run black * fix: add necessary params in lutnet testing script * fix: remove transform pass in testing script * fix: same for lutnet_quantize.py * fix: use 1 and 0 to represent true, false in toml --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * Add lutnet conv2d workflow (#394) * feat: add lutnet conv2d workflow * style: run black --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * LogicNets (#395) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Feat]: Variable fusion for LogicNets (#450) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * LUTNet software (#440) * fix(LUTNet): add unittest and small bug fixes * feat: add binary residual * fix: reformat lutnet script * fix: update related config for binary residual * fix: add support for functions in residual to mase * feat: add residualSign to lutnet * fix: add torch.stack and size1 tensor result handl * feat: add linear lutnet pass * feat: add lutnet cli pass * feat: add conv2d binary_residual * add: lut_conv2d with residual sign * style: run black * fix: minor bug fixs * fix: train residual layers * add: fine-tuning with pruning masks on * add: training with pruning mask on * style: add comment * add: lutnet pipeline completed * fix: remove softmax * fix: remove assertion * fix: update toml file * fix: remove assertion * fix: add pruning_masks to conv1d * fix: add options to disable residual for layer1 * fix: use level-pruner, copy new params in transfom * fix: update bash script * chore: rebase to main * style: run black * fix: correct quant config dictionary --------- Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> * fix: Jsc Models now training (#458) * fix: convert jsc_dataset output labels to index encoding * style: run black * [Draft] LogicNets Hardware Pass (#451) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black * copy logicnets files * initialise emit_logicnets test file * refactor logicnets hw code to new class * fix: remove unneeded print * feat: logicnets linear hw generating * style: run black * trigger ci * comment failing test --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Draft] Coursework prep (#469) * fix pruning bugs * fix jsc bug * lab1 cont * minor * Update lab1.md Example in in-project cross-reference * continue on lab 1 * new size * lab1 done * lab1 * minor * remove yaml in jsc * add jsc to get input, finished drafting lab 2 * [software] Cheng's ADLS Lab1 fix (#472) * fix git address and format md * fix test command and add load-type warning/exception to load_model * fix typo and update lightning introduction * prevent wandb logger from saving config toml * new loggers (#473) * beautify jsc dataset (#471) * Adls fix logger (#475) * fix getLogger * Adls fix logger: format codes (#476) * format * Update names * update link in lab1 * Update lab1.md aesthetics * Update lab1.md * minor * add docker setup tutorial (#480) * Update Setup-docker-env.md Add x11 forward comment for MacOS * fix typos * better naming and change the grammar a bit * lab3 done * minor * Coursework Lab2 Fix - CZ (#482) Add an explanation of MASE types Support loading checkpoint into the model in notebook Update statistic profiler example * add lab1 colab notebook * feat: add lab2 colab notebook * fix: recover profile statistics * feat: remove token * lab4 * minor * lab4 * Course prep cz lab3 (#489) * remove legacy codes * add comments; fix search bugs * format codes * nerf model and dataset skeleton * [Draft] NeRF Port (#491) * dataset downloading * ported model and dataset, not passing sanity check * training and testingg flow working * fix: requirements --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * format * Added missing packages --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * migrate static docs to sphinx * disable software CI for doc changes * static doc images fix code in lab 4 machop image disable doc build on pull request, only push trigger * Added txt to gitignore * doc for doc * add doc write * Updated top-level readme (#11) * Tidy up readme * Resize * Updated repo names (#14) * Fix transform (#15) * fix lab bugs * fixed batchnom issue, make data feeding to have batch size greater than 1. close #12 * formatting --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * Added adding pass doc steps * fixed deepcopy issue * fix param * fixed save_load mase * fix formatting * fix formatting * fix numpy corner case * test file chagned * formatting again.. * separate conda env .yml and pip requirements.txt * fix lab issues (#23) Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * fix to the lab-1 quesiton to point to jsc-tiny (#26) * fixing search action, errors caused because of recent version bumps, relates to issue #28 * quantization pass relink fixed (#30) * force to be on the same device for now (#34) Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Updated hardware components and actions for lab4 (#32) * Updated hardware components and actions for lab4 * manual merge for lab 4 hardware update (#36) ci paths gitignore * verilog format * verilog format * Updated the test script for hardware regression test * Updated hardware testing CI * Removed HLS folders and remove verilog analysis header * Updated setup * update watch path for hardware ci fix * fix hardware tests fix * Removed metadata value type cast test --------- Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> * formatting plus enable accelerator choice on search (#38) * formatting plus enable accelerator choice on search * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Fix directory in Train tutorial (#22) * Recovered missing changes for the search action (#41) * basically replicate 5a426ed (#43) * basically replicate 5a426ed * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * minor directory restructure to enable editable pip install * gtkwave instructions for lab 4 remove prints make pip install in hw ci editable update test script paths * integrate agile hardware library components (#44) * integrate agile hardware library components * hardware documentation on sphinx enable hw cw formatting verilog formatting fixed deps fixed arith renaming python3 for test hw script add images images from links * lab3 doc (#47) * linear testbench passing without data coherency check * systolic mapping search space * hw documentation for linear layer formatting * update getting started instructions and docker environment md-> rst for docker getting started and stop triggering CIs on pull request * bug fix * Added link to the slack group * Updated docker container setup (#55) * Updated docker container setup * Reenable software test for env test * Revert Docker * Updated Docker * Reverted lic * Updated conv_bn_fusion pass * verilog format * Fixed missing conflict * python-format * Updated dep * Fixed hw regression test * Synced doc * Removed redundant files * Updated config - dangerous! * Removed redundant passes before changing directories * Removed old-tests * Removed old test folder * python format --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Fixed doc format (#537) * Feature/module transform (#538) * module based swapping for quantization * cli fix * transform on module level * add to script * formating and flow * fix formating * sphinx * I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this) * minor * format * minor * remove redundant readme * seems like same file name clashes with pytest * +x for .sh * ch point to python3 for github action * Updated file location * Updated docker * Fixed typo * Changed gpu to cpu --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> --------- Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Pointed ch to python3 * support more type option in parse_accelerator func --------- Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Various bug fixes related to parallelism to pass CI. * Reformatted files with black. * Attempt at fixing Black format diff. * Reformatted internal comp. * Reformatted hardware to pass CI. * Temporary disable of Verilator warnings for further CI tests. * Disabled sqrt TB for now. * fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits) * fixed linting issues on layer norm - some ignored as shouldn't have adverse effects * Fixes to bugs regarding precision tests in LayerNorm. * Fixed Verilog format in layernorm. * Reverted accidental constant change. * Attempt at fixing Black format diff. * (Hopefully) final reformat. * Removed few small accidental print-outs throughout codebase. * Removed sys.path inserts for easy debugging in TBs. --------- Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Revert "Group 7 - Hardware Normalisation (#85)" (#125) This reverts commit 05bad8077e9fb3a869ab614e07ae8e17b314788a. * ADLS Group 7 LLM int (#84) * test * manually merged branch mazi to group7_llm for pull request. Remaining issues: 1. fixed_cmp_tree_tb failed 2. -Wno arguments in mase_cocotb/runner.py * updated tb-related files * removed test file llm_int8.sv * added README for pull request * added flow diagrams * replaced old fifo with derrek's fifo * modified format for CI check * changed the output value of fixed_comparator_tree to be an absolute value. tb passed. * formatted .py for PR sw test * formatted .sv files for PR hw test * removed fixed_point_divide.sv * fixed Verilator lint errors and passed scripts/test-hardware.py test. Ready for PR hw regression check * I'm tired. * removed user-specific mase_cocotb path, which is not needed in the standard mase_docker environment * added sys library * changed the normal generated random num to be (0,30) to fit the check_results in many common module testbenches * changed mase_runner argument to module_param_list * formatted .py again * IMPORTANT: fixed bias-related signal declarations (especially DATA_IN_PARALLELISM_DIM_0 -> DATA_OUT_DIM_0). fixed_matmul_core_tb passed for HAS_BIAS=1. fixed_linear passed for HAS_BIAS=0 but still failed for HAS_BIAS=1 * dummy modification: changed the parameter 'self.in_rows' from 2000 to 20 in order to reduce compilation load * writing docs * finished docs * finished README * modifed title of README * updated figure paths * test: latex math * test again * test again * last try * tired * fixed markdown format bugs --------- Co-authored-by: Moteng Ma <852964048@qq.com> --------- Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com> Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> Co-authored-by: Zixian-Jin <84839323+Zixian-Jin@users.noreply.github.com> Co-authored-by: Moteng Ma <852964048@qq.com>

* ADLS: Hardware Normalization (Group 17) (#65) * migrated matmul verilog * started on group norm 2d * wip: group norm * migrated matrix tools util * built basic test * wip group norm hardware * added full throughput fifo to design * update docker to HEAD * added full throughput fifo_v2 * docker * group norm is wip * wip: added difference fifo to group_norm_2d * pipeline working with temporary identity func as inv sqrt * cleaned up unused state * C++ NR method finished but needs cleaning * Cleaned up nr file but need to test different Q formats * wip: group norm * added output rounding stage & software model * migrated new repeat circ buffer * added first draft of rmsnorm * group norm has single bit error * refactored fixed_signed_cast testbench * fixed broken assertion * group norm 2d tests passing on constant inv sqrt * docstrings * added rms norm testbench and all tests passing with constant inv sqrt * Adapted isqrt code to collect data * unified norm file * wip: layernorm mase integration * removed INTERNAL_RTL_DEPENDENCIES, new runner for simulation, emit_tb fixed * Finished building blocks for isqrt * simulation is running, currently failing due to small error because of repeated integer rounding * refactored out all models * MVP of isqrt finished * Cleaned up a bit and adapted interface of isqrt module * added layernorm integer quantized * added abs to invsqrt testbench, integrated inv sqrt unit into group norm * added group norm 2d randomised dimensions * Fixed lut index module * updated interfaces for inv sqrt, moved INV SQRT widths to internal params * fixed arithmetic deps for norm.sv * Major fix to structure of group norm2d * Updated software model to match C++ model * Fixed range augmentation due to register width not being wide enough * Fixed register length issues with lut index module * Made LUT parameterisable by width but not by size * Fixed testbench and reduced register sizes of nr stage * Updated the testbenches to import software module parts from a single source * All isqrt test passing * Fixed assert stmt * factored out lut parameter dict * invsqrt working but group norm sometimes has 1 bit error, fix is WIP * wip: groupnorm2d still failing test, have no clue how it is expecting unknown data?? * remove nettype none for vivado bug * added tcl script for vivado * changed all software quantized modules * added an error threshold monitor, groupnorm2d still has very weird high error bug in streaming test * updated error threshold monitor to support no check * major fix to fifo, groupnorm2d now passing * add instance norm and fixed software emit verilog * added batchnorm2d, groupnorm and instance norm quantizable modules in mase * removed fixed signed cast * added lut common element * removed local path * integrated new lut into fixed isqrt * fixed streaming mon, added new lut isqrt into groupnorm 2d, added more tests for groupnorm 2d * added unified norm testbench and removed old temporary inv sqrt * updated emit verilog tb, top level norm still failing * fixed isqrt linting error * removed unused params from tb * pipelined isqrt module and integrated into group norm 2d * added new rms norm extension layers, hardware not passing tb * fixed rms norm implementation and added tests for hw * added rms norm to software stack * refactored rms_norm and test_emit_verilog_norm test * commented out rms_norm circular dependency issue not fixed * normtb works for rms * removed online layer norm * moved inverse sqrt C++ fixed model * fixed groupnorm net in test_emit_verilog_norm * refactored test_emit_verilog_norm and readjusted test times * docstring * add memoryfile and top for vivado synth * vivado script * fixed top level params & ports * Batch norm 2d working * added comments * added u250 constraints file and updated tcl * stripped xdc file * docstrings * Batch norm hook up? * integrated batchnorm into norm.sv all tests passing * pipelined batchnorm module * added docs for normalization layers * fixed bugged channel selector * added per-element scale to rms norm * fixed rms norm function, norm_tb rms not passing * error analysis code * added fifo * switched to non-project vivado flow and 333mhz clk * removed variance clamp from groupnorm * fix comma * added isqrt fix and inverse NUM_VALUES fix to rms norm * moved around wires for vivado * isqrt pipelineing * cleaned up nr stage * vivado scripts * fix batchnorm * comment out batchnorm * pipelined NR stage more * fix nr stage bug * more aggressive clk frequency * added two contraint files to find fmax * fixed timing constraints * fixed up tests for submission * sv linting * python linting * more linting * isqrt CI * fixed tests * fix test_emit_verilog_norm * removed cpp model duplicate * fixed software tests * reformat for pytest * linting * sv linting * python linting * linting * docstr * removed matmul * Trigger Build * group norm CI --------- Co-authored-by: J-u1i0 <Julio.Castillejo.Motta@gmail.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> * Updated CI rules to apply on more branches (#122) * Update Slack link (#130) * Removed instructions in README.md (users should refer to the official doc) and change the order of setup instructions (#131) * tidy up documentation * matmul version for fixed linear * Fixed a bug that the "local" flag is not working (#170) * Docker update (#171) * Updated Docker to main for local build * Sync scripts for updated docker setups (seperating CPU and GPU containers) * removed docker as submodule * Pull from Makefile to avoid repeated sync on submodule * Added instructions to log for better readability on CI log * remove breakpoint * refactor mase_components to use pytest for hardware regression * fix formatting * unified linear 2d tb * update action and makefile * fixes * update utils import * missing filepath updates for emit norm * fix to locate results.xml --------- Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: J-u1i0 <Julio.Castillejo.Motta@gmail.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk>

* Group 13 ADLS Coursework submission (#69) * imported the activations, rtl and tb, and changed deps.py * stream monitor + roller edge case + fifio wire bug + unpacked_reg bug * emit verilog integration * refactor - Verify Required Activations + Softmax * gen sv updates, emit verilog dependence update * docs except softmax * silu doc had a small mistake * alpha parameter for elu * brought hardshrink from other branch * file cleanup and bringing hardshrink from other branch * emit verilog generates lut * made 1 comment be on two lines * more comments * changed comments * adding lut modules * lut in tb * code cleanup #1 * tested with luts (bardia) * emit verilog tested and working * added pipelining * added bitstrign to pip reqs * softmax tb * softmax doc * changed python files reformated using python3 -m black FILE_NAME * formatted verilog files by hand * removed a script * verilog linter * debug , adding linting to lut * adding luts in efforts to pass ci bakhtiar * bakhtiar: removing dir to fix ci --------- Co-authored-by: ANDY WANNA <andy.wanna.02@gmail.com> Co-authored-by: bardia01 <bardia.mzad@gmail.com> Co-authored-by: bardia01 <93927927+bardia01@users.noreply.github.com> * fixes to merge ADLS group 13 PR * fix generate memory script path * generate activation LUTs before test-hardware linting * fix * build mase components script * editable pip install in hw action * remove editable install * include in pytest test list include untested tbs as skipped tests gitignore *_lut.sv, html, *.csv import pytest * verilog formatting * hardswish rename --------- Co-authored-by: bakhtiarZ <93926720+bakhtiarZ@users.noreply.github.com> Co-authored-by: ANDY WANNA <andy.wanna.02@gmail.com> Co-authored-by: bardia01 <bardia.mzad@gmail.com> Co-authored-by: bardia01 <93927927+bardia01@users.noreply.github.com>

* Updated CI rules to apply on more branches (#122) * Group 12 - Hardware Normalisation (#126) * Registered batch_norm1d as valid quantisation and INTERNAL_RTL op. * Started registering batch_norm1d as a valid quantisation op. for testing purposes. * Added seperate NotImpl for quantized batch_norm1d. * Temporary stop-gap measure for unnamed variable access when emitting tb. * Added script for testing quantised batch norm integration. * Added Linear version of BatchNorm1D and registered it as a quantized module. * Updated testing script to test quantisation and quantised graph performance. * Added initial batch norm system verilog component. AUTHOR: SCOTT VANDENBERGHE. * Fixed quantised batch norm 1d not using bias quantiser. * implemented simple testbench - still failing as not implemented model * Reworked BatchNorm1D SV module to retrieve gamma/std/mean etc from external BRAM modules. Rewrote TB to match. * Attempts at getting a FE model of BatchNorm1D to integrate with Cocotb. * More work on batch norm 1d tb. * Working FE model for batch norm, but precision errors still observed. * WIP fixed_layer_norm and CORDIC sqrt - none tested * WIP - testbench for sqrt CORDIC * Working FE model for batch norm 1d. * Added extra TODO comment for FE BatchNorm TB model to use BatchNorm software layer. * progress on WIP sqrt implementation - still some problems on formating (need to work with values smaller than 1 and not doing currently - also need to work with larger fractional part in iterative algo * Almost working sqrt - values deviate from matlab in STATE_4 * Added PARTS_PER_NORM parameter and explanation to layer norm. * iterative sqrt working on a single testcase - TODO: broaden test coverage * Added (semi-functioning) layer norm SV module. Started work on corresponding TB. * fix to sqrt hardware - removed rescaling for smaller numbers (wasn't fully implemented * Added temporary measures to view post-processed outputs from TB. * Work on layer norm implementation precision. * Deleted old fixed layer norm file. * Started work on cleaning up layer norm design. * fixed sign extention when calculating sum * Added STDV and mean as inputs to BatchNorm1d during quantisation. * variance working - integration with sqrt in progress * working first draft of layernorm * Fixed parts of layer block to get Vivado to synthesize. * fixed double assignement * parametrizing constant in sqrt cordic * made the design multi-cycle * added support for group and instance norm in hardware * Added quantized layernorm module. * Added neccesary dependencies for layernorm. * Updated fixed batch norm to support multiple different widths for its inputs. * Registered mean as a named parameter for the quantized batch norm. * Added layer norm to jet substructure model. Remove later. * Further work on LayerNormInteger integration. * Reformatted layer norm to have right parameters. Small changes to TBs. * Pipelined batch norm 1d. * fixed layer pippelined EXCEPT the sqrt HW * Pipelined sqrt almost completed - state machine yet to be removed * working pipeline of sqrt hardware * Added ability for batch_norm to convert between parallelism levels using new module. * Slight reworks on parallelism conversions for batch norm. * Unusued, but potentially useful: Created a join_n module for joining ready/valid signals of an arbitrary number of modules. * 1 cycle timing fix * fixing consequence of previous 1 cycle change on sqrt * fixed driving signal for valid in of sqrt * removed docker credentials (#68) * removed docker credentials * switch docker container from ghio to docker hub * disable page deployment from forked repos * Skip for forked repo & print message * Removed missed echo * Add module passes (#57) * updated license * Os sync (#539) * fix: remove import nni (#526) * Software/emit-verilog-refactoring (#516) * fix emit verilog test according to new naming standard following analysis pass refactoring * linear/relu changes for new naming standard * improved pass import * random partitioning pass for toy model * hardware pass refactor * formatting * enable new pass import flow on the CI formatting * enable new pass import flow on the CI formatting formatting formatting relu * Added verible path * emit top verilog refactoring for new naming rules * fixed errors emit top working * fixing bram emit formating * Device Partitioning (#518) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * Partition new (#520) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * moved dir * refactored partitioning pass * updated the pass name in the init * format * fixed doc error and verilog format error * fixed hardware regression test * fixed most of the tests * Refactored verilog param collect and add repetition check * Added pythonpath for machop * Refactored the interface emit * refactored the signal and component emit * fixed term * refactored wiring * enable emit verilog in the test * Sync docker --------- Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * Os mirror (#529) * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * Update README.md with badges and a link to doc (#4) * Update README.md Fixed broken link and minor edition to add bibtex * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * fix package name * Revert lic --------- Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> * MASE Hardware Refactor (#528) * Ignores folders cloned by "make sync" * Increased docker ram and reduced jobs for verilator * Basic interface and bringup test * WIP: grouped attention * First draft of group_matmul, not tested, passed linting * WIP: Group matmul testbench * WIP: simple matrix multiplication with tests * simple matrix mult tests passing locally * added repeated random testing * Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul * Improved runner * fix linting issues on generate blocks * Improved mase_cocotb runner and refactored for single source of truth * Refactored a bunch of testbenches with new mase runner * added background white * Created interface for matmul module * first draft of circular buffer * factored out streaming interface * added circ buffer tests, not passing * Basic no-backpressure working for circ buffer, wip backpressure tests * Standardised more interface names, WIP need to change tests, circular buffer working * cleaned up & linting * improved circ buffer tests to be generic & more converage * WIP on matmul.sv * fixed ports * improved mase_runner, added valid bit toggling to drivers * bringup test working for matmul * added matrix accumulator, not tested * basic matrix mult test passing * added signed casting, tests are not passing for edge cases * temporary change back to fixed_cast so matmul works * restored docker submodule * fix verilator flags for version & fix simple matmul multidriven * casting working for floor rounding * basic 2 matmul tests working with rounding * added full window matmul test * Improved testbench param setting * WIP: test_chain_matmul test * fixed signed cast and chain multiply works * added random backpressure valid tests * added more variations to chain matmul * added combinatorial transpose module * WIP: matrix stream transpose * minor comment fix * submodule fix * minor submodule fix * Separate all new group_att work from hardware refactor * minor cleanup * linting * fixes for HW refactor PR format other components components as package * mase_components package * enable higher python versions for pip and fix mase_cocotb imports deepspeed dependencies --------- Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: pgimenes <pgimenes@outlook.com> * pass verilator linting for linear layer linting issues fixed * Adding software test case for lab4 (#530) * Sync docker * Added init test case for lab 4 * Added a pass template for cocotb test * Added hardware models for LLM.int, AWQ, and BigLittle (#531) * Added llm int hardware model * Added awq hardware model in hls * Added big little integer hardware model in hls * Added big little bfp hardware model in HLS * Added bfp mm * Added p&r * emit and simulate actions * define parallelism per dimension in hardware metadata * emit cocotb testbench for emitted verilog * enable pre-emit in simulate action * simulate action changes * syntax shortening for graph and node level metadata handling * enable emit tb on arbitrary mase graph * enable emit tb on arbitrary mase graph editable pip install in sw action * fix pythonpath for ci fix fix * update lab instructions * Check versions * remove verilog analysis * removed hls part * revert mistakes * Os mirror (#536) * Remove debug code (#139) * [Draft] Add Lutnet linear and convolution (#358) * feat: add lut linear * style: add comment * feat: add lutnet prune flow testing script * feat: add lutnet convolution * style: reformat code * feat: init LUTNet linear and convolution weight * feat: add linear layer-wise scaling factor * fix: add binary_training argument * feat: add lutnet linear full workflow * style: run black * fix: add necessary params in lutnet testing script * fix: remove transform pass in testing script * fix: same for lutnet_quantize.py * fix: use 1 and 0 to represent true, false in toml --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * Add lutnet conv2d workflow (#394) * feat: add lutnet conv2d workflow * style: run black --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * LogicNets (#395) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Feat]: Variable fusion for LogicNets (#450) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * LUTNet software (#440) * fix(LUTNet): add unittest and small bug fixes * feat: add binary residual * fix: reformat lutnet script * fix: update related config for binary residual * fix: add support for functions in residual to mase * feat: add residualSign to lutnet * fix: add torch.stack and size1 tensor result handl * feat: add linear lutnet pass * feat: add lutnet cli pass * feat: add conv2d binary_residual * add: lut_conv2d with residual sign * style: run black * fix: minor bug fixs * fix: train residual layers * add: fine-tuning with pruning masks on * add: training with pruning mask on * style: add comment * add: lutnet pipeline completed * fix: remove softmax * fix: remove assertion * fix: update toml file * fix: remove assertion * fix: add pruning_masks to conv1d * fix: add options to disable residual for layer1 * fix: use level-pruner, copy new params in transfom * fix: update bash script * chore: rebase to main * style: run black * fix: correct quant config dictionary --------- Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> * fix: Jsc Models now training (#458) * fix: convert jsc_dataset output labels to index encoding * style: run black * [Draft] LogicNets Hardware Pass (#451) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black * copy logicnets files * initialise emit_logicnets test file * refactor logicnets hw code to new class * fix: remove unneeded print * feat: logicnets linear hw generating * style: run black * trigger ci * comment failing test --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Draft] Coursework prep (#469) * fix pruning bugs * fix jsc bug * lab1 cont * minor * Update lab1.md Example in in-project cross-reference * continue on lab 1 * new size * lab1 done * lab1 * minor * remove yaml in jsc * add jsc to get input, finished drafting lab 2 * [software] Cheng's ADLS Lab1 fix (#472) * fix git address and format md * fix test command and add load-type warning/exception to load_model * fix typo and update lightning introduction * prevent wandb logger from saving config toml * new loggers (#473) * beautify jsc dataset (#471) * Adls fix logger (#475) * fix getLogger * Adls fix logger: format codes (#476) * format * Update names * update link in lab1 * Update lab1.md aesthetics * Update lab1.md * minor * add docker setup tutorial (#480) * Update Setup-docker-env.md Add x11 forward comment for MacOS * fix typos * better naming and change the grammar a bit * lab3 done * minor * Coursework Lab2 Fix - CZ (#482) Add an explanation of MASE types Support loading checkpoint into the model in notebook Update statistic profiler example * add lab1 colab notebook * feat: add lab2 colab notebook * fix: recover profile statistics * feat: remove token * lab4 * minor * lab4 * Course prep cz lab3 (#489) * remove legacy codes * add comments; fix search bugs * format codes * nerf model and dataset skeleton * [Draft] NeRF Port (#491) * dataset downloading * ported model and dataset, not passing sanity check * training and testingg flow working * fix: requirements --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * format * Added missing packages --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * migrate static docs to sphinx * disable software CI for doc changes * static doc images fix code in lab 4 machop image disable doc build on pull request, only push trigger * Added txt to gitignore * doc for doc * add doc write * Updated top-level readme (#11) * Tidy up readme * Resize * Updated repo names (#14) * Fix transform (#15) * fix lab bugs * fixed batchnom issue, make data feeding to have batch size greater than 1. close #12 * formatting --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * Added adding pass doc steps * fixed deepcopy issue * fix param * fixed save_load mase * fix formatting * fix formatting * fix numpy corner case * test file chagned * formatting again.. * separate conda env .yml and pip requirements.txt * fix lab issues (#23) Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * fix to the lab-1 quesiton to point to jsc-tiny (#26) * fixing search action, errors caused because of recent version bumps, relates to issue #28 * quantization pass relink fixed (#30) * force to be on the same device for now (#34) Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Updated hardware components and actions for lab4 (#32) * Updated hardware components and actions for lab4 * manual merge for lab 4 hardware update (#36) ci paths gitignore * verilog format * verilog format * Updated the test script for hardware regression test * Updated hardware testing CI * Removed HLS folders and remove verilog analysis header * Updated setup * update watch path for hardware ci fix * fix hardware tests fix * Removed metadata value type cast test --------- Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> * formatting plus enable accelerator choice on search (#38) * formatting plus enable accelerator choice on search * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Fix directory in Train tutorial (#22) * Recovered missing changes for the search action (#41) * basically replicate 5a426ed (#43) * basically replicate 5a426ed * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * minor directory restructure to enable editable pip install * gtkwave instructions for lab 4 remove prints make pip install in hw ci editable update test script paths * integrate agile hardware library components (#44) * integrate agile hardware library components * hardware documentation on sphinx enable hw cw formatting verilog formatting fixed deps fixed arith renaming python3 for test hw script add images images from links * lab3 doc (#47) * linear testbench passing without data coherency check * systolic mapping search space * hw documentation for linear layer formatting * update getting started instructions and docker environment md-> rst for docker getting started and stop triggering CIs on pull request * bug fix * Added link to the slack group * Updated docker container setup (#55) * Updated docker container setup * Reenable software test for env test * Revert Docker * Updated Docker * Reverted lic * Updated conv_bn_fusion pass * verilog format * Fixed missing conflict * python-format * Updated dep * Fixed hw regression test * Synced doc * Removed redundant files * Updated config - dangerous! * Removed redundant passes before changing directories * Removed old-tests * Removed old test folder * python format --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Fixed doc format * Updated dockerfile (#56) * refactor --------- Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Add module transform (#541) * fix: remove import nni (#526) * Software/emit-verilog-refactoring (#516) * fix emit verilog test according to new naming standard following analysis pass refactoring * linear/relu changes for new naming standard * improved pass import * random partitioning pass for toy model * hardware pass refactor * formatting * enable new pass import flow on the CI formatting * enable new pass import flow on the CI formatting formatting formatting relu * Added verible path * emit top verilog refactoring for new naming rules * fixed errors emit top working * fixing bram emit formating * Device Partitioning (#518) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * Partition new (#520) * Added md syntax (#515) * Added md syntax * polished code in md * test md syntax * Added proper code blocks in doc * Added device id as metadata for partitioning * moved dir * refactored partitioning pass * updated the pass name in the init * format * fixed doc error and verilog format error * fixed hardware regression test * fixed most of the tests * Refactored verilog param collect and add repetition check * Added pythonpath for machop * Refactored the interface emit * refactored the signal and component emit * fixed term * refactored wiring * enable emit verilog in the test * Sync docker --------- Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * Os mirror (#529) * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * Update README.md with badges and a link to doc (#4) * Update README.md Fixed broken link and minor edition to add bibtex * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * fix package name * Revert lic --------- Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> * MASE Hardware Refactor (#528) * Ignores folders cloned by "make sync" * Increased docker ram and reduced jobs for verilator * Basic interface and bringup test * WIP: grouped attention * First draft of group_matmul, not tested, passed linting * WIP: Group matmul testbench * WIP: simple matrix multiplication with tests * simple matrix mult tests passing locally * added repeated random testing * Moved a bunch of hardware files, ALL TESTS BROKEN except for simple_matmul * Improved runner * fix linting issues on generate blocks * Improved mase_cocotb runner and refactored for single source of truth * Refactored a bunch of testbenches with new mase runner * added background white * Created interface for matmul module * first draft of circular buffer * factored out streaming interface * added circ buffer tests, not passing * Basic no-backpressure working for circ buffer, wip backpressure tests * Standardised more interface names, WIP need to change tests, circular buffer working * cleaned up & linting * improved circ buffer tests to be generic & more converage * WIP on matmul.sv * fixed ports * improved mase_runner, added valid bit toggling to drivers * bringup test working for matmul * added matrix accumulator, not tested * basic matrix mult test passing * added signed casting, tests are not passing for edge cases * temporary change back to fixed_cast so matmul works * restored docker submodule * fix verilator flags for version & fix simple matmul multidriven * casting working for floor rounding * basic 2 matmul tests working with rounding * added full window matmul test * Improved testbench param setting * WIP: test_chain_matmul test * fixed signed cast and chain multiply works * added random backpressure valid tests * added more variations to chain matmul * added combinatorial transpose module * WIP: matrix stream transpose * minor comment fix * submodule fix * minor submodule fix * Separate all new group_att work from hardware refactor * minor cleanup * linting * fixes for HW refactor PR format other components components as package * mase_components package * enable higher python versions for pip and fix mase_cocotb imports deepspeed dependencies --------- Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: pgimenes <pgimenes@outlook.com> * pass verilator linting for linear layer linting issues fixed * Adding software test case for lab4 (#530) * Sync docker * Added init test case for lab 4 * Added a pass template for cocotb test * Added hardware models for LLM.int, AWQ, and BigLittle (#531) * Added llm int hardware model * Added awq hardware model in hls * Added big little integer hardware model in hls * Added big little bfp hardware model in HLS * Added bfp mm * Added p&r * emit and simulate actions * define parallelism per dimension in hardware metadata * emit cocotb testbench for emitted verilog * enable pre-emit in simulate action * simulate action changes * syntax shortening for graph and node level metadata handling * enable emit tb on arbitrary mase graph * enable emit tb on arbitrary mase graph editable pip install in sw action * fix pythonpath for ci fix fix * update lab instructions * Check versions * remove verilog analysis * removed hls part * revert mistakes * Os mirror (#536) * Remove debug code (#139) * [Draft] Add Lutnet linear and convolution (#358) * feat: add lut linear * style: add comment * feat: add lutnet prune flow testing script * feat: add lutnet convolution * style: reformat code * feat: init LUTNet linear and convolution weight * feat: add linear layer-wise scaling factor * fix: add binary_training argument * feat: add lutnet linear full workflow * style: run black * fix: add necessary params in lutnet testing script * fix: remove transform pass in testing script * fix: same for lutnet_quantize.py * fix: use 1 and 0 to represent true, false in toml --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * Add lutnet conv2d workflow (#394) * feat: add lutnet conv2d workflow * style: run black --------- Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> * LogicNets (#395) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Feat]: Variable fusion for LogicNets (#450) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * LUTNet software (#440) * fix(LUTNet): add unittest and small bug fixes * feat: add binary residual * fix: reformat lutnet script * fix: update related config for binary residual * fix: add support for functions in residual to mase * feat: add residualSign to lutnet * fix: add torch.stack and size1 tensor result handl * feat: add linear lutnet pass * feat: add lutnet cli pass * feat: add conv2d binary_residual * add: lut_conv2d with residual sign * style: run black * fix: minor bug fixs * fix: train residual layers * add: fine-tuning with pruning masks on * add: training with pruning mask on * style: add comment * add: lutnet pipeline completed * fix: remove softmax * fix: remove assertion * fix: update toml file * fix: remove assertion * fix: add pruning_masks to conv1d * fix: add options to disable residual for layer1 * fix: use level-pruner, copy new params in transfom * fix: update bash script * chore: rebase to main * style: run black * fix: correct quant config dictionary --------- Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> * fix: Jsc Models now training (#458) * fix: convert jsc_dataset output labels to index encoding * style: run black * [Draft] LogicNets Hardware Pass (#451) * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * feat: logicnets linear - not yet working * fix: logicnets linear * style: run black * feat: merge linear pruning and half done conv * feat: add neuron pruning * feat: add jetsubstructure model and dataset * feat: logicnets init and remove activation functio * style: run black * fix: correct JSC-S architecture * run black * feat: add weight decay param * fix: query activation functions from bl_graph * fix: rebase to main, add jsc to the new interface. * fix: rm redundant file * style: run black * chore: add dependency to build script * style: rename model source * style: run black * fix: add unittest support for logicnets * fix: more the dataset to cache directory * fix: update toml files * style: add comment to logicnets script * fix: jsc dataset path * style: run black * fix: add jsc dataset info * chore: update toml file * fix: put logicN tensor to the same device as input * fix: update jsc model * feat: customizable logicnets fusion (not fully verified) * fix: all logicnets linear bugs fixed, fusion pass verified * style: run black * copy logicnets files * initialise emit_logicnets test file * refactor logicnets hw code to new class * fix: remove unneeded print * feat: logicnets linear hw generating * style: run black * trigger ci * comment failing test --------- Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * [Draft] Coursework prep (#469) * fix pruning bugs * fix jsc bug * lab1 cont * minor * Update lab1.md Example in in-project cross-reference * continue on lab 1 * new size * lab1 done * lab1 * minor * remove yaml in jsc * add jsc to get input, finished drafting lab 2 * [software] Cheng's ADLS Lab1 fix (#472) * fix git address and format md * fix test command and add load-type warning/exception to load_model * fix typo and update lightning introduction * prevent wandb logger from saving config toml * new loggers (#473) * beautify jsc dataset (#471) * Adls fix logger (#475) * fix getLogger * Adls fix logger: format codes (#476) * format * Update names * update link in lab1 * Update lab1.md aesthetics * Update lab1.md * minor * add docker setup tutorial (#480) * Update Setup-docker-env.md Add x11 forward comment for MacOS * fix typos * better naming and change the grammar a bit * lab3 done * minor * Coursework Lab2 Fix - CZ (#482) Add an explanation of MASE types Support loading checkpoint into the model in notebook Update statistic profiler example * add lab1 colab notebook * feat: add lab2 colab notebook * fix: recover profile statistics * feat: remove token * lab4 * minor * lab4 * Course prep cz lab3 (#489) * remove legacy codes * add comments; fix search bugs * format codes * nerf model and dataset skeleton * [Draft] NeRF Port (#491) * dataset downloading * ported model and dataset, not passing sanity check * training and testingg flow working * fix: requirements --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * format * Added missing packages --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> * updated license * update docs and conda environment docs restructuring mase env * lab 4 hardware stream temporarily disable test opt * polish labs * Lab4 md minor tweak, doc editing (#3) * Update lab4-hardware.md * standardize docstr * formatting * add mase to pip update to use python flow with setuptools lutnet quantizer init.py logicnets verilog init.py fix license file * migrate static docs to sphinx * disable software CI for doc changes * static doc images fix code in lab 4 machop image disable doc build on pull request, only push trigger * Added txt to gitignore * doc for doc * add doc write * Updated top-level readme (#11) * Tidy up readme * Resize * Updated repo names (#14) * Fix transform (#15) * fix lab bugs * fixed batchnom issue, make data feeding to have batch size greater than 1. close #12 * formatting --------- Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * Added adding pass doc steps * fixed deepcopy issue * fix param * fixed save_load mase * fix formatting * fix formatting * fix numpy corner case * test file chagned * formatting again.. * separate conda env .yml and pip requirements.txt * fix lab issues (#23) Co-authored-by: Bryan E Tan <bet20@ee-tarrasque.ee.ic.ac.uk> * fix to the lab-1 quesiton to point to jsc-tiny (#26) * fixing search action, errors caused because of recent version bumps, relates to issue #28 * quantization pass relink fixed (#30) * force to be on the same device for now (#34) Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Updated hardware components and actions for lab4 (#32) * Updated hardware components and actions for lab4 * manual merge for lab 4 hardware update (#36) ci paths gitignore * verilog format * verilog format * Updated the test script for hardware regression test * Updated hardware testing CI * Removed HLS folders and remove verilog analysis header * Updated setup * update watch path for hardware ci fix * fix hardware tests fix * Removed metadata value type cast test --------- Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> * formatting plus enable accelerator choice on search (#38) * formatting plus enable accelerator choice on search * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * Fix directory in Train tutorial (#22) * Recovered missing changes for the search action (#41) * basically replicate 5a426ed (#43) * basically replicate 5a426ed * formating --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> * minor directory restructure to enable editable pip install * gtkwave instructions for lab 4 remove prints make pip install in hw ci editable update test script paths * integrate agile hardware library components (#44) * integrate agile hardware library components * hardware documentation on sphinx enable hw cw formatting verilog formatting fixed deps fixed arith renaming python3 for test hw script add images images from links * lab3 doc (#47) * linear testbench passing without data coherency check * systolic mapping search space * hw documentation for linear layer formatting * update getting started instructions and docker environment md-> rst for docker getting started and stop triggering CIs on pull request * bug fix * Added link to the slack group * Updated docker container setup (#55) * Updated docker container setup * Reenable software test for env test * Revert Docker * Updated Docker * Reverted lic * Updated conv_bn_fusion pass * verilog format * Fixed missing conflict * python-format * Updated dep * Fixed hw regression test * Synced doc * Removed redundant files * Updated config - dangerous! * Removed redundant passes before changing directories * Removed old-tests * Removed old test folder * python format --------- Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Fixed doc format (#537) * Feature/module transform (#538) * module based swapping for quantization * cli fix * transform on module level * add to script * formating and flow * fix formating * sphinx * I would suggest remove verible dependency in conda env, since this should be hardware-related install (maybe we can open a separate file for this) * minor * format * minor * remove redundant readme * seems like same file name clashes with pytest * +x for .sh * ch point to python3 for github action * Updated file location * Updated docker * Fixed typo * Changed gpu to cpu --------- Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> --------- Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Pointed ch to python3 * support more type option in parse_accelerator func --------- Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * Various bug fixes related to parallelism to pass CI. * Reformatted files with black. * Attempt at fixing Black format diff. * Reformatted internal comp. * Reformatted hardware to pass CI. * Temporary disable of Verilator warnings for further CI tests. * Disabled sqrt TB for now. * fixed verilator linting for sqrt HW(1 genuine error and 1 where added ignore lint for unused bits) * fixed linting issues on layer norm - some ignored as shouldn't have adverse effects * Fixes to bugs regarding precision tests in LayerNorm. * Fixed Verilog format in layernorm. * Reverted accidental constant change. * Attempt at fixing Black format diff. * (Hopefully) final reformat. * Removed few small accidental print-outs throughout codebase. * Removed sys.path inserts for easy debugging in TBs. --------- Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE> Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Pedro Gimenes <55806722+pgimenes@users.noreply.github.com> Co-authored-by: pgimenes <pgimenes@outlook.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk> * revert some changes * remove layernorm 1d * moving files to correct places * formatting fixes --------- Co-authored-by: Jianyi Cheng <jianyi.cheng@cl.cam.ac.uk> Co-authored-by: JoachimSand <37040245+JoachimSand@users.noreply.github.com> Co-authored-by: sv720 <sv720@PC-mo22-113.OASIS.UCLOUVAIN.BE> Co-authored-by: bet20ICL <73127883+bet20ICL@users.noreply.github.com> Co-authored-by: Aaron Zhao <Aaron-Zhao123@users.noreply.github.com> Co-authored-by: Aaron Zhao <aaronzhao0731@gmail.com> Co-authored-by: Derek Lai <53407744+dereklai1@users.noreply.github.com> Co-authored-by: Derek Lai <ddl20@ic.ac.uk> Co-authored-by: ChengZhang-98 <102538889+ChengZhang-98@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-beholder0.ee.ic.ac.uk> Co-authored-by: Tsz-hang Wong <70986970+JeffreyWong20@users.noreply.github.com> Co-authored-by: TszHang Wong <thw20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Bryan Tan <bet20@ee-tarrasque.ee.ic.ac.uk> Co-authored-by: Cheng Zhang <chengzhang98@outlook.com> Co-authored-by: Aaron-Zhao123 <aaronzhao@gmail.com> Co-authored-by: Basheq Tarifi <42390965+btarifi10@users.noreply.github.com> Co-authored-by: cano <cx922@ic.ac.uk>

jianyicheng and others added 30 commits April 3, 2024 17:07

group 8 (#123)

7bdd573

add sphinx docs, place files in correct directories, remove prof remove unnecessary files format tanh test temporarily disable failing activation emit tests relaunching actions Co-authored-by: pgimenes <pgimenes@outlook.com>

Fix software train runner RunnerBasicTrain (#90)

bfd3a04

Merge branch 'group_17_norm' into gqa

1744136

WIP: pow2 rtl for softmax norm

7f6500e

first working draft of pow2

2595e31

added comparator tree

28b91bd

more tests

467197f

add common metadata for BERT masegraph generated from ONNX backend

2f371ba

added comparator accumulator

00176d1

added first section of softermax

c67b8d8

formalize onnx ir and export pass

3b9878b

added splitn (untested) and reworked range reduction + lpw_pow2

8199a49

enable creating a masegraph directly from fx graphmodule

2e0f393

x

facb123

bert model from codegen working, but generating different output

f58d9ba

temporarily disable extended model testing

64b64c4

support for more ops, refactor attr mapping as dict, BERT works, need…

e47922d

… to fix gather

small fixes

d3d1c46

only test on bert

7db2b89

fix tests and docs

a181dbe

Merge branch 'main' of https://github.com/DeepWok/mase into onnx

50ed11e

Merge branch 'main-adls-2324' of https://github.com/DeepWok/mase into…

28b9d40

… adls-group-17

tidy up documentation

b6d8163

matmul version for fixed linear

683274a

added range of tests for lpw recip

d40410b

started raise granularity transform pass

c57fc0f

Aaron-Zhao123 added 6 commits July 7, 2024 21:33

reformat

c022a8d

Merge branch 'main' into fix/tests

1ae6186

put hw test to justfiles

b040bbd

missed reformat

afb48a8

reformating

50ea03c

docker python3 quirk

a479901

Aaron-Zhao123 mentioned this pull request Jul 9, 2024

Hardware Regression Test #201

Open

Aaron-Zhao123 and others added 21 commits July 9, 2024 18:06

fixed various activation modules

8dcb317

moved helper funcs

b8d4214

fixing imports

546ba4f

Temporarily disable memory generation

aea587b

Removed temporary path (since docker is fixed

ecc8e17

Added back the lut generation and fix the path for LUT components

0ef1c79

Added path

19118b0

refined the message

b23f8ba

Added back the path

5d51e3b

Inlined the memory generation for LUT into the test bench

bb8b5c2

Added line break

26698d5

Test jobs on software test

51c1e50

Reorganized CI report directories and add to jobs

224253e

fixed typo in yml

764af1d

Refactored CIs into a single script

2d3d7de

Added output check

5fd151f

Clean files

f944e22

fixed extension typo

626a607

fixed typo

8c1faf6

reformat

1b04b7c

Fixed wrong file name for the artifacts

f9f710b

jianyicheng merged commit 8d746ee into main Jul 10, 2024
4 checks passed

jianyicheng deleted the fix/tests branch July 10, 2024 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/tests #203

Fix/tests #203

Aaron-Zhao123 commented Jul 7, 2024 •

edited

Loading

Fix/tests #203

Fix/tests #203

Conversation

Aaron-Zhao123 commented Jul 7, 2024 • edited Loading

Aaron-Zhao123 commented Jul 7, 2024 •

edited

Loading