Merged
Conversation
* Initial commit * finn flow: pass absolute path names to finn * Added scripts for roofline analysis * Making the output save in the current directory * release v0.2.0 Enable 4 bits * Bringing up a branch that is just the plugin framework for the BERT ops that have been added * Initial cleanup script. Performs some simplification and does some surgery to remove the Dropout layer. For some reason the IdentityOps are not being removed * Added a simple input arg * Moving to bert_build * Added a transformation to reorder the inputs so that the remove IdentityOP transformation is effective. * Initial cut and laying the groundwork for plugin-based shuffle convert_to_hw operator * Getting stubs up for shuffle op and starting to populate some * Cleanup and some more asserts to check permutation list and shapes match up * Initial helper functions for shuffle work * Adding the input_generator for the cases where the inner dimension is not migrating. * Adding latest version of the onnx model and combining cleanup and bringup scripts into a single build script with multiple steps. * Added the infer QuantSoftMax to the pipecleaner build script, renamed the brevitas script * First cut at shuffle specialise layer * Registering Shuffle_hls * Added convert step that is currently skipped * Added a step that attempts to specialise layers on the pipecleaner model * Using fpgapart from the config instead * fixed model * adding some streamlining steps to the build flow which are passing through on the modified input model * Initial commit * finnbrainsmith integration * Added a simple README for now * fixing typoe thanks @auphelia * Initial build shuffle tests up" * populating member functions for getting the dtype and instream/outstream width for HLS generation * Adding the loop_coeffs to the attribute types dict * Needed to give nodes unique names to start generating hardware * Adding a custom HLSBackend where the tcl generation is overridden so that we can include the hlsextension directory * Fixing some portname issues in the generated HLS code * IP successfully building * Added cppsim support, passed suspiciously easily * Added some temporary stop-gaps with a brainsmith_templates so that we can support vector inputs before they appear in finn/dev * Fixing loop bound/coefficient zipping ordering * Reshaping now happening properly and avoiding cppsim segfault * removing IPgen step... for now... * Adding testing from pytorch for the shuffles * cppsim from pytorch to hw is passing * Ramping up testing for all the shuffle types * Removing redundant reshape in testing * First cut at rtlsim support for shuffles * First shuffle RTLSim tests passing * cleaning up the test a little * Cleaning up the InferShuffle transformation * shuffle cppsim codegen cleanup * fixing bug with shape of output when a reshape was present * Needed to increase liveness threshold to get all the rtlsim's to pass' * Bigger bump needed? * [BugFix] Fixed issue with using old Brevitas API for quant_act_scale. * Was including the file from the location * Using the plugin's template now * Removing test that doesn't make sense anymore * Removing INT16 for now focusing testing on INT8 for EoY goal * Adding the latest Brevitas bert build script and starting work on the cleanup scripts * Datatype name fix * cppsim integration * Fixing issues with the decapitation step * Added model tail removal custom step * Cleaning up the cleanup script * Removing redundant cleanup step * Adding an endtoend script and updating the README * Ensuring hash's and branches are consistent on the README * Added a minimal initial endtoend test * test fixed * Added a switch to end2end test to attempt IP generation (this is currently failing) * Extended the test to track how many ops have been successfully specialised and what percentage * Have the end2end test export a json dashboard file instead for tracking progress. * refactoring the endtoend test a bit to use fixtures and track progress through the build process * Updated testing to track various bits * RTLSim for QuantSoftMax * Removing prepare_rtlsim stub * QuantSoftMax RTLSim bugfixes (working now) * fix issue of passing datatypes instead of datatype strings * Adding template types to the treereduction operation * cppsim compiling, for the half it required some casting that I was not quite sure about. * ensure that the context array is np.float32 * Getting stuff working with the latest changes * Clean up remove head and add streamlining steps * Add streamlining steps for softmax * add gather to crop * Fixing linker library paths and include directories for 2024.2 compatibility * Cleanup * tracking individual steps now with fixtures dependencies, also added the ability to dump data to the dashboard json file * Refactored testing so that each step in the build flow is a separate pytest fixture. If we want to add a test at any point in the build flow we can just pass the step fixture in as an argument and then the cached build at that specific point will be picked up" * Starting to bring in the default steps * Generate a test for each step added automatically * Trying as much of the default flow as possible * removing tests that don't make sense right now * fixing the custom steps * Remove call to default convert_to_hw * Reverting back to old specialise layers * need dataflow partition, comment out for now * Removing duplication of the custom steps for BERT and duplicated scripts * updating endtoend script to include some of the default steps * commenting out the last few steps for now * Add a check at the end to see if hls synth went okay * dashboard json data update * Cleaning up the custom steps * Docstring explanations of the custom_steps required for BERT also cleaned up the flow a bit * bringing up validation testing of some of the steps * Adding python execution model for the shuffle * Added a small function for validation that when a test fails will examine the contexts and show what is the same and what differs * Silly mistake with the shuffle execute, it was not writing the result back into the context but was returning it * Elemwise integration * Adding UINT8 testcase which is the same as the BERT model * Increasing the timeout on softmax tests * Changing paths to match new 2024.2 directory structure * keep things float32 for now * Fixing case issue on SIMD attribute allowed the compilation to go further * boilerplate prepare_rtl sim is okay now, removing overridden version * Input int8, 2024.2 update * FuncLayerNorm bugfix and FLOAT32 testcase * "exec_mode" fix and code cleanup * Merge feature/plugin/layernorm_stf * support multiple lines * Added template parameter to enable/disable the quant stage at the end of the softmax * Adjusting the nodeattr for shuffle so that it is compatible with the set_target_fps transformation * QuantSoftMax nodeattr compatibility with set_fps_target transformation * Adding nodeattr so that layernorm is compatible with set_target_fps transformations * simd to SIMD * Non Quant softmax passing cppsim * Validation is having a lot more success with HWSoftMax rather than QuantSoftMax * reintroducing some essential streamlining steps, validation looking a lot better * Endtoend up without fps_target yet * integer cycles to stop issue in set_fifo_depths * Using the v80 part number for the softmax tests * Fix for the issue causing the stitched rtl sim stall * Setting reasonable fps target for initial pipecleaning * Fix for infering the datatypes in the shuffle node thanks @auphelia * Adding some configuration files for the bert end2end flow * Added some expected input and output npy files * Removing start step * Adding correct expected output * Adding an RTLSim node-by-node test to the pytests. Adjusting the configuration for a default build flow. * Adding more rtlsim based testing to the end2end pytests * Saving the context of the node-by-node runs under a different dir name * generate a reference IO each time due to randomly generated weights in brevitas script * Adding a custom step that generates the reference IO for each run for validation * SIMD parameter for shuffles in testing is now properly being set, some tests are now failing cppsim and need fixing * Not every loop coeff should be divided by simd * Fixed the shuffle SIMD issue * Making more command line arguments available for the parameter sweeping for the bert_build demo scripts * Woops left in note * Removing the custom debugging steps from the build flow * Adding an example bash script to sweep over some parameters. * Added a simple script to print the results of param sweep * Cleaning up to remove c++17 warning * Tidying up comments / warnings for demos * Using board instead of fpga_part * Making the output look a bit neater * Removing unused validation steps * fix param sweep * Slight tweak to example param sweep script * Adding a makefile and configs for some single layer and three layer configurations. * We have some large fifos in these builds that need to be split. * Updating the Brevitas model as per @nfraser suggestion * Fix circular make dependency * Works using later qonnx changes * New FIFO depth configurations for the three layers, folding configuration might not match the main plugin version though. * Added new preconfigured designs for latest brevitas changes. * Adding license file headers * updating to correct link in setup instructions * Tidying up QuantSoftMax/SoftMax * Cleaning up utils and testing * Cleaning up endtoend pytestingclear * Adding back in the bitwidth option for the parameter sweep with the new model generation * Added a parameter for changing the sequence length * Skipping LN test for now * Changed the artifact naming convention a little * Remove extraneous implementation of QuantizeLayerNormalization * Added a script to generate a config (pre FIFO depth sizing) for a particular folding configuration as we explore the DSE side of the Bert build * Added a makefile recipe for a maximum folding three layer design for passing to RW team * Adjusting number of layers on the design * Manually control the fifo depth stage instead of setting it if a param file is present * Need to come up with better arg naming for parameters, maybe just enforce longargs? * Makefile recipies use the generation script for various SIMD/PE configurations rather than prebaking them --------- Co-authored-by: aziz bahri <azizb@amd.com> Co-authored-by: azizb-xlnx <48930381+azizb-xlnx@users.noreply.github.com> Co-authored-by: root <root@TAFK> Co-authored-by: Thomas Keller <thomaskeller@microsoft.com> Co-authored-by: auphelia <jakobapk@web.de> Co-authored-by: Joshua Monson <joshmonson@microsoft.com> Co-authored-by: jsmonson <jsmonson@gmail.com>
* Added extra arguments to reflect latest change in finn/custom/transformer that enables you to override the number of inferences that the fifo depth sizing stage performs. * Fixing the recipies and simplifying
* Improvements to SoftMax hardware efficiency and also adding support for ap_float<W,I> datatypes. * Fixes and compiler integration for new SoftMax * fixing license header
…es on three layer designs (#9) * Adding check to make sure that we don't accidentally set SIMD for shuffleB yet, also updated the config generation so that we do not accidentally set the wrong shuffle in later layers * Cleaning up the build scripts a little thanks @auphelia * Moving the constraining of shuffle paramemters and pumpedCompute to temporary custom transformations so that they are more reliable * Removing the temporary check and relying on the custom pass for now until the parallel transpose op comes online * Fixed the return type of the custom transformations
* Added cycle testing to softmax test script Implemented cycle testing code, which compares the layer's rtlsim cycles with its expected cycles (found using QONNX's ModelWrapper.analysis). Copied from https://github.com/Xilinx/finn/blob/00bf8279f2ed20500f3046b395b24c08c8c82325/tests/fpgadataflow/test_fpgadataflow_fmpadding.py * Updated cycles test op type, imported exp_cycles_per_layer - The rtlsim cycles test for the softmax custom op was failing due to the incorrect op type string being used ("FMPadding" instead of "HWSoftmax"). - The FINN method, exp_cycles_per_layer, was not imported, causing the test to fail. * Implemented cycles test for Shuffle custom op - Implemented test to test_fpgadataflow_shuffle.py which compares the Shuffle node's expected cycles with the rtlsim's outputted cycles. - Ran this test, it currently fails. The expected cycles (12288) do not fall within a tolerance of 10 of the rtlsim cycles (23475). * Implemented alternate LayerNorm test script - The existing LayerNorm test is incomplete, and doesn't execute. To bridge the gap in testing, a new test was written based on other custom operations tests. - The new test, test_fpga_dataflow_layernorm_hw_custom_op(), is in the same file as the old test. - The cppsim version of the test currently passes. The rtlsim version fails due to the expected cycles (456) not matching the simulated cycles (63516). Testing was done using the [ifm_dim0-rtlsim-INT9-simd4-hls] configuration. * Removed rtlsim_trace from LayerNorm, updated comments Implemented reviewer suggested changes: - Removed rtlsim_trace attribute from the test's LayerNorm node. - Updated comments: - In construct_onnx_model()'s header comment, changed "Finn" -> "FINN", added info about the LayerNorm's Scale and Bias tensors. - In test_fpga_dataflow_layernorm_hw_custom_op()'s header comment, explained that this test is missing the inferred eltwise operations.
…flow (#15) * Removing the accidentally included startstep in the endtoend flow * Restoring the default to 8 for bitwidth
Co-authored-by: Thomas Keller <thomaskeller@microsoft.com>
* Include the reference IO as part of the metadata handover * typo fix
* Added cycle testing to softmax test script Implemented cycle testing code, which compares the layer's rtlsim cycles with its expected cycles (found using QONNX's ModelWrapper.analysis). Copied from https://github.com/Xilinx/finn/blob/00bf8279f2ed20500f3046b395b24c08c8c82325/tests/fpgadataflow/test_fpgadataflow_fmpadding.py * Updated cycles test op type, imported exp_cycles_per_layer - The rtlsim cycles test for the softmax custom op was failing due to the incorrect op type string being used ("FMPadding" instead of "HWSoftmax"). - The FINN method, exp_cycles_per_layer, was not imported, causing the test to fail. * Implemented cycles test for Shuffle custom op - Implemented test to test_fpgadataflow_shuffle.py which compares the Shuffle node's expected cycles with the rtlsim's outputted cycles. - Ran this test, it currently fails. The expected cycles (12288) do not fall within a tolerance of 10 of the rtlsim cycles (23475). * Implemented alternate LayerNorm test script - The existing LayerNorm test is incomplete, and doesn't execute. To bridge the gap in testing, a new test was written based on other custom operations tests. - The new test, test_fpga_dataflow_layernorm_hw_custom_op(), is in the same file as the old test. - The cppsim version of the test currently passes. The rtlsim version fails due to the expected cycles (456) not matching the simulated cycles (63516). Testing was done using the [ifm_dim0-rtlsim-INT9-simd4-hls] configuration. * Removed rtlsim_trace from LayerNorm, updated comments Implemented reviewer suggested changes: - Removed rtlsim_trace attribute from the test's LayerNorm node. - Updated comments: - In construct_onnx_model()'s header comment, changed "Finn" -> "FINN", added info about the LayerNorm's Scale and Bias tensors. - In test_fpga_dataflow_layernorm_hw_custom_op()'s header comment, explained that this test is missing the inferred eltwise operations. * Created OpTest class for abstracting CustomOp tests - This class helps reduce shared boilerplate code between tests for custom FINN ops. - The OpTest class is designed to be inherited by custom test classes. These custom test classes will inherit pre-written commonly used tests, and helper functions to make writing tests easier. - An example of a test designed using OpTest can be found at the end of `./test/fpgadataflow/test_fpgadataflow_layernorm.py`. - While functional, the class is still a work in progress, and more functionality will be added in alignment with the needs of the engineers who use it. * Applied linting - Applied linting using black's default settings. * Created target_fpga fixture, removed prints, added SIMD ids - Target FPGA, as used by the model_specialise fixture, is now a fixture, which can be overridden by a test class. - Removed print statements in op_test.py that were used for debugging - Added IDs to TestLayerNorms SIMD parameters. Pytest now displays SIMD1, SIMD2, SIMD4, instead of 1, 2, 4. More human-readable! * Implemented reviewer suggestions, new 'target_node' fixture, improved typing - Implemented @STFleming 's suggestions: - The `exec_mode` comparsisons at lines 65 and 68 now use `==` instead of `is`. - The reference to `LayerNorm` in the comment at line 173 has been removed. - `apply_transforms()` no longer uses an `assert`, instead it raises a `RuntimeError`. - Implemented a new fixture, `target_node()`. This fixture returns an integer, specifiying the index in the model of the node we're testing. This means a model can contain nodes/layers other than the the one we want to test. - Improved typing consistency throughout 'op_test.py': `input_tensors()` and `apply_transforms()` were missing parameter type hints.
* Formatting bert_build as a job * Further iteration/brainstorming * Initial FINN docker transplant * Adding deps to git ignore * [Deps] Restructure python github repo installs (#8) Co-authored-by: auphelia <jakobapk@web.de> * Initial docker structuring for BrainSmith * entrypoint path bugfix * [Docker] Enable interactive mode for docker container (#10) * Added model profiling scripts * Hotpatch to remove pyverilator * Normalize line endings in SUPPORT.md * finnbrainsmith --> brainsmith/finnlib paths * Tools folder restructure * Fix gen_bert paths & name in expand_norms * Custom QONNX branch to fix is_finn * Removed old QuantLayerNorm func * Initial job runner structuring * Job structure v0, structure for profiling improvements * Updated readme * Template path fix * Unsued import and formatting cleanup * FP IP import fix * Docker updates for pyxsi * Pyxsi path fix * Onnx path + linting fixes * Removed finnlib, moving up sub folders * Moved run_job to core for consistency * Linting cleanup * Updated README * Added RTL placeholder * Typo & gitignore fixes * Updated finnlib to brainsmith in tests * bert_steps path fix in tests * Fix punctuation in README instructions. * Update LICENSE: Brainsmith name fix Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Update LICENSE: Brainsmith name fix 2 Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Update README.md - typo fix Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Brainsmith name fix Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Update brainsmith/tools/README.md: Brainsmith name fix Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Update docker/entrypoint.sh: Brainsmith name fix Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Update docker/entrypoint.sh: Brainsmith name fix Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com> * Removed exec from fetch_repos * Copyright typo fix --------- Co-authored-by: Thomas Keller <thomaskeller@microsoft.com> Co-authored-by: auphelia <jakobapk@web.de> Co-authored-by: auphelia <56755897+auphelia@users.noreply.github.com>
* add custom onnxscript branch * Add TODO for reconciling onnxscript dependencies --------- Co-authored-by: Joshua Monson <joshmonson@microsoft.com> Co-authored-by: Thomas Keller <tkeller787@gmail.com>
* Initial attempt at docker build action * Added branch name to action * PR & weekly tests for dev/ci-actions * Added self-hosted runner * Adjusted runs-on label * path fix * Added debug to orient pwd * Added pytest keyword through run-docker.sh * Fixed license path * Updated upload-artifats to v4 * Reorganize bert demo for github action * Updated run-docker CLI args * Added e2e test to actions * Removed build artifacts * Fix ci.yml run-docker statement * Removed "push" trigger * Merge with develop changes and add num workers env variable * Re-added push trigger for testing * Fix merge * Temporarily disabled docker and pytest for e2e validation * Fix BSMITH_BUILD_DIR env variable * Remove push trigger, since PR trigger is sufficient * Remove tesing branches and triggers for PR * Remove auto-gen docs * Delete demos/bert/configs/l1_simd12_pe8.json Removed extraneous config from test --------- Co-authored-by: Ubuntu <azureuser@brainsmith-dev2.woh15gx5mv0exiu0m5xe0hjytg.dx.internal.cloudapp.net>
* add custom onnxscript branch * fix torch error * readd todo --------- Co-authored-by: Joshua Monson <joshmonson@microsoft.com>
* fix formatting with copilot * fix dynamic matmul config when sizing is not divisble by 3 --------- Co-authored-by: Joshua Monson <joshmonson@microsoft.com>
Co-authored-by: Joshua Monson <joshmonson@microsoft.com>
…me (#31) * fix argparse arg that could never be false * update fifosizing arg in hw compiler to match new argument name --------- Co-authored-by: Joshua Monson <joshmonson@microsoft.com>
Co-authored-by: Joshua Monson <joshmonson@microsoft.com>
* Added cleanup steps and job * Made num_default_worker env variable
Co-authored-by: Joshua Monson <joshmonson@microsoft.com>
…r guides Add complete documentation suite including: - API reference for all core modules (dataflow, DSE, registry, kernel_op, settings) - Developer guides covering foundations (dataflow accelerators, kernel concepts), core systems (component registry, DSE, kernel modeling), and reference materials (blueprints, CLI, kernels) - Getting started guides for installation, configuration, and quickstart - Technical diagrams and images for visual explanations - Custom CSS styling for documentation site - Remove commented-out PR preview deployment code from docs workflow
- Remove VectorVectorActivation (VVAU) kernel implementation and tests - Delete obsolete ptranspose.sv SystemVerilog file - Enhance ElementwiseBinary kernel with float type support - Fix integer division to use truncating semantics matching hardware - Remove overly restrictive integer-only datatype constraint - Add comprehensive kernel documentation structure with 11 markdown files - Enhance test framework with ONNX utilities and improved base classes - Update MkDocs configuration for new documentation sections
Core dataflow improvements: - Add lift_scalar_to_rank1() for ONNX scalar normalization - Add constant_datatype() for fixed output types - Optimize design_point caching (regenerate from nodeattrs) - Add execution initialization guard for QONNX compatibility - Handle rank-0 tensors in template resolution - Auto-configure FIFO depths from schema interface counts Kernel enhancements: - Add ElementwiseBinary kernel with scalar broadcast support - Refactor Crop to use model shapes directly - Update Softmax and LayerNorm for new schema patterns - Improve InferCropFromGather documentation Test framework v2: - Add composition-based architecture (v2 base classes) - Add support modules: data_generation, golden_reference, quant_insertion - Add datatype annotation system for parameterized tests - Refactor dual/single kernel tests for cleaner inheritance - Add ElementwiseBinary v2 test suite with comprehensive coverage Documentation: - Add AddStreams HLS backend documentation - Update test framework README with v2 patterns
Optimize LayerNorm HLS implementation: - Remove redundant var static variable from var_stage - Move variance computation outside loop (division by constant N) - Replace division by sqrt with multiplication by reciprocal in inv_sqrt_stage - Fix datatype specification to FLOAT32 constant Add test framework improvements: - Add dual_kernel_test_v2.py for FINN vs Brainsmith parity testing - Enhance kernel_test_base_v2.py and single_kernel_test_v2.py - Add test_addstreams_parity_poc.py for AddStreams validation - Remove deprecated golden_reference.py and tensor_mapping.py Add build_hw_graph step: - Combine partitioning and specialization phases - Unify dataflow partition creation and backend specialization - Update blueprint to use unified build_hw_graph step Update .gitignore to exclude .ignore files
Reorganize test support utilities into tests/fixtures/ with improved structure: - Move kernel_test_helpers.py → model_builders.py - Create model_annotation.py (consolidates datatype_annotation.py and quant_insertion.py) - Create test_data.py (consolidates data_generation.py) - Update imports in conftest.py and test files - Clean up tests/support/ (remove consolidated files) - Update test framework base classes for new import paths Update bert_quicktest.yaml target_fps from 1 to 30000. Update tests/frameworks and tests/kernels for new fixture organization.
- Move blueprints.py and design_spaces.py to tests/fixtures/dse/ - Create tests/fixtures/dse/__init__.py with comprehensive exports - Update imports in tests/conftest.py and integration tests - Remove outdated fixture unit tests (test_kernel_test_helpers.py, test_model_annotation_quant.py, test_test_data.py) - Re-enable apply_parallelization_config step in BERT blueprint
- Add KernelTestConfig unified configuration system with stage-aware parameter handling - Update all test methods to accept kernel_test_config parameter for tolerances and model creation - Restructure installation documentation with improved prerequisite formatting - Add elementwise binary test infrastructure and parity tests - Deprecate legacy configure_kernel_node/configure_backend_node in favor of stage-aware configure_parameters
- Remove stdout redirect override in FINN adapter to align with upstream logging - Update FINN dependency to feature/logging-integration-transformer branch - Migrate test framework files: remove v1 variants, keep v2 as canonical - Add pytest-cases dependency for improved test parameterization - Expand pytest markers for better test categorization (single_kernel, dual_kernel, basic_validation) - Improve installation documentation formatting - Add git fetch to resolve_ref_to_commit for up-to-date remote refs
…twise_binary tests Logging system: - Replace simple logging with orchestrated system supporting quiet/normal/verbose/debug levels - Add file logging with rotation to output directories - Integrate FINN logging without handler conflicts (verbose=False means "don't add handlers") - Support per-tool FINN log levels via configuration CLI changes: - Rename --logs to --log-level with new verbosity levels - Rename 'project show' to 'project info' Test framework: - Move elementwise_binary tests from brainsmith/kernels/ to tests/kernels/ - Add shared test base class in kernel package for reuse - Introduce certification/validation test structure - Add model caching and certification sweep fixtures Cleanup: - Remove 70+ legacy test artifacts and planning docs - Remove unused prerelease-docs images - Update documentation and examples to reflect CLI changes
…ernel ops documentation Test Infrastructure Changes: - Split test base class into separate file (test_elementwise_binary.py) - Move shared test cases from tests/ to kernel package (test_cases.py) - Update __init__.py with comprehensive exports for test utilities - Remove redundant test case variants (mixed dtype/broadcasting duplicates) - Expand validation cases with DSE dimension coverage (RAM styles, memory modes, PE+RAM combinations) - Add narrow/binary datatype tests and 3D shape coverage Documentation: - Add comprehensive kernel ops guide (7 chapters covering introduction through best practices) - Simplify settings.md by removing redundant content sections - Update installation and configuration guides - Add tutorials index Build Configuration: - Update pyproject.toml with mkdocs plugins and navigation structure - Refine bert quicktest configuration Test Organization: - Consolidate Add tests (remove certification module, keep validation only) - Update validation test to use new test case imports
Phase 1 of KernelParityTest v6.0 migration - Move _prepare_model_with_annotations to base class - Move _generate_test_inputs to base class - Eliminates ~80 lines of duplication - No functional changes (all tests pass with identical results) Impact: - kernel_test_base_v2.py: +85 lines (new helper methods) - kernel_parity_test.py: -79 lines (removed duplicates) - single_kernel_test_v2.py: -85 lines (removed duplicates) - Net: -79 lines removed Test results: 10 passed, 4 failed (expected), 4 errors (expected) Identical to baseline - no regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Phase 2 of KernelParityTest v6.0 migration - Non-breaking additive changes New asymmetric design: - Primary (unqualified): Uses inherited base class methods - Reference (qualified): Explicit comparison target with "_reference" suffix Added methods: - infer_kernel_reference() - Delegates to infer_kernel_a() for compatibility - get_backend_variants_reference() - Delegates to get_backend_variants_a() - configure_kernel_reference() - Delegates to configure_kernel_a() - specialize_to_backend_reference() - NO METHOD SWAPPING! Uses explicit backends Added fixtures: - stage2_model (primary, unqualified) - Delegates to stage2_model_b - stage2_model_reference (reference, qualified) - Uses infer_kernel_reference() - stage3_model (primary backend) - Uses inherited specialize_to_backend() - stage3_model_reference (reference backend) - Uses specialize_to_backend_reference() Benefits: - Eliminates method swapping anti-pattern (explicit backends) - Clearer semantics (primary vs reference, not a vs b) - Non-breaking (delegates to old API for compatibility) - Prepares for eventual deprecation of kernel_a/kernel_b methods Impact: - kernel_parity_test.py: +261 lines (4 methods + 4 fixtures) - No functional changes (all tests pass identically) - Old kernel_a/b API still works Test results: 10 passed, 4 failed (expected), 4 errors (expected) Identical to baseline - no regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove brainsmith/kernels/shuffle/ module (legacy and modern implementations) - Move project config from .brainsmith/config.yaml to brainsmith.yaml at project root - Consolidate getting-started docs into single docs/getting-started.md - Relocate elementwise_binary tests from tests/kernel-migration/ to tests/kernels/ - Remove dual_kernel_test_v2.py (superseded by kernel_parity_test.py) - Update CLI commands and documentation to reflect new config path - Allow kernels without backends in DSE parser (log warning instead of error) - Add "regsitry" typo alias in CLI for convenience
Rename OrderedDimension → OrderedParameter and DSEDimension → ParameterSpec throughout dataflow system for clarity. Maintain backward compatibility aliases for external code. Changes: - Rename ordered_dimension.py → ordered_parameter.py - Update KernelDesignSpace.dimensions → parameters - Add deprecation aliases (DimensionSpec, OrderedDimension) - Update all method names (dim_min → param_min, etc.) - Consolidate test framework (v2 → base, remove deprecated files) - Add documentation images (BERT DFC, simple MHA)
- Rename DimensionSpec to ParameterSpec in schema definitions - Rename OrderedDimension to OrderedParameter for DSE navigation - Rename dse_dimensions to dse_parameters in schema fields - Update all dimension-related methods to parameter-based naming - Consolidate _resolve_input_datatype and _resolve_output_datatype into unified _resolve_datatype method - Remove backward compatibility aliases (Custom, DimensionSpec, OrderedDimension) - Update test framework to use parameter-based terminology - Streamline test inheritance hierarchy and remove deprecated test utilities - Update finn-hlslib dependency to latest commit - Remove obsolete documentation files from docs/developer-guide and docs/kernels - Simplify getting-started and index documentation - Update API reference documentation for dataflow, dse, registry, and settings - Rename test_ordered_dimension.py to test_ordered_parameter.py - Rename single_kernel_test.py to kernel_test.py
- Add comprehensive parity test between LegacyCrop and modern Crop - Test both height and width axis cropping with 8 configurations - Fix design space initialization in Crop_hls.execute_node() - Ensure QONNX-created instances properly initialize design space Test coverage: - 4 height axis configs (int8/int16/int4, symmetric/asymmetric/large) - 4 width axis configs (int8/int16/int4, symmetric/asymmetric/large) - Golden validation, parity checks, HW estimation tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Core changes: - Add stream parameter helpers to KernelDesignPoint (reduce duplication) - Add type inference for ParameterSpec with validation - Optimize duplicate detection and validation in schemas - Extract binary operation datatype helper - Improve KernelOp execution compatibility docs Crop kernel: - Remove legacy Crop implementations (LegacyCrop, LegacyCrop_hls, InferCropFromGather) - Fix Crop_hls width calculation and execution - Simplify crop kernel exports Documentation: - Update README with new project description - Reorganize docs from prerelease-docs to docs/ - Add experimental kernel ops tutorial series - Add API docs for CLI, dataflow, DSE, registry - Add test framework guides (QUICKSTART, KERNEL_PARITY_TEST_GUIDE) Tests: - Reorganize test structure (migration folder for legacy tests) - Move elementwise tests to top level - Update crop tests to use new implementation
…king Replace configuration-based source detection with runtime discovery tracking. Introduces _discovered_sources set populated during component discovery and new domain utilities for bidirectional ONNX domain resolution. Key changes: - Add _discovered_sources set to track sources from entrypoints and component_sources - Replace module prefix matching with discovered sources matching in _detect_source() - Add _domain_utils.py module with derive_domain_from_module() and match_domain_to_source() - Update get_domain_for_backend() to derive domain from __module__ attribute - Deprecate source_module_prefixes configuration field with warning - Add tests for domain derivation and matching utilities - Remove cli-architecture.md documentation (outdated) - Update component registry and CLI documentation references
- Delete brainsmith/codegen module (HLSCodeBuilder and tests) - Replace HLSCodeBuilder with direct string construction in elementwise_binary_hls.py - Change logging from info to debug level in FINN adapter, DSE, and dataflow builder - Disable FINN progress display (show_progress: False)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR restructures four core systems: component registry, dataflow modeling, design space exploration, and composition-based testing. The work replaces hardcoded BERT-specific assumptions with extensible abstractions while maintaining backward compatibility with existing examples.
Major Components
1. Component Registry (
brainsmith/registry/)Decorator-based plugin architecture with automatic discovery:
Features:
brainsmith,finn,project,custom)2. Dataflow Modeling (
brainsmith/dataflow/)Schema-driven kernel design with two-phase construction:
Design Principles:
Two-Phase Construction (enables 2-50x DSE speedup):
Navigation API:
Components:
KernelSchema: Declarative inputs, outputs, parameters, constraintsKernelDesignSpace: Valid parameter ranges from schema + ONNX contextKernelDesignPoint: Immutable configuration snapshot with navigation APIConstraintSystem: Unified validation (divisibility, bounds, relationships)BroadcastSemantics: NumPy-style broadcasting for elementwise operations3. Design Space Exploration (
brainsmith/dse/)Extracted from monolithic
core/into modular architecture:Segment-Based Execution (reuses shared computation):
4. Test Framework (
tests/frameworks/)Composition-based architecture: implement 3-5 methods, inherit 18+ test cases.
Example (18 tests from ~20 lines):
Progressive Disclosure Logging:
"Executing cppsim for AddStreams...")-v): Detailed execution logs, shape info, parameter values5. CLI (
brainsmith/cli/)Dual-command architecture with project management:
Features:
6. Settings System (
brainsmith/settings/)Pydantic-based hierarchical configuration:
Features:
${HOME},${FINN_ROOT})Kernel Implementations
Added
ElementwiseBinary (
brainsmith/kernels/elementwise_binary/)Thresholding (
brainsmith/kernels/thresholding/)Channelwise (
brainsmith/kernels/channelwise/)AddStreams (
brainsmith/kernels/addstreams/)DuplicateStreams (
brainsmith/kernels/duplicate_streams/)RotaryEmbedding (
brainsmith/kernels/rotaryembedding/)Migrated
LayerNorm, Softmax, Crop
Infrastructure
Internal Utilities (
brainsmith/_internal/)lazy_imports.py: PEP 562 lazy module loadinglogging.py: Progressive disclosure loggingio/yaml.py: YAML parsing with!includedirectivesio/dependencies.py: Package dependency resolutionfinn/adapter.py: FINN integration layerDocumentation (
docs/, 60+ files)MkDocs site structure:
Breaking Changes
Module Reorganization
Configuration Format
CLI Workflow
Kernel Interface
Performance Impact
CLI Startup
DSE Execution
Test Suite
Testing
Coverage
Test Categories
Validation
BERT End-to-End:
Status: All stages passing (verified in CI)
Parity Tests: All migrated kernels (LayerNorm, Softmax, Crop, Thresholding) passed 18 test cases each
Migration Guide
Kernel Developers
Pipeline Developers
brainsmith.core.dse→brainsmith.dseexplore_design_space(model, blueprint)End Users
brainsmith project init ~/my-fpga-projectbrainsmith.yaml)examples/bert/blueprint.yaml)smith model.onnx blueprint.yaml --output-dir ./resultsCompatibility
Backward Compatibility
Maintained:
Broken (intentional):