Skip to content

BitSys v1.0.0: Multi-Precision Multiplier and MAC RTL/IP Release

Latest

Choose a tag to compare

@liuyh-Horizon liuyh-Horizon released this 22 Jun 17:20

BitSys v1.0.0: Multi-Precision Multiplier and MAC RTL/IP Release

This is the first stable open-source release of BitSys.

This release provides the core RTL artifacts of the BitSys multi-precision arithmetic architecture, including both the multi-precision multiplier and the multi-precision multiply-accumulator.

Included in this release

  • BitSys multi-precision multiplier

    • LUT6_2-optimized implementation
    • Pure RTL reference implementation
  • BitSys multi-precision multiply-accumulator

    • LUT6_2-optimized implementation
    • Pure RTL reference implementation
  • Vivado packaged IPs for LUT6_2-optimized implementations

    • BitSys_MPMUL
    • BitSys_MPMAC
  • Verilog source code

  • Testbenches

  • Complete Vivado projects

Supported modes

The released designs use a fixed 8-bit input container and support runtime-selectable precision modes:

  • 1-bit, 8 channels
  • 2-bit, 4 channels
  • 4-bit, 2 channels
  • 8-bit, 1 channel

The 2/4/8-bit modes support signed and unsigned operation through the is_signed input. The 1-bit mode follows bipolar BNN-style encoding.

Verification

The BitSys MAC was validated by randomized simulation with 100,000 tests per mode. Each test accumulates 1,024 input pairs.

The tested configurations include:

  • 1-bit mode
  • 2-bit unsigned mode
  • 2-bit signed mode
  • 4-bit unsigned mode
  • 4-bit signed mode
  • 8-bit unsigned mode
  • 8-bit signed mode

No mismatches were observed in the tested configurations.

Toolchain

The designs were developed and tested with Vivado 2024.1.

The LUT6_2-optimized versions are Xilinx FPGA specific. The pure RTL versions are provided as reference implementations.

Notes

The Vivado packaged IPs are provided for the LUT6_2-optimized implementations.

This release focuses on the multiplier-level and MAC-level BitSys arithmetic artifacts. It does not include AXI wrappers, host software, DMA integration, or complete system-level accelerator integration.