Skip to content

Add: BGEMM example for tensormap_and_ringbuffer runtime#89

Merged
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
chenshengxin2026:bgemm/tensormap
Feb 14, 2026
Merged

Add: BGEMM example for tensormap_and_ringbuffer runtime#89
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
chenshengxin2026:bgemm/tensormap

Conversation

@chenshengxin2026
Copy link
Copy Markdown
Contributor

Add BGEMM (Batched General Matrix Multiplication) example demonstrating
Cube (AIC) and Vector (AIV) core cooperation with tile-based computation
using the tensormap_and_ringbuffer runtime.

Add examples/tensormap_and_ringbuffer/bgemm/

  • Task dependencies resolved automatically via TensorMap overlap detection
  • Supports 4x4x4 grid with 64x64 tiles (256x256 matrices, batch=2)

Example includes:

  • golden.py: Test specification with tile-first 5D memory layout
  • kernel_config.py: Runtime and kernel configuration
  • kernels/aic/kernel_gemm_tile.cpp: Cube core matmul kernel (TMATMUL)
  • kernels/aiv/kernel_tile_add.cpp: Vector core accumulation kernel (TADD)
  • kernels/orchestration/bgemm_orch.cpp: Task graph builder

Closes #84

@ChaoWao ChaoWao merged commit fbca018 into hw-native-sys:main Feb 14, 2026
7 of 9 checks passed
PKUZHOU pushed a commit to PKUZHOU/simpler that referenced this pull request Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rtStreamSynchronize (AICPU) fails with error 507018 in TensorMap and RingBuffer runtime

2 participants