Skip to content

Introduce Quadtrix benchmark suite with Python and C++ support#44

Merged
Eamon2009 merged 15 commits into
expfrom
master
May 21, 2026
Merged

Introduce Quadtrix benchmark suite with Python and C++ support#44
Eamon2009 merged 15 commits into
expfrom
master

Conversation

@Eamon2009
Copy link
Copy Markdown
Owner

Summary

  • Project Versioning: Sets the starting project version to 0.1.0.

  • Code Shortcuts (Macros): Creates clean shorthand terms for CUDA keywords (like wrapping device into QX_DEVICE) to make writing GPU kernels cleaner.

  • Math & Memory Utilities: Adds fast math helpers for aligning memory, rounding numbers, and calculating power-of-two boundaries quickly.

  • Memory Optimization: Forces a 128-byte memory alignment to ensure the GPU can read data as fast as possible (coalesced memory access).

  • Automatic Error Checking: Introduces safety wrappers (CUDA_CHECK, CUBLAS_CHECK, NCCL_CHECK) that instantly watch for crashes or failures in Nvidia's core hardware and math libraries, making debugging much easier.

Eamon2009 added 15 commits May 16, 2026 18:16
## Summary
Introduces a CLI tool to load, index, and align benchmark JSON results
from both backends. It displays a side-by-side comparison table showing
latency (ms), throughput (tokens/s), and the percentage
speedup/slowdown.
## Summary
 execution wrapper for Python runner

Adds a boilerplate compatibility script to handle safe system exits and
execution routing for python benchmark.
## Summary
Introduces the primary Python benchmark runner, measuring model
metadata, data throughput, forward latency, training-step latency, and
autoregressive generation. Includes utility functions for dynamic module
loading, timing, and percentile calculation.

## Model BenchmarkingLatency Profiling: 
Tracks forward pass, training step, and autoregressive generation
latencies.Throughput Tracking: Measures tokenizer processing speeds and
data throughput.Resource Monitoring: Captures model metadata and system
memory footprints during runs.

## Math UtilitiesDynamic Loading: 
Implements safe runtime module loading via importlib to dynamically
interact with engine/inference.py.Statistical Metrics: Adds custom
mathematical utility functions, including a precise percentile
calculator ($P_{50}$, $P_{90}$, $P_{99}$) for latency distribution
reporting.Standardized Exports: Lays the groundwork for structured JSON
and CSV output formatting.
Introduces the primary C++ benchmark runner (cpp_benchmark.cpp). It defines the parsing configurations, tracking metrics structures (Stats and BenchRow), and basic time/utility abstractions needed to mirror the Python benchmark suite capabilities.
Added an image to the README for better visualization.
- Define core architecture, compiler hints (`QX_INLINE`, `QX_DEVICE`, etc.).
- Implement generic math macros (`CEIL_DIV`, `ROUND_UP`, `NEXT_POW2`).
- Add memory alignment utilities targeting 128-byte boundaries.
- Implement explicit error-checking macros for CUDA, cuBLAS, and NCCL.
@Eamon2009 Eamon2009 self-assigned this May 21, 2026
@Eamon2009 Eamon2009 added cuda python Pull requests that update python code labels May 21, 2026
@Eamon2009 Eamon2009 merged commit 0517a06 into exp May 21, 2026
9 checks passed
Eamon2009 added a commit that referenced this pull request May 21, 2026
…45)

## Summary
- Project Versioning: Sets the starting project version to 0.1.0.

- Code Shortcuts (Macros): Creates clean shorthand terms for CUDA
keywords (like wrapping __device__ into QX_DEVICE) to make writing GPU
kernels cleaner.

- Math & Memory Utilities: Adds fast math helpers for aligning memory,
rounding numbers, and calculating power-of-two boundaries quickly.

- Memory Optimization: Forces a 128-byte memory alignment to ensure the
GPU can read data as fast as possible (coalesced memory access).

- Automatic Error Checking: Introduces safety wrappers (CUDA_CHECK,
CUBLAS_CHECK, NCCL_CHECK) that instantly watch for crashes or failures
in Nvidia's core hardware and math libraries, making debugging much
easier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant