Skip to content

v0.1.7

Choose a tag to compare

@yanfeiwong yanfeiwong released this 09 Jun 12:53
· 13 commits to main since this release

Release Notes (v0.1.7)

v0.1.7

CUDA 边缘场景防御与状态鲁棒性加固

  • 引入全局 Norm 溢出保护:在 CUDA Kernel 中增加了对数域的安全阈值截断,防止极端稀疏梯度导致单点更新过大,进而引发 FP32 累加器溢出为 INF 及全局更新失效的问题。
  • 加固 EMA 状态防崩溃机制:在量化融合算子中增加了对数域上下界的物理 Clamp,阻断 Loss Spike 或底层浮点异常导致 Scale 爆炸及未定义行为 (UB) 的路径。

CUDA Edge-Case Defense & State Robustness Hardening

  • Global Norm Overflow Protection: Added a log-domain safety threshold clamp in CUDA kernels to prevent single-point updates from becoming excessively large under extremely sparse gradients, which could cause FP32 accumulator overflow to INF and global update failure.
  • EMA State Anti-Collapse Hardening: Introduced physical clamps for the upper and lower bounds of the log domain in the fused quantization kernel, blocking paths where Loss Spikes or underlying floating-point anomalies could cause Scale explosions and undefined behavior (UB).