Skip to content

v0.1.5

Choose a tag to compare

@yanfeiwong yanfeiwong released this 09 Jun 05:24
· 16 commits to main since this release

Release Notes (v0.1.5)

v0.1.5

长期训练支持、编译兼容性及文档完善

  • 支持长期连续训练:新增 beta2 参数。允许解除与训练步数的硬绑定,防止长序列训练中 EMA 窗口膨胀导致的优化器钝化。
  • JIT 编译旁路开关:新增 use_cuda_kernel 参数。允许在无 CUDA 编译器环境中显式禁用 JIT 编译,直接回退至纯 PyTorch 实现,并修复了编译失败时的重试卡顿问题。
  • 文档完善

Long Training Support, Compilation Compatibility & Documentation Improvements

  • Support for long-term continual training: Added the beta2 parameter, which decouples the optimizer state from the hard-coded training step count, preventing optimizer "blunted" caused by EMA window inflation during long-sequence training.
  • JIT compilation bypass switch: Added the use_cuda_kernel parameter. This allows explicitly disabling JIT compilation in environments without a CUDA compiler, falling back to a pure PyTorch implementation, and fixes the retry hang issue upon compilation failure.
  • Documentation improvements.