v0.1.5
Release Notes (v0.1.5)
v0.1.5
长期训练支持、编译兼容性及文档完善
- 支持长期连续训练:新增
beta2参数。允许解除与训练步数的硬绑定,防止长序列训练中 EMA 窗口膨胀导致的优化器钝化。 - JIT 编译旁路开关:新增
use_cuda_kernel参数。允许在无 CUDA 编译器环境中显式禁用 JIT 编译,直接回退至纯 PyTorch 实现,并修复了编译失败时的重试卡顿问题。 - 文档完善。
Long Training Support, Compilation Compatibility & Documentation Improvements
- Support for long-term continual training: Added the
beta2parameter, which decouples the optimizer state from the hard-coded training step count, preventing optimizer "blunted" caused by EMA window inflation during long-sequence training. - JIT compilation bypass switch: Added the
use_cuda_kernelparameter. This allows explicitly disabling JIT compilation in environments without a CUDA compiler, falling back to a pure PyTorch implementation, and fixes the retry hang issue upon compilation failure. - Documentation improvements.