Releases
v26.02
Compare
Sorry, something went wrong.
No results found
Added
B300 support
Pretrain recipes: Llama 3.1, DeepSeek V3, Nemotron-H, Qwen3
NCCL benchmark
CPU overhead microbenchmark
GPT-OSS pretrain recipe.
DeepSeek V3 Torchtitan FP8 support for GB300 and GB200.
DeepSeek V3 proxy models for 64 GB300/GB200 GPUs.
System info script for IB, container, and enroot diagnostics.
llmb-run archive command to package experiment logs into tarball.
Exemplar program documentation and tooling.
Changed
Updated recipes to NeMo 26.02.00 where applicable.
Llama3 LoRa finetuning ported to Megatron Bridge.
Torchtitan optimizations for DeepSeek V3.
Centralized peak throughput (TFLOP/GPU) as primary performance metric in READMEs.
Qwen3 235B GB200 removed FP8 support.
Removed
Known Issues
Recipes using NeMo 26.02.00 container will not work with EFA, see Known Issues section of README for workaround.
DeepSeek V3 on EFA clusters may encounter connectivity issues.
You can’t perform that action at this time.