Transformation stage 0: NeMo docs revision by pzelasko · Pull Request #15363 · NVIDIA-NeMo/NeMo

pzelasko · 2026-02-05T23:16:48Z

What does this PR do ?

Overhauled NeMo developer docs to reflect the project's refocus on Speech AI.

Collection: all

Changelog

Removed all documentation for deleted collections (NLP, multimodal, vision, LLM, text
processing) and Megatron-specific content (MoE, distributed checkpoints, Megatron-LM conversion,
LLM optimizations, performance benchmarks)
Rewrote landing page with three-tier overview (Models, Inference & Deployment, Voice Agent)
and HuggingFace model links
Rewrote feature docs: parallelisms (DDP + ModelParallelStrategy), mixed precision (mixed vs
true half, HalfPrecisionForAudio), checkpoints (.nemo, .safetensors, distributed)
Rewrote getting started: Quick Start with 4 Speech AI examples (Parakeet, FastPitch+HiFi-GAN,
Sortformer, Canary-Qwen), actual install instructions, updated prerequisites
Rewrote best-practices as a "Why NeMo?" overview of Speech AI capabilities
Simplified collections index, checkpoints page, and API references
Fixed all errors and warnings in docs build

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

…text_processing) Remove all doc references to collections that are no longer part of NeMo: nlp, multimodal, vision, vlm, avlm, diffusion, llm, multimodal_autoregressive, speechlm, and text_processing. Update landing page, collections index, tutorials, conf.py bibtex entries, and cross-reference links. Delete the remaining docs/source/multimodal/ directory. Clean up false_positives.json and links_needing_review.json to remove entries for deleted doc pages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… AI focus Remove obsolete docs (performance benchmarks, MoE, Megatron optimizations, distributed checkpoints, Megatron-LM conversion). Rewrite landing page, parallelisms, mixed precision, Quick Start, and best-practices pages to reflect NeMo's Speech AI focus. Update collection references throughout to include ASR, TTS, Audio, and SpeechLM2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mixed precision: document mixed vs true half precision modes, explain HalfPrecisionForAudio plugin that preserves audio tensor precision. Parallelisms: document DDP (all collections) and ModelParallelStrategy (SpeechLM2) with FSDP2/TP/SP concepts, configuration examples, and requirement for configure_model() implementation. Checkpoints: document .nemo as tar archive (unpack/repack), .safetensors for SpeechLM2 via HuggingFace Hub, and distributed checkpoints with ModelParallelStrategy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rename "NeMo Framework" to "NeMo Toolkit" across docs. Rewrite landing page with three-tier capability overview (Models, Inference & Deployment, Voice Agent) and HuggingFace model links. Flatten collections toctree into a single index. Update intro: Python 3.12 / PyTorch 2.7+ prerequisites, add PyPI and source install instructions, remove broken User Guide link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace broken NVIDIA docs URLs: use cross-reference to local ASR datasets doc in speechlm2, inline pip install command in g2p. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nithinraok

LGTM!

Getting few issues while compiling, pls check with make -C docs clean html

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

pzelasko · 2026-02-10T18:48:10Z

@nithinraok fixed all errors and warnings in the docs build, this triggered some linter errors, which I also fixed.

nithinraok

LGTM. Thanks Piotr!

github-actions · 2026-02-11T05:55:43Z

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

//cc @chtruong814 @ko3n1g @pablo-garay @thomasdhc

* Remove deprecated collection documentation (nlp, multimodal, vision, text_processing) Remove all doc references to collections that are no longer part of NeMo: nlp, multimodal, vision, vlm, avlm, diffusion, llm, multimodal_autoregressive, speechlm, and text_processing. Update landing page, collections index, tutorials, conf.py bibtex entries, and cross-reference links. Delete the remaining docs/source/multimodal/ directory. Clean up false_positives.json and links_needing_review.json to remove entries for deleted doc pages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Deep cleanup of docs: remove Megatron/LLM content, rewrite for Speech AI focus Remove obsolete docs (performance benchmarks, MoE, Megatron optimizations, distributed checkpoints, Megatron-LM conversion). Rewrite landing page, parallelisms, mixed precision, Quick Start, and best-practices pages to reflect NeMo's Speech AI focus. Update collection references throughout to include ASR, TTS, Audio, and SpeechLM2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Expand docs for mixed precision, parallelisms, and checkpoints Mixed precision: document mixed vs true half precision modes, explain HalfPrecisionForAudio plugin that preserves audio tensor precision. Parallelisms: document DDP (all collections) and ModelParallelStrategy (SpeechLM2) with FSDP2/TP/SP concepts, configuration examples, and requirement for configure_model() implementation. Checkpoints: document .nemo as tar archive (unpack/repack), .safetensors for SpeechLM2 via HuggingFace Hub, and distributed checkpoints with ModelParallelStrategy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Rebrand to NeMo Toolkit, rewrite landing page, update intro Rename "NeMo Framework" to "NeMo Toolkit" across docs. Rewrite landing page with three-tier capability overview (Models, Inference & Deployment, Voice Agent) and HuggingFace model links. Flatten collections toctree into a single index. Update intro: Python 3.12 / PyTorch 2.7+ prerequisites, add PyPI and source install instructions, remove broken User Guide link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix broken external links in speechlm2/datasets and tts/g2p docs Replace broken NVIDIA docs URLs: use cross-reference to local ASR datasets doc in speechlm2, inline pip install command in g2p. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix some warnings Signed-off-by: Piotr Żelasko <petezor@gmail.com> * All warnings and errors fixed Signed-off-by: Piotr Żelasko <petezor@gmail.com> * Fix linter Signed-off-by: Piotr Żelasko <petezor@gmail.com> --------- Signed-off-by: Piotr Żelasko <petezor@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

pzelasko and others added 4 commits February 5, 2026 16:47

github-actions bot added TTS Multi Modal labels Feb 5, 2026

pzelasko requested a review from blisc February 5, 2026 23:16

Fix broken external links in speechlm2/datasets and tts/g2p docs

16fcc62

Replace broken NVIDIA docs URLs: use cross-reference to local ASR datasets doc in speechlm2, inline pip install command in g2p. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pzelasko added documentation Improvements or additions to documentation and removed TTS Multi Modal labels Feb 5, 2026

pzelasko requested a review from nithinraok February 6, 2026 19:43

pzelasko marked this pull request as ready for review February 9, 2026 15:05

nithinraok previously approved these changes Feb 9, 2026

View reviewed changes

pzelasko added 2 commits February 10, 2026 11:41

Fix some warnings

528c4ed

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

All warnings and errors fixed

5154ad4

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

pzelasko dismissed nithinraok’s stale review via 5154ad4 February 10, 2026 18:02

github-actions bot added core Changes to NeMo Core TTS ASR common Multi Modal audio labels Feb 10, 2026

Fix linter

8adc00d

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

pzelasko added the Run CICD label Feb 10, 2026

chtruong814 added Run CICD and removed Run CICD labels Feb 10, 2026

chtruong814 temporarily deployed to test February 10, 2026 18:44 — with GitHub Actions Inactive

pzelasko requested a review from nithinraok February 10, 2026 18:47

nithinraok approved these changes Feb 10, 2026

View reviewed changes

github-actions bot removed the Run CICD label Feb 10, 2026

chtruong814 temporarily deployed to test February 11, 2026 02:47 — with GitHub Actions Inactive

pzelasko merged commit bf365d0 into main Feb 11, 2026
258 of 270 checks passed

pzelasko deleted the stage0-docs-revision branch February 11, 2026 18:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformation stage 0: NeMo docs revision#15363

Transformation stage 0: NeMo docs revision#15363
pzelasko merged 8 commits intomainfrom
stage0-docs-revision

pzelasko commented Feb 5, 2026 •

edited

Loading

Uh oh!

nithinraok left a comment

Uh oh!

pzelasko commented Feb 10, 2026

Uh oh!

nithinraok left a comment

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pzelasko commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

nithinraok left a comment

Choose a reason for hiding this comment

Uh oh!

pzelasko commented Feb 10, 2026

Uh oh!

nithinraok left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pzelasko commented Feb 5, 2026 •

edited

Loading