Skip to content

Transformation stage 0: NeMo docs revision#15363

Merged
pzelasko merged 8 commits intomainfrom
stage0-docs-revision
Feb 11, 2026
Merged

Transformation stage 0: NeMo docs revision#15363
pzelasko merged 8 commits intomainfrom
stage0-docs-revision

Conversation

@pzelasko
Copy link
Collaborator

@pzelasko pzelasko commented Feb 5, 2026

What does this PR do ?

Overhauled NeMo developer docs to reflect the project's refocus on Speech AI.

Collection: all

Changelog

  • Removed all documentation for deleted collections (NLP, multimodal, vision, LLM, text
    processing) and Megatron-specific content (MoE, distributed checkpoints, Megatron-LM conversion,
    LLM optimizations, performance benchmarks)
  • Rewrote landing page with three-tier overview (Models, Inference & Deployment, Voice Agent)
    and HuggingFace model links
  • Rewrote feature docs: parallelisms (DDP + ModelParallelStrategy), mixed precision (mixed vs
    true half, HalfPrecisionForAudio), checkpoints (.nemo, .safetensors, distributed)
  • Rewrote getting started: Quick Start with 4 Speech AI examples (Parakeet, FastPitch+HiFi-GAN,
    Sortformer, Canary-Qwen), actual install instructions, updated prerequisites
  • Rewrote best-practices as a "Why NeMo?" overview of Speech AI capabilities
  • Simplified collections index, checkpoints page, and API references
  • Fixed all errors and warnings in docs build

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

pzelasko and others added 4 commits February 5, 2026 16:47
…text_processing)

Remove all doc references to collections that are no longer part of NeMo:
nlp, multimodal, vision, vlm, avlm, diffusion, llm, multimodal_autoregressive,
speechlm, and text_processing. Update landing page, collections index, tutorials,
conf.py bibtex entries, and cross-reference links. Delete the remaining
docs/source/multimodal/ directory. Clean up false_positives.json and
links_needing_review.json to remove entries for deleted doc pages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… AI focus

Remove obsolete docs (performance benchmarks, MoE, Megatron optimizations,
distributed checkpoints, Megatron-LM conversion). Rewrite landing page,
parallelisms, mixed precision, Quick Start, and best-practices pages to
reflect NeMo's Speech AI focus. Update collection references throughout
to include ASR, TTS, Audio, and SpeechLM2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mixed precision: document mixed vs true half precision modes, explain
HalfPrecisionForAudio plugin that preserves audio tensor precision.

Parallelisms: document DDP (all collections) and ModelParallelStrategy
(SpeechLM2) with FSDP2/TP/SP concepts, configuration examples, and
requirement for configure_model() implementation.

Checkpoints: document .nemo as tar archive (unpack/repack), .safetensors
for SpeechLM2 via HuggingFace Hub, and distributed checkpoints with
ModelParallelStrategy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename "NeMo Framework" to "NeMo Toolkit" across docs. Rewrite landing
page with three-tier capability overview (Models, Inference & Deployment,
Voice Agent) and HuggingFace model links. Flatten collections toctree
into a single index. Update intro: Python 3.12 / PyTorch 2.7+
prerequisites, add PyPI and source install instructions, remove broken
User Guide link.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broken NVIDIA docs URLs: use cross-reference to local ASR
datasets doc in speechlm2, inline pip install command in g2p.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pzelasko pzelasko added documentation Improvements or additions to documentation and removed TTS Multi Modal labels Feb 5, 2026
@pzelasko pzelasko requested a review from nithinraok February 6, 2026 19:43
@pzelasko pzelasko marked this pull request as ready for review February 9, 2026 15:05
nithinraok
nithinraok previously approved these changes Feb 9, 2026
Copy link
Member

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Getting few issues while compiling, pls check with make -C docs clean html

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
@pzelasko
Copy link
Collaborator Author

@nithinraok fixed all errors and warnings in the docs build, this triggered some linter errors, which I also fixed.

Copy link
Member

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Piotr!

@github-actions
Copy link
Contributor

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

//cc @chtruong814 @ko3n1g @pablo-garay @thomasdhc

@pzelasko pzelasko merged commit bf365d0 into main Feb 11, 2026
258 of 270 checks passed
@pzelasko pzelasko deleted the stage0-docs-revision branch February 11, 2026 18:49
nemoramo pushed a commit to nemoramo/MoNeMo that referenced this pull request Feb 13, 2026
* Remove deprecated collection documentation (nlp, multimodal, vision, text_processing)

Remove all doc references to collections that are no longer part of NeMo:
nlp, multimodal, vision, vlm, avlm, diffusion, llm, multimodal_autoregressive,
speechlm, and text_processing. Update landing page, collections index, tutorials,
conf.py bibtex entries, and cross-reference links. Delete the remaining
docs/source/multimodal/ directory. Clean up false_positives.json and
links_needing_review.json to remove entries for deleted doc pages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Deep cleanup of docs: remove Megatron/LLM content, rewrite for Speech AI focus

Remove obsolete docs (performance benchmarks, MoE, Megatron optimizations,
distributed checkpoints, Megatron-LM conversion). Rewrite landing page,
parallelisms, mixed precision, Quick Start, and best-practices pages to
reflect NeMo's Speech AI focus. Update collection references throughout
to include ASR, TTS, Audio, and SpeechLM2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Expand docs for mixed precision, parallelisms, and checkpoints

Mixed precision: document mixed vs true half precision modes, explain
HalfPrecisionForAudio plugin that preserves audio tensor precision.

Parallelisms: document DDP (all collections) and ModelParallelStrategy
(SpeechLM2) with FSDP2/TP/SP concepts, configuration examples, and
requirement for configure_model() implementation.

Checkpoints: document .nemo as tar archive (unpack/repack), .safetensors
for SpeechLM2 via HuggingFace Hub, and distributed checkpoints with
ModelParallelStrategy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Rebrand to NeMo Toolkit, rewrite landing page, update intro

Rename "NeMo Framework" to "NeMo Toolkit" across docs. Rewrite landing
page with three-tier capability overview (Models, Inference & Deployment,
Voice Agent) and HuggingFace model links. Flatten collections toctree
into a single index. Update intro: Python 3.12 / PyTorch 2.7+
prerequisites, add PyPI and source install instructions, remove broken
User Guide link.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix broken external links in speechlm2/datasets and tts/g2p docs

Replace broken NVIDIA docs URLs: use cross-reference to local ASR
datasets doc in speechlm2, inline pip install command in g2p.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix some warnings

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* All warnings and errors fixed

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix linter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ASR audio common core Changes to NeMo Core documentation Improvements or additions to documentation Multi Modal TTS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants