forked from NVIDIA/NeMo
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use ModelOpt build_tensorrt_llm for building engines for qnemo checkp…
…oints (NVIDIA#9452) * Enable specyfing alpha for SQ Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Enable specifying use_custom_all_reduce for export Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Use native TRT-LLM param names in export (partial) Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Detect TRT-LLM checkpoint programatically Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Pass use_custom_all_reduce in test_nemo_export.py Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Paramter parsing bugfix Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Revert "Paramter parsing bugfix" This reverts commit b0a4dd3. Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Revert "Enable specifying use_custom_all_reduce for export" This reverts commit 9e419e3. Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Revert "Pass use_custom_all_reduce in test_nemo_export.py" This reverts commit be70812. Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Rename checkpoint detection function Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Use ModelOpt build_tensorrt_llm utility for qnemo for performance alignment Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Import fix Signed-off-by: Jan Lasek <janek.lasek@gmail.com> * Apply isort and black reformatting Signed-off-by: janekl <janekl@users.noreply.github.com> --------- Signed-off-by: Jan Lasek <janek.lasek@gmail.com> Signed-off-by: janekl <janekl@users.noreply.github.com> Co-authored-by: janekl <janekl@users.noreply.github.com>
- Loading branch information
Showing
3 changed files
with
76 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
import os | ||
from pathlib import Path | ||
|
||
from nemo.export.tarutils import TarPath | ||
|
||
CONFIG_NAME = "config.json" | ||
WEIGHTS_NAME = "rank{}.safetensors" | ||
|
||
|
||
def is_qnemo_checkpoint(path: str) -> bool: | ||
"""Detect if a given path is a TensorRT-LLM a.k.a. "qnemo" checkpoint based on config & tensor data presence.""" | ||
if os.path.isdir(path): | ||
path = Path(path) | ||
else: | ||
path = TarPath(path) | ||
config_path = path / CONFIG_NAME | ||
tensor_path = path / WEIGHTS_NAME.format(0) | ||
return config_path.exists() and tensor_path.exists() |