Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ __pycache__
*.so
*.pyd
docs/_build
docs/sphinx/*/generated
docs/sphinx/**/generated
docs/sphinx/generated
dist
build
Expand Down
4 changes: 4 additions & 0 deletions .markdownlint.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# SPDX-License-Identifier: Apache-2.0

MD013:
line_length: 92
code_block_line_length: 88
Expand Down
4 changes: 3 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Copyright (c) 2024, NVIDIA CORPORATION.
# Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# SPDX-License-Identifier: Apache-2.0

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
Expand Down
69 changes: 68 additions & 1 deletion .talismanrc
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,72 @@ fileignoreconfig:
checksum: 01022d56aafb7c98d5af05a3e9e87ce4d267781def6f1844470fd4cd59d6b26b
- filename: nvmath/device/random.py
checksum: c534d9a475521cfcbfa6b048904f8495ff70e2a9ccdf3f2710e050cf75fafa35

- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-sysctk12-Pipfile.lock
checksum: ac3e74b0d9d8e36c9400aaccda328a23eae6abb09b39813d3767fcca4f7314c9
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-sysctk11-Pipfile.lock
checksum: 7d7fe899d77a9b3cddd67b7ad6cedd3b0fd508e403dc750ad8d4b186f3e0e470
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cu12-cpu-Pipfile.lock
checksum: dc0f70918f75d9a336d748eade983ff4b46ad03149565e0740efe8a4aadfdc10
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cu12-dx-Pipfile.lock
checksum: 3bc632416be184605b6dcb3c1ec28af1e26e68df6f3232ad40a02a4091153a0a
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-cpu-Pipfile.lock
checksum: f11933df76dcc98ae3b25e5f356cc9060afdd9a838cf705a4ceca6bdd7161d01
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-sysctk12-dx-Pipfile.lock
checksum: 357c92fecf447f5640da599d25def0ac04490b03adc859798b2eba937a546f09
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-build-Pipfile.lock
checksum: 224516e0451196831d93512a6ff9e26d8dee14b83064270ce6cbe9567dcb5753
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-cu12-Pipfile.lock
checksum: 61d43b09f08b7f6e965194ade06c41896827cd5868c3d64e95b4524e3a1d98b9
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cpu-Pipfile.lock
checksum: 73cfd9f66cfa1c7252cc16904b90d79faf7e06cd27b0fe653fd290e1c20a819c
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-cu12-cpu-Pipfile.lock
checksum: 88c76a2fd790a1f20f0bd68fd47c30cd99df07603ff8e82353855e18ffc16e75
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-cu12-dx-Pipfile.lock
checksum: 8a0d52aff956ac7f241e09b79af3ca5bddcf0aa040476a861f7a499a5681c410
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cu11-Pipfile.lock
checksum: 7333ea9adfb7d931a5a4f4056e26deab90ded9caa054ff7baadc03722577b2d6
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-build-Pipfile.lock
checksum: 7d9895b83fe7051b9a0fb146a8f47cb25f87a8deccf4862400941f7c61196ed4
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-cu12-dx-torch-Pipfile.lock
checksum: 1e3fce2ab1065d2feef0e714a06dfe172033c10cade32de030749ee466b45423
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-cu11-Pipfile.lock
checksum: 555f9e4ba8b76f3ea912e1df46fee0ed5a0a809b01cceb9e79abe5138ad6e1c0
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-docstrings-Pipfile.lock
checksum: 2dc4d248779a72d1aad7b7119f0eae9cf5430f58b60cd84a1eb5b28dc6602bd9
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cpu-Pipfile.lock
checksum: d5a24df09c39349a868443a7da14840eb0d503132402cc11d916c0928eb41286
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cu11-torch-Pipfile.lock
checksum: 14172b0e1f856fbaeb3075613bbdc8d08fed7db84c8017e914656d0f41d1c2bf
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-sysctk12-dx-Pipfile.lock
checksum: f4ee267a48ed091c1daf92ab57a6ee65a62182e531ac3f6d351ff379a94e03ef
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cu12-dx-torch-Pipfile.lock
checksum: fe0b0af438bc668bb1ecfeee126cb5d5e1aaa3ab4997bdb88191c65e8e70fcb2
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cu12-cpu-Pipfile.lock
checksum: 9a26295b25c524d38ee32dc15225a60a1f78a42c0fbe831881603c4528c3b79d
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-sysctk11-Pipfile.lock
checksum: 15710d3872d0ff5ad83750e0c15f5d05128a9b2e131f3e91a645f78e6651f450
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-sysctk11-Pipfile.lock
checksum: 27b5fed2e2beac1530ca3c7a2e14ce058ba0fbbc58585775c12c9f1f7b479dbc
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cu12-Pipfile.lock
checksum: eeeb2c572ffe4cf2f5c84f9e3ff6a66f6dfbe54ce0523fd13a309fb8151e061e
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-sysctk12-Pipfile.lock
checksum: a341c44c5bcfabbf8faab106dfe4d331eef55e9b4895b34d6d26b4ea975579b2
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-sysctk12-dx-Pipfile.lock
checksum: e0fb537be5bde7c0a52c551198fe21085ffac9bed1b30700afd5e05659506fb3
- filename: .ci/pipenv/manylinux_2_28_x86_64-py310-test-cu12-Pipfile.lock
checksum: a42cbc8a03f5a82494b44226903cee90dfc4208320385b6eccfbc72cc8e508f7
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cu11-torch-Pipfile.lock
checksum: 64ca8c7987f08a2de952a99833010f14a7180948ba23b46c17fb04c35cfc9ba1
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cu11-Pipfile.lock
checksum: d88bb28ca54a8f79d3ce4ae0575658de62c4fbd49048f8844a740372b2483e82
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-build-Pipfile.lock
checksum: 68a465110297077c4e07616dc055b73a5e37ad412d9c5ca4bb5800024d3fa273
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cu12-dx-Pipfile.lock
checksum: 669ce9cf12b07e98b1955008ea90fbeb333a087269ed05b5c5319f4ea9c5988b
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-test-cu12-dx-torch-Pipfile.lock
checksum: b89939ae0ac554d89f5a333f37da8d9e7fb37c0ad832fb3e6f9b73b00b21bf99
- filename: .ci/pipenv/manylinux_2_28_x86_64-py312-test-sysctk12-Pipfile.lock
checksum: 62280af6c8aa520138f636a09d0330f3cda9efa2ac4e9337d7e995c93bd10c06
- filename: .ci/pipenv/manylinux_2_28_x86_64-py311-docs-Pipfile.lock
checksum: 50417e87baee9d7aa17765525d21c48ea99f9d3c0b2a2b25d401102b5c5bb32a
checksum: acfaab5ffb3098a96645323d7879d8d1df69a549b40a09a590bf8fe1315dc839
5 changes: 5 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
graft nvmath
global-include *.pyd
global-include *.pyi
global-exclude *.cpp
global-exclude *.pyx
2 changes: 1 addition & 1 deletion builder/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
# Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# SPDX-License-Identifier: Apache-2.0
4 changes: 1 addition & 3 deletions builder/pep517.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
# Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# SPDX-License-Identifier: Apache-2.0

Expand All @@ -15,8 +15,6 @@

from setuptools import build_meta as _build_meta

import utils # this is builder.utils (the build system has sys.path set up)


prepare_metadata_for_build_wheel = _build_meta.prepare_metadata_for_build_wheel
build_wheel = _build_meta.build_wheel
Expand Down
2 changes: 1 addition & 1 deletion builder/utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
# Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. ALL RIGHTS RESERVED.
#
# SPDX-License-Identifier: Apache-2.0

Expand Down
2 changes: 1 addition & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
SHELL=/bin/bash

# You can set these variables from the command line or environment
SPHINX_NVMATH_PYTHON_VER ?= $(shell [[ $$(< ../nvmath/_version.py) =~ __version__[^0-9.]*([0-9.]*) ]] && echo $${BASH_REMATCH[1]})
SPHINX_NVMATH_PYTHON_VER ?= $(shell [[ $$(< ../pyproject.toml) =~ [^a-zA-Z_]version\ =\ [^0-9.]*([0-9.]*) ]] && echo $${BASH_REMATCH[1]})
SPHINXOPTS ?= -W
SPHINXBUILD ?= sphinx-build
SOURCEDIR = sphinx
Expand Down
4 changes: 4 additions & 0 deletions docs/sphinx/_static/switcher.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
"version": "latest",
"url": "https://docs.nvidia.com/cuda/nvmath-python/latest"
},
{
"version": "0.3.0",
"url": "https://docs.nvidia.com/cuda/nvmath-python/0.3.0"
},
{
"version": "0.2.1",
"url": "https://docs.nvidia.com/cuda/nvmath-python/0.2.1"
Expand Down
4 changes: 2 additions & 2 deletions docs/sphinx/bindings/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,13 +157,13 @@ require a sequence or a nested sequence, the following operations are equivalent
my_func(..., buf, ...) # the underlying data type is determined by the C API

which is particularly useful when users need to pass multiple sequences or nested sequences
to C (ex: :func:`nvmath.bindings.cufft.plan_many`).
to C (For example, :func:`nvmath.bindings.cufft.plan_many`).

.. note::

Some functions require their arguments to be in the device memory. You need to pass
device memory (for example, :class:`cupy.ndarray`) to such arguments. nvmath-python
does not validate the memory pointers passed and does not implicitly transfer the data.
neither validates the memory pointers nor implicitly transfers the data.
Passing host memory where device memory is expected (and vice versa) results in
undefined behavior.

Expand Down
55 changes: 48 additions & 7 deletions docs/sphinx/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,12 @@
import os
import re
import sys
import tomllib
import tempfile
import json

sys.path.insert(0, os.path.abspath("."))
import pkg_resources
import warnings
import json

from sphinx.writers.html import HTMLTranslator
from docutils.transforms import Transform
Expand Down Expand Up @@ -56,10 +57,8 @@
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
with open("../../nvmath/_version.py") as f:
exec(f.read())
nvmath_py_ver = __version__ # noqa: F821
del __version__ # noqa: F821
with open("../../pyproject.toml", "rb") as f:
nvmath_py_ver = tomllib.load(f)["project"]["version"]

# The short X.Y version.
version = nvmath_py_ver
Expand Down Expand Up @@ -90,6 +89,8 @@
#'sphinxcontrib.autoprogram',
"sphinxcontrib.programoutput",
"sphinx_favicon",
"nbsphinx",
"nbsphinx_link",
]

imgmath_latex_preamble = r"\usepackage{braket}"
Expand All @@ -101,6 +102,9 @@
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# Silence a warning about unpicklable value
nbsphinx_custom_formats = {}

# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
Expand Down Expand Up @@ -181,7 +185,7 @@ def autodoc_process_docstring(app, what, name, obj, options, lines):
struct = snake_to_camel([mod] + struct.split("_")[:-1])
line = f"NumPy dtype object that represents the `{struct}` struct.\n"
else:
# handle dtype in high-level pythonic APIs
# handle dtype in high-level Pythonic APIs
struct = " ".join(struct.split("_")[:-1])
line = f"NumPy dtype object that encapsulates the {struct} in {mod}.\n"
lines.clear()
Expand Down Expand Up @@ -250,9 +254,42 @@ def default_departure(self, node):
default_priority = 800


class NotebookHandler:
def __init__(self):
self.tmpdir = tempfile.mkdtemp()

def __del__(self):
os.unlink(self.tmpdir)

def remove_notebook_copyright(self, app, docname, content):
if os.path.exists(os.path.join("sphinx", docname + ".nblink")):
link = json.loads(content[0])
notebook_path = os.path.join("sphinx", os.path.dirname(docname), link["path"])

with open(notebook_path) as original_notebook_file:
notebook_content = json.load(original_notebook_file)
copyright_regex = (
r"\s*Copyright \(c\) [0-9-]+, NVIDIA CORPORATION & AFFILIATES\s*SPDX-License-Identifier: BSD-3-Clause\s*"
)
if re.match(copyright_regex, "".join(notebook_content["cells"][0]["source"])):
# Remove first cell if it's a copyright notice
notebook_content["cells"] = notebook_content["cells"][1:]

new_notebook_path = os.path.join(self.tmpdir, docname.replace(".nblink", ".ipynb").replace("/", "__"))
with open(new_notebook_path, "w") as new_notebook_file:
json.dump(notebook_content, new_notebook_file)

link["path"] = os.path.relpath(new_notebook_path, os.path.join("sphinx", os.path.dirname(docname)))
content[0] = json.dumps(link)


notebook_handler = NotebookHandler()


def setup(app):
app.add_css_file("nvmath_override.css")
app.connect("autodoc-process-docstring", autodoc_process_docstring)
app.connect("source-read", lambda *args, **kwargs: notebook_handler.remove_notebook_copyright(*args, **kwargs))
app.set_translator("html", DotBreakHtmlTranslator)
app.add_autodocumenter(PatchedEnumDocumenter, override=True)
app.add_post_transform(UnqualifiedTitlesTransform)
Expand Down Expand Up @@ -283,6 +320,10 @@ def setup(app):
# sweetspot value determined by trial & error to suppress all warnings
autosectionlabel_maxdepth = 2

show_warning_types = True
suppress_warnings = [
"config.cache", # nbsphinx_link makes nbsphinx_custom_formats unpicklable
]

doctest_global_setup = """
import numpy as np
Expand Down
2 changes: 1 addition & 1 deletion docs/sphinx/device-apis/cufft.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Overview
========

These APIs offer integration with the NVIDIA cuFFTDx library.
Detailed documentation of cuBLASDx can be found in the
Detailed documentation of cuFFTDx can be found in the
`cuFFTDx documentation <https://docs.nvidia.com/cuda/cufftdx/1.2.0>`_.

.. note::
Expand Down
1 change: 1 addition & 0 deletions docs/sphinx/device-apis/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Detailed documentation for these libraries can be found at `cuFFTDx
<https://docs.nvidia.com/cuda/cufftdx/1.2.0>`_, `cuBLASDx
<https://docs.nvidia.com/cuda/cublasdx/0.1.1>`_, and `cuRAND device APIs
<https://docs.nvidia.com/cuda/curand/group__DEVICE.html#group__DEVICE>`_ respectively.
Device APIs can only be called from CUDA device or kernel code, and execute on the GPU.

Users may take advantage of the device module via the two approaches below:

Expand Down
File renamed without changes.
20 changes: 20 additions & 0 deletions docs/sphinx/host-apis/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
*********
Host APIs
*********

The following of modules of nvmath-python offer integration with NVIDIA's
high-performance computing libraries through host APIs for cuBLAS and cuFFT.
Host APIs are called from host code but can execute in any supported execution
space (CPU or GPU).

========
Contents
========

.. toctree::
:caption: API Reference
:maxdepth: 2

Linear Algebra <linalg/index.rst>
Fast Fourier Transform <fft/index.rst>
Host API Utilities <utils.rst>
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,27 @@ Specialized Linear Algebra APIs (:mod:`nvmath.linalg.advanced`)

:template: dataclass.rst

MatmulEpilogPreferences
MatmulOptions
MatmulPlanPreferences
MatmulQuantizationScales

Helpers
^^^^^^^

The Specialized Linear Algebra helpers module :mod:`nvmath.linalg.advanced.helpers`
provides helper functions to facilitate working with some of the complex features of
:mod:`nvmath.linalg.advanced` module.

Matmul helpers (:mod:`nvmath.linalg.advanced.helpers.matmul`)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

.. module:: nvmath.linalg.advanced.helpers.matmul

.. autosummary::
:toctree: generated/

create_mxfp8_scale
invert_mxfp8_scale
apply_mxfp8_scale
get_mxfp8_scale_offset
23 changes: 16 additions & 7 deletions docs/sphinx/host-utils.rst → docs/sphinx/host-apis/utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,31 @@
Host API Utilities
**********************

.. _host-api-util-overview:

Overview
========

nvmath-python provides host-side APIs for managing device-side memory.

.. _host-api-util-reference:

API Reference
=============

.. module:: nvmath

Memory utilities
----------------

nvmath-python provides host-side APIs for managing device-side memory.

.. autosummary::
:toctree: generated/

BaseCUDAMemoryManager
MemoryPointer

Data types
----------

nvmath-python provides the following data types.


.. autosummary::
:toctree: generated/

CudaDataType
Loading