Skip to content

build(ascend): support custom kernel builds#602

Merged
voltjia merged 2 commits into
InfiniTensor:masterfrom
zhangyue207:build/ascend-custom-kernels
Jul 2, 2026
Merged

build(ascend): support custom kernel builds#602
voltjia merged 2 commits into
InfiniTensor:masterfrom
zhangyue207:build/ascend-custom-kernels

Conversation

@zhangyue207

@zhangyue207 zhangyue207 commented May 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Replace the old BUILD_CUSTOM_KERNEL path with BUILD_ASCEND_CUSTOM and drive custom AscendC kernels through the standalone src/native/ascend/custom/build.sh sub-build.
  • Import the produced libno_workspace_kernel.a and link it into generated Python bindings with --whole-archive.
  • Add shared SOC_VERSION detection in src/native/ascend/custom/cmake/detect_soc.cmake.
  • Route the custom sub-build through a non-hidden source symlink and pass MAIN_SRC_DIR explicitly so CANN can find host objects even when the repo is checked out under .worktrees.
  • Keep the branch rebased onto current upstream master (d67759e).

Motivation

Custom AscendC kernels are needed by upcoming Ascend operator implementations. Building them through a standalone sub-build avoids the CANN extract_host_stub.py path handling issue seen in scikit-build-core temporary builds while keeping the artifacts under build/build_ascend_custom/.

N/A. No issue is linked.

Type of Change

  • feat - new feature / new operator / new platform
  • fix - bug fix
  • perf - performance improvement (no behavioral change)
  • refactor - code restructuring without behavior change
  • test - adding or fixing tests only
  • docs - documentation only
  • build / ci - build system or CI configuration
  • chore - tooling, formatting, or other non-code changes
  • Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

  • CPU (WITH_CPU)
  • NVIDIA (WITH_NVIDIA)
  • Iluvatar (WITH_ILUVATAR)
  • MetaX (WITH_METAX)
  • Cambricon (WITH_CAMBRICON)
  • Moore (WITH_MOORE)
  • Ascend (WITH_ASCEND)
  • PyTorch C++ bindings (WITH_TORCH)
  • Build system / CMake / CI
  • Python bindings / user-facing API

Test Results on Supported Platforms

Platform Built pytest Result Notes / Hardware
NVIDIA Yes Passed GitHub CI and CI v2 shadow passed.
Iluvatar Yes Passed GitHub CI passed after rerunning a runner-device-detection failure; CI v2 shadow passed.
MetaX Yes Passed GitHub CI and CI v2 shadow passed.
Cambricon Yes Passed GitHub CI and CI v2 shadow passed.
Moore Yes Passed GitHub CI and CI v2 shadow passed.
Ascend Yes Passed GitHub CI and CI v2 shadow passed on head 91f73a0.
Local checks
git diff --check upstream/master...HEAD
bash -n src/native/ascend/custom/build.sh

Both commands passed locally after rebasing onto upstream/master d67759e.

GitHub CI re-run
Head SHA: 91f73a05533a15651b46fb300d2260e680217778
Base SHA: d67759e0c59de09087a4e8774c8a5f4db7bd5134
Triggered at: 2026-06-29 07:26 UTC

Passed checks:
- ruff
- clang-format
- ci / Generate matrix from config
- ci-v2-shadow / Generate CI v2 shadow matrix
- ci / unit / nvidia
- ci-v2-shadow / ci-v2-shadow / nvidia
- ci / unit / iluvatar
- ci-v2-shadow / ci-v2-shadow / iluvatar
- ci / unit / metax
- ci-v2-shadow / ci-v2-shadow / metax
- ci / unit / moore
- ci-v2-shadow / ci-v2-shadow / moore
- ci / unit / cambricon
- ci-v2-shadow / ci-v2-shadow / cambricon
- ci / unit / ascend
- ci-v2-shadow / ci-v2-shadow / ascend

Benchmark / Performance Impact

N/A. This PR changes build plumbing only.

Notes for Reviewers

  • API alignment note: no production src/base/ operator API is changed in this PR.
  • The source symlink is only for the custom sub-build. It avoids a CANN helper script limitation where recursive Python glob skips host object paths containing hidden directory components.
  • The custom kernel archive is linked only for Ascend Python bindings when BUILD_ASCEND_CUSTOM is enabled.

Checklist

Title, Branch, and Commits

  • PR title follows Conventional Commits.
  • Branch name follows <type>/xxx-yyyy-zzzz.
  • Each commit message follows Conventional Commits.
  • Small PR is a single squashable commit.
  • No stray merge commits from master.
  • No fixup! / squash! / wip commits remain.

Scope and Design

  • Changes are minimal: only custom AscendC build plumbing and SOC detection are included.
  • No dead code, commented-out blocks, debug prints, or ownerless TODO entries.
  • No unrelated formatting churn.
  • Public API changes are intentional. No production src/base/ public API is changed.

General Code Hygiene

  • The code is self-explanatory; comments were added only where the reason is non-obvious.
  • Every modified or added file ends with a single trailing newline.
  • No trailing whitespace, tab/space mixing, or stray BOMs.
  • Identifiers in comments and error messages are wrapped in backticks.
  • All comments and error messages are in English.
  • Comments and error messages are complete sentences unless the language/framework convention says otherwise.

C++ Specific

N/A. No C++ source files are changed.

Python Specific

N/A. No Python source files are changed.

Testing

  • GitHub CI re-run passed on all configured platforms.
  • Build-system behavior is covered by the Ascend CI build path.
  • Platforms are validated by GitHub CI and CI v2 shadow.

Build, CI, and Tooling

  • git diff --check passes.
  • bash -n src/native/ascend/custom/build.sh passes.
  • No new backend/device auto-detection changes are required.
  • The existing CUDA-like mutual-exclusion check is not changed.
  • No new runtime dependency was added.

Documentation

  • Inline build comments document the custom sub-build path and SOC_VERSION detection.

Security and Safety

  • No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
  • No third-party code was added.
  • No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were introduced.

@zhangyue207 zhangyue207 force-pushed the build/ascend-custom-kernels branch 2 times, most recently from 8aabd09 to 251faac Compare May 25, 2026 01:42
@zhangyue207 zhangyue207 force-pushed the build/ascend-custom-kernels branch 5 times, most recently from 9ac252e to 4df58e5 Compare June 2, 2026 02:40
@zhangyue207 zhangyue207 force-pushed the build/ascend-custom-kernels branch from 4df58e5 to 91f73a0 Compare June 29, 2026 07:26
@zhangyue207 zhangyue207 requested review from Ziminli and voltjia June 30, 2026 06:06
@zhangyue207 zhangyue207 marked this pull request as ready for review June 30, 2026 06:09
@zhangyue207 zhangyue207 requested a review from a team June 30, 2026 06:09
Comment thread src/CMakeLists.txt
@voltjia voltjia merged commit 815c5a0 into InfiniTensor:master Jul 2, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants