Add: hardware docs and CANN query tools#883
Conversation
|
Warning Review limit reached
More reviews will be available in 57 minutes and 58 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (24)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request reorganizes and expands the documentation regarding the Ascend NPU architecture (covering both a2a3 and a5 generations), cache coherency, and project layout. It also introduces a standalone host-side device-info CLI tool (query) under tools/cann-examples/query to query device counts, SoC names, core counts, and HBM memory info using CANN ACL APIs, integrating its build and execution into the CI workflow. The review feedback highlights two important improvements: ensuring that aclrtResetDevice is reliably called in cmd_mem even if aclrtGetMemInfo fails to prevent thread-state pollution, and wrapping ASCEND_HOME_PATH references in double quotes within the CMake configuration to safely handle paths containing spaces.
7b637a4 to
e509bae
Compare
e509bae to
ba5e9df
Compare
Bundles six related threads that together answer "how do I reason
about Ascend hardware in this repo and not get burned by a wrong
--platform invocation."
1. New docs/hardware/ cross-chip tree:
- chip-architecture.md: Host CPU + DDR attached via PCIe (x86) or
UB / HCCS (Kunpeng), on-chip AICPU + AICore clusters, GM,
end-to-end task flow, off-chip vs on-chip cost model.
- SoC family <-> arch mapping table cites vllm-ascend FAQ hw-native-sys#21
(Atlas A2 = ascend910b1, Atlas A3 = ascend910_9391) plus the
toolchain.py dav-c220 / dav-c310 bridge as authoritative
sources.
- cache-coherency.md moved from src/a2a3/docs/ and generalized:
dcci + cache_invalidate_range live on both a2a3 and a5. Inbound
refs in src/a2a3/docs/platform.md and the AICPU L2 perf
collector comment updated.
2. Per-chip facts in src/{a2a3,a5}/docs/hardware.md:
- a2a3: a2 vs a3 packaging (single die vs dual die / 2 device IDs
sharing AICPU OS), per-die 24 AIC + 48 AIV, 64 GiB HBM,
UB 1.0 / HCCS on Kunpeng.
- a5: 2 dies as 1 device id, per-die layout, UB 2.0 on Kunpeng.
- Three views of "how many cores" section: spec view (delivered
to user code) vs HAL view vs CANN ini view, with the observed
discrepancy resolved by the device-side probe in thread 5.
a3 closure: cpu_id 0 = AICPU OS scheduler, cpu_id 1 = PG
fab-disabled, cpu_id 2..7 = 6 user-schedulable. a5 same
pattern is calibrated inference pending its own probe run.
3. Rules reorg under .claude/rules/:
- architecture.md + ascend-device.md merged then split by
audience: ascend.md (HW + SW arch quick reference + AIC / AIV /
AICPU terminology) and project-layout.md (Python wheel split,
build system lookup, test layout).
- Inbound refs in docs/python-packaging.md and
review-pr/SKILL.md updated.
4. tools/cann-examples/query/ — host-side CLI:
- Subcommands: devices, device <id> (full per-device dump:
identification + cores + memory hierarchy with per-field
comments), mem <id>, version (compiler/version.info — toolkit
version, not aclrtGetVersion's runtime lib version).
- Compile-time link to ascendcl + runtime + ascend_hal +
drvdsmi_host (no dlopen). ASCEND_DRIVER_PATH defaults to the
sibling of ASCEND_HOME_PATH, override via cmake -D.
- Buffer sizes (UB / L1 / L0A/B/C) read from CANN platform_config
ini because the matching ACL device-attribute queries return 0
on CANN 9.0 / a3.
5. tools/cann-examples/aicpu-device-query/ — device-side HAL probe:
- halGetDeviceInfo has queries flagged "used in device" in the
header (notably AICPU + OS_SCHED, AICPU + PF_OCCUPY) that only
succeed when called from inside an AICPU OS process. This tool
uploads a small inner SO via the dispatcher bootstrap path
(rtAicpuKernelLaunchExWithArgs with libaicpu_extend_kernels.so
in Mode A, no sudo / no pre-deployment), runs the queries
device-side, and reads results back through GM.
- Closes the long-standing a3 question of whether the 8 -> 6
AICPU gap is OS-reservation or PG: device-side OS_SCHED = 0x1
proves cpu_id 0 is OS-owned (single bit), and the absence of
cpu_id 1 from every other CPU module's OCCUPY mask plus
not-in-vNPU-mode rules out virtualization remapping. The gap
is therefore 1 OS + 1 PG, not 2 OS.
- Tool README documents how to run it on a5 to close the
analogous question there.
6. .claude/skills/onboard-arch-precheck/ — wrong-arch gate:
- Refuses pytest / task-submit invocations with
--platform a2a3|a5 when the host's actual silicon is the other
family, before any device lock is acquired. CI is fine because
each onboard runner is labeled with its arch; local hardware
work bypasses that protection. Wrong-arch runs produce
507018 / 507899 cascades that LOOK LIKE genuine bugs and
routinely waste hours on phantom investigations.
- Detection reads the same source as the query tool: npu-smi for
Chip Name + NPU Name, then
$ASCEND_HOME_PATH/{arch}-linux/data/platform_config/<SoC>.ini
for Short_SoC_version, then maps to repo arch. No ACL init,
no device binding, ~600 ms cold and ~5 ms cached
(/tmp/onboard-arch-precheck.cache, 1 hr TTL). Sim variants
(a2a3sim, a5sim) pass through unconditionally.
- .claude/rules/task-submit-isolation.md links to the skill from
its pre-flight section and adds bypass-the-precheck to the
anti-patterns list.
7. CI integration in .github/workflows/ci.yml:
- ut-a2a3 and ut-a5 jobs build tools/cann-examples/query and run
`query version` (no device locked, no resource-spec conflict).
- Same jobs build tools/cann-examples/aicpu-device-query
(cross-compiled device SO + native host) as a link smoke test.
- docs/ci.md job table updated; tools/README.md updated.
ba5e9df to
2f28986
Compare
Smallest possible end-to-end demonstration of the AICPU kernel launch pipeline used by this repo's runtime — no scene-test plumbing, no ringbuffer / tensormap, no ChipWorker fork. Strips PR hw-native-sys#883's aicpu-device-query down to the bootstrap path itself with a trivial inner kernel so a reader who wants to add new AICPU work can use it as a copy-paste template. Pipeline (Method 2 / "Path A" — see docs/aicpu-kernel-launch-mechanisms.md): 1. host : rtAicpuKernelLaunchExWithArgs(KERNEL_TYPE_AICPU_KFC, libaicpu_extend_kernels.so) hands dispatcher SO + inner SO bytes to the AICPU OS process. 2. device: dispatcher (DynTileFwkBackendKernelServerInit) writes inner SO to /usr/lib64/aicpu_kernels/0/aicpu_kernels_device/ simpler_inner_<fp>_<dev>.so. 3. host : fingerprint inner SO by ELF Build-ID, emit a JSON descriptor pointing at the preinstall basename, register via rtsBinaryLoadFromFile. 4. host : rtsFuncGetByName for init + run handles. 5. host : rewrite DeviceArgs (offsets 96 / 104) to point at the HelloResult + input_token, rtsLaunchCpuKernel(run). 6. device: kernel logs via DlogRecord, calls halGetDeviceInfo, writes HelloResult{ magic, echoed_token, hal_rc, hal_value }. 7. host : D2H + verify magic == 0xDEADBEEFC0FFEE01 + echoed token. Files: - docs/aicpu-kernel-launch-mechanisms.md : NEW canonical doc covering all three known methods of getting a custom AICPU SO onto the device (tar.gz pre-deployment, Path A dispatcher bootstrap, broken Path B KERNEL_TYPE_AICPU_CUSTOM). Records issue hw-native-sys#822's full failure forensics: cust-subprocess L1 stale on AICore HBM writes, the four user-space workarounds that all fail (volatile / ldar / dc civac / dc ivac) with the architectural reason each fails, and the CANN-side fix options (A/B/C/D). Sedimentation of the PR hw-native-sys#537 debugging session so future readers don't re-derive any of it. - tools/cann-examples/aicpu-kernel-launch/device/hello_aicpu.cpp : simpler_aicpu_init no-op + simpler_aicpu_run reads DeviceArgs, calls halGetDeviceInfo(AICPU, CORE_NUM), writes HelloResult to GM. - tools/cann-examples/aicpu-kernel-launch/device/CMakeLists.txt : aarch64 cross, links ascend_hal, emits --build-id for fingerprint stability. - tools/cann-examples/aicpu-kernel-launch/host/launch_hello.cpp : full Mode A bootstrap + JSON descriptor + binary load + launch + verify. Inlined ELF Build-ID reader so the example is standalone (no headers from src/). - tools/cann-examples/aicpu-kernel-launch/host/CMakeLists.txt : links ascendcl + runtime. - tools/cann-examples/aicpu-kernel-launch/README.md : pipeline diagram, I/O contract, build/run, scope+limits; references the mechanisms doc for the full comparison. Cross-links updated: - src/common/aicpu_dispatcher/README.md and - tools/cann-examples/aicpu-device-query/README.md both now point at the new mechanisms doc so anyone reading the dispatcher or the device-query tool can find the full Path A vs Path B vs tar.gz comparison. Integration: - ut-a2a3 + ut-a5 CI jobs build the device SO (cross-compile) + host launcher as a link smoke test, mirroring the aicpu-device-query precedent. No device locked, no resource-spec conflict. - tools/README.md adds an "aicpu-kernel-launch" subsection. - docs/ci.md job table updated. Why bundle as one PR: the reference tool and the mechanisms doc exist to answer the same question ("how do I launch a custom AICPU kernel"). Splitting them would mean the doc points at a tool that doesn't exist yet, or the tool points at a doc that doesn't exist yet. Reviewable as one logical unit.
Smallest possible end-to-end demonstration of the AICPU kernel launch pipeline used by this repo's runtime — no scene-test plumbing, no ringbuffer / tensormap, no ChipWorker fork. Strips PR hw-native-sys#883's aicpu-device-query down to the bootstrap path itself with a trivial inner kernel so a reader who wants to add new AICPU work can use it as a copy-paste template. Pipeline (Method 2 / "Path A" — see docs/aicpu-kernel-launch-mechanisms.md): 1. host : rtAicpuKernelLaunchExWithArgs(KERNEL_TYPE_AICPU_KFC, libaicpu_extend_kernels.so) hands dispatcher SO + inner SO bytes to the AICPU OS process. 2. device: dispatcher (DynTileFwkBackendKernelServerInit) writes inner SO to /usr/lib64/aicpu_kernels/0/aicpu_kernels_device/ simpler_inner_<fp>_<dev>.so. 3. host : fingerprint inner SO by ELF Build-ID, emit a JSON descriptor pointing at the preinstall basename, register via rtsBinaryLoadFromFile. 4. host : rtsFuncGetByName for init + run handles. 5. host : rewrite DeviceArgs (offsets 96 / 104) to point at the HelloResult + input_token, rtsLaunchCpuKernel(run). 6. device: kernel logs via DlogRecord, calls halGetDeviceInfo, writes HelloResult{ magic, echoed_token, hal_rc, hal_value }. 7. host : D2H + verify magic == 0xDEADBEEFC0FFEE01 + echoed token. Files: - docs/aicpu-kernel-launch-mechanisms.md : NEW canonical doc covering all three known methods of getting a custom AICPU SO onto the device (tar.gz pre-deployment, Path A dispatcher bootstrap, broken Path B KERNEL_TYPE_AICPU_CUSTOM). Records issue hw-native-sys#822's full failure forensics: cust-subprocess L1 stale on AICore HBM writes, the four user-space workarounds that all fail (volatile / ldar / dc civac / dc ivac) with the architectural reason each fails, and the CANN-side fix options (A/B/C/D). Sedimentation of the PR hw-native-sys#537 debugging session so future readers don't re-derive any of it. - tools/cann-examples/aicpu-kernel-launch/device/hello_aicpu.cpp : simpler_aicpu_init no-op + simpler_aicpu_run reads DeviceArgs, calls halGetDeviceInfo(AICPU, CORE_NUM), writes HelloResult to GM. - tools/cann-examples/aicpu-kernel-launch/device/CMakeLists.txt : aarch64 cross, links ascend_hal, emits --build-id for fingerprint stability. - tools/cann-examples/aicpu-kernel-launch/host/launch_hello.cpp : full Mode A bootstrap + JSON descriptor + binary load + launch + verify. Inlined ELF Build-ID reader so the example is standalone (no headers from src/). - tools/cann-examples/aicpu-kernel-launch/host/CMakeLists.txt : links ascendcl + runtime. - tools/cann-examples/aicpu-kernel-launch/README.md : pipeline diagram, I/O contract, build/run, scope+limits; references the mechanisms doc for the full comparison. Cross-links updated: - src/common/aicpu_dispatcher/README.md and - tools/cann-examples/aicpu-device-query/README.md both now point at the new mechanisms doc so anyone reading the dispatcher or the device-query tool can find the full Path A vs Path B vs tar.gz comparison. Integration: - ut-a2a3 + ut-a5 CI jobs build the device SO (cross-compile) + host launcher as a link smoke test, mirroring the aicpu-device-query precedent. No device locked, no resource-spec conflict. - tools/README.md adds an "aicpu-kernel-launch" subsection. - docs/ci.md job table updated. Why bundle as one PR: the reference tool and the mechanisms doc exist to answer the same question ("how do I launch a custom AICPU kernel"). Splitting them would mean the doc points at a tool that doesn't exist yet, or the tool points at a doc that doesn't exist yet. Reviewable as one logical unit.
Smallest possible end-to-end demonstration of the AICPU kernel launch pipeline used by this repo's runtime — no scene-test plumbing, no ringbuffer / tensormap, no ChipWorker fork. Strips PR hw-native-sys#883's aicpu-device-query down to the bootstrap path itself with a trivial inner kernel so a reader who wants to add new AICPU work can use it as a copy-paste template. Pipeline (Method 2 / "Path A" — see docs/aicpu-kernel-launch-mechanisms.md): 1. host : rtAicpuKernelLaunchExWithArgs(KERNEL_TYPE_AICPU_KFC, libaicpu_extend_kernels.so) hands dispatcher SO + inner SO bytes to the AICPU OS process. 2. device: dispatcher (DynTileFwkBackendKernelServerInit) writes inner SO to /usr/lib64/aicpu_kernels/0/aicpu_kernels_device/ simpler_inner_<fp>_<dev>.so. 3. host : fingerprint inner SO by ELF Build-ID, emit a JSON descriptor pointing at the preinstall basename, register via rtsBinaryLoadFromFile. 4. host : rtsFuncGetByName for init + run handles. 5. host : rewrite DeviceArgs (offsets 96 / 104) to point at the HelloResult + input_token, rtsLaunchCpuKernel(run). 6. device: kernel logs via DlogRecord, calls halGetDeviceInfo, writes HelloResult{ magic, echoed_token, hal_rc, hal_value }. 7. host : D2H + verify magic == 0xDEADBEEFC0FFEE01 + echoed token. Files: - docs/aicpu-kernel-launch-mechanisms.md : NEW canonical doc covering all three known methods of getting a custom AICPU SO onto the device (tar.gz pre-deployment, Path A dispatcher bootstrap, broken Path B KERNEL_TYPE_AICPU_CUSTOM). Records issue hw-native-sys#822's full failure forensics: cust-subprocess L1 stale on AICore HBM writes, the four user-space workarounds that all fail (volatile / ldar / dc civac / dc ivac) with the architectural reason each fails, and the CANN-side fix options (A/B/C/D). Sedimentation of the PR hw-native-sys#537 debugging session so future readers don't re-derive any of it. - tools/cann-examples/aicpu-kernel-launch/device/hello_aicpu.cpp : simpler_aicpu_init no-op + simpler_aicpu_run reads DeviceArgs, calls halGetDeviceInfo(AICPU, CORE_NUM), writes HelloResult to GM. - tools/cann-examples/aicpu-kernel-launch/device/CMakeLists.txt : aarch64 cross, links ascend_hal, emits --build-id for fingerprint stability. - tools/cann-examples/aicpu-kernel-launch/host/launch_hello.cpp : full Mode A bootstrap + JSON descriptor + binary load + launch + verify. Inlined ELF Build-ID reader so the example is standalone (no headers from src/). - tools/cann-examples/aicpu-kernel-launch/host/CMakeLists.txt : links ascendcl + runtime. - tools/cann-examples/aicpu-kernel-launch/README.md : pipeline diagram, I/O contract, build/run, scope+limits; references the mechanisms doc for the full comparison. Cross-links updated: - src/common/aicpu_dispatcher/README.md and - tools/cann-examples/aicpu-device-query/README.md both now point at the new mechanisms doc so anyone reading the dispatcher or the device-query tool can find the full Path A vs Path B vs tar.gz comparison. Integration: - ut-a2a3 + ut-a5 CI jobs build the device SO (cross-compile) + host launcher as a link smoke test, mirroring the aicpu-device-query precedent. No device locked, no resource-spec conflict. - tools/README.md adds an "aicpu-kernel-launch" subsection. - docs/ci.md job table updated. Why bundle as one PR: the reference tool and the mechanisms doc exist to answer the same question ("how do I launch a custom AICPU kernel"). Splitting them would mean the doc points at a tool that doesn't exist yet, or the tool points at a doc that doesn't exist yet. Reviewable as one logical unit.
#923) Smallest possible end-to-end demonstration of the AICPU kernel launch pipeline used by this repo's runtime — no scene-test plumbing, no ringbuffer / tensormap, no ChipWorker fork. Strips PR #883's aicpu-device-query down to the bootstrap path itself with a trivial inner kernel so a reader who wants to add new AICPU work can use it as a copy-paste template. Pipeline (Method 2 / "Path A" — see docs/aicpu-kernel-launch-mechanisms.md): 1. host : rtAicpuKernelLaunchExWithArgs(KERNEL_TYPE_AICPU_KFC, libaicpu_extend_kernels.so) hands dispatcher SO + inner SO bytes to the AICPU OS process. 2. device: dispatcher (DynTileFwkBackendKernelServerInit) writes inner SO to /usr/lib64/aicpu_kernels/0/aicpu_kernels_device/ simpler_inner_<fp>_<dev>.so. 3. host : fingerprint inner SO by ELF Build-ID, emit a JSON descriptor pointing at the preinstall basename, register via rtsBinaryLoadFromFile. 4. host : rtsFuncGetByName for init + run handles. 5. host : rewrite DeviceArgs (offsets 96 / 104) to point at the HelloResult + input_token, rtsLaunchCpuKernel(run). 6. device: kernel logs via DlogRecord, calls halGetDeviceInfo, writes HelloResult{ magic, echoed_token, hal_rc, hal_value }. 7. host : D2H + verify magic == 0xDEADBEEFC0FFEE01 + echoed token. Files: - docs/aicpu-kernel-launch-mechanisms.md : NEW canonical doc covering all three known methods of getting a custom AICPU SO onto the device (tar.gz pre-deployment, Path A dispatcher bootstrap, broken Path B KERNEL_TYPE_AICPU_CUSTOM). Records issue #822's full failure forensics: cust-subprocess L1 stale on AICore HBM writes, the four user-space workarounds that all fail (volatile / ldar / dc civac / dc ivac) with the architectural reason each fails, and the CANN-side fix options (A/B/C/D). Sedimentation of the PR #537 debugging session so future readers don't re-derive any of it. - tools/cann-examples/aicpu-kernel-launch/device/hello_aicpu.cpp : simpler_aicpu_init no-op + simpler_aicpu_run reads DeviceArgs, calls halGetDeviceInfo(AICPU, CORE_NUM), writes HelloResult to GM. - tools/cann-examples/aicpu-kernel-launch/device/CMakeLists.txt : aarch64 cross, links ascend_hal, emits --build-id for fingerprint stability. - tools/cann-examples/aicpu-kernel-launch/host/launch_hello.cpp : full Mode A bootstrap + JSON descriptor + binary load + launch + verify. Inlined ELF Build-ID reader so the example is standalone (no headers from src/). - tools/cann-examples/aicpu-kernel-launch/host/CMakeLists.txt : links ascendcl + runtime. - tools/cann-examples/aicpu-kernel-launch/README.md : pipeline diagram, I/O contract, build/run, scope+limits; references the mechanisms doc for the full comparison. Cross-links updated: - src/common/aicpu_dispatcher/README.md and - tools/cann-examples/aicpu-device-query/README.md both now point at the new mechanisms doc so anyone reading the dispatcher or the device-query tool can find the full Path A vs Path B vs tar.gz comparison. Integration: - ut-a2a3 + ut-a5 CI jobs build the device SO (cross-compile) + host launcher as a link smoke test, mirroring the aicpu-device-query precedent. No device locked, no resource-spec conflict. - tools/README.md adds an "aicpu-kernel-launch" subsection. - docs/ci.md job table updated. Why bundle as one PR: the reference tool and the mechanisms doc exist to answer the same question ("how do I launch a custom AICPU kernel"). Splitting them would mean the doc points at a tool that doesn't exist yet, or the tool points at a doc that doesn't exist yet. Reviewable as one logical unit. Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
Bundles six related threads that together answer "how do I reason
about Ascend hardware in this repo and not get burned by a wrong
--platform invocation."
New docs/hardware/ cross-chip tree:
UB / HCCS (Kunpeng), on-chip AICPU + AICore clusters, GM,
end-to-end task flow, off-chip vs on-chip cost model.
(Atlas A2 = ascend910b1, Atlas A3 = ascend910_9391) plus the
toolchain.py dav-c220 / dav-c310 bridge as authoritative
sources.
dcci + cache_invalidate_range live on both a2a3 and a5. Inbound
refs in src/a2a3/docs/platform.md and the AICPU L2 perf
collector comment updated.
Per-chip facts in src/{a2a3,a5}/docs/hardware.md:
sharing AICPU OS), per-die 24 AIC + 48 AIV, 64 GiB HBM,
UB 1.0 / HCCS on Kunpeng.
to user code) vs HAL view vs CANN ini view, with the observed
discrepancy resolved by the device-side probe in thread 5.
a3 closure: cpu_id 0 = AICPU OS scheduler, cpu_id 1 = PG
fab-disabled, cpu_id 2..7 = 6 user-schedulable. a5 same
pattern is calibrated inference pending its own probe run.
Rules reorg under .claude/rules/:
audience: ascend.md (HW + SW arch quick reference + AIC / AIV /
AICPU terminology) and project-layout.md (Python wheel split,
build system lookup, test layout).
review-pr/SKILL.md updated.
tools/cann-examples/query/ — host-side CLI:
identification + cores + memory hierarchy with per-field
comments), mem , version (compiler/version.info — toolkit
version, not aclrtGetVersion's runtime lib version).
drvdsmi_host (no dlopen). ASCEND_DRIVER_PATH defaults to the
sibling of ASCEND_HOME_PATH, override via cmake -D.
ini because the matching ACL device-attribute queries return 0
on CANN 9.0 / a3.
tools/cann-examples/aicpu-device-query/ — device-side HAL probe:
header (notably AICPU + OS_SCHED, AICPU + PF_OCCUPY) that only
succeed when called from inside an AICPU OS process. This tool
uploads a small inner SO via the dispatcher bootstrap path
(rtAicpuKernelLaunchExWithArgs with libaicpu_extend_kernels.so
in Mode A, no sudo / no pre-deployment), runs the queries
device-side, and reads results back through GM.
AICPU gap is OS-reservation or PG: device-side OS_SCHED = 0x1
proves cpu_id 0 is OS-owned (single bit), and the absence of
cpu_id 1 from every other CPU module's OCCUPY mask plus
not-in-vNPU-mode rules out virtualization remapping. The gap
is therefore 1 OS + 1 PG, not 2 OS.
analogous question there.
.claude/skills/onboard-arch-precheck/ — wrong-arch gate:
--platform a2a3|a5 when the host's actual silicon is the other
family, before any device lock is acquired. CI is fine because
each onboard runner is labeled with its arch; local hardware
work bypasses that protection. Wrong-arch runs produce
507018 / 507899 cascades that LOOK LIKE genuine bugs and
routinely waste hours on phantom investigations.
Chip Name + NPU Name, then
$ASCEND_HOME_PATH/{arch}-linux/data/platform_config/.ini
for Short_SoC_version, then maps to repo arch. No ACL init,
no device binding, ~600 ms cold and ~5 ms cached
(/tmp/onboard-arch-precheck.cache, 1 hr TTL). Sim variants
(a2a3sim, a5sim) pass through unconditionally.
its pre-flight section and adds bypass-the-precheck to the
anti-patterns list.
CI integration in .github/workflows/ci.yml:
query version(no device locked, no resource-spec conflict).(cross-compiled device SO + native host) as a link smoke test.