Feat/issue 356 aicpu launch new interface#537
Feat/issue 356 aicpu launch new interface#537puddingfjz wants to merge 6 commits intohw-native-sys:mainfrom
Conversation
Implements Issue hw-native-sys#356: Migrate from legacy rtAicpuKernelLaunchExWithArgs to the new rtsLaunchCpuKernel / rtsBinaryLoadFromFile / rtsFuncGetByName interface available in CANN 7.0+. Changes: - Add AicpuLoader abstraction class supporting both legacy and new interfaces - New interface: Use rtsBinaryLoadFromFile with JSON descriptor (pypto approach) - JSON contains only filename (libaicpu_kernel.so), runtime finds via library path - No temporary .so file creation needed - cpuKernelMode=0 (JSON only mode) - Legacy interface: Unchanged behavior when BUILD_WITH_NEW_CANN=OFF - BUILD_WITH_NEW_CANN compile flag controls which interface is used The new interface provides forward compatibility and aligns with pypto's approach. The legacy interface remains as fallback for older CANN versions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align a5 implementation with a2a3 by using rtsBinaryLoadFromFile with cpuKernelMode=0 instead of rtsBinaryLoadFromData. This matches the pypto approach and ensures consistency across platforms. Changes: - Use rtsBinaryLoadFromFile + JSON descriptor (cpuKernelMode=0) - Generate temporary JSON file with filename-only .so reference - Add json_file_path_ member for cleanup Fixes hw-native-sys#356 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces an AicpuLoader abstraction to support both legacy and new CANN 7.0+ interfaces for launching AICPU kernels across the a2a3 and a5 platforms. The implementation includes build system updates, runtime JSON descriptor generation, and integration into the DeviceRunner. Feedback focuses on improving build portability by avoiding hardcoded architecture paths and enhancing the robustness of manual JSON construction. Additionally, the removal of a default parameter in the a2a3 platform's header is identified as a breaking change that violates cross-platform consistency. Suggestions were also made to reduce coupling in the kernel name mapping.
|
|
||
| target_link_directories(host_runtime | ||
| PRIVATE | ||
| ${ASCEND_HOME_PATH}/aarch64-linux/lib64 |
There was a problem hiding this comment.
Hardcoding the architecture-specific path aarch64-linux/lib64 reduces the portability of the build system. The previous implementation used more generic paths. Consider using ${CMAKE_SYSTEM_PROCESSOR}-linux or a similar variable to maintain flexibility across different Ascend platforms or simulation environments.
| static bool GenerateAicpuOpJson(const std::string& json_path, const std::vector<AicpuOpConfig>& op_configs) { | ||
| std::ofstream json_file(json_path); | ||
| if (!json_file.is_open()) { | ||
| LOG_ERROR("Failed to open JSON file for writing: %s", json_path.c_str()); | ||
| return false; | ||
| } | ||
|
|
||
| json_file << "{\n"; | ||
| for (size_t i = 0; i < op_configs.size(); ++i) { | ||
| const auto& config = op_configs[i]; | ||
| json_file << " \"" << config.opType << "\": {\n"; | ||
| json_file << " \"opInfo\": {\n"; | ||
| json_file << " \"functionName\": \"" << config.functionName << "\",\n"; | ||
| json_file << " \"kernelSo\": \"" << config.kernelSo << "\",\n"; | ||
| json_file << " \"opKernelLib\": \"" << config.opKernelLib << "\",\n"; | ||
| json_file << " \"computeCost\": \"" << config.computeCost << "\",\n"; | ||
| json_file << " \"engine\": \"" << config.engine << "\",\n"; | ||
| json_file << " \"flagAsync\": \"" << config.flagAsync << "\",\n"; | ||
| json_file << " \"flagPartial\": \"" << config.flagPartial << "\",\n"; | ||
| json_file << " \"userDefined\": \"" << config.userDefined << "\"\n"; | ||
| json_file << " }\n"; | ||
| json_file << " }" << (i < op_configs.size() - 1 ? "," : "") << "\n"; | ||
| } | ||
| json_file << "}\n"; | ||
| json_file.close(); | ||
|
|
||
| LOG_INFO("Generated AICPU op info JSON: %s", json_path.c_str()); | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Manual JSON construction using string concatenation is fragile and does not handle character escaping. If any of the configuration strings (such as functionName or opType) contain special characters like double quotes, the resulting JSON will be invalid. It is safer to use a dedicated JSON library or at least implement basic string escaping for these fields.
| std::unordered_map<std::string, std::string> name_mapping = { | ||
| {"DynTileFwkKernelServerInit", "DynTileFwkBackendKernelServerInit"}, | ||
| {"DynTileFwkKernelServer", "DynTileFwkBackendKernelServer"} | ||
| }; |
There was a problem hiding this comment.
The name_mapping between opType and functionName is hardcoded within the AicpuLoader implementation. This creates a tight coupling and makes the code harder to maintain if new kernels are added in the future. Consider passing this mapping as an argument to init_with_binary or defining it in a more central configuration location.
| int | ||
| run(Runtime &runtime, int block_dim, int device_id, const std::vector<uint8_t> &aicpu_so_binary, | ||
| const std::vector<uint8_t> &aicore_kernel_binary, int launch_aicpu_num = 1); | ||
| const std::vector<uint8_t> &aicore_kernel_binary, int launch_aicpu_num); |
There was a problem hiding this comment.
The default value for launch_aicpu_num has been removed from the run method declaration. This is a breaking change for any existing callers that rely on the default value. Additionally, this change is inconsistent with the a5 platform's device_runner.h, which still retains the default value. Per repository rules, files shared across different runtimes should maintain identical code for consistency. Consider restoring the default value to maintain backward compatibility.
References
- For files shared across different runtimes, maintain identical code for consistency, even if a more robust implementation is possible.
|
|
||
| target_link_directories(host_runtime | ||
| PRIVATE | ||
| ${ASCEND_HOME_PATH}/aarch64-linux/lib64 |
| static bool GenerateAicpuOpJson(const std::string& json_path, const std::vector<AicpuOpConfig>& op_configs) { | ||
| std::ofstream json_file(json_path); | ||
| if (!json_file.is_open()) { | ||
| LOG_ERROR("Failed to open JSON file for writing: %s", json_path.c_str()); | ||
| return false; | ||
| } | ||
|
|
||
| json_file << "{\n"; | ||
| for (size_t i = 0; i < op_configs.size(); ++i) { | ||
| const auto& config = op_configs[i]; | ||
| json_file << " \"" << config.opType << "\": {\n"; | ||
| json_file << " \"opInfo\": {\n"; | ||
| json_file << " \"functionName\": \"" << config.functionName << "\",\n"; | ||
| json_file << " \"kernelSo\": \"" << config.kernelSo << "\",\n"; | ||
| json_file << " \"opKernelLib\": \"" << config.opKernelLib << "\",\n"; | ||
| json_file << " \"computeCost\": \"" << config.computeCost << "\",\n"; | ||
| json_file << " \"engine\": \"" << config.engine << "\",\n"; | ||
| json_file << " \"flagAsync\": \"" << config.flagAsync << "\",\n"; | ||
| json_file << " \"flagPartial\": \"" << config.flagPartial << "\",\n"; | ||
| json_file << " \"userDefined\": \"" << config.userDefined << "\"\n"; | ||
| json_file << " }\n"; | ||
| json_file << " }" << (i < op_configs.size() - 1 ? "," : "") << "\n"; | ||
| } | ||
| json_file << "}\n"; | ||
| json_file.close(); | ||
|
|
||
| LOG_INFO("Generated AICPU op info JSON: %s", json_path.c_str()); | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Remove profiling_logs_* and *.log entries that were added during development and testing. These are not needed for the production codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Apply clang-format pointer/reference spacing - Fix copyright header: "WITHOUT WARRANTIES OF ANY KIND" (not "OR") Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Revert hardcoded aarch64-linux path in CMakeLists.txt, use portable paths - Restore default parameter for launch_aicpu_num in device_runner.h - Add documentation explaining JSON construction and name_mapping design The JSON construction uses manual string concatenation without a library. This is safe because kernel names are controlled strings without special characters, matching pypto's approach for similar AICPU op descriptors. The name_mapping from opType to functionName is specific to the Ascend tile framework kernels and is unlikely to change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace rtAicpuKernelLaunchExWithArgs with rtsBinaryLoadFromFile/ rtsFuncGetByName/rtsLaunchCpuKernel (BUILD_WITH_NEW_CANN=ON path) - Add two-layer architecture: outer libaicpu_dispatcher.so + inner runtime SO. Dispatcher saves inner SO to /tmp/aicpu_kernels/ and forwards kernel calls via dlopen/dlsym - Add LoadAicpuOp class to manage JSON generation and function handles - Generate JSON in SO directory with matching basename (cpuKernelMode=1) - Remove static linking (-static-libstdc++ -static-libgcc), add -Wl,-Bsymbolic,--build-id, strip debug info to match CANN built-ins - Update a2a3 and a5 platforms identically Known issue: scheduler fails with 507018 (symbol lookup failure for DynTileFwkKernelServerNull in libaicpu_dispatcher.so). Host-side API calls succeed, scheduler-side dlsym fails despite symbol being present in SO binary. Investigation ongoing.
Summary
rtAicpuKernelLaunchExWithArgsAPI to newrtsLaunchCpuKernel/rtsBinaryLoadFromFile/rtsFuncGetByNameinterfaceChanges
a2a3 platform
AicpuLoaderabstraction class with conditional compilationrtsBinaryLoadFromFilewithcpuKernelMode=0device_runnerto useAicpuLoadera5 platform
rtsBinaryLoadFromFileinstead ofrtsBinaryLoadFromDataBuild
BUILD_WITH_NEW_CANNcompile flag (default: ON)Test plan
Fixes #356