Add TopK layer support for pnnx ONNX export path#6558
Add TopK layer support for pnnx ONNX export path#6558vlordier wants to merge 32 commits intoTencent:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds TopK layer support for the pnnx ONNX export path in ncnn, addressing issue #6377 which reports failures in ONNX export pipelines for models like YOLOv10 through ultralytics.
Changes:
- Implements native TopK layer in ncnn core with support for different axes, largest/smallest selection, and sorted/unsorted output
- Adds pnnx pass to convert ONNX TopK nodes to ncnn TopK layer with proper batch axis handling
- Adds comprehensive test coverage for 1D, 2D, 3D, and 4D tensors with various axis configurations
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/layer/topk.h | Header file defining TopK layer class with parameters for axis, largest, sorted, and k |
| src/layer/topk.cpp | Implementation of TopK layer supporting 1-4D tensors with efficient sorting using std::partial_sort and std::nth_element |
| src/CMakeLists.txt | Registers TopK layer in build system (correctly ordered alphabetically) |
| tools/pnnx/src/pass_ncnn/TopK.cpp | PNNX pass to convert ONNX TopK operations to ncnn format, handling batch axis removal and parameter mapping |
| tools/pnnx/src/CMakeLists.txt | Adds TopK pass to pnnx build (correctly ordered alphabetically) |
| tests/test_topk.cpp | Test cases covering 1D through 4D tensors with various axis, k, and largest parameters |
| tests/CMakeLists.txt | Registers TopK test (minor ordering issue - should come after Tile) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Please enable github action in YOUR FORKED REPO to make code-format workflow work |
|
any updates on this? |
7cc5ae0 to
830149c
Compare
- Generate TopK class definition in pnnx.py output with forward() method - Instantiate TopK modules in Model.__init__() with proper parameters - Update forward() method to call self.topk_name() instead of direct TopK() calls - Fixes pnnx inference to properly execute TopK operations using torch.topk() - Test confirms TopK ONNX→pnnx conversion and inference working correctly
- Fix IR pattern syntax to use explicit parameter names (axis=%, largest=%, sorted=%) - Replace incorrect parameter lookup from 'op_0.axis' to 'axis' to match captured names - TopK pass now properly fires during ONNX→pnnx→ncnn conversion - All TopK parameters (axis, largest, sorted) correctly captured and set in ncnn layers - End-to-end test confirms ONNX→pnnx→ncnn conversion with TopK working correctly
use c++03-style topk comparator and keep deterministic nan/inf ordering remove redundant constructor param initialization fix tests cmakelists alphabetical order (Tile before TopK) expand torch_topk onnx tests (k=0/k=1, negative dim, sorted=false cases) drop generated topk onnx/pnnx/ncnn sidecar artifacts from repo
- Guard <algorithm>/<vector> behind #if NCNN_SIMPLESTL, include simplestl.h - Use std::partial_sort in simplestl mode (no std::nth_element available) - Guard <math.h> in tests behind #if !NCNN_SIMPLESTL to avoid simplemath.h conflict; define INFINITY/NAN as float expressions in simplestl mode - Fix cstep-unaware indexing for 3D/4D output tensors: use actual cstep for channel offset instead of assuming contiguous w*h layout - Convert #pragma omp parallel + inner #pragma omp for to #pragma omp parallel for to avoid __kmpc_barrier in simpleomp mode - Fix copyright year 2026->2025 - Apply code-format whitespace cleanup
830149c to
84e083b
Compare
|
@mlknz seems it's passing the tests and CI, give it a try and let me know |
- pass_level2/torch_topk.cpp: capture k/dim/largest/sorted as parameters
(prim::Constant) instead of tensor inputs, enabling ncnn pass matching
- pass_level2/torch_gather.cpp: restore original pattern (dim as tensor)
- pass_ncnn/TopK.cpp: match torch.topk with captured parameters and
convert to ncnn TopK layer (axis, largest, sorted)
- pass_ncnn/torch_gather.cpp (NEW): match torch.gather with 2 inputs
(input, index) and captured dim parameter, convert to ncnn Gather layer
- src/layer/gather.{h,cpp} (NEW): implement Gather ncnn operator
supporting 1D/2D/3D tensors with arbitrary axis
- PNNX CMakeLists fixes:
- per-target Torch include dirs to avoid protobuf header conflicts
- Abseil linking for Homebrew protobuf 34.x
- disable onnxruntime auto-detection (protobuf conflict)
- directory-level INCLUDE_DIRECTORIES_BEFORE for protobuf headers
Verified: YOLOv10n converts with 2 TopK + 2 Gather layers, only
cosmetic ops (Tensor.to, pnnx.Expression) ignored.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- src/layer/cast.{h,cpp}: extend Cast layer with int64 (type 5) and
int32 (type 6) support, adding conversions int64↔float32 and
int32↔float32
- pass_ncnn/tensor_to.cpp (NEW): convert Tensor.to (dtype cast) to
ncnn Cast layer, mapping torch dtype strings to ncnn type codes
- CMakeLists.txt: register tensor_to.cpp in pass_ncnn sources
Verified: YOLOv10n Tensor.to (i64→f32) now converts to Cast layer
instead of being ignored. Only cosmetic ops (pnnx.Expression) remain.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…PR triggers to workflow - load_onnx.cpp does not use any onnxruntime API (no Ort* usage), so the include guards and #error are unnecessary and break builds for users who don't have onnxruntime installed - Add pull_request trigger and fix-pnnx-onnx-topk-support push trigger to topk-linux-test.yml so CI runs on this PR Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…le ONNX pass This commit adds critical missing operators needed for YOLO26 model conversion to NCNN. New Operators: - GatherElements: ONNX GatherElements operator for tensor element gathering - Expand: ONNX Expand operator for tensor broadcasting with numpy semantics - Tile ONNX pass: Conversion pass for ONNX Tile operator (layer already exists) Changes: - Add src/layer/gatherelements.h and .cpp - Add src/layer/expand.h and .cpp - Add tools/pnnx/src/pass_ncnn/gatherelements.cpp - Add tools/pnnx/src/pass_ncnn/expand.cpp - Add tools/pnnx/src/pass_ncnn/tile.cpp - Update src/CMakeLists.txt to register GatherElements layer - Update tools/pnnx/src/CMakeLists.txt to register PNNX passes Implementation follows the pattern from PR Tencent#6558 (TopK/Gather/Cast). YOLO26 Operator Analysis (453 nodes, 28 unique ops): - 25 operators: Already supported in NCNN - 3 operators: Newly implemented (this commit) - 1 operator (Mod): Low priority, only 1 usage, can workaround Testing: - All files compile successfully - No compilation errors - Follows NCNN coding style and patterns Enables YOLO26 end2end NMS-free conversion with output shape [1, 300, 6]. References: - PR Tencent#6558: TopK/Gather/Cast implementation - YOLO26: https://arxiv.org/abs/2602.14582 - Issues: Tencent#6518, Tencent#6610 Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Tile: Added support for ONNX Tile operator with repeats as second input blob - Expand: Fixed implementation for proper shape broadcasting - Both operators now follow PR Tencent#6558 pattern - Maintains backward compatibility with parameter-based mode Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 12 comments.
Comments suppressed due to low confidence (1)
src/layer/cast.cpp:108
Castadvertisestype_from=0as “auto” (see cast.h), butCast::forward()never resolvestype_from==0to an actual source type. With the new pnnxTensor.to -> Castlowering emittingtype_from=0, this will produce uninitialized outputs. Addtype_from==0inference (e.g., frombottom_blob.elemsize/ known flags) or explicitly treat0as float32 when no better signal exists, and return an error for unsupported conversions.
int Cast::forward(const Mat& bottom_blob, Mat& top_blob, const Option& opt) const
{
if (type_from == type_to)
{
top_blob = bottom_blob;
return 0;
}
int w = bottom_blob.w;
int h = bottom_blob.h;
int d = bottom_blob.d;
int channels = bottom_blob.c;
int dims = bottom_blob.dims;
size_t elemsize = bottom_blob.elemsize;
int elempack = bottom_blob.elempack;
size_t out_elemsize = elemsize;
if (type_to == 1)
{
// float32
out_elemsize = 4 * elempack;
}
else if (type_to == 2)
{
// float16
out_elemsize = 2 * elempack;
}
else if (type_to == 3)
{
// int8
out_elemsize = elempack;
}
else if (type_to == 4)
{
// bfloat16
out_elemsize = 2 * elempack;
}
else if (type_to == 5)
{
// int64
out_elemsize = 8 * elempack;
}
else if (type_to == 6)
{
// int32
out_elemsize = 4 * elempack;
}
if (dims == 1)
{
top_blob.create(w, out_elemsize, elempack, opt.blob_allocator);
}
else if (dims == 2)
{
top_blob.create(w, h, out_elemsize, elempack, opt.blob_allocator);
}
else if (dims == 3)
{
top_blob.create(w, h, channels, out_elemsize, elempack, opt.blob_allocator);
}
else if (dims == 4)
{
top_blob.create(w, h, d, channels, out_elemsize, elempack, opt.blob_allocator);
}
if (top_blob.empty())
return -100;
int size = w * h * d * elempack;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return 0 | ||
| || test_topk(a, 0, 1, 1, 1) | ||
| || test_topk(a, 0, 5, 1, 1) | ||
| || test_topk(a, 0, 1, 0, 0) | ||
| || test_topk(a, -1, 7, 0, 1) | ||
| || test_topk(a, 0, 4, 1, 0) | ||
| || test_topk(a, 0, 9, 1, 1); | ||
| } | ||
|
|
||
| static int test_topk_1() | ||
| { | ||
| ncnn::Mat a = RandomMat(12, 17); | ||
|
|
||
| return 0 | ||
| || test_topk(a, 0, 1, 1, 1) | ||
| || test_topk(a, 0, 5, 1, 1) | ||
| || test_topk(a, 1, 3, 1, 1) | ||
| || test_topk(a, -1, 8, 0, 1) | ||
| || test_topk(a, 1, 6, 0, 0) | ||
| || test_topk(a, -2, 7, 1, 1); | ||
| } | ||
|
|
||
| static int test_topk_2() | ||
| { | ||
| ncnn::Mat a = RandomMat(8, 9, 11); | ||
|
|
||
| return 0 | ||
| || test_topk(a, 0, 3, 1, 1) | ||
| || test_topk(a, 1, 4, 1, 1) | ||
| || test_topk(a, 2, 2, 0, 1) | ||
| || test_topk(a, 2, 5, 1, 0) | ||
| || test_topk(a, -1, 6, 1, 1) | ||
| || test_topk(a, -2, 5, 0, 1) | ||
| || test_topk(a, -3, 7, 1, 1); |
There was a problem hiding this comment.
Missing semicolon after return 0 causes a compile error, blocking the new test_topk target from building. Add the semicolon so the chained || test_topk(...) expressions compile.
| return 0 | |
| || test_topk(a, 0, 1, 1, 1) | |
| || test_topk(a, 0, 5, 1, 1) | |
| || test_topk(a, 0, 1, 0, 0) | |
| || test_topk(a, -1, 7, 0, 1) | |
| || test_topk(a, 0, 4, 1, 0) | |
| || test_topk(a, 0, 9, 1, 1); | |
| } | |
| static int test_topk_1() | |
| { | |
| ncnn::Mat a = RandomMat(12, 17); | |
| return 0 | |
| || test_topk(a, 0, 1, 1, 1) | |
| || test_topk(a, 0, 5, 1, 1) | |
| || test_topk(a, 1, 3, 1, 1) | |
| || test_topk(a, -1, 8, 0, 1) | |
| || test_topk(a, 1, 6, 0, 0) | |
| || test_topk(a, -2, 7, 1, 1); | |
| } | |
| static int test_topk_2() | |
| { | |
| ncnn::Mat a = RandomMat(8, 9, 11); | |
| return 0 | |
| || test_topk(a, 0, 3, 1, 1) | |
| || test_topk(a, 1, 4, 1, 1) | |
| || test_topk(a, 2, 2, 0, 1) | |
| || test_topk(a, 2, 5, 1, 0) | |
| || test_topk(a, -1, 6, 1, 1) | |
| || test_topk(a, -2, 5, 0, 1) | |
| || test_topk(a, -3, 7, 1, 1); | |
| return 0 || | |
| test_topk(a, 0, 1, 1, 1) || | |
| test_topk(a, 0, 5, 1, 1) || | |
| test_topk(a, 0, 1, 0, 0) || | |
| test_topk(a, -1, 7, 0, 1) || | |
| test_topk(a, 0, 4, 1, 0) || | |
| test_topk(a, 0, 9, 1, 1); | |
| } | |
| static int test_topk_1() | |
| { | |
| ncnn::Mat a = RandomMat(12, 17); | |
| return 0 || | |
| test_topk(a, 0, 1, 1, 1) || | |
| test_topk(a, 0, 5, 1, 1) || | |
| test_topk(a, 1, 3, 1, 1) || | |
| test_topk(a, -1, 8, 0, 1) || | |
| test_topk(a, 1, 6, 0, 0) || | |
| test_topk(a, -2, 7, 1, 1); | |
| } | |
| static int test_topk_2() | |
| { | |
| ncnn::Mat a = RandomMat(8, 9, 11); | |
| return 0 || | |
| test_topk(a, 0, 3, 1, 1) || | |
| test_topk(a, 1, 4, 1, 1) || | |
| test_topk(a, 2, 2, 0, 1) || | |
| test_topk(a, 2, 5, 1, 0) || | |
| test_topk(a, -1, 6, 1, 1) || | |
| test_topk(a, -2, 5, 0, 1) || | |
| test_topk(a, -3, 7, 1, 1); |
| return R"PNNXIR(7767517 | ||
| 7 7 | ||
| 12 7 | ||
| pnnx.Input input_0 0 1 input | ||
| pnnx.Input input_1 0 1 k | ||
| pnnx.Input input_2 0 1 dim | ||
| pnnx.Input input_3 0 1 largest | ||
| pnnx.Input input_4 0 1 sorted | ||
| aten::topk op_0 5 2 input k dim largest sorted values indices | ||
| prim::Constant op_0 0 1 k value=%k | ||
| prim::Constant op_1 0 1 dim value=%dim | ||
| prim::Constant op_2 0 1 largest value=%largest | ||
| prim::Constant op_3 0 1 sorted value=%sorted | ||
| aten::topk op_4 5 2 input k dim largest sorted values indices |
There was a problem hiding this comment.
The PNNXIR header 12 7 does not match the actual number of operator lines in this pattern (7). This mismatch will prevent the graph rewriter from matching aten::topk nodes. Update the header counts to reflect the pattern contents (likely 7 7).
| fprintf(pyfp, " self.%s = TopK(", sanitize_identifier(op->name).c_str()); | ||
|
|
||
| int i = 0; | ||
| for (const auto& it : op->params) | ||
| { | ||
| fprintf(pyfp, "%s=", it.first.c_str()); | ||
|
|
||
| const Parameter& param = it.second; | ||
| if (param.type == 2) | ||
| { | ||
| fprintf(pyfp, "%d", param.i); | ||
| } | ||
| else if (param.type == 1) | ||
| { | ||
| fprintf(pyfp, "%d", param.b ? 1 : 0); | ||
| } | ||
|
|
||
| if (i + 1 != op->params.size()) | ||
| fprintf(pyfp, ", "); | ||
| i++; | ||
| } | ||
|
|
||
| fprintf(pyfp, ")\n"); |
There was a problem hiding this comment.
When instantiating TopK modules, this loop prints it.first as a Python keyword (e.g., 0=..., 1=...). ncnn-style param keys are numeric strings, which are not valid Python identifiers and will generate invalid Python code. Map these param ids to named arguments (axis, largest, sorted, k) or emit positional arguments in a fixed order.
| return -100; | ||
|
|
||
| const float* inp = input_blob; | ||
| const int* idx = (const int*)index_blob; |
There was a problem hiding this comment.
Gather::forward() reinterprets the index tensor as const int* (idx = (const int*)index_blob), but ncnn (and this PR’s TopK/ArgMax) commonly represent indices as float values stored in a float buffer. Reinterpreting float bits as int will produce huge indices and out-of-bounds reads. Read indices as float and cast to int (or branch based on index_blob.elemsize / the Cast output type).
| const int* idx = (const int*)index_blob; | |
| const float* idx = index_blob; |
| int rem = i; | ||
| int coord_out[4] = {0, 0, 0, 0}; | ||
| if (top_blob.dims == 1) | ||
| { | ||
| coord_out[0] = rem; | ||
| } | ||
| else if (top_blob.dims == 2) | ||
| { | ||
| coord_out[0] = rem % top_blob.w; | ||
| coord_out[1] = rem / top_blob.w; | ||
| } | ||
| else if (top_blob.dims == 3) | ||
| { | ||
| int hw = top_blob.w * top_blob.h; | ||
| coord_out[0] = (rem % hw) % top_blob.w; | ||
| coord_out[1] = (rem % hw) / top_blob.w; | ||
| coord_out[2] = rem / hw; | ||
| } | ||
|
|
||
| // Get index value at this output position | ||
| int gather_idx = idx[i]; | ||
| // Handle negative indices | ||
| if (gather_idx < 0) gather_idx += axis_dim_size; | ||
|
|
||
| // Build input coordinate (same as output, but axis coord replaced) | ||
| int coord_in[4] = {coord_out[0], coord_out[1], coord_out[2], coord_out[3]}; | ||
| coord_in[positive_axis] = gather_idx; | ||
|
|
||
| // Clamp to input bounds | ||
| if (coord_in[positive_axis] >= axis_dim_size) coord_in[positive_axis] = axis_dim_size - 1; | ||
| if (coord_in[positive_axis] < 0) coord_in[positive_axis] = 0; | ||
|
|
||
| // Compute flat input index | ||
| int flat_in = 0; | ||
| if (dims == 1) | ||
| { | ||
| flat_in = coord_in[0]; | ||
| } | ||
| else if (dims == 2) | ||
| { | ||
| flat_in = coord_in[0] + coord_in[1] * input_blob.w; | ||
| } | ||
| else if (dims == 3) | ||
| { | ||
| // ncnn 3D layout: w * h * c, with cstride padding | ||
| size_t cstep = input_blob.cstep; | ||
| flat_in = coord_in[0] + coord_in[1] * input_blob.w + coord_in[2] * (int)cstep; | ||
| } | ||
|
|
There was a problem hiding this comment.
This implementation only decomposes output coordinates / computes flat_in for dims up to 3. For 4D inputs (and/or 4D index tensors), coord_out/flat_in are left partially unset and flat_in stays 0, so most outputs will incorrectly read the first element. Add explicit handling for dims==4 (both coordinate decomposition and flat index computation, including cstep/d strides).
| type_to = 5; | ||
| else if (dtype == "torch.int32" || dtype == "torch.int") | ||
| type_to = 6; | ||
|
|
||
| op->params["0"] = 0; // auto-detect input type | ||
| op->params["1"] = type_to; | ||
| } |
There was a problem hiding this comment.
Tensor.to is lowered to Cast with type_from=0 (“auto”), but Cast::forward() currently has no logic to infer the source type when type_from==0, so none of the conversion branches will run and the output buffer will be left uninitialized. Either implement type_from==0 auto-detection in Cast or set an explicit type_from here based on the actual input dtype tracked by pnnx.
| int new_axis = axis; | ||
| if (axis >= 0) | ||
| new_axis = axis > batch_index ? axis - 1 : axis; | ||
|
|
||
| op->params["0"] = new_axis; | ||
| op->params["1"] = largest; | ||
| op->params["2"] = sorted; | ||
| } |
There was a problem hiding this comment.
The ncnn TopK layer expects k via param id 3 (or via a second input blob), but this rewrite pass never writes k into op->params. As a result, converted graphs will use the default k=1 regardless of the original torch.topk(..., k=...). Capture and write k (and consider supporting the ONNX-style second input for dynamic K if needed).
| fprintf(pyfp, " def __init__(self, axis=1, largest=1, sorted=1):\n"); | ||
| fprintf(pyfp, " super(TopK, self).__init__()\n"); | ||
| fprintf(pyfp, " self.axis = axis\n"); | ||
| fprintf(pyfp, " self.largest = largest\n"); | ||
| fprintf(pyfp, " self.sorted = sorted\n"); | ||
| fprintf(pyfp, " def forward(self, x, k):\n"); |
There was a problem hiding this comment.
The generated Python TopK module takes forward(self, x, k), but the pnnx->ncnn lowering for torch.topk in this PR models k as a captured parameter (not an input tensor). If the resulting TopK operator only has one input, the emitted self.<name>(v_input) call will raise at runtime due to the missing k argument. Align the Python stub with how k is represented (store k as a module attribute and accept only x, or ensure the operator keeps a second k input).
| fprintf(pyfp, " def __init__(self, axis=1, largest=1, sorted=1):\n"); | |
| fprintf(pyfp, " super(TopK, self).__init__()\n"); | |
| fprintf(pyfp, " self.axis = axis\n"); | |
| fprintf(pyfp, " self.largest = largest\n"); | |
| fprintf(pyfp, " self.sorted = sorted\n"); | |
| fprintf(pyfp, " def forward(self, x, k):\n"); | |
| fprintf(pyfp, " def __init__(self, axis=1, largest=1, sorted=1, k=None):\n"); | |
| fprintf(pyfp, " super(TopK, self).__init__()\n"); | |
| fprintf(pyfp, " self.axis = axis\n"); | |
| fprintf(pyfp, " self.largest = largest\n"); | |
| fprintf(pyfp, " self.sorted = sorted\n"); | |
| fprintf(pyfp, " self.k = k\n"); | |
| fprintf(pyfp, " def forward(self, x, k=None):\n"); | |
| fprintf(pyfp, " k = self.k if k is None else k\n"); | |
| fprintf(pyfp, " if k is None:\n"); | |
| fprintf(pyfp, " raise TypeError('TopK.forward() missing required argument: k')\n"); |
|
|
||
| // Allocate output (same dtype as input, shape matches index) | ||
| Mat& top_blob = top_blobs[0]; | ||
| top_blob.create(out_shape.w, out_shape.h, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); |
There was a problem hiding this comment.
top_blob.create(out_shape.w, out_shape.h, out_shape.c, ...) always creates a 3D output, even when index_blob is 1D/2D/4D. That changes the output rank and breaks the coordinate decomposition logic below (which depends on top_blob.dims). Allocate top_blob using the correct create() overload based on out_shape.dims so the output shape truly matches the index tensor.
| top_blob.create(out_shape.w, out_shape.h, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| if (out_shape.dims == 1) | |
| top_blob.create(out_shape.w, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else if (out_shape.dims == 2) | |
| top_blob.create(out_shape.w, out_shape.h, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else if (out_shape.dims == 3) | |
| top_blob.create(out_shape.w, out_shape.h, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else if (out_shape.dims == 4) | |
| top_blob.create(out_shape.w, out_shape.h, out_shape.d, out_shape.c, input_blob.elemsize, input_blob.elempack, opt.blob_allocator); | |
| else | |
| return -1; |
| else if (dim_p.type == 5 && !dim_p.ai.empty()) | ||
| axis = dim_p.ai[0]; | ||
| } | ||
|
|
There was a problem hiding this comment.
This lowering writes axis directly from dim, but other ncnn rewrite passes drop/renumber the batch axis using __batch_index (see e.g. torch_flip/TopK). Without the same adjustment here, Gather will target the wrong dimension after batch removal. Apply the same batch-axis handling (reject gather along batch, and decrement axes > batch_index).
| if (op->params.find("__batch_index") != op->params.end()) | |
| { | |
| const Parameter& batch_index_p = op->params.at("__batch_index"); | |
| if (batch_index_p.type == 2) | |
| { | |
| const int batch_index = batch_index_p.i; | |
| // ncnn lowers tensor ranks with the batch axis removed. | |
| // gather along the removed batch axis cannot be represented here. | |
| if (axis == batch_index) | |
| return; | |
| if (axis > batch_index) | |
| axis -= 1; | |
| } | |
| } |
- ir.cpp: store k in TopK.__init__, use forward(self, x) — k was a ctor param but forward() expected it as an input arg, causing runtime error - ir.cpp: pass k= in TopK instantiation (k_val from params["3"]) - gather.cpp: reject non-float32 data (elemsize != 4) and dims > 3 explicitly - pnnx/src/CMakeLists: replace invalid set_property(INCLUDE_DIRECTORIES_BEFORE) with include_directories(BEFORE ...) to correctly force protobuf header order - pnnx/tests/onnx: add test_torch_gather.py roundtrip test (1D/2D/3D, multiple axes, negative axis) and register it in CMakeLists
|
Superseded by #6669 which consolidates all operators (TopK, Gather, GatherElements, Expand, Tile, Mod) with all review issues addressed. |
Summary
TopKlayer implementation in ncnn coreTopKnodes in pnnx ncnn pass to the newTopKlayertest_topkcoverageWhy
Issue #6377 reports ONNX export pipelines (for example YOLOv10 via ultralytics) failing due to unsupported
TopKin ncnn conversion/runtime path.Validation
test_topklocally./build/tests/test_topksuccessfullyFixes #6377