feat: Add TGEMV_MX family ops by FangRui0 · Pull Request #418 · hw-native-sys/PTOAS

FangRui0 · 2026-04-02T02:01:56Z

TGEMV_MX is A5-only op

FangRui0 · 2026-04-02T02:02:26Z

/run a5

gemini-code-assist

Code Review

This pull request introduces the pto.tgemv.mx operation family for mixed-precision matrix-vector multiplication on A5 targets, including IR definitions, documentation, and lowering patterns. Review feedback highlights the need for unique intrinsic names for the accumulation and bias variants to prevent backend collisions, the importance of updating conversion patterns and test expectations accordingly, and the necessity of preserving result types during view-to-memref transformations to maintain SSA consistency.

gemini-code-assist · 2026-04-02T02:04:17Z

include/PTO/IR/PTOOps.td

+  }];
+
+  let extraClassDeclaration = [{
+    static StringRef getIntrinsicName() { return "TGEMV_MX"; }


The intrinsic name for the accumulation variant should be specific to avoid collisions and ensure correct mapping in the backend. Following the pattern of other operations (like TMatmulMXAccOp), this should be TGEMV_MX_ACC.

static StringRef getIntrinsicName() { return "TGEMV_MX_ACC"; }

Backend API named "TGEMV_MX".

gemini-code-assist · 2026-04-02T02:04:17Z

include/PTO/IR/PTOOps.td

+  }];
+
+  let extraClassDeclaration = [{
+    static StringRef getIntrinsicName() { return "TGEMV_MX"; }


The intrinsic name for the bias variant should be specific. Using the same name as the base operation (TGEMV_MX) will cause issues during code generation as the argument count and positions differ. It should be TGEMV_MX_BIAS.

static StringRef getIntrinsicName() { return "TGEMV_MX_BIAS"; }

Backend API named "TGEMV_MX".

gemini-code-assist · 2026-04-02T02:04:17Z

lib/PTO/Transforms/PTOToEmitC.cpp

+    replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
+                                {dst, cIn, a, aScale, b, bScale}, rewriter);


This variant should map to the TGEMV_MX_ACC intrinsic. Mapping both acc and bias variants to the same TGEMV_MX name with different argument structures will lead to incorrect C code generation or compilation errors in the backend.

Suggested change

replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",

{dst, cIn, a, aScale, b, bScale}, rewriter);

replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX_ACC",

{dst, cIn, a, aScale, b, bScale}, rewriter);

Backend API named "TGEMV_MX".

gemini-code-assist · 2026-04-02T02:04:18Z

lib/PTO/Transforms/PTOToEmitC.cpp

+    replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
+                                {dst, a, aScale, b, bScale, bias}, rewriter);


This variant should map to the TGEMV_MX_BIAS intrinsic. Using a generic name for operations with different argument signatures is fragile and likely incorrect for the EmitC backend.

Suggested change

replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",

{dst, a, aScale, b, bScale, bias}, rewriter);

replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX_BIAS",

{dst, a, aScale, b, bScale, bias}, rewriter);

Backend API named "TGEMV_MX".

gemini-code-assist · 2026-04-02T02:04:18Z

lib/PTO/Transforms/PTOToEmitC.cpp

+  }
+};
+
+struct PTOTGemvMXAccToTGEMV_MX


The pattern name should reflect the specific intrinsic it targets. It is recommended to rename this to PTOTGemvMXAccToTGEMV_MX_ACC for consistency with other mixed-precision accumulation patterns.

Suggested change

struct PTOTGemvMXAccToTGEMV_MX

struct PTOTGemvMXAccToTGEMV_MX_ACC

gemini-code-assist · 2026-04-02T02:04:18Z

lib/PTO/Transforms/PTOToEmitC.cpp

+  }
+};
+
+struct PTOTGemvMXBiasToTGEMV_MX


The pattern name should be updated to PTOTGemvMXBiasToTGEMV_MX_BIAS to match the intended intrinsic target.

Suggested change

struct PTOTGemvMXBiasToTGEMV_MX

struct PTOTGemvMXBiasToTGEMV_MX_BIAS

gemini-code-assist · 2026-04-02T02:04:18Z

lib/PTO/Transforms/PTOViewToMemref.cpp

+        IRRewriter rewriter(ctx);
+        rewriter.setInsertionPoint(op);
+        rewriter.replaceOpWithNewOp<pto::TGemvMxOp>(
+          op, TypeRange{},


Using TypeRange{} will drop any results the original operation might have had. If these operations are in the tensor domain (where they return a result aliasing the destination buffer), this will break the SSA chain. It is safer to use op->getResultTypes() to preserve the original result signature.

Suggested change

op, TypeRange{},

op, op->getResultTypes(),

gemini-code-assist · 2026-04-02T02:04:18Z

test/basic/tgemv_mx_variants_emitc.pto

+// CHECK: TGEMV_MX(
+// CHECK: TGEMV_MX(


The test expectations should be updated to reflect the specific intrinsic names for the accumulation and bias variants.

// CHECK: TGEMV_MX_ACC( // CHECK: TGEMV_MX_BIAS(

reedhecre · 2026-04-02T02:06:13Z

A5 板测失败

触发方式：manual
源码提交：6b32a2640480
结果汇总：OK 0 / FAIL 0 / SKIP 0
日志：/root/ptoas-board-monitor-a5/logs/20260402_100305_manual_pr418.log
手动指令：/run a5
触发人：FangRui0
触发评论：feat: Add TGEMV_MX family ops #418 (comment)
失败阶段：sample-build-and-test / exit=1

日志尾部

_inject_sync_loop-pto.cpp
Sync(test_inject_sync_two_event_id.py) OK   generated: test_inject_sync_two_event_id-pto.cpp
Sync(test_intercore_sync_a3_dyn.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_missing_setffts.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_modes.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a5_dyn.py) OK   generated: test_intercore_sync_a5_dyn-pto.cpp
Sync(test_intercore_sync_a5_functional.py) OK   generated: test_intercore_sync_a5_functional-pto.cpp
Sync(test_intercore_sync_a5_ptoisa_vec.py) OK   generated: test_intercore_sync_a5_ptoisa_vec-pto.cpp
Sync(test_intercore_sync_a5.py) OK   generated: test_intercore_sync_a5-pto.cpp
Sync(test_mem_inject_sync_basic.py) OK   generated: test_mem_inject_sync_basic-pto.cpp
Sync(test_set_wait_unified_api.py) OK   generated: test_set_wait_unified_api-pto.cpp
Sync(tmatmulk_autosync_a5.py) OK   generated: tmatmulk_autosync_a5-pto.cpp
Tcvt(tcvt.py) OK   generated: tcvt-pto.cpp
TileSetGetValue(tile_getval_mat_invalid.py) XFAIL python failed as expected
TileSetGetValue(tileSetGetValue.py) OK   generated: tileSetGetValue-pto.cpp
TInsert(tinsert.py) OK   generated: tinsert-pto.cpp
Trans(trans.py) OK   generated: trans-pto.cpp
Trap(trap.py) OK   generated: trap-pto.cpp
VectorAddition(vadd_pto_ir.py) OK   generated: vadd_pto_ir-pto.cpp
VectorAddition(vadd_validshape_hyper.py) OK   generated: vadd_validshape_hyper-pto.cpp
VectorAddition(vectorAddition.py) OK   generated: vectorAddition-pto.cpp
Xors(xors.py) OK   generated: xors-pto.cpp
Xor(xor.py)  OK   generated: xor-pto.cpp
-----------------------------
OK=172  FAIL=1  SKIP=4
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-04-02 10:06:11 =====

FangRui0 · 2026-04-02T02:54:00Z

/run a5

reedhecre · 2026-04-02T02:57:14Z

A5 板测失败

触发方式：manual
源码提交：1ae7e200ce24
结果汇总：OK 0 / FAIL 0 / SKIP 0
日志：/root/ptoas-board-monitor-a5/logs/20260402_105406_manual_pr418.log
手动指令：/run a5
触发人：FangRui0
触发评论：feat: Add TGEMV_MX family ops #418 (comment)
失败阶段：sample-build-and-test / exit=1

日志尾部

_inject_sync_loop-pto.cpp
Sync(test_inject_sync_two_event_id.py) OK   generated: test_inject_sync_two_event_id-pto.cpp
Sync(test_intercore_sync_a3_dyn.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_missing_setffts.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_modes.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a5_dyn.py) OK   generated: test_intercore_sync_a5_dyn-pto.cpp
Sync(test_intercore_sync_a5_functional.py) OK   generated: test_intercore_sync_a5_functional-pto.cpp
Sync(test_intercore_sync_a5_ptoisa_vec.py) OK   generated: test_intercore_sync_a5_ptoisa_vec-pto.cpp
Sync(test_intercore_sync_a5.py) OK   generated: test_intercore_sync_a5-pto.cpp
Sync(test_mem_inject_sync_basic.py) OK   generated: test_mem_inject_sync_basic-pto.cpp
Sync(test_set_wait_unified_api.py) OK   generated: test_set_wait_unified_api-pto.cpp
Sync(tmatmulk_autosync_a5.py) OK   generated: tmatmulk_autosync_a5-pto.cpp
Tcvt(tcvt.py) OK   generated: tcvt-pto.cpp
TileSetGetValue(tile_getval_mat_invalid.py) XFAIL python failed as expected
TileSetGetValue(tileSetGetValue.py) OK   generated: tileSetGetValue-pto.cpp
TInsert(tinsert.py) OK   generated: tinsert-pto.cpp
Trans(trans.py) OK   generated: trans-pto.cpp
Trap(trap.py) OK   generated: trap-pto.cpp
VectorAddition(vadd_pto_ir.py) OK   generated: vadd_pto_ir-pto.cpp
VectorAddition(vadd_validshape_hyper.py) OK   generated: vadd_validshape_hyper-pto.cpp
VectorAddition(vectorAddition.py) OK   generated: vectorAddition-pto.cpp
Xors(xors.py) OK   generated: xors-pto.cpp
Xor(xor.py)  OK   generated: xor-pto.cpp
-----------------------------
OK=172  FAIL=1  SKIP=4
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-04-02 10:57:11 =====

FangRui0 · 2026-04-02T06:29:46Z

/run a3

FangRui0 · 2026-04-02T06:32:43Z

/run a5

FangRui0 · 2026-04-03T07:40:51Z

/run a5 test/basic/tgemv_mx_emitc.pto

reedhecre · 2026-04-03T07:45:53Z

A5 板测失败

触发方式：manual
源码提交：62b896849e46
结果汇总：OK 0 / FAIL 1 / SKIP 0
日志：/root/ptoas-board-monitor-a5/logs/20260403_154406_manual_pr418.log
手动指令：/run a5 test/basic/tgemv_mx_emitc.pto
触发人：FangRui0
指定用例：test/basic/tgemv_mx_emitc.pto
触发评论：feat: Add TGEMV_MX family ops #418 (comment)
失败阶段：board-validation / exit=1

失败用例

tgemv_mx_emitc (run, exit=2)

reedhecre · 2026-04-03T07:45:55Z

A5 板测失败详情：PR #418

tgemv_mx_emitc

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:107:5: error: static assertion failed due to requirement 'std::is_same_v<half, float>': TMatmulMX:No supported data type combination.
    static_assert((isFp4 || isFp8) && std::is_same_v<CType, float>, "TMatmulMX:No supported data type combination.");
    ^                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:312:5: note: in instantiation of function template specialization 'pto::CheckMadMxValid<pto::Tile<pto::TileType::Acc, half, 1, 16, pto::BLayout::ColMajor, 1, 16, pto::SLayout::RowMajor, 1024, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Left, half, 1, 128, pto::BLayout::ColMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 1, 128, pto::BLayout::RowMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Right, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::ColMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    CheckMadMxValid<TileRes, TileLeft, TileLeftScale, TileRight, TileRightScale>();
    ^
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/common/pto_instr.hpp:368:5: note: in instantiation of function template specialization 'pto::TGEMV_MX_IMPL<pto::AccPhase::Unspecified, pto::Tile<pto::TileType::Acc, half, 1, 16, pto::BLayout::ColMajor, 1, 16, pto::SLayout::RowMajor, 1024, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Left, half, 1, 128, pto::BLayout::ColMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 1, 128, pto::BLayout::RowMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Right, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::ColMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MAP_INSTR_IMPL(TGEMV_MX, cMatrix, aMatrix, aScaleMatrix, bMatrix, bScaleMatrix);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/common/pto_instr.hpp:22:34: note: expanded from macro 'MAP_INSTR_IMPL'
#define MAP_INSTR_IMPL(API, ...) API##_IMPL(__VA_ARGS__)
                                 ^
<scratch space>:218:1: note: expanded from here
TGEMV_MX_IMPL
^
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/npu_validation/Basic/tgemv_mx_emitc/tgemv_mx_emitc_kernel.cpp:91:3: note: in instantiation of function template specialization 'pto::TGEMV_MX<pto::Tile<pto::TileType::Acc, half, 1, 16, pto::BLayout::ColMajor, 1, 16, pto::SLayout::RowMajor, 1024, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Left, half, 1, 128, pto::BLayout::ColMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 1, 128, pto::BLayout::RowMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Right, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::ColMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
  TGEMV_MX(v9, v5, v6, v7, v8);
  ^
1 error generated.
gmake[2]: *** [CMakeFiles/tgemv_mx_emitc_kernel.dir/build.make:76: CMakeFiles/tgemv_mx_emitc_kernel.dir/tgemv_mx_emitc_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/tgemv_mx_emitc_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-03 15:45:51] ERROR: testcase failed (exit 2): tgemv_mx_emitc
[2026-04-03 15:45:51] === SUMMARY ===
[2026-04-03 15:45:51] OK=0 FAIL=1 SKIP=0
[2026-04-03 15:45:51] RESULTS_TSV=/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/remote_npu_validation_results.tsv

FangRui0 · 2026-04-03T08:27:23Z

/run a5 test/basic/tgemv_mx_emitc.pto test/basic/tgemv_mx_variants_emitc.pto

reedhecre · 2026-04-03T08:29:47Z

A5 板测失败

触发方式：manual
源码提交：0a830082228e
结果汇总：OK 0 / FAIL 0 / SKIP 0
日志：/root/ptoas-board-monitor-a5/logs/20260403_162806_manual_pr418.log
手动指令：/run a5 test/basic/tgemv_mx_emitc.pto test/basic/tgemv_mx_variants_emitc.pto
触发人：FangRui0
指定用例：test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto
触发评论：feat: Add TGEMV_MX family ops #418 (comment)
失败阶段：internal / RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

日志尾部

, strided<[16, 1], offset: ?>, #pto.address_space<acc>>
  [Success] -> __cc__ float*
[Debug] Converting MemRef: memref<1x128xf8E4M3, strided<[128, 1], offset: ?>, #pto.address_space<left>>
  [Success] -> __ca__ float8_e4m3_t*
[Debug] Converting MemRef: memref<1x128xf16, strided<[128, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<128x16xf8E4M3, strided<[16, 1], offset: ?>, #pto.address_space<right>>
  [Success] -> __cb__ float8_e4m3_t*
[Debug] Converting MemRef: memref<128x16xf16, strided<[16, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<1x16xf32, strided<[16, 1], offset: ?>, #pto.address_space<bias>>
  [Success] -> __gm__ float*
===== END STAGE emit-basic-pto-cases rc=0 @ 2026-04-03 16:29:46 =====
basic direct pto emitted: test/basic/tgemv_mx_emitc.pto -> test/samples/Basic/tgemv_mx_emitc-pto.cpp, test/basic/tgemv_mx_variants_emitc.pto -> test/samples/Basic/tgemv_mx_variants_emitc-pto.cpp

===== INTERNAL ERROR =====
Traceback (most recent call last):
  File "/root/ptoas-board-monitor-a5/monitor.py", line 2071, in run_once
    summary = runner.run()
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1499, in run
    self.generate_payload()
    ~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1441, in generate_payload
    self.resolve_payload_run_only_cases()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1122, in resolve_payload_run_only_cases
    raise RuntimeError(f"RUN_ONLY_CASES matched zero buildable cases: {self.run_only_cases}")
RuntimeError: RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

reedhecre · 2026-04-03T08:29:54Z

A5 板测失败

触发方式：manual
源码提交：0a830082228e
结果汇总：OK 0 / FAIL 0 / SKIP 0
日志：/root/ptoas-board-monitor-a5/logs/20260403_162806_manual_pr418.log
手动指令：/run a5 test/basic/tgemv_mx_emitc.pto test/basic/tgemv_mx_variants_emitc.pto
触发人：FangRui0
指定用例：test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto
触发评论：feat: Add TGEMV_MX family ops #418 (comment)
失败阶段：internal / RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

日志尾部

, strided<[16, 1], offset: ?>, #pto.address_space<acc>>
  [Success] -> __cc__ float*
[Debug] Converting MemRef: memref<1x128xf8E4M3, strided<[128, 1], offset: ?>, #pto.address_space<left>>
  [Success] -> __ca__ float8_e4m3_t*
[Debug] Converting MemRef: memref<1x128xf16, strided<[128, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<128x16xf8E4M3, strided<[16, 1], offset: ?>, #pto.address_space<right>>
  [Success] -> __cb__ float8_e4m3_t*
[Debug] Converting MemRef: memref<128x16xf16, strided<[16, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<1x16xf32, strided<[16, 1], offset: ?>, #pto.address_space<bias>>
  [Success] -> __gm__ float*
===== END STAGE emit-basic-pto-cases rc=0 @ 2026-04-03 16:29:46 =====
basic direct pto emitted: test/basic/tgemv_mx_emitc.pto -> test/samples/Basic/tgemv_mx_emitc-pto.cpp, test/basic/tgemv_mx_variants_emitc.pto -> test/samples/Basic/tgemv_mx_variants_emitc-pto.cpp

===== INTERNAL ERROR =====
Traceback (most recent call last):
  File "/root/ptoas-board-monitor-a5/monitor.py", line 2071, in run_once
    summary = runner.run()
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1499, in run
    self.generate_payload()
    ~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1441, in generate_payload
    self.resolve_payload_run_only_cases()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1122, in resolve_payload_run_only_cases
    raise RuntimeError(f"RUN_ONLY_CASES matched zero buildable cases: {self.run_only_cases}")
RuntimeError: RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

FangRui0 · 2026-04-03T08:46:23Z

/run a5 tgemv_mx_emitc tgemv_mx_variants_emitc

reedhecre · 2026-04-03T08:52:42Z

A5 板测成功

触发方式：manual
源码提交：0a830082228e
结果汇总：OK 2 / FAIL 0 / SKIP 0
日志：/root/ptoas-board-monitor-a5/logs/20260403_164706_manual_pr418.log
结果 TSV：/root/ptoas-board-monitor-a5/logs/20260403_164706_manual_pr418.tsv
手动指令：/run a5 tgemv_mx_emitc tgemv_mx_variants_emitc
触发人：FangRui0
指定用例：tgemv_mx_emitc,tgemv_mx_variants_emitc
触发评论：feat: Add TGEMV_MX family ops #418 (comment)

gemini-code-assist bot reviewed Apr 2, 2026

View reviewed changes

FangRui0 force-pushed the add_gemv branch from cc91afe to 4b165ca Compare April 2, 2026 02:53

jiashu added this to pto project Apr 2, 2026

github-project-automation bot moved this to Todo in pto project Apr 2, 2026

FangRui0 force-pushed the add_gemv branch from 4b165ca to 64fcaab Compare April 2, 2026 06:29

FangRui0 force-pushed the add_gemv branch from 64fcaab to a9a12ee Compare April 3, 2026 07:39

FangRui0 force-pushed the add_gemv branch from a9a12ee to 49741a0 Compare April 3, 2026 08:26

FangRui0 force-pushed the add_gemv branch from 49741a0 to 8f46cd5 Compare April 3, 2026 09:05

feat: Add TGEMV_MX family ops

1f5be87

FangRui0 force-pushed the add_gemv branch from 8f46cd5 to 1f5be87 Compare April 3, 2026 09:12

test: skip gemvmx on non-a5 sample runs

a6152e0

		replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
		{dst, cIn, a, aScale, b, bScale}, rewriter);

		replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
		{dst, a, aScale, b, bScale, bias}, rewriter);

	struct PTOTGemvMXAccToTGEMV_MX
	struct PTOTGemvMXAccToTGEMV_MX_ACC

	struct PTOTGemvMXBiasToTGEMV_MX
	struct PTOTGemvMXBiasToTGEMV_MX_BIAS

Conversation

FangRui0 commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FangRui0 commented Apr 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

FangRui0 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

FangRui0 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

FangRui0 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

FangRui0 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

reedhecre commented Apr 2, 2026

A5 板测失败

日志尾部

Uh oh!

FangRui0 commented Apr 2, 2026

Uh oh!

reedhecre commented Apr 2, 2026

A5 板测失败

日志尾部

Uh oh!

FangRui0 commented Apr 2, 2026

Uh oh!

FangRui0 commented Apr 2, 2026

Uh oh!

FangRui0 commented Apr 3, 2026

Uh oh!

reedhecre commented Apr 3, 2026

A5 板测失败

失败用例

Uh oh!

reedhecre commented Apr 3, 2026

A5 板测失败详情：PR #418

Uh oh!

FangRui0 commented Apr 3, 2026

Uh oh!

reedhecre commented Apr 3, 2026

A5 板测失败

日志尾部

Uh oh!

reedhecre commented Apr 3, 2026

A5 板测失败

日志尾部

Uh oh!

FangRui0 commented Apr 3, 2026

FangRui0 commented Apr 2, 2026 •

edited

Loading