Skip to content
This repository was archived by the owner on Aug 30, 2024. It is now read-only.

Commit aa4a8ab

Browse files
authored
[BesTLA] AVX2: Use loaded registers of B. (#151)
1 parent 750b356 commit aa4a8ab

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

bestla/bestla/bestla_gemm.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2716,7 +2716,7 @@ class AvxvnniN8P4 : protected bestla::xbyak::JitAvxvnni {
27162716
vpbroadcastd(vreg_t(AReg), ptr[reg_tmp1]);
27172717
add(reg_tmp1, reg_astride);
27182718
for (int i = 0; i < NRegs; i++) {
2719-
vpdpbusds_(vreg_t(CReg + mm * NRegs + i), vreg_t(AReg), ptr[reg_matBptr + kk * BKStepSize + i * VecBytes]);
2719+
vpdpbusds_(vreg_t(CReg + mm * NRegs + i), vreg_t(AReg), vreg_t(BReg + i));
27202720
}
27212721
}
27222722
}

0 commit comments

Comments
 (0)