-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ggml : fix MUL_MAT_ID repack with Q8_K #12544
Conversation
@Djip007 Could you take a look at these fixes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a very quick review...
for (int64_t i11 = ith * 4; i11 < ne11 - ne11 % 4; i11 += nth * 4) { | ||
quantize_mat_q8_K((float *) ((char *) src1->data + i11 * nb11), (void *) (wdata + i11 * nbw1), 4, ne10, | ||
INTER_SIZE); | ||
} | ||
} else { | ||
GGML_ASSERT(PARAM_TYPE == GGML_TYPE_Q8_0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For C++ I'll use in this case
if constexpr (PARAM_TYPE == GGML_TYPE_Q8_0) { ... }
if constexpr (PARAM_TYPE == GGML_TYPE_Q8_K) { ... }
and if this is the only to possible may may be some
static_assert( (PARAM_TYPE == GGML_TYPE_Q8_K) || (PARAM_TYPE == GGML_TYPE_Q8_0), "comment");
but it may be trap by adding gem[v/m] PARAM_TYPE as template param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be remove the if and change (transforme) the quantize_mat_q8_K/quantize_mat_q8_0 to template quantize_mat<PARAM_TYPE>
template <ggml_type PARAM_TYPE>
void quantize_mat(...);
tempate<> quantize_mat<GGML_TYPE_Q8_0>(...) {
quantize_mat_q8_0(...); // or "inline" it.
}
tempate<> quantize_mat<GGML_TYPE_Q8_0>(...) {
quantize_mat_q8_K(...);
}
@@ -5578,7 +5579,7 @@ static const tensor_traits<block_q4_0, 8, 8, GGML_TYPE_Q8_0> q4_0_8x8_q8_0; | |||
static const tensor_traits<block_q4_K, 8, 8, GGML_TYPE_Q8_K> q4_K_8x8_q8_K; | |||
|
|||
// instance for IQ4 | |||
static const tensor_traits<block_iq4_nl, 4, 4, GGML_TYPE_IQ4_NL> iq4_nl_4x4_q8_0; | |||
static const tensor_traits<block_iq4_nl, 4, 4, GGML_TYPE_Q8_0> iq4_nl_4x4_q8_0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the static_assert it will be catch a build time.
@@ -5295,8 +5295,7 @@ template <> void gemv<block_q4_K, 8, 8>(int n, float * s, size_t bs, const void | |||
ggml_gemv_q4_K_8x8_q8_K(n, s, bs, vx, vy, nr, nc); | |||
} | |||
|
|||
template <> | |||
void gemv<block_iq4_nl, 4, 4>(int n, float * s, size_t bs, const void * vx, const void * vy, int nr, int nc) { | |||
template <> void gemv<block_iq4_nl, 4, 4>(int n, float * s, size_t bs, const void * vx, const void * vy, int nr, int nc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be add the PARAM_TYPE on this template to
template <typename BLOC_TYPE, int64_t INTER_SIZE, int64_t NB_COLS, ggml_type PARAM_TYPE>
void gem[m,v]....
@Djip007 Thanks. Let me know if you have any other suggestions. |
Merging this for now and happy to hear if you have any additional suggestions. |
fix #12528
mul_mat_id
assumedQ8_0
type for thesrc1
IQ4_NL
param type to byQ8_0
instead ofIQ4_NL