Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : fix MUL_MAT_ID repack with Q8_K #12544

Merged
merged 2 commits into from
Mar 26, 2025
Merged

Conversation

ggerganov
Copy link
Member

fix #12528

  • The mul_mat_id assumed Q8_0 type for the src1
  • Fix IQ4_NL param type to by Q8_0 instead of IQ4_NL
  • Code indentations

@ggerganov
Copy link
Member Author

@Djip007 Could you take a look at these fixes?

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 24, 2025
Copy link
Contributor

@Djip007 Djip007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a very quick review...

for (int64_t i11 = ith * 4; i11 < ne11 - ne11 % 4; i11 += nth * 4) {
quantize_mat_q8_K((float *) ((char *) src1->data + i11 * nb11), (void *) (wdata + i11 * nbw1), 4, ne10,
INTER_SIZE);
}
} else {
GGML_ASSERT(PARAM_TYPE == GGML_TYPE_Q8_0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For C++ I'll use in this case

        if constexpr (PARAM_TYPE == GGML_TYPE_Q8_0) { ... }
        if constexpr (PARAM_TYPE == GGML_TYPE_Q8_K) { ... }

and if this is the only to possible may may be some

  static_assert( (PARAM_TYPE == GGML_TYPE_Q8_K) || (PARAM_TYPE == GGML_TYPE_Q8_0), "comment");

but it may be trap by adding gem[v/m] PARAM_TYPE as template param

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be remove the if and change (transforme) the quantize_mat_q8_K/quantize_mat_q8_0 to template quantize_mat<PARAM_TYPE>

template <ggml_type PARAM_TYPE>
void quantize_mat(...);

tempate<> quantize_mat<GGML_TYPE_Q8_0>(...) {
    quantize_mat_q8_0(...);  // or "inline" it.
}
tempate<> quantize_mat<GGML_TYPE_Q8_0>(...) {
    quantize_mat_q8_K(...);
}

@@ -5578,7 +5579,7 @@ static const tensor_traits<block_q4_0, 8, 8, GGML_TYPE_Q8_0> q4_0_8x8_q8_0;
static const tensor_traits<block_q4_K, 8, 8, GGML_TYPE_Q8_K> q4_K_8x8_q8_K;

// instance for IQ4
static const tensor_traits<block_iq4_nl, 4, 4, GGML_TYPE_IQ4_NL> iq4_nl_4x4_q8_0;
static const tensor_traits<block_iq4_nl, 4, 4, GGML_TYPE_Q8_0> iq4_nl_4x4_q8_0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the static_assert it will be catch a build time.

@@ -5295,8 +5295,7 @@ template <> void gemv<block_q4_K, 8, 8>(int n, float * s, size_t bs, const void
ggml_gemv_q4_K_8x8_q8_K(n, s, bs, vx, vy, nr, nc);
}

template <>
void gemv<block_iq4_nl, 4, 4>(int n, float * s, size_t bs, const void * vx, const void * vy, int nr, int nc) {
template <> void gemv<block_iq4_nl, 4, 4>(int n, float * s, size_t bs, const void * vx, const void * vy, int nr, int nc) {
Copy link
Contributor

@Djip007 Djip007 Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be add the PARAM_TYPE on this template to

template <typename BLOC_TYPE, int64_t INTER_SIZE, int64_t NB_COLS, ggml_type PARAM_TYPE>
void gem[m,v]....

@ggerganov
Copy link
Member Author

@Djip007 Thanks. Let me know if you have any other suggestions.

@ggerganov ggerganov merged commit 5ed38b6 into master Mar 26, 2025
55 of 56 checks passed
@ggerganov
Copy link
Member Author

Merging this for now and happy to hear if you have any additional suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eval bug: Program not working properly due to new features of "repack Q4_K tensor"
2 participants