-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[blockwise] GOI packing routine for qb4w #6373
[blockwise] GOI packing routine for qb4w #6373
Conversation
cc @alankelly - here is the first PR to start building blockwise support i.e. qb4w. I closed the Draft PR #6030 |
src/xnnpack/pack.h
Outdated
@@ -202,6 +202,38 @@ XNN_INTERNAL void xnn_pack_qs8_qc4w_gemm_goi_w( | |||
size_t extra_bytes, | |||
const struct xnn_qs8_qc4w_packing_params* params); | |||
|
|||
typedef void (*xnn_pack_qs8_qb4w_gemm_fn)( | |||
size_t g, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Traditionally these parameters were undocumented and cryptic. Could you please break this tradition and give them better names?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. I feel nc/kc/nr/kr are quite well "baked" in the code and we should document is somewhere like in the docs dir. It might be more meaningful for readability. I can put up a doc change tomorrow.
packing-test --gtest_filter="PACK_QD8_F32_QB4W_GEMM_GOI_W.*"
361695c
to
fdd08a9
Compare
@@ -930,6 +930,45 @@ void xnn_init_qs8_to_qs8_qc8w_scale_fp32_params( | |||
} | |||
} | |||
|
|||
void xnn_init_qs8_qb8w_scale_fp32_params( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be named qb4w?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though you may be coming from consistency point of view with-in qb*w. I would argue that, it shouldn't matter actually because the routine is independent of weight bit width. If you look qc4w, it uses qc8w routine. I am not sure if we will ever write qb8w but this is just an artifact from qc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is independent of quantization bit width so qs8_qb8w is not needed, maybe xnn_init_blockwise_scale_fp32_params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Let me update.
Thanks @alankelly - let me push another PR to fix these name changes and other small things. |
#6434 is the follow up PR. |
Test
packing-test --gtest_filter="PACK_QD8_F32_QB4W_GEMM_GOI_W.*"