Skip to content

[backport][rls-v3.8] cpu: x64: matmul: fix blocking heuristics for l2 set issues #3436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 22, 2025

Conversation

yair-obodovsky
Copy link
Contributor

fix performance degradation related to issue:
https://jira.devtools.intel.com/browse/MFDNN-13324
backport original pull request:
#3403

@yair-obodovsky yair-obodovsky requested a review from a team as a code owner June 17, 2025 12:30
@github-actions github-actions bot added platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 backport labels Jun 17, 2025
@yair-obodovsky
Copy link
Contributor Author

make test
set test_scope=NIGHTLY

// The following consts are correct for all platforms supporting AMX
constexpr size_t l2_ways = 16;
constexpr size_t l2_ways_threshold = size_t(l2_ways * 0.75);
constexpr size_t l2_sets = 2048;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't number_of_sets = cache_size / number_ways ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed implemented in new function in platform.cpp:
uint32_t get_num_ways_in_cache(int level)
uint32_t get_num_sets_in_cache(int level)
and used them in heuristics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad I missed it was a backport. I guess you could just backport the original changes from main in this PR, and put these extra changes (query cache ways in xbyak_utils) to main in another PR.

@yair-obodovsky yair-obodovsky requested a review from a team as a code owner June 18, 2025 12:38
@dzarukin dzarukin changed the title cpu: x64: matmul: fix blocking heuristics for l2 set issues [backport][rls-v3.8] cpu: x64: matmul: fix blocking heuristics for l2 set issues Jun 18, 2025
@yair-obodovsky
Copy link
Contributor Author

make test
set test_scope=NIGHTLY

@yair-obodovsky yair-obodovsky merged commit 0887aec into rls-v3.8 Jun 22, 2025
34 of 40 checks passed
@yair-obodovsky yair-obodovsky deleted the yobodovs/l2-fix-port branch June 22, 2025 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants