Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve cache localization in default_blasmul #130

Merged
merged 4 commits into from
May 2, 2023

Conversation

jishnub
Copy link
Member

@jishnub jishnub commented Apr 29, 2023

The order of the loop was cache unfriendly, but it was written this way because the range of the row variable may depend on the column one. However, we may carry out a quick check to ascertain whether the row range is actually independent of the column one, in which case we may switch them and loop in a cache-friendly manner. This branching improves performance in banded*dense matmul. The original behavior is retained in the fallback branch.

julia> B = brand(4000, 4000, 5, 5);

julia> X = rand(4000, 4000);

julia> @btime $B * $X;
  977.570 ms (2 allocations: 122.07 MiB) # master
  518.942 ms (2 allocations: 122.07 MiB) # PR

@codecov
Copy link

codecov bot commented Apr 29, 2023

Codecov Report

Patch coverage: 93.33% and project coverage change: +2.10 🎉

Comparison is base (0a9a57c) 86.24% compared to head (b0e8569) 88.34%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #130      +/-   ##
==========================================
+ Coverage   86.24%   88.34%   +2.10%     
==========================================
  Files          10       10              
  Lines        1708     1716       +8     
==========================================
+ Hits         1473     1516      +43     
+ Misses        235      200      -35     
Impacted Files Coverage Δ
src/muladd.jl 75.29% <93.33%> (+0.81%) ⬆️

... and 6 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@dlfivefifty dlfivefifty merged commit 536d21b into master May 2, 2023
@jishnub jishnub deleted the cachelocdefblasmul branch May 3, 2023 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants