-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve performance of apply!
be rewriting the part that zero out rows
#489
Conversation
22a73b6
to
1c883e7
Compare
Could someone just verify the performance difference. |
julia> @btime apply_zero!(K, f, ch) #master
340.741 ms (16 allocations: 423.98 MiB)
julia> @btime apply_zero!(K, f, ch) #kc/row_zero
408.249 ms (4 allocations: 544 bytes) |
498.754 ms (16 allocations: 423.98 MiB) # master
665.802 ms (4 allocations: 544 bytes) # kc/row_zero
145.603 ms (4 allocations: 544 bytes) # kam/both_zero Perhaps you already discussed why not storing the indices of The downside is that creating the constraint handler is slower, 138.396 ms (120174 allocations: 77.98 MiB) # master
143.173 ms (120163 allocations: 80.44 MiB) # kc/row_zero
713.404 ms (160598 allocations: 107.49 MiB) # kam/both_zero (with `close!(ch, K)`)
1.824 s (160622 allocations: 2.96 GiB) # kam/both_zero (with `close!(ch)`) (For the close!(ch) about 1 GB of allocations and 0.3 s can be saved by filling with a Singelton type instead of Float64) |
Then you are storing approx the equivalent memory of half a stiffness matrix there which maybe is too much. |
It is interesting we get quite different benchmarks. On my desktop with a pretty beefy CPU (i9-12900K) I get:
If I slap a
|
Perhaps I misunderstand here, but wouldn't it be (on average) |
Yeah, I'm wrong, you only need the mapping for the constrained dofs (of course). Caching this mapping in the constraint handler makes sense to me I think, it would just move a part of the logic in the new function here to there. I guess the only drawback is that we typically do not send in the stiffness matrix to the constraint handler when it is created... |
I used my laptop with 12th Gen i5-1240p but I can retry with threads If you want |
Probably stupid question, but if the performance of applying (affine) constraints is of concern, why don't we apply the constraints on element-level as in deal.ii? (see e.g. https://www.dealii.org/current/doxygen/deal.II/classAffineConstraints.html#a373fbdacd8c486e675b8d2bff8943192 and https://www.dealii.org/current/doxygen/deal.II/step_27.html#Creatingthesparsitypattern) |
Why not both :D I think this PR can be merged, even though it seems to be a bit slower on @koehlerson benchmarks. |
|
I should probably be get rid of the identity hashing thing. It is a bit unclear if that is optimizing for the specific benchmark. |
This patch improves the performance of apply! and apply_zero! by rewriting the part that zero out rows of the matrix. As a result, the `strategy` keyword argument is obsolete and thus ignored. Co-authored-by: Kristoffer Carlsson <kcarlsson89@gmail.com> Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
1c883e7
to
2c261c0
Compare
Codecov ReportBase: 92.20% // Head: 92.28% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #489 +/- ##
==========================================
+ Coverage 92.20% 92.28% +0.07%
==========================================
Files 22 22
Lines 3783 3783
==========================================
+ Hits 3488 3491 +3
+ Misses 295 292 -3
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
ApplyStrategy was deprecated in #489, but kept for backwards compat (having no effect). This PR removes them completely.
As a consequence the
strategy
kwarg is no longer useful.Benchmark code:
Time is approx the same but we no longer need to keep two copies of the stiffness matrix in memory at the same time.