-
Notifications
You must be signed in to change notification settings - Fork 9
speed up periodic boundary coupling #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
By precomputing suitable search areas on the target grid boundary region
@chmerdon what do you think? Seems to cause no problems so far, so I would remove the |
I put the wrap of the point evaluator outside the loop which improved the situation further: |
Nice. We should still tackle the allocation problem. |
Yes, so far the interpolate was expected to be used for all elements at once, here it is used in a loop to interpolate on each single boundary face. Maybe we can reduce the number of interpolate calls somehow by treating several boundary faces at once, that have non overlapping supports on the other side? Or we need to find a way to rewrite the interpolate code, such that variables like edgemoments are reused... |
Got rid of some more minor allocations that brought also a little speedup... I am now at |
Great. I removed the |
7190faa
to
5cab4f3
Compare
5cab4f3
to
d357811
Compare
I accidently ran another Formatter, before I remember to run the pre-commit, maybe that caused it. At least I don't see a need for these commas and white spaces myself ^^ |
There shall not be another formatter than Runic 😆 |
4de4064
to
01d8d29
Compare
By precomputing suitable search areas on the target grid boundary region.
Compute for each face the bounding box, transfer it to the opposite side and check intersections with the bounding boxes of the faces there.
A not so trivial benchmark:
Then,
@time A = get_periodic_coupling_matrix(FES, xgrid, 4, 2, give_opposite!, sparsity_tol = 1.0e-8, heuristic_search = true)
takes about 31 seconds,
while the current method
@time A = get_periodic_coupling_matrix(FES, xgrid, 4, 2, give_opposite!, sparsity_tol = 1.0e-8, heuristic_search = false)
takes about 5 minutes!
The switch
heuristic_search
should probably be removed when this improvement is accepted.Remark: the allocations are crazy (independent of the switch):
31.423613 seconds (36.58 M allocations: 31.542 GiB, 5.39% gc time)
... we should look into this.