-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert some particle performance regressions #4249
Revert some particle performance regressions #4249
Conversation
What if you only used the new code path if (i) deal.II 9.3 is used, and (ii) some |
Are you seeing that |
Yes, it takes more time. |
Ok, can you please make a minimal test that shows the problem (just add 20 or so fields and time the difference) for FEEvaluation? |
Sure, I will do that. |
4f030be
to
b661b68
Compare
Ok, I found the culprit (it was me not remembering something about FEPointEvaluation). But first: @tjhei and @zjiaqi2018: If you use FEPointEvaluation with more than one base element (e.g. a FESystem like in ASPECT) and you ask it to evaluate more than one base element it will default back to using FEValues internally (and generate a new FEValues for each cell). Is that what you observed? In #4060 I worked around this by using one FEPointEvaluation per base element and only use the components of that base element, but that of course duplicates the mapping information (still for the particles it is worth it). But I remembered the problem with #4060: FEPointEvaluation is currently only optimized for MappingQGeneric, otherwise it also defaults to creating individual FEValues per cell. I changed the PR to check the mapping and only use the new function if MappingQGeneric is used. For ASPECT that means in practice all models with curved cells (spherical shell, chunk, ellipsoidal chunk, sphere) that do not use mesh deformation (because then we use MappingQ1Eulerian). I profiled the PR with both my spherical benchmark and @erinheilman's box model, and it should now be always faster or the same speed as before #4060 was merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
It turns out the changes in #4060 only speed up things under very specific conditions like very many particles per cell >>100 and few compositional fields. Unfortunately I did not benchmark more realistic models, where they seem to perform much worse than before and those I did benchmark I combined with #4062, which more or less offsets the changes to the advection part.
I think the performance can be fixed with improvements to the FEPointEvaluation class in deal.II, in particular to teach it how to handle FESystems efficiently, currently we recalculate mapping related data for each base element separately. However, these changes would still need to be made and benchmarked so the changes only make sense starting with deal.II 10 at the earliest.
I would like to disable #4060 for now to avoid slowing everyone's particle models, but I understand that we do not want to keep disabled code around. If you prefer I can also revert #4060 and open a new PR in the future that introduces it when it is faster. Opinions?