New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature p4est find partition - Find MPI ranks of point owners in distributed meshes #12341
Conversation
In case you need it: If you just need to find the MPI rank a cell belongs to, you can use |
Thanks for the hint @marcfehling. This feature is slightly different. It finds the rank of the cell that owns an arbitrary point (if there is such a cell) on each rank, meaning even when the rank querying does not have (all) info about other parts of the triangulation. I will try to be quick to come up with tests. |
Hi @tjhei, as I promised, here the new feature. I guess it works, at least it does so on meshes that have the default manifold attached. (Where can I check that?) I will be happy about comments and hints. I have removed draft flag and turned it into a PR. The project is also ready to test now. @bangerth maybe this PR is of relevance to you, too. |
I have a feeling like this could be done in a simpler way using the functionality that sits inside |
Hi @bangerth I was not aware of this function but as far as I understand this PR here introduces a slightly different tool. It seems that I guess you refer to #PR5414 and the also related #PR5458. FYI, @cburstedde Maybe a question where you could help: The tests fail in the release config but not in debug. I have the clue that this might be me doing something wrong with the sc_array. |
Hi @konsim83, your understanding of the functionality is pretty much spot-on. The What exactly is the fail you're getting? Please feel free to isolate / discuss in the p4est repository. |
It would be great if a Deal.II maintainer would add the "ready to test" label, so I can be confident that the bug I am chasing does not come from my installation of p4est. |
There you go :-) /rebuild |
Btw, am I as a contributor able to set labes like "documentation" or "bug" myself or upon request? |
Nope ;) |
This should be defined in the header |
Looks like at least version 2.2 is required (https://github.com/cburstedde/p4est/blob/v2.1/src/p4est_search.h vs. https://github.com/cburstedde/p4est/blob/v2.2/src/p4est_search.h). |
I see, seems like my local version is newer. But that means even if I remove the bug the CI has no chance to pass. |
+ // p4est for some reason always needs double vxyz[3] as last argument to
+ // quadrant_coord_to_vertex
Yes, the in-out argument is double vxyz[3] on purpose, even in 2D.
The memory leak I am chasing occurs in these lines most likely in the function `internal::p4est::functions<dim>::quadrant_coord_to_vertex(...)` which is just a `typedef` for `p4est_qcoord_to_vertex(...)` of the `p8est_...`-version in 3D. I found this with some caveman debugging using `std::cout` .
Am I doing something obviously wrong here? I do not find it and feel a bit blind, to be honest. Any help would be appreciated.
Not all connectivities have vertices, but it may be that the ones used by
deal.II do. Without vertices, the function will run into an assertion.
Just make sure the in-out argument vxyz has 3 entries.
|
`/jenkins/workspace/dealii_PR-12341/include/deal.II/distributed/p4est_wrappers.h:99:41: error: ‘p4est_search_partition_t’ does not name a type
using search_partition_callback = p4est_search_partition_t;`
This should be defined in the header `p4est_search.h`.
I you do not want to check for versions, you can query the P4EST_SEARCH_LOCAL
#define in p4est_base.h. When defined, search_partition should be available.
|
If you are running into issues in release mode, it could be because we don't set vertex positions in p4est in release mode, only in debug mode. |
Any chance to compute the vertices if they are not set? |
Thanks for the hint @tjhei , should have read the code more thoroughly. I guess you mean these lines https://github.com/dealii/dealii/blob/master/source/distributed/tria.cc#L1802 which orccur in a similar way at several places in this file. Is it critical not to set the vertices in release mode? |
I have to admit that we did this a long time ago without checking memory and performance implications. As this information is not used anywhere, we decided not to provide it. |
Would it make sense if I compared the performance in terms of timing and memory in some tests, maybe on a small cluster? Another option would be to introduce another |
> I have to admit that we did this a long time ago without checking memory and performance implications. As this information is not used anywhere, we decided not to provide it.
This policy makes a lot of sense imo, since it forces both deal.II and
application developers to handle geometry the deal.II way only.
Would it make sense if I compared the performance in terms of timing and memory in some tests, maybe on a small cluster? Another option would be to introduce another `#define P4EST_USE_VERTICES_RELEASE_MODE` or so that could be set in the config phase. What is your opinion?
I don't know much about deal.II geometries, but since most applications would
use deal.II specific geometry code, it appears least redundant to me to use
such code in the new functionality and not to rely on p4est-internal vertices.
The deal.II way will allow us to stop maintaining vertex logic in the future.
|
@tjhei , I checked it. The cause for the failure in release mode is exactly this. As far as I understand the coarsest mesh is available on each processor, right? Is there any simple way to compute the vertex information from this efficiently? (maybe for the beginning for attached default manifolds for now) - sorry if this question is dumb but I am not familiar with all internals here. What would be a good way to proceed with this then. In particular keeping @cburstedde 's last comment in mind. |
You need the vertices to answer if a point is within a quadrant, right? Would it be possible to use the deal.II functionality to answer the question instead? |
Sorry, @tjhei, for the late reply. @cburstedde Please correct me if I am wrong with the following. Yes, to locate a point you need the vertices. Please note that the p4est algorithm is free of any communication. But you can not hold all points of the mesh on one processor of course. According to my understanding of p4est (and I am for sure not an expert on it) locating a point is possible since each MPI process knows about the physical location of its first quadrant. This gives you the partition boundaries on the space filling curve and you can exclude entire trees from the search when going top-down from coarse to fine. Once you refine the mesh you may need to rebalance and this invokes communicating a new location of the first quadrant of each process. I did not find any deal.ii function that does that, right? My feeling here is that having the vertices in p4est as in the debug config may be redundant but it is not a big overhead and can enable users to find arbitrary points (or more general objects through modifying the callback wrappers) in the partition without communication during the search. I also have the feeling doing it directly in deal.ii would only reimplement what is already available in p4est. I, personally find these searches really useful since you can minimize communication (in particular no bad all-to-all communication is necessary) in semi-Lagrangian simulations which are a big deal in climate and wheather related applications. This could even lift it on another level since you do not need ghost layers and hence the CFL can be larger than in state-of-the arts algorithms. My suggestion would be to take into account vertices in release mode. The change is minor and the additional memory overhead caused by this little redundancy is not too big. It will still hide the p4est backend from the deal.ii user. Also keep in mind that the bounding box based solution currently implemented does not scale very well with the number of processes. Opinions on this? Last remark: A case that I do not fully understand is the case when a manifold other than the default manifold is attached, e.g., CAD based geometry or so. Are vertices after applying the manifold projection (after refinement) communicated to p4est? As far as I understand yes which would be good. Best regards and thanks for your patience, |
Right, I think this p4est internal geometry handling should be hidden from the user. But I do think that the way
In deal.ii's debug mode you still have the vertex info. Having it also in release is redundant, yes, but without it one would need to reimplement the search in deal.ii terms and that would make a lot more work I guess. |
Whatever works for you! If you feel like using the vertices and p4est_qcoord_to_vertex, that is certainly fine by me. |
We only ever communicate the coarse mesh vertices to p4est. If you have a manifold attached, new vertices when refining will be moved to conform to the manifold. p4est does not know about any of that. If you need to go this way, I would suggest to add a |
Wrt. the earlier question of @peterrum, for local leaves and entirely local branches, the partition search returns exactly one owner. For higher branches in the tree, we return a range of processes from pfirst to plast. This may happen to be a single process or more than one. For example, one process is returned for a branch that is on exactly one remote process. |
To add to this, users are free to return true for a point in multiple quadrants. Accordingly, they may get multiple owner processes for any given point, local or not. |
@peterrum The function that I implemented is only the first step toward what you refer to in your comments (which name important applications of this new interface). Currently the new function is implemented through an interface that returns exactly one process per point. The implementation of the C-callback responsible for the decision on what you return is flexible as @cburstedde mentioned and can be extended to, e.g., a DG usecase. IMHO, incorporating this function or a modified version of its internals into, e.g., the @marcfehling, @tjhei and @peterrum: Many thanks for the reviews. I will take care of incorporating your comments :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I incorporated the comments. What will we do with the version of p4est? The CI does not run the tests since the availability of search functions is not given.
@tjhei and @marcfehling I added an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the empty output
file manually using the GitHub interface.
Would you still incorporate what you reverted in 7952529, but fix it with what I suggested in #12341 (review)? In a follow-up PR (not this one), we can then change the behavior of deal.II and remove the distinction between debug and release mode.
Then this PR is good to be merged in my opinion. We should address @peterrum's important points #12341 (review) in a follow-up PR as you suggested in #12341 (comment).
In terms of CI: we either need to kindly ask @tjhei to update p4est, or rely on the word of someone else that the testsuite passes on a manual run. Since I already have your branch on my workstation (while I tried to fix that static_cast
-enum
problem), I can do that with not much of an overhead.
(EDIT: I think I got you wrong in the first time. Not the whole testsuite is run by the CI worker. However, we need to skip these tests, when P4EST_SEARCH_LOCAL
is not set. Hmm, we need to tell cmake
about it like in #9266. Hmm, I don't know straight away how to do it.)
(EDIT EDIT: Or we add an ifdef P4EST_SEARCH_LOCAL
in these tests and provide an empty alternative output file. This would be the cheap solution.)
All the 3D tests fail on my workstation with
@konsim83 -- Can you investigate? |
@marcfehling thank you for the hint, I need to test it with another version anyway. I am on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporated suggestions.
The Now we need to still figure out a way how to hide your tests for incompatible versions of p4est, i.e. if we run the testsuite with We discussed ideas for this in #12341 (review). I think the cheapest solution with alternative output files are the way to go here. I would not rewrite the |
@konsim83 -- Could you review and merge the two open PRs against this feature branch? |
Sorry, was on my list, then forgot. It is merged now. Many thanks for the work! :-) Le me know when everything is ready so I can squash my commits. |
@marcfehling The windows 2019 check fails and I do not see why. Any idea? |
Just a hickup of |
582b1e2
to
9791aa1
Compare
OK, squashed everything. Actual use cases of this function are about to follow. We already implemented then and now would like to share them with the community :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There are just two things that we should address in future pull requests as this one is already large enough:
- Do no longer communicate vertices to
p4est
in debug mode in general, but use the newSettings
parameter.
I can take care of this as a follow-up.
As discussed in Feature p4est find partition - Find MPI ranks of point owners in distributed meshes #12341 (review). - Reconsider interface to match
GridTools::distributed_compute_point_locations()
.
Maybe even come up with a function that calls thep4est
specialization forp:d:Tria
, and theGridTools
variant for other Triangulation types (similar toGridTools::get_subdomain_association()
).
Any volunteers? :-)
As discussed in Feature p4est find partition - Find MPI ranks of point owners in distributed meshes #12341 (review).
I looked over this PR for quite some time, and feel like I'm less likely to find anything else. I would prefer someone else looking over this PR one more time before merging.
many thanks @marcfehling for your valuable comments and addons. Thumbs up! @bangerth @tjhei @peterrum @cburstedde or oanyone else that I forgot to mention have additional comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks to me ready to be merged! Very nice; thanks for the efforts! I will merge tomorrow in the morning CEST, to give others the chance to express their opinions. The changes a very much localized in p:d:T and changes do not affect the rest of the code. A better integration of the introduced in the rest of the library would be very nice but need the join involvement of more people!
@konsim83 @cburstedde Thanks for the explanations and sorry for the silence - vacations and busy days since than! I remember reading Parallel tree algorithms for AMR and non-standard data access
and Parallel level-set methods on adaptive tree-based grids
quite a while; it is nice to see algorithms from there now also in deal.II!
To open the gate for massively parallel semi-Lagrangian simulations (with high Courant numbers) one may need to trace back points across partition boundaries of distributed meshes. These points may be located in cells outside the ghost layer of a process or even outside the mesh.
The purpose of this feature is to introduce a first function that can find the rank of the MPI process that owns the cell in which the point is located. This is done through interfacing p4est's
p4est_search_partition(...)
function with appropriate callbacks. Mathematically this amounts to a search across all trees and in each tree through the Morton curve utilizing meta information communicated throughout the forest.