New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maximum matrix sizes #4268
Comments
Are you using the "performance" examples or is this the regular |
I have tried with both, and I'm having issues in either case. Edit: But just to clarify, the particular example I gave with the 17 million unknowns was on the regular |
Can you step through on a debugger to see what is going on? My suspicion, if I understand correctly that the 17 million dof case is run on a single MPI rank, is that there is a quadrature data array being allocated (for example, for the element Jacobians) which might be larger than 2B entries and you are overflowing on the index. This happens for 3D Jacobians when |
Can you provide some more information, e.g. which initial mesh you are using, and how many MPI ranks? If I understand your configuration correctly, the serial mesh is refined until it has 10,000 elements, and then it is partitioned, and 7 parallel refinements are performed. For a 2D quad mesh, this will result in Also, for better parallel load balancing, it is probably better to partition the mesh as much as possible in serial (before running out of memory on a single rank), and only then partition and switch to parallel refinements. The parallel refinements do not do any repartitioning or load balancing. |
@Heinrich-BR, can you post the exact command line you use (and any modifications you made to |
Hi everyone! Thank you for your support! Let me try to answer everything.
That's very interesting @sebastiangrimberg , and it does sound very likely. I've stepped through with a debugger before which is how I found where the issue was happening to start with, but I'll try again and keep out an eye for the quadrature data array.
I am using the
Not exactly, I set the number of serial refinements to be -1 (i.e. no serial refinement at all), so all of the refinement is parallel. I don't think the refinement being serial or parallel makes any difference in the case with 1 MPI rank anyway. The important part is that starting from the original mesh, there were 7 refinements in total.
Of course! I'm using the latest version of MFEM (as of this week), so commit
With this, I ran the example with the command
Hopefully this helps you reproduce the error! Thank you everyone for your support and have a great weekend! |
Just as an update regarding this, I've looked into it with a debugger again and retrieved some numbers:
Given this is almost twice as much as But of course, I'm just speculating here in case this is indeed the issue. It's quite possible I've missed something. Let me know what you find! |
Hello MFEM developers,
I'm testing out MFEM's scaling on large clusters and for that, I'm pushing some of the examples to see how big they can be made before they break, simply by parallel refining the mesh further and further. However, I'm noticing that in general, they stop working about one or two orders of magnitude before I would expect memory issues or integer overflow to cause problems.
For instance, take the example
ex1p
. If you set the serial refinement level to -1 and the parallel refinement level to 7, the example ends up with about 17 million unknowns and segfaults in themfem::internal::quadrature_interpolator::TensorDerivatives
function. With a parallel refinement level of 6, the example runs normally to the end. I know that it is not a matter of running out of memory since the cluster I am using has more than enough for problems much larger than this. I have tried building MFEM and its dependencies with 64-bit integers and with mixed integers, but nothing seems to allow me to go further than this. Splitting the problem into many MPI ranks does help me go one parallel refinement level further before breaking again, but it would require a very large number of ranks to reach the problem sizes I am interested in, and in principle it should be doable with only a few.I would like to ask, then, is this a known limit for MFEM or is there perhaps some build configuration that I'm missing?
The text was updated successfully, but these errors were encountered: