Memory consumption in large scale hp-adaptive applications. #9598

marcfehling · 2020-03-02T21:47:17Z

While running scaling tests on hp-adaptive problems for my PhD project on our local supercomputer, I ran out of memory on large-scale scenarios. For example, this was the case for a problem with a size of about 1 billion degrees of freedom, for which I was occupying 32 nodes with 128GB of RAM each. Each node has two Intel Xeon E5 installed (2x12 cores, SMT enabled, 48 MPI processes per node).

I would like to investigate how the memory demand changes when we switch from DoFHandler to hp::DoFHandler and find out if there are any parts which blow up over time.

The text was updated successfully, but these errors were encountered:

peterrum · 2020-03-02T21:50:21Z

@marcfehling Have you had a look the number of non-zeros in the matrices? I would guess in 3D and high order this might be a problem.

bangerth · 2020-03-02T22:26:11Z

Yes, @peterrum is right. For most problems with moderate polynomial degrees (say, 2 or 3), you end up with a few kB per unknown accumulated over all data structures. Let it be 3kB/unknowns, and then you'd be able to fit about 40M unknowns into 128 GB, or about 1.2B unknowns onto 32 such nodes. That's about the size you see.

But if you increase the polynomial degree, you may end up with substantially more memory in the matrix, and it would not surprise me if you can fit substantially less than 1B unknowns into your 32 nodes.

That shouldn't stop you from investigating, but your numbers don't seem outlandish to me.

tjhei · 2020-03-04T04:03:05Z

(2x12 cores, SMT enabled, 48 MPI processes per node

just a side comment: I have not seen large performance increase when running 2 MPI ranks per physical core. Computation and memory heavy loads might be a little faster but anything involving MPI communication is a little slower (you have twice as many ranks and latency might go up?). You will be a lot more memory constrained this way. This assumes good process pinning if you run with one rank per physical core.

kronbichler · 2021-06-12T07:47:30Z

I don't think there is much left to do here and leave this issue open - can you verify @marcfehling ? In 3D, one should really not use sparse matrices for polynomial degrees >= 3 because the coupling between unknowns is too dense, it only leads to disappointments. And regarding performance, even for p=2 one leaves a factor of 3-5 on the table against matrix-free methods.

marcfehling · 2024-02-22T01:42:48Z

I don't think there is much left to do here and leave this issue open.

I agree with you @kronbichler. Especially since we have a matrixfree version running.

I'm closing this issue.

masterleinad assigned masterleinad and marcfehling and unassigned masterleinad Mar 2, 2020

marcfehling mentioned this issue Apr 6, 2022

Introduce types::fe_index. #13595

Closed

marcfehling closed this as completed Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory consumption in large scale hp-adaptive applications. #9598

Memory consumption in large scale hp-adaptive applications. #9598

marcfehling commented Mar 2, 2020

peterrum commented Mar 2, 2020

bangerth commented Mar 2, 2020

tjhei commented Mar 4, 2020

kronbichler commented Jun 12, 2021

marcfehling commented Feb 22, 2024

Memory consumption in large scale hp-adaptive applications. #9598

Memory consumption in large scale hp-adaptive applications. #9598

Comments

marcfehling commented Mar 2, 2020

peterrum commented Mar 2, 2020

bangerth commented Mar 2, 2020

tjhei commented Mar 4, 2020

kronbichler commented Jun 12, 2021

marcfehling commented Feb 22, 2024