jedbrown edited this page May 25, 2011 · 9 revisions
Clone this wiki locally

Solution of large-deformation elasticity using Q_5 elements.

This shows a computed solution to a large-deformation elasticity problem using \(Q_5\) elements. We have a (manufactured) exact solution for this problem and the computed solution here has converged to 5 significant digits in \( W^{1,p} \) (\( p=1,2,\infty \); equivalent to strain) and 6 to 7 digits in \(L^p\). This is using \(Q_5\) elements and the solution time is similar to using \(Q_1\) elements (a few seconds), but it would take \((10^5)^3\) \(Q_1\) elements (\(10^{15}\) nodes), or \(400^3\) \(Q_2\) elements (half a billion nodes), to get similar accuracy (which is admittedly higher than you’re likely to need).

Memory and floating point costs of assembled versus unassembled representations of the Jacobian

This figure shows why we should care about high-order unassembled methods (e.g. high-order elements via tensor product). Both “assembled” and “tensor” apply the same operator, but tensor just stores Jacobian information at quadrature points and applies the action of the Jacobian matrix-free. Note that the “tensor” memory requirements are much lower and scale better as the block size \(b\) (number of degrees of freedom per node) increases. Hardware can currently do about 2 to 6 flops per byte of memory bandwidth and this ratio will increase in the future. Sparse matrix kernels (MatMult and MatSolve) can only use between 0.167 and 0.25 flops/byte and are thus overwhelmingly limited by memory bandwidth. The unassembled methods use about 8 flops/byte. This figure has all the constants, assuming BAIJ\((b)\) block storage for the matrix, and no special structure for the tensor case (non-affine isoparametric coordinate mapping of equal order to the approximation space and fairly expensive coefficient storage).

Assembly cost for Libmesh, Deal.II, and Dohp.