Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An idea to make MOOSE applications run 10 times faster #9014

Closed
YaqiWang opened this issue Apr 27, 2017 · 10 comments
Closed

An idea to make MOOSE applications run 10 times faster #9014

YaqiWang opened this issue Apr 27, 2017 · 10 comments
Labels
C: Framework P: minor A defect that does not affect the accuracy of results. T: task An enhancement to the software.

Comments

@YaqiWang
Copy link
Contributor

YaqiWang commented Apr 27, 2017

Description of the enhancement or error report

Our current residual and Jacobian evaluation is significantly slower (maybe 10 times) than a hand-coded evaluation mainly due to the utilization of numerical quadratures. It requires us to evaluate the shape functions, their gradients, variable values, JxW, etc. on every single quadrature points. While this provide ultimate flexibility for multiphysics simulations. There are cases where all of these evaluations are unnecessary, such as 2D first-order triangular meshes with constant properties over elements. In such a case the reference mass matrix is constant, stiffness matrix is a constant function of element aspect ratios. The residual evaluation for Diffusion kernel is simply a matrix-vector product. It also possible, instead of doing multiphysics coupling on quadrature point bases, we can just do the coupling on element basis, i.e. manually make properties couple variable averaged values.

So I propose to implement a new type of kernel, possibly with the name being ElementKernel, which inherits KernelBase:

  1. move all quadrature related members in KernelBase into Kernel;
  2. Disable all quadrature point related functions like coupledValue in ElementKernel and only allow something like coupledNodal (which obtain the shape function coefficients), coupledAveragedValue (which provide averaged variable value), etc.;
  3. Implement ElementKernel::initialSetup(), where necessary pieces are evaluated like the reference mass matrix, coefficient matrices for the stiffness matrix, etc.
  4. Change Assembly, MooseVariable to skip most of evaluations when the element only has ElementKernels and Materials without coupling variables on quadrature points.

As an example of Reaction to ElementReaction, the code in ElementReaction::computeResidual would be

massMatrix() * _nodal_u;

Or similarly for ElementDiffusion

stiffnessMatrix() * _nodal_u;

Or ElementKernel can provide a further optimized functions like

DenseVector<Real> massProduct(const DenseVector<Real> & u);
DenseVector<Real> stiffnessProduct(const DenseVector<Real> & u);

We can add more pieces for example

DenseVector<Real> convectionProduct(const DenseVector<Real> & u);

etc.

Rationale for the enhancement or information for reproducing the error

This can potentially make simulations run 10 times faster. It is typically allowed to reduce the accuracy of coupling. The condition can be satisfied in single physics calculations.
To make this fully functioning, we may need to do the same for BoundaryCondition, DGKernel, AuxKernel. Or we can provide a ElementInterface. We may need to have coupledBoundary (to get the coefficients on boundary element).

Identified impact

(i.e. Internal object changes, limited interface changes, public API change, or a list of specific applications impacted)
It will be a new system so have not impact on our current capabilities. Nobody will complain our application is slow ever.

@YaqiWang
Copy link
Contributor Author

Well I agree this is revolutionary. But is it worth to try?

@friedmud
Copy link
Contributor

friedmud commented Apr 27, 2017 via email

@andrsd
Copy link
Contributor

andrsd commented Apr 27, 2017

@YaqiWang Why do you think the speedup will be ~10x? Did you try to implement this or is that just a SWAG?

@permcody
Copy link
Member

It is time to start considering framework level changes for optimization. However, this particular design will require an enormous amount of code duplicate to be feasible so this isn't the right idea. We can do better but it might require help on the libMesh side too. If you could prove a 10x speedup for specific cases like the one you laid out, we could detect that case and run a different code path without changing all of the objects on top. I don't really want an explosion of special cases though so we'd have to be smart about this.

I honestly think we need to continue shaking out the distributed mesh stuff in Rattlesnake first, then we'll start looking at some of these other optimizations. Never hurts to start the discussion.

@rwcarlsen is already working on profiling and finding ways to speedup the framework as a whole.

@permcody permcody added C: Framework P: minor A defect that does not affect the accuracy of results. T: task An enhancement to the software. labels Apr 27, 2017
@rwcarlsen
Copy link
Contributor

I've taken a detailed look at a fair number of simulation profiles (BISON, module tests, etc.) and I'm a bit skeptical about 10X speedup. I'd be very interested in any demonstration of such a drastic improvement.

@YaqiWang
Copy link
Contributor Author

So far it is a SWAG, hehe. I need a numerical proof with a simple case. Hopefully I can provide it soon.

@YaqiWang
Copy link
Contributor Author

For a simple 2D diffusion-reaction calculation with one variable, the speedup on residual and jacobian evaluation is above factor 3. If number of variables is large or in 3D, I expect the gain to be more. 10 times faster is ideal ;-) Solver time is also not counted though. But you get my idea. Actually those small functions can be pushed into libMesh::Elem, then the change in MOOSE is minor. We do need to implement those small functions for each type of elements.

@rwcarlsen
Copy link
Contributor

I think a better, more scalable (to more use cases) approach - also in the category of big framework changes - would be a dependency value "pull" approach where, when a value is needed at a particular mesh location, a function is called to calculate/retrieve it (possibly from some sort of cache if it was already retrieved once before). This would work for materials, and most other things. Then we would automatically get super fine-grained only-calculate-what-you-need functionality.

@friedmud
Copy link
Contributor

friedmud commented May 1, 2017 via email

@friedmud
Copy link
Contributor

This idea is really not happening - but we are growing more capability for specialized solves as we go. For instance, see the new ActuallyExplicitEuler as an example. We now have the ability to compute a "mass matrix" and a "stiffness" matrix and manually call linear solvers...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Framework P: minor A defect that does not affect the accuracy of results. T: task An enhancement to the software.
Projects
None yet
Development

No branches or pull requests

5 participants