Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix newton assembly performance 3 #2470

Merged

Conversation

MFraters
Copy link
Member

Testing it again :)

@tjhei
Copy link
Member

tjhei commented Jun 26, 2018

Looks good. The changes in test output seem minor at first glance. Can you update the test results?

@MFraters
Copy link
Member Author

I am still surprised how much of a difference it makes.

In the code the difference comes down to doing (grads_phi_u[i] * (deta_deps * grads_phi_u[j])) * strain_rate which it used to be or grads_phi_u[i] * ((deta_deps * grads_phi_u[j]) * strain_rate) which this pull request proposes. Looking at the paper, the second one seems to be more consistent with what we have written down, so I guess this change is fine, and I should indeed update the outputs.

Minimises the amount of tensor contractions by precomputing
them in a vector for later use.
@MFraters MFraters force-pushed the fix_newton_assembly_performance_3 branch from 1231b4a to 6e7e05b Compare June 26, 2018 20:14
@gassmoeller
Copy link
Member

Is this ready from your side @MFraters ?

@MFraters
Copy link
Member Author

Yes, I think it is ready. I will put a change file entry in a new pull request.

Final result (others are in #2381)

+---------------------------------------------+------------+------------+
| Total wallclock time elapsed since start    |      69.1s |            |
|                                             |            |            |
| Section                         | no. calls |  wall time | % of total |
+---------------------------------+-----------+------------+------------+
| Assemble Stokes system          |         1 |      1.35s |         2% |
| Assemble Stokes system Newton   |         4 |      8.95s |        13% |
| Assemble Stokes system Picard   |         3 |      4.09s |       5.9% |
| Assemble Stokes system rhs      |         6 |      4.49s |       6.5% |
| Assemble temperature system     |         7 |      10.9s |        16% |
| Build Stokes preconditioner     |         7 |      15.6s |        23% |
| Solve Stokes system             |         7 |        12s |        17% |
| Initialization                  |         1 |       5.3s |       7.7% |
| Postprocessing                  |         1 |      2.25s |       3.3% |
| Setup dof systems               |         1 |     0.779s |       1.1% |
| Setup initial conditions        |         1 |     0.987s |       1.4% |
+---------------------------------+-----------+------------+------------+

@gassmoeller gassmoeller merged commit c7b2102 into geodynamics:master Jun 27, 2018
freddrichards pushed a commit to freddrichards/aspect that referenced this pull request May 20, 2019
…_performance_3

fix newton assembly performance 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants