Copying spectral radii back to main memory after blockette routines … #37

anilyil · 2020-04-17T21:53:38Z

... if updateDt is true. Without this change, reverse mode AD routines use outdated spectral radii values, which results in inaccurate sensitivities.

Purpose

This PR addresses issues #32 and #36.

With this change, the spectral radii values are copied back to main memory from cached memory after a blockette residual computation that also updates the time step. This copy is done inside this if check because we do not need to copy these values for matrix-free matrix-vector products; however, we want to copy them after every ANK and NK step.

The extra copy comes with additional cost due to the increase memory access. This extra cost will only be present with blockette calls that also update the time step, which is not required for matrix-free operations.

I have ran tests to measure the performance difference caused by this change on one residual evaluation. The test is ran on Stampede2, one skylake node, compiled with AVX2 instruction set, 48 processors, blockette size 8, test block sizes of 32^3 and 48^3 per processor. I provide a few results:

Base speed: Millions of cells processed per one processor in one second with default residual routines that operate on main arrays.
Blockette speed: Same speed metric, but with cache-blocked residual routines.
Speedup: Blockette speed / base speed

The timing results:

Old results before this change:

32^3:

Base speed: 0.676747811395
Blockette speed: 1.24815149274
Speedup: 1.844337686393324

48^3:

Base speed: 0.629510355349
Blockette speed: 1.29542792502
Speedup: 2.057834178600312

New results with this change:

32^3:

Base speed: 0.676336855482
Blockette speed: 1.21488916591
Speedup: 1.7962782244717301

48^3:

Base speed: 0.627697421135
Blockette speed: 1.25598801018
Speedup: 2.00094499019755

The addition of this extra memory access causes the speedup to decrease from 1.84 to 1.80 and from 2.06 to 2.00 for test blocks of size 32^3 and 48^3, respectively. This is a very minor decrease in performance. Furthermore, this "slowdown" will only be happening with calls that update the time step, and the main computationally intensive calls (mat-free operations and preconditioner computations) do not update this time step. Therefore, the overall effect of this change on performance will be negligible.

Type of change

Bugfix (non-breaking change which fixes an issue)

Testing

Regressions pass. Furthermore, the test in #32 also passes.

Checklist

I have run unit and regression tests which pass locally with my changes

…f updateDt is true. Without this change, reverse mode AD routines use outdated spectral radii values, which results in inaccurate sensitivities.

joanibal · 2020-04-17T22:04:40Z

I'm going to remove myself as a reviewer because we have already discussed this issue in depth.

I approve of the changes, but don't want my review to count towards the two required

anilyil · 2020-04-18T02:14:06Z

I was wrong; the bug would not cause adjoint sensitivities to be inaccurate. The intermediate arrays are updated in the call to master in: https://github.com/mdolab/adflow/blob/master/src/adjoint/adjointUtils.F90#L592 The bug would still effect issue #32.

joanibal · 2020-04-22T19:37:50Z

Oops, I thought it wouldn't count it since I removed myself as a reviewer.
(I approved so pull panda would stop reminding me)

…f updateDt is true. Without this change, reverse mode AD routines use outdated spectral radii values, which results in inaccurate sensitivities. (mdolab#37)

Copying spectral radii back to main memory after blockette routines i…

b616192

…f updateDt is true. Without this change, reverse mode AD routines use outdated spectral radii values, which results in inaccurate sensitivities.

anilyil requested a review from a team as a code owner April 17, 2020 21:53

anilyil requested review from sseraj, Xiaosong2105, camader and joanibal April 17, 2020 21:53

joanibal removed their request for review April 17, 2020 22:04

Merge remote-tracking branch 'mdolab/master'

bd3e51c

joanibal approved these changes Apr 22, 2020

View reviewed changes

Xiaosong2105 approved these changes Apr 24, 2020

View reviewed changes

Xiaosong2105 merged commit 3ec3273 into mdolab:master Apr 24, 2020

anilyil mentioned this pull request Apr 25, 2020

Copy intermediate variables from blockette memory #38

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copying spectral radii back to main memory after blockette routines … #37

Copying spectral radii back to main memory after blockette routines … #37

anilyil commented Apr 17, 2020

joanibal commented Apr 17, 2020

anilyil commented Apr 18, 2020

joanibal commented Apr 22, 2020

Copying spectral radii back to main memory after blockette routines … #37

Copying spectral radii back to main memory after blockette routines … #37

Conversation

anilyil commented Apr 17, 2020

Purpose

Old results before this change:

32^3:

48^3:

New results with this change:

32^3:

48^3:

Type of change

Testing

Checklist

joanibal commented Apr 17, 2020

anilyil commented Apr 18, 2020

joanibal commented Apr 22, 2020