Reused the gradient calculation in the adaptive algorithm. #336
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The adaptive algorithm was calculating the gradient an extra type per cycle. It needs to calculate the gradient twice, to figure out how close the self-consistent iteration and NR steps are, but it was calculating it again a third time to start the loop. Now, it instead stores the gradient that has the lowest norm for the next cycle. It turns out that hessian calculations are not that much slower than gradient calculations in many cases, so eliminating the gradient call helps.
This reduces the number of gradient calls by 1/3 with no loss of accuracy. For a sample large problem (500 states, 1000 samples / state) it reduced the total time taken by ~10%, which is good for 3 lines of code and no loss of accuracy.