New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Anderson for inputs of arbitrary dimension #223
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
be2af9d
Use Walker's defaults for m and droptol
devmotion d0adf70
Remove Anderson struct
devmotion ec61565
Check if values are finite
devmotion cb60ee3
Fix Anderson acceleration for inputs of arbitrary dimension
devmotion c0c4e88
Remove where clauses
devmotion e8c5d2c
Add test
devmotion fd825cb
Fix typo
devmotion File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,6 @@ | ||
# Notations from Walker & Ni, "Anderson acceleration for fixed-point iterations", SINUM 2011 | ||
# Attempts to accelerate the iteration xₙ₊₁ = xₙ + beta*f(xₙ) | ||
|
||
struct Anderson{m} end | ||
|
||
struct AndersonCache{Tx,To,Tdg,Tg,TQ,TR} <: AbstractSolverCache | ||
x::Tx | ||
g::Tx | ||
|
@@ -14,34 +12,40 @@ struct AndersonCache{Tx,To,Tdg,Tg,TQ,TR} <: AbstractSolverCache | |
R::TR | ||
end | ||
|
||
function AndersonCache(df, ::Anderson{m}) where m | ||
function AndersonCache(df, m) | ||
x = similar(df.x_f) | ||
g = similar(x) | ||
|
||
fxold = similar(x) | ||
gold = similar(x) | ||
|
||
# maximum size of history | ||
mmax = min(length(x), m) | ||
|
||
# buffer storing the differences between g of the iterates, from oldest to newest | ||
Δgs = [similar(x) for _ in 1:mmax] | ||
|
||
T = eltype(x) | ||
γs = Vector{T}(undef, mmax) # coefficients obtained from the least-squares problem | ||
|
||
# matrices for QR decomposition | ||
Q = Matrix{T}(undef, length(x), mmax) | ||
R = Matrix{T}(undef, mmax, mmax) | ||
if m > 0 | ||
fxold = similar(x) | ||
gold = similar(x) | ||
|
||
# maximum size of history | ||
mmax = min(length(x), m) | ||
|
||
# buffer storing the differences between g of the iterates, from oldest to newest | ||
Δgs = [similar(x) for _ in 1:mmax] | ||
|
||
T = eltype(x) | ||
γs = Vector{T}(undef, mmax) # coefficients obtained from the least-squares problem | ||
|
||
# matrices for QR decomposition | ||
Q = Matrix{T}(undef, length(x), mmax) | ||
R = Matrix{T}(undef, mmax, mmax) | ||
else | ||
fxold = nothing | ||
gold = nothing | ||
Δgs = nothing | ||
γs = nothing | ||
Q = nothing | ||
R = nothing | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is unstable. Not sure if that's a big problem in practice since small type unions are supposed to be fast these days, and anyway Anderson is mostly useful for large scale in which the overhead doesn't matter... |
||
|
||
AndersonCache(x, g, fxold, gold, Δgs, γs, Q, R) | ||
end | ||
|
||
AndersonCache(df, ::Anderson{0}) = | ||
AndersonCache(similar(df.x_f), similar(df.x_f), nothing, nothing, nothing, nothing, nothing, nothing) | ||
|
||
@views function anderson_(df::Union{NonDifferentiable, OnceDifferentiable}, | ||
initial_x::AbstractArray{T}, | ||
initial_x::AbstractArray, | ||
xtol::Real, | ||
ftol::Real, | ||
iterations::Integer, | ||
|
@@ -51,7 +55,7 @@ AndersonCache(df, ::Anderson{0}) = | |
beta::Real, | ||
aa_start::Integer, | ||
droptol::Real, | ||
cache::AndersonCache) where T | ||
cache::AndersonCache) | ||
copyto!(cache.x, initial_x) | ||
tr = SolverTrace() | ||
tracing = store_trace || show_trace || extended_trace | ||
|
@@ -65,9 +69,14 @@ AndersonCache(df, ::Anderson{0}) = | |
while iter < iterations | ||
iter += 1 | ||
|
||
# fixed-point iteration | ||
# evaluate function | ||
value!!(df, cache.x) | ||
fx = value(df) | ||
|
||
# check that all values are finite | ||
check_isfinite(fx) | ||
|
||
# compute next iterate of fixed-point iteration | ||
@. cache.g = cache.x + beta * fx | ||
|
||
# save trace | ||
|
@@ -80,7 +89,7 @@ AndersonCache(df, ::Anderson{0}) = | |
update!(tr, | ||
iter, | ||
maximum(abs, fx), | ||
iter > 1 ? sqeuclidean(cache.g, cache.x) : convert(real(T),NaN), | ||
iter > 1 ? sqeuclidean(cache.g, cache.x) : convert(real(eltype(initial_x)), NaN), | ||
dt, | ||
store_trace, | ||
show_trace) | ||
|
@@ -90,7 +99,7 @@ AndersonCache(df, ::Anderson{0}) = | |
x_converged, f_converged, converged = assess_convergence(cache.g, cache.x, fx, xtol, ftol) | ||
converged && break | ||
|
||
# define next iterate | ||
# update current iterate | ||
copyto!(cache.x, cache.g) | ||
|
||
# perform Anderson acceleration | ||
|
@@ -144,7 +153,7 @@ AndersonCache(df, ::Anderson{0}) = | |
|
||
# solve least squares problem | ||
γs = view(cache.γs, 1:m_eff) | ||
ldiv!(R, mul!(γs, Q', fx)) | ||
ldiv!(R, mul!(γs, Q', vec(fx))) | ||
|
||
# update next iterate | ||
for i in 1:m_eff | ||
|
@@ -161,31 +170,31 @@ AndersonCache(df, ::Anderson{0}) = | |
end | ||
|
||
function anderson(df::Union{NonDifferentiable, OnceDifferentiable}, | ||
initial_x::AbstractArray, | ||
xtol::Real, | ||
ftol::Real, | ||
iterations::Integer, | ||
store_trace::Bool, | ||
show_trace::Bool, | ||
extended_trace::Bool, | ||
m::Integer, | ||
beta::Real, | ||
aa_start::Integer, | ||
droptol::Real) | ||
anderson(df, initial_x, xtol, ftol, iterations, store_trace, show_trace, extended_trace, beta, aa_start, droptol, AndersonCache(df, Anderson{m}())) | ||
initial_x::AbstractArray, | ||
xtol::Real, | ||
ftol::Real, | ||
iterations::Integer, | ||
store_trace::Bool, | ||
show_trace::Bool, | ||
extended_trace::Bool, | ||
m::Integer, | ||
beta::Real, | ||
aa_start::Integer, | ||
droptol::Real) | ||
anderson(df, initial_x, xtol, ftol, iterations, store_trace, show_trace, extended_trace, beta, aa_start, droptol, AndersonCache(df, m)) | ||
end | ||
|
||
function anderson(df::Union{NonDifferentiable, OnceDifferentiable}, | ||
initial_x::AbstractArray{T}, | ||
xtol::Real, | ||
ftol::Real, | ||
iterations::Integer, | ||
store_trace::Bool, | ||
show_trace::Bool, | ||
extended_trace::Bool, | ||
beta::Real, | ||
aa_start::Integer, | ||
droptol::Real, | ||
cache::AndersonCache) where T | ||
anderson_(df, initial_x, convert(real(T), xtol), convert(real(T), ftol), iterations, store_trace, show_trace, extended_trace, beta, aa_start, droptol, cache) | ||
initial_x::AbstractArray, | ||
xtol::Real, | ||
ftol::Real, | ||
iterations::Integer, | ||
store_trace::Bool, | ||
show_trace::Bool, | ||
extended_trace::Bool, | ||
beta::Real, | ||
aa_start::Integer, | ||
droptol::Real, | ||
cache::AndersonCache) | ||
anderson_(df, initial_x, convert(real(eltype(initial_x)), xtol), convert(real(eltype(initial_x)), ftol), iterations, store_trace, show_trace, extended_trace, beta, aa_start, droptol, cache) | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this was done to have m be known to the compiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, the current implementation has the same type instability problem, it's just better hidden. In
NLsolve.jl/src/solvers/anderson.jl
Line 175 in ddd16a5
the creation of
Anderson{m}
is also type unstable.As far as I understand, both these instabilities in the current and the new implementation should not be a problem, since the type instability should not affect the output type when calling
nlsolve
and after the first call ofAnderson{m}
andAndersonCache(df, m)
everything can be inferred. To me it seems, the current implementation is just more complicated without providing any benefits.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I retract my objection then!