Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nicer Interfaces for ::Number Manifolds and Objectives #236

Merged
merged 50 commits into from
Apr 29, 2023

Conversation

kellertuer
Copy link
Member

@kellertuer kellertuer commented Apr 10, 2023

Trying to resolve #223 and #235

Todo

  • check which algorithms can be provided in the way of Implement a fill(p) prequel for every solver, where p might be a number. #235 way (maybe not all necessary) - it can in most cases
  • rework all interfaces according to High-lever interfaces starting with solvername(M, obj, ...) #223, see list below
  • Algorithms to check / rework
    • Alternating Gradient Descent (not for numbers, since it alternates on an array anyways)
    • augmented Lagrangian Method
    • Conjugate Gradient
    • Cyclic Proximal Point Algorithm
    • Difference of Convex Algorithm
    • Difference of Convex Proximal Point Algorithm
    • DouglasRachford
    • EPM
    • FW
    • Gradient Descent
    • LevenbergMarquardt (not (yet) for Numbers, maybe still rename jacB kw)
    • NelderMead
    • PS
    • Quasi Newton
    • Stochastic Gradient Descent
    • Subgradient Method
    • TCG
    • trust regions (internally then already uses the right TCG)
  •  check that the docs are fine (since I introduced one small new function here as well

Still assert type problems, where I have no clue why some area still thinks we should work with floats in the assertion of types.
@kellertuer
Copy link
Member Author

kellertuer commented Apr 10, 2023

Ah, I at least see the problem.

The NonMonotoneLineSearch uses a storage for a point and a vector, but since it (during initialisation) never “sees” an actual point (i.e. how it is represented) it takes a random point, which is a float.
The storage worked fine with Floats by the way before (I was not even aware of that).

Now since we have a float storage, that of course errors when we com by with 1-arrays. This is not only a problem here but even a general problem if other manifolds use other than their default representation. I did fix the number/vector issue when initialising storage – but will have to check for NMLinesearch. Probably the best is to parr an optional init point and init vector to that as well.

edit:
I think I am one step further, now a problem seems to still be that

  • copy(Circle(), [0.1]) works fine
  • copy(Circle(), 0.1) errors since 0.1 is not a Vector{Float64} and of float we have
ERROR: MethodError: no method matching similar(::Float64, ::Type{Float64})
Closest candidates are:
  similar(::Union{LinearAlgebra.Adjoint{T, var"#s886"}, LinearAlgebra.Transpose{T, var"#s886"}} where {T, var"#s886"<:(AbstractVector)}, ::Type{T}) where T at ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:207
  similar(::Union{LinearAlgebra.Adjoint{T, S}, LinearAlgebra.Transpose{T, S}} where {T, S}, ::Type{T}) where T at ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:211
  similar(::Union{LinearAlgebra.Adjoint{T, S}, LinearAlgebra.Transpose{T, S}} where {T, S}, ::Type{T}, ::Tuple{Vararg{Int64, N}}) where {T, N} at ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:212
  ...
Stacktrace:
 [1] allocate(a::Float64, T::Type)
   @ ManifoldsBase ~/Repositories/Julia/ManifoldsBase.jl/src/ManifoldsBase.jl:65
[.... and a long stack trace ...]

So maybe parts of Manifolds are also not so much tested with `::Number``s.

@kellertuer
Copy link
Member Author

kellertuer commented Apr 10, 2023

While this turns out far more complex than I thought, I found out some interesting things.
defining copy as above (for now only in weak dependency in Manopt, maybe later in Manifolds)

I noticed

julia> 1.0 .* inverse_retract(Circle(), fill(0.0), fill(0.0))
0.0

julia> 1.0 * inverse_retract(Circle(), fill(0.0), fill(0.0))
0-dimensional Array{Float64, 0}:
0.0

So I am not so sure I want to use .* any longer to scale tangent vectors?
The same for .-

julia> 1.0 * inverse_retract(Circle(), fill(0.0), fill(0.0)) .- inverse_retract(Circle(), fill(0.0), fill(0.0))
0.0

julia> 1.0 * inverse_retract(Circle(), fill(0.0), fill(0.0)) - inverse_retract(Circle(), fill(0.0), fill(0.0))
0-dimensional Array{Float64, 0}:
0.0

since for now this is what we used in a lot of places but it seems to change the type? Or in short

julia> fill(0.0) .- fill(0.0)
0.0

julia> fill(0.0) - fill(0.0)
0-dimensional Array{Float64, 0}:
0.0

edit: So as soon as we scale vectors using * (and not .* and we subtract vectors using - and not .- (currently just affected Nesterov) – tests are running again. I think I will ask the fill thing the discourse.

@codecov
Copy link

codecov bot commented Apr 10, 2023

Codecov Report

Merging #236 (39d3a23) into master (2a5a62e) will decrease coverage by 0.17%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #236      +/-   ##
==========================================
- Coverage   99.85%   99.69%   -0.17%     
==========================================
  Files          65       64       -1     
  Lines        5558     5819     +261     
==========================================
+ Hits         5550     5801     +251     
- Misses          8       18      +10     
Impacted Files Coverage Δ
src/Manopt.jl 100.00% <ø> (ø)
src/plans/augmented_lagrangian_plan.jl 100.00% <ø> (ø)
src/plans/constrained_plan.jl 100.00% <ø> (ø)
src/plans/solver_state.jl 100.00% <ø> (ø)
src/plans/stepsize.jl 95.85% <ø> (-4.15%) ⬇️
src/plans/cache.jl 100.00% <100.00%> (ø)
src/plans/gradient_plan.jl 100.00% <100.00%> (ø)
src/plans/hessian_plan.jl 100.00% <100.00%> (ø)
src/plans/nonlinear_least_squares_plan.jl 100.00% <100.00%> (ø)
src/plans/objective.jl 100.00% <100.00%> (ø)
... and 18 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kellertuer
Copy link
Member Author

The switch to 1-dim arrays (using [p] instead of fill(p); we could also use fill(p,1) I think) makes this more reliable (might even revert the changes in Nesterov)

See JuliaLang/julia#28866 thanks to the super-fast answer in https://discourse.julialang.org/t/broadcast-strange-on-0-dimensional-array/97311/2

@kellertuer kellertuer changed the title remove old boilerplate code, sketch new scheme to work with nonmutati… Nicer Interfaces for ::Number Manifolds and Objectives Apr 10, 2023
@mateuszbaran
Copy link
Member

  • copy(Circle(), 0.1) errors since 0.1 is not a Vector{Float64} and of float we have

Hm, copy(Circle(), 0.1) should just return 0.1. This should be fixed in ManifoldsBase.jl. I will make a patch.

So maybe parts of Manifolds are also not so much tested with `::Number``s.

Yes, I very rarely use plain numbers in Manifolds.jl.

So I am not so sure I want to use .* any longer to scale tangent vectors?
The same for .-

Oh, I didn't know about that problem with 0-dim arrays. Let's just use one-element 1-dim arrays with broadcasting then.

@kellertuer
Copy link
Member Author

  • copy(Circle(), 0.1) errors since 0.1 is not a Vector{Float64} and of float we have

Hm, copy(Circle(), 0.1) should just return 0.1. This should be fixed in ManifoldsBase.jl. I will make a patch.

great!

So maybe parts of Manifolds are also not so much tested with `::Number``s.

Yes, I very rarely use plain numbers in Manifolds.jl.

Maybe this PR will check a few of those then :) but I think the copy might be the main thing.

So I am not so sure I want to use .* any longer to scale tangent vectors?
The same for .-

Oh, I didn't know about that problem with 0-dim arrays. Let's just use one-element 1-dim arrays with broadcasting then.

Well, t was interesting to learn, and we just use 1-dims and then it should be fine :)

# Conflicts:
#	Project.toml
#	src/plans/nonmutating_manifolds_plans.jl
@kellertuer
Copy link
Member Author

This is a bit more work than I thought, but a nice occasion to unify and test all high-level interfaces. ALM took me about an hour, but the kwargs got nicer in quite a few places.

@mateuszbaran
Copy link
Member

Yes, it indeed looks like quite a bit of work but I think it's a good change 🙂 .

@kellertuer
Copy link
Member Author

Yes, I think both things are good ideas, both the preprocessing for numbers (though it takes a bit with adopting functions),
and I like the interfaces alg(M, obj, p) if you have an objective that can be run on different manifolds. And all changes are non-breaking indeed.

So this is definetly useful. Just also takes bit to rework.

@kellertuer
Copy link
Member Author

...phew just 10 Algorithms left. It is nice to also review a few tests and improve them, but going through all algorithms is indeed a bit of work by now.

But. Both new things are really nice to have (though I will not do the number thing for the more-involved algorithms like RCPA and PDSSN).

@kellertuer
Copy link
Member Author

For EPM it even made phrasing the defaults much nicer, because some required the objective already. So this rework not only separates some code nicely, it also makes some code really nice.

@kellertuer
Copy link
Member Author

Checked Chambolle-Pock and PDSSN today, they make not so much sense in this new form, since they have both a lot of mandatory functions (which sure could be wrapped in their objectives) but also quite a few initial mandatory steps, so that the interface with the objective would still be too crowded. Will think about something nicer for them in some next PR and removed them from the list here.

@kellertuer
Copy link
Member Author

@mateuszbaran I have one small questions on LevenbergMarquardt

  • Do you think a variant on Numbers (Circle/PositiveNumbers/Euclidan) would be useful? I am not sure how to test this
  • expect_zero_residual was not documented, I already added a line, can you add an explanation, what it does? I mean I know what it indicates, but now how that changes the algorithm.

@mateuszbaran
Copy link
Member

  • Do you think a variant on Numbers (Circle/PositiveNumbers/Euclidan) would be useful? I am not sure how to test this

That would be a valid optimization problem but I would expect that there should be a better specialized algorithm for that case.

  • expect_zero_residual was not documented, I already added a line, can you add an explanation, what it does? I mean I know what it indicates, but now how that changes the algorithm.

I don't really have a good explanation how the algorithm is changed. BTW, expect_zero_residual is documented for LevenbergMarquardtState.

@kellertuer
Copy link
Member Author

kellertuer commented Apr 29, 2023

  • Do you think a variant on Numbers (Circle/PositiveNumbers/Euclidan) would be useful? I am not sure how to test this

That would be a valid optimization problem but I would expect that there should be a better specialized algorithm for that case.

Then I will not cast the special case here

  • expect_zero_residual was not documented, I already added a line, can you add an explanation, what it does? I mean I know what it indicates, but now how that changes the algorithm.

I don't really have a good explanation how the algorithm is changed. BTW, expect_zero_residual is documented for LevenbergMarquardtState.

Ah, I took that one over then for now, though it does not help much for a user to read that text ;)

@kellertuer kellertuer merged commit 34d1bf4 into master Apr 29, 2023
12 of 13 checks passed
@kellertuer kellertuer deleted the kellertuer/more-interfaces branch December 16, 2023 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High-lever interfaces starting with solvername(M, obj, ...)
2 participants