Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Householder reflections with modifications. #8

Merged
merged 53 commits into from
May 26, 2023

Conversation

benedict-96
Copy link
Collaborator

Householder reflections are needed because Gram Schmidt (especially the symplectic version) is numerically very unstable. In addition, Householder reflections are much cheaper to compute and should in general scale with $\mathcal{O}(N^2)$ as opposed to $\mathcal{O}(N^3)$ (but this has to be investigated further).

The most important new routines are in src/optimizers/householder.jl. HouseDecom is a struct that efficiently and implicitly stores the $Q$ and $R$ matrices (same as in the qr routine in LinearAlgebra.jl). Depending on whether transpose is set to false or true, (HD::HouseDecom)(X) will compute $QX$ or $Q^TX$.
Tests for these routines are in test/householder.jl. These routines (application of $Q$ and $Q^T$, not the factorization $A=QR$) are implied the need to re-implement the Householder reflections!

src/optimizers/lie_alg_lifts.jl computes the lift for the Stiefel manifold and maps to the global tangent space representation, i.e. $\mathtt{global\_rep}: T_Y\mathcal{M} \to \mathfrak{g}^{\mathrm{hor}}$. $\mathfrak{g}^\mathrm{hor}$ is also realized as a struct StiefelLieAlgHorMatrix in the file src/arrays/stiefel_lie_alg_hor.jl.
The function apply_projection is the canonical map that does $\mathfrak{g}^\mathrm{hor}\to{}T_Y\mathcal{M}$.
So far one test for this has been implemented in tests/lie_alg_lifts.jl.

@benedict-96
Copy link
Collaborator Author

Also started to implement retractions now. I first tried the retraction which can be found in file src/optimizers/retractions.jl under Exp_euc (https://math.mit.edu/~edelman/publications/geometry_of_algorithms.pdf, "Geometry of Algorithms with Orthogonality Constraints"; page 310) - this does actually not compute the geodesic we need. A better solution can be found in the same reference in chapter 2.4.2.

@benedict-96
Copy link
Collaborator Author

I implemented a custom retraction for the Stiefel manifold (true geodesic). A problem with the performance of the Householder retractions became apparent. Also: there may be a problem with the term $\exp(A)*A^{-1}$ - the inverse is not necessary and may lead to instabilities! (File src/optimizers/retractions.jl)

… changed bits of the Householder algorithm. The main routines are now using LinearAlgebra.qr for the moment, however. This might become an issue since there is no Julia implementation of symplectic Householder.
…specific random number generator at the initialization step. A new 'TrivialInitRNG' is used for initializing the optimizer caches.
…re now more or less in the format they should be in.
@benedict-96
Copy link
Collaborator Author

A "problem" that has emerged is that Lux is only working with single precision by default and this is implemented in a somewhat sloppy way in the code, for example:

function glorot_uniform(rng::AbstractRNG, dims::Integer...; gain::Real=1)
    scale = Float32(gain) * sqrt(24.0f0 / sum(_nfan(dims...)))
    return (rand(rng, Float32, dims...) .- 0.5f0) .* scale
end

in Lux/src/utils.jl. I'm not sure if I should keep the option to also work with double precision, or just give up on it because keeping it would probably involve refactoring some Lux code.

@benedict-96
Copy link
Collaborator Author

Also: I implemented a custom RNG TrivialInitRNG that initializes the weights for the optimizers, as Lux always takes an RNG as input for its setup function.

@benedict-96 benedict-96 marked this pull request as ready for review May 15, 2023 14:10
@michakraus michakraus merged commit 42c7fda into main May 26, 2023
@michakraus michakraus deleted the dev-manifold-optimizers branch May 26, 2023 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants