Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update gauge fixing to work with symmetries #19

Merged
merged 13 commits into from
Apr 16, 2024
Merged

Conversation

leburgel
Copy link
Collaborator

@leburgel leburgel commented Mar 21, 2024

Placeholder PR for an update to the gauge fixing algorithm to one that works for symmetric TensorMaps. This implementation fixes the issues #18 , but it breaks the AD since it uses KrylokvKit.eigsolve which is not differentiable. Hopefully this works once we add the corresponding rrule.

We should probably also try the other approaches, but this seemed the simplest one that worked so just adding it here for reference and discussion.

@leburgel leburgel marked this pull request as draft March 21, 2024 16:22
@pbrehmer
Copy link
Collaborator

I can try to code up the truncated eigensolve adjoint similar to the SVD truncated adjoint (also detailed in here), which would add a term to the conventional eigensolve adjoint that accounts for the truncated ranks. This could just be added to PR #15. (An SVD would also work instead of an eigendecomposition here, if we decide that implementing the KrylovKit.eigsolve adjoint is too much for now.)

@lkdvos
Copy link
Collaborator

lkdvos commented Mar 22, 2024

https://github.com/Jutho/KrylovKit.jl/tree/eigsolve_ad/src/adrules

This has been stuck there for quite a long time, but maybe I'll find the time to actually properly implement and test these things such that this can be merged. I am pretty sure that these rules work, and they should work for generic functions so it might be worth a try

@leburgel
Copy link
Collaborator Author

I copied the eigsolve rrule from Jutho/KrylovKit.jl#56 with some minor tweaks, namely filtering the ZeroTangents here
https://github.com/quantumghent/PEPSKit.jl/blob/f7cf90456269a6085422a5e9d7a7ff2b46b37094/src/utility/eigsolve.jl#L177
and specifying the add!! here
https://github.com/quantumghent/PEPSKit.jl/blob/f7cf90456269a6085422a5e9d7a7ff2b46b37094/src/utility/eigsolve.jl#L248

This doesn't work yet, specifically the linear problem rrule sometimes doesn't converge because things inside the anonymous function
https://github.com/quantumghent/PEPSKit.jl/blob/f7cf90456269a6085422a5e9d7a7ff2b46b37094/src/utility/eigsolve.jl#L230-L236
can become NaN already for some reason. This is seen from just from running test_gradients.jl in the examples folder.

@lkdvos
Copy link
Collaborator

lkdvos commented Mar 26, 2024

Some questions I have:

  • There is a check_elementwise_convergence which seems to check if the gaugefixing didn't do anything, which always triggers for me. Is this expected/necessary?
  • Why is what I did different?

[EDIT]
I don't think I actually fixed anything, the @checkgrad just discards the gradient, so somehow just discarding that part of the gradient does not break the gradient "too much", while working with AD. I would guess it is still wrong though...

@pbrehmer
Copy link
Collaborator

pbrehmer commented Mar 26, 2024

  • There is a check_elementwise_convergence which seems to check if the gaugefixing didn't do anything, which always triggers for me. Is this expected/necessary?

This is kind of expected since the default tolerance for checking the element-wise convergence is set to ctmrg_tol=1e-12, which is pretty low. Usually, the tensors only converge up to few orders of magnitude above the CTMRG tolerance. If we set the tolerance to a more realistic default, then this doesn't trigger anymore.

[EDIT] I don't think I actually fixed anything, the @checkgrad just discards the gradient, so somehow just discarding that part of the gradient does not break the gradient "too much", while working with AD. I would guess it is still wrong though...

Hmm, while I don't really understand why it works, I think the gradient comes out correct. If you run the optimization on the Heisenberg Hamiltonian you seem to get correct gradients; the ground-state energy converges correctly up to high accuracy.

Edit: Also after some further experimenting, I still don't understand why @checkgrad makes the gradient converge all of the sudden. I will try to find out where the NaNs come up in the first place.

@pbrehmer
Copy link
Collaborator

pbrehmer commented Mar 28, 2024

While I haven't made sense of the @checkgrad thing, I finally managed to repair the gauge-fixing for arbitrary unit cells. In particular for unit cells larger than $(2, 2)$ the CTM tensors now also converge element-wise.

@lkdvos
Copy link
Collaborator

lkdvos commented Apr 2, 2024

I'm still confused about the gradient, I'm quite sure that the @checkgrad implementation I have just discarded the gradient, you can test this by returning the gradient after printing, which breaks it again. The contribution might be small, making the Heisenberg example still converge, but it still feels wrong to just discard it.

@pbrehmer
Copy link
Collaborator

pbrehmer commented Apr 3, 2024

I fully agree that just discarding the gradient is a bad idea, we should be able to figure out how to compute the adjoint properly. Still, I wasn't able to make any progress on the eigsolve adjoint: I can't figure out what the actual origin of the NaNs is. Usually, the first few times the rrule for eigsolve is called, the linear problem converges and Δvecs is still finite. But then Δvecs will only contain NaNs. I'm not sure, however, if this is due to the linear problem not converging or if the linear problem doesn't converge because it gets NaNs as an input (through b which depends on Δvecs).

Another thought I had is that the adjoint of an eigendecomposition contains possibly diverging terms with $F_{ij} = (\lambda_j - \lambda_i)^{-1}$. This would only be a problem if the eigenvalues are indeed degenerate or very small or very close, right? And I don't think that should the case generally, but maybe there is still something going on with that?

@lkdvos
Copy link
Collaborator

lkdvos commented Apr 3, 2024

When I was playing around with it, I also felt like it wasn't actually the rrule of eigsolve that caused the problem, and actually the NaN appeared sooner. But it does feel like sometimes there is just some incredibly small values, which then seem to cause this behaviour

@pbrehmer
Copy link
Collaborator

pbrehmer commented Apr 3, 2024

Okay, so I think the problem must be somewhere in the differentiation of leftorth. I don't know if there is a problem with the rrule or something else going on, but replacing leftorth with tsvd (and constructing a unitary matrix with $Q=UV^\dagger$) seems to resolve the AD problems of gauge_fix. Of course it would be nicer to do a QR decomposition instead of an SVD, but at least this works as a quick fix.

src/algorithms/ctmrg.jl Outdated Show resolved Hide resolved
src/algorithms/ctmrg.jl Show resolved Hide resolved
src/environments/ctmrgenv.jl Show resolved Hide resolved
src/utility/rotations.jl Show resolved Hide resolved
@lkdvos
Copy link
Collaborator

lkdvos commented Apr 8, 2024

Just put these things here to not forget them, will try and address them myself tomorrow if I find some time

@lkdvos lkdvos marked this pull request as ready for review April 15, 2024 10:49
@lkdvos
Copy link
Collaborator

lkdvos commented Apr 15, 2024

I cleaned up the VectorInterface part, and did some minor tweaks. For me this is ready to go.

@leburgel leburgel merged commit 93e74a3 into master Apr 16, 2024
8 of 11 checks passed
@leburgel leburgel deleted the lb/symm_gauge_fixing branch April 16, 2024 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants