Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ITensors][BUG] Derivative of gate application of non-Hermitian MPOs with apply_dag=true #1314

Closed
ntausend opened this issue Jan 24, 2024 · 5 comments · Fixed by #1344
Closed
Labels
bug Something isn't working ITensors Issues or pull requests related to the `ITensors` package.

Comments

@ntausend
Copy link
Contributor

ntausend commented Jan 24, 2024

Description of bug

Hey, I found that the application of an ITensor gate $G$ acting on some MPO $A$ in the adjoint representation:

$G \to GAG^\dagger$

leads to a wrong pullback. The normal left action $G\to GA$ is working properly (see minimal example). Also the adjoint action on a plain ITensor is also working as expected.

Minimal code demonstrating the bug or unexpected behavior

Minimal runnable code

using ITensors
using Zygote: pullback
using Random: seed!

seed!(1234)
N = 2
elT = Float64

idx = Index(N)
# generate the test MPO
A = MPO(randomITensor(elT, prime(idx), dag(idx)), [idx])
# ITensor version
Ait = contract(A)
# fix the order to compare to numerical results
inds_order = inds(Ait)
# matrix version
Am = matrix(Ait)

# apply itensor
G = randomITensor(elT, prime(idx), dag(idx))
inds_order_G = inds(G)
# matrix version
Gm = matrix(G)

# generate a dual vector MPO
M = MPO(randomITensor(elT, prime(idx), dag(idx)), [idx])
# ITensor version
Mit = contract(M)
# matrix version
Mm = matrix(Mit)

# apply function without dagger apply
f(G) = apply([G], A)
fit(G) = apply([G], Ait)
fm(G) = G * Am

# compute the actions
fG   = matrix(permute(contract(f(G)), inds_order))
fGit = matrix(permute(fit(G), inds_order))
fGm  = fm(Gm) 

# should all result in the same matrix
println("Results from applying gate to MPO/ITensor/Matrix")
println("Are all versions of computing the contraction results in the same result? $((fG ≈ fGit) && (fGit ≈ fGm))")

# now we calculate the pullback of all these versions onto a test dual vetor
∇fG   = matrix(permute(pullback(f, G)[2](M)[1], inds_order_G))
∇fGit = matrix(permute(pullback(fit, G)[2](Mit)[1], inds_order_G))
∇fGm  =        pullback(fm, Gm)[2](Mm)[1]

println("Is the ITensor pullback equal to the matrix pullback? $(∇fGit ≈ ∇fGm)")
println("Is the MPO pullback equal to the matrix pullback? $(∇fG ≈ ∇fGm)")


# apply function with dagger apply
g(G) = apply([G], A; apply_dag = true)
git(G) = apply([G], Ait; apply_dag = true)
gm(G) = G * Am * G'

# compute the actions
gG   = matrix(permute(contract(g(G)), inds_order))
gGit = matrix(permute(git(G), inds_order))
gGm  = gm(Gm) 

# should all result in the same matrix
println("Results from applying gate to MPO/ITensor/Matrix")
println("Are all versions of computing the contraction results in the same result? $((gG ≈ gGit) && (gGit ≈ gGm))")

# now we calculate the pullback of all these versions onto a test dual vetor
∇gG   = matrix(permute(pullback(g, G)[2](M)[1], inds_order_G))
∇gGit = matrix(permute(pullback(git, G)[2](Mit)[1], inds_order_G))
∇gGm  =        pullback(gm, Gm)[2](Mm)[1]

println("Is the ITensor pullback equal to the matrix pullback? $(∇gGit ≈ ∇gGm)")
println("Is the MPO pullback equal to the matrix pullback? $(∇gG ≈ ∇gGm)")
println(∇gGit)
println(∇gG)

Expected output or behavior

I expect all the tests in the example to return true. This would be equivalent to the output:

Results from applying gate to MPO/ITensor/Matrix
Are all versions of computing the contraction results in the same result? true
Is the ITensor pullback equal to the matrix pullback? true
Is the MPO pullback equal to the matrix pullback? true
Results from applying gate to MPO/ITensor/Matrix
Are all versions of computing the contraction results in the same result? true
Is the ITensor pullback equal to the matrix pullback? true
Is the MPO pullback equal to the matrix pullback? true
[-0.1423413716309932 -0.3131385545124809; -1.5979554418491506 -0.19790363455598498]
[-0.1423413716309932 -0.3131385545124809; -1.5979554418491506 -0.19790363455598498]

Actual output or behavior

Instead of getting the expected behavior, the last check fails, indicating that the adjoint application of the ITensor gate onto the MPO leads to a different pullback compared to the matrix and the ITensor version. Also the numerical values are far of being comparable with the other versions.

Results from applying gate to MPO/ITensor/Matrix
Are all versions of computing the contraction results in the same result? true
Is the ITensor pullback equal to the matrix pullback? true
Is the MPO pullback equal to the matrix pullback? true
Results from applying gate to MPO/ITensor/Matrix
Are all versions of computing the contraction results in the same result? true
Is the ITensor pullback equal to the matrix pullback? true
Is the MPO pullback equal to the matrix pullback? false
[-0.1423413716309932 -0.3131385545124809; -1.5979554418491506 -0.19790363455598498]
[0.7198139139346542 -1.652092462241267; 1.9081148486222423 -4.176254592261031]

Version information

  • Output from versioninfo():
julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × 13th Gen Intel(R) Core(TM) i7-1355U
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, goldmont)
  Threads: 1 on 12 virtual cores
  • Output from using Pkg; Pkg.status("ITensors"):
julia> using Pkg; Pkg.status("ITensors")
[9136182c] ITensors v0.3.54
@ntausend ntausend added bug Something isn't working ITensors Issues or pull requests related to the `ITensors` package. labels Jan 24, 2024
@mtfishman
Copy link
Member

Thanks for the report. Good to hear it's only an issue with apply_dag=true and MPOs, thanks for checking the other cases.

The logic handling apply_dag=true for MPOs is here: https://github.com/ITensor/ITensors.jl/blob/v0.3.55/src/ITensorChainRules/mps/abstractmps.jl#L101-L105. It's been a while since I looked at that so it would take some time to debug. Maybe there is an issue with the priming. Does it seem like maybe it is performing $A \to GAG$ instead of $A \to GAG^\dagger$?

@mtfishman
Copy link
Member

Something that sticks out to me is that it seems like you are testing on a non-Hermitian MPO. Does it seem like it works on Hermitian MPOs?

Looking at https://github.com/ITensor/ITensors.jl/blob/v0.3.55/src/ITensorChainRules/mps/abstractmps.jl#L101-L105, I see that it only applies the gate to one side of the MPO, and then doubles the result to compute the pullback. That logic would only work for Hermitian MPOs.

I think we just didn't consider someone would use apply_dag=true for non-Hermitian MPOs. Of course we should make the non-Hermitian case work or throw an error in that case indicating it doesn't work.

@mtfishman
Copy link
Member

I confirmed that the issue is specific to taking derivatives through gate application of non-Hermitian MPOs using apply_dag=true. In #1335 I catch that case and throw an error so at least it doesn't silently give the wrong result like it is doing now. I'll leave this issue open as a reminder to fix it properly, though I'm not sure if I'll have time to look into it soon. As a workaround, users can apply the gates on either side of the MPO manually with apply_dag=false.

@mtfishman mtfishman changed the title [ITensors] [BUG] Apply ITensor to MPO with adjoint representation has wrong pullback [ITensors][ENHANCEMENT] Derivative of gate application of non-Hermitian MPOs with apply_dag=true Feb 16, 2024
@mtfishman mtfishman changed the title [ITensors][ENHANCEMENT] Derivative of gate application of non-Hermitian MPOs with apply_dag=true [ITensors][BUG] Derivative of gate application of non-Hermitian MPOs with apply_dag=true Feb 16, 2024
@ntausend
Copy link
Contributor Author

Hey sorry for not answering the last days.
Yes I think its because the rrule implicitly assumes hermiticity, at least this was my first guess.

I can try to write some function i case I find the time. Should be straight forward to define the rrule in this case.

@mtfishman
Copy link
Member

Sounds good, thanks. Would be great to get a proper fix for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ITensors Issues or pull requests related to the `ITensors` package.
Projects
None yet
2 participants