Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement gemm! for (LazyTensor, LazyProduct) combo #190

Closed
Datseris opened this issue Jan 5, 2018 · 5 comments
Closed

Implement gemm! for (LazyTensor, LazyProduct) combo #190

Datseris opened this issue Jan 5, 2018 · 5 comments

Comments

@Datseris
Copy link
Collaborator

Datseris commented Jan 5, 2018

For reference:

xmin = -30
xmax = 30
b_position = PositionBasis(xmin, xmax, 100)
b_momentum = MomentumBasis(b_position)

b_positiony = PositionBasis(xmin, xmax, 100)
b_momentumy = MomentumBasis(b_positiony)

Txp = transform(b_position, b_momentum)
Tpx = transform(b_momentum, b_position)
Typ = transform(b_positiony, b_momentumy)
Tpy = transform(b_momentumy, b_positiony)

Hkinx = LazyProduct(Txp, momentum(b_momentum)^2/2, Tpx)
Hkiny = LazyProduct(Typ, momentum(b_momentumy)^2/2, Tpy)

b_comp = b_position  b_positiony

Hkinx_comp = LazyTensor(b_comp, [1, 2], [Hkinx, one(b_positiony)])
Hkiny_comp = LazyTensor(b_comp, [1, 2], [one(b_position), Hkiny])

H = LazySum(Hkinx_comp, Hkiny_comp)

Now doing

ψx = gaussianstate(b_position, -10.0, 1.0, 2)
ψy = gaussianstate(b_positiony, 0, 0.5, 2)
ψ = ψx  ψy

T = collect(linspace(0., 10.0, 50))
tout, C = timeevolution.schroedinger(T, ψ, H)

gives the error that gemm! is not defined for LazyTensor and LazyProduct. Instead one has to work-around by passing Hfull = sparse(full(H)) instead of H to timeevolution.

@david-pl
Copy link
Member

david-pl commented Jan 9, 2018

As a remark, the issue you are having can actually be traced back to the fact that no form of tensor product (lazy or normal) is implemented for the FFTOperator type.

To explain: While it is true that gemm! is not implement for a LazyTensor of LazyProduct, it is always possible to rewrite such an instance to a LazyProduct of LazyTensor which gemm! can handle.
As an example, say you have two LazyProducts, X = LazyProduct(x1, x2) and Y = LazyProduct(y1, y2). Forming the LazyTensor of these will give you the above error. However, you can rewrite X ⊗ Y = x1*y1 ⊗ x2*y2. So essentially you can rewrite LazyTensor(b_comp, [1, 2], X, Y) (which gives the error) as LazyProduct(LazyTensor(b_comp, [1, 2], [x1, y1]), LazyTensor(b_comp, [1, 2], [x2, y2])).

The problem in your example now is, that x1, x2, y1, y2 are of type FFTOperator for which LazyTensor is not defined. This means, though, that the actual issue is not the combination of LazyTensor and LazyProduct (you just have to rewrite those as shown above). What you need is an (efficient) implementation of a FFTOperator on a composite basis and the corresponding tensor product defined accordingly.

@david-pl
Copy link
Member

david-pl commented Jan 16, 2018

I have implemented a first working version of the FFTOperator in more than one dimension here https://github.com/qojulia/QuantumOptics.jl/tree/FFTOperator-ND. This already works for your example in the PR here qojulia/QuantumOptics.jl-examples#11.

If you change the Hamiltonian as follows, everything works fine with the time evolution:

Hkinx = LazyTensor(b_momentum  b_momentumy, [1, 2], [momentum(b_momentum)^2/2, one(b_momentumy)])
Hkiny = LazyTensor(b_momentum  b_momentumy, [1, 2], [one(b_momentum), momentum(b_momentumy)^2/2])
Hkin = LazyProduct(Txp  Typ, LazySum(Hkinx, Hkiny), Tpx  Tpy)

Note, that instead of Txp ⊗ Typ you could equivalently write transform(b_position ⊗ b_positiony, b_momentum ⊗ b_momentumy).

A first quick round of benchmarks already looks quite promising:

@benchmark timeevolution.schroedinger(T, ψ, H)
BenchmarkTools.Trial: 
  memory estimate:  81.15 MiB
  allocs estimate:  50534
  --------------
  minimum time:     291.918 ms (1.53% GC)
  median time:      294.519 ms (2.15% GC)
  mean time:        305.641 ms (5.65% GC)
  maximum time:     390.605 ms (26.00% GC)
  --------------
  samples:          17
  evals/sample:     1

As compared to the very inefficient case that was previously used here:

Hsparse = sparse(full(H))
@benchmark timeevolution.schroedinger(T, ψ, Hsparse)
BenchmarkTools.Trial: 
  memory estimate:  6.43 MiB
  allocs estimate:  2036
  --------------
  minimum time:     17.420 s (0.00% GC)
  median time:      17.420 s (0.00% GC)
  mean time:        17.420 s (0.00% GC)
  maximum time:     17.420 s (0.00% GC)
  --------------
  samples:          1
  evals/sample:     1

But also when comparing it to a more elegant implementation that was already possible, namely:

Hkinx_old = LazyTensor(b_comp, [1, 2], [momentum(b_position)^2/2, one(b_positiony)])
Hkiny_old = LazyTensor(b_comp, [1, 2], [one(b_position), momentum(b_positiony)^2/2])
H_old = LazySum(Hkinx_old, Hkiny_old, V_comp)
@benchmark timeevolution.schroedinger(T, ψ, H_old)
BenchmarkTools.Trial: 
  memory estimate:  7.98 MiB
  allocs estimate:  31514
  --------------
  minimum time:     4.025 s (0.00% GC)
  median time:      4.037 s (0.00% GC)
  mean time:        4.037 s (0.00% GC)
  maximum time:     4.049 s (0.00% GC)
  --------------
  samples:          2
  evals/sample:     1

we see that we have a huge speed-up.

@Datseris Please have a go at this and let me know if everything works fine for you as well.

There are still some open issues here, that I have to address though:

  • Run benchmarks against previous implementation in 1D. I used reshape in the current gemm! and gemv! implementations which will certainly slow things down. The question is by how much and how this can be avoided.
  • Allow tensor products of different bases. For composite bases one could then check which of the respective sub-bases is of type PositionBasis or MomentumBasis, respectively, and only apply fft to those.

Do any other things come to mind?

EDIT: One more thing for the list:

  • The current implementation of FFTOperator can be applied to both a Ket and a DenseOperator. To do this, it uses plan_fft for both a vector and a matrix. Naturally, one has to create the matrix in order to plan the fft. However, this can lead to an undesirably high memory consumption. This is especially true if one is only working with lazy operations and a state that is a Ket. We should therefore implement a specialized form of the FFTOperator that only transforms Ket types. This will avoid having to create the matrix and allow for much larger dimensions in problems that only involve Ket.

@Datseris
Copy link
Collaborator Author

Hi David,

I have to say that your latest post blew away my mind.

Unfortunatelly at the moment I have been extremely busy with other projects (involving Husimi functions) and that is why I haven't made any moves yet.

I really hope I can get to this on Friday, else I will do something over the weekend.

@david-pl
Copy link
Member

The benchmarks keep giving me inconsistent results on different runs, even though I am reserving a CPU core for them. I am not sure why this is happening. Sometimes the new implementation is even faster than the old one in 1D, which is quite surprising. In any case, I think the difference is negligible.
bench-fft
Here, the blue line is the new implementation.

Basically, we can now move on the remaining points raised above. I will look into it.

@Datseris No worries, I know that time is always a very limited resource.

@david-pl
Copy link
Member

The FFTOperator is implemented for composite bases by now. In regard to the title of this issue, namely the combination of lazy products and tensors in gemm!, you simply have to keep in mind the order, i.e. the LazyTensor needs to remain the lowest level (you can always rewrite products and tensor products so this is the case). A corresponding comment was added to the documentation, which is now online.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants