Use real FFTs #36

antoine-levitt · 2019-10-28T17:54:51Z

Add an option to perform real-to-complex and complex-to-real FFTs with a kwarg, and use that to do the potentials as real arrays. Could also do dispatch on the eltype of f_real, but it's not compatible with G_to_r(f_fourier).

antoine-levitt · 2019-10-28T18:35:17Z

For bonus points, optimize on the kpoints that are real (there's gamma but also some others; see https://docs.abinit.org/theory/wavefunctions/ for a nice writeup) and store wavefunctions in packed mode. That should be relatively easy since the only place we actually use them is in the G_to_r functions: it's just that the basis field of the Kpoint struct only contains half the G vectors. NB: to get a unitary representation (orthogonal basis), a specific treatment is needed: each G component gets a factor sqrt(2) except for the DC one.

mfherbst · 2019-10-28T20:39:00Z

About the dispatch: I think it's fair to only do this if the in-place version of the FFTs is used or if an appropriate kwarg is used.

mfherbst · 2019-10-28T20:39:35Z

Related to #9

antoine-levitt · 2019-12-29T16:41:57Z

So I've been thinking about this, and the main difficulty is the in-place.

First, I've been doing some more serious benchmarking on this: https://gist.github.com/antoine-levitt/3495f47c99676f26824df1fbe9462040. So if I'm interpreting this right, a reasonable mental model is that an alloc, a fill and a copy take about the same time. That unit of time is about 1/8th of an FFT. (some of that might be due to julia issue 130. So allocations do matter, unfortunately (especially when going to eg threading).

The main problem is that we use in-place ffts, which is a lot more tricky with real FFTs, essentially because the input and output don't have the same size (they also don't have the same type, but we can work around that with reinterpret). The real array should be of size 2*(n/2+1) (see eg http://www.fftw.org/fftw3_doc/One_002dDimensional-DFTs-of-Real-Data.html and http://www.fftw.org/fftw3_doc/Multi_002dDimensional-DFTs-of-Real-Data.html#Multi_002dDimensional-DFTs-of-Real-Data). There's a julia interface with an open PR to FFTW but really it's just as simple to do it ourselves). So it means that we can't easily use that and keep the interface we have right now for our functions and in the rest of the code.

So the simplest option is to just drop the in-place FFTs entirely. Does that hurt us very much? In my tests I see a performance difference of about 10% in favour of in-place. So acceptable for the sake of a much cleaner interface.

So, roadmap:

1 loop and thread at the level of Hamiltonian application (rather than FFTs), so FFT functions only ever have to deal with one wavefunction at a time. Ie do for iband=1:nband in mul!(Y,Ham,X). There is no performance penalty in doing so, it simplifies the FFT code quite a bit, improves memory locality and helps us manage memory footprint.

2 do a barebones implementation with out-of-place FFTs keeping the data format for the Psi the same

3 benchmark that implementation. Maybe we need to play some tricks like reuse temporary buffers; we can do that with thread-local buffers (see https://julialang.org/blog/2019/07/multithreading paragraph "thread-local state" for an example; there is also the convenient let buf = ... function f() do_things(buf) end end for function-local variables)

4 switch to a compressed representation for the reciprocal-space representation of the psi. That involves annoying factors of sqrt(2) in both the psi and the nonlocal terms, but we can maybe abstract that away, with something like a on_basis(basis, kpt, gen) where gen(q) is the Fourier transform. That would also simplify some of these pesky sqrt(unit_cell_volume).

antoine-levitt · 2020-01-01T13:28:19Z

So I've started on this, and hit an unexpected fun thing. I was expecting for the zero kpoint to store only half the G vectors, and have a basis like e_G(r) = e^{iGr} + e^{-iGr} and keep the rest of the code as is, ie expand with complex amplitudes on this basis set. But it doesn't work like that: cG e_G isn't real when cG is complex. So the next idea is to expand like e_G(r) = cG e^{iGr} + cG* e^{-iGr}, but then inner products don't match: <c, c'> != <psi, psi'> (even ignoring the pesky sqrt(2) factors). Fundamentally, real eigenvalue problems of size 2N and complex eigenvalue problems of size N are different. Which explains both

antoine@beta ~/Desktop/qe-6.4.1 $ grep -ir _gamma * | wc -l
933

in QE (important functions have two versions, one for gamma and the others for the other kpoints) and

real(dp) :: cg(2,npw*nspinor*nband)

in ABINIT (complex numbers are not used for the storage of the wavefunctions, and npw gets divided by 2 in the case when k is the gamma point).

Dynamic languages in general and julia in particular really shine here. We just have to be careful to type Psi as a Vector{Any}, so that Psi[1] can be a matrix of reals and Psi[2] a matrix of complex.

mfherbst · 2022-10-04T07:02:03Z

Done in #722.

mfherbst added feature New feature or request performance Performance regression or performance-related labels Oct 28, 2019

antoine-levitt mentioned this issue Dec 29, 2019

FFTs only work one band at a time #96

Merged

mfherbst mentioned this issue Mar 4, 2022

Use time-reversal symmetry for orbitals at the gamma point #616

Open

mfherbst mentioned this issue Sep 11, 2022

Introduce irfft for inverse FFT of real-valued quantities #722

Merged

mfherbst closed this as completed Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use real FFTs #36

Use real FFTs #36

antoine-levitt commented Oct 28, 2019

antoine-levitt commented Oct 28, 2019

mfherbst commented Oct 28, 2019

mfherbst commented Oct 28, 2019

antoine-levitt commented Dec 29, 2019 •

edited

Loading

antoine-levitt commented Jan 1, 2020

mfherbst commented Oct 4, 2022

Use real FFTs #36

Use real FFTs #36

Comments

antoine-levitt commented Oct 28, 2019

antoine-levitt commented Oct 28, 2019

mfherbst commented Oct 28, 2019

mfherbst commented Oct 28, 2019

antoine-levitt commented Dec 29, 2019 • edited Loading

antoine-levitt commented Jan 1, 2020

mfherbst commented Oct 4, 2022

antoine-levitt commented Dec 29, 2019 •

edited

Loading