cudnn complex convolution via gauss trick #517

nikopj · 2023-06-30T04:02:08Z

Addresses #510 using the same method as in Pytorch (Gauss's trick, complex conv via 3 real convs).

PR Checklist

Tests are added
Documentation: conv docstring updated

ext/NNlibCUDACUDNNExt/conv.jl

CarloLucibello · 2023-07-01T10:42:40Z

Great, looks good to go. Since this is an important feature, maybe we should document it somewhere? the conv docstring?

…nikopj-complexconv

nikopj · 2023-07-02T20:20:56Z

Great, looks good to go. Since this is an important feature, maybe we should document it somewhere? the conv docstring?

I noticed one dissimilarity between the cpu and cuda versions was that the cuda version could only handle when both arguments are complex, whereas the cpu version can handle a mix (though with errors for some of the pullback (∇) functions). This is understandable when the mix of real and complex doesn't make sense. E.g. for dx = ∇conv_data(dy, w), we can have (w real, dy complex) but not (w complex, dy real), b/c w is complex => y complex for y = conv(x, w).

I've now added a few more functions to allow the cuda conv functions to also handle a mix of real and complex inputs (in the same way that the cpu version can), and an additional sentence to the conv docstring mentioning that real and complex inputs are allowed. I tried to make the implementation more clean by writing an inline function _complex! which handles the beta if-statement.

There is one test that fails for the mixed case (∇conv_filter!, non-zero beta, flipkernel=true, all input sizes), though I can't seem to figure out why. I've left it commented out for now with a note. Minus this one bug, I think things are ready to go.

ToucheSir · 2023-07-04T20:02:10Z

I would make sure the case with non-zero beta isn't a correctness issue. We have some handling of that for single conv calls, but some of the fast paths (e.g. FluxML/NNlibCUDA.jl#53) may no longer hold if one chains a bunch of them with delegated arguments like this PR does.

nikopj · 2023-07-04T21:53:55Z

Ok, I think the beta bug is an error with the CPU version. Heres a MWE for the mixed real-complex ∇conv_filter! with non-zero beta:

using NNlib, CUDA, cuDNN

for T=(Float64, ComplexF64), beta=(0,1), flip=(false, true)
    @show T, beta, flip
    x_cpu = fill(T(1), 2, 1, 1)
    w_cpu = T.([1; -1;;;])
    x_gpu = CuArray(x_cpu)
    w_gpu = CuArray(w_cpu)
    cdims = NNlib.DenseConvDims(x_cpu, w_cpu; flipkernel=flip)
    y_cpu = fill(T(1), 1, 1, 1)
    y_gpu = CuArray(y_cpu)

    w_cpu_2 = NNlib.∇conv_filter!(copy(w_cpu), real(x_cpu), y_cpu, cdims, alpha=T(1), beta=T(beta))
    w_gpu_2 = NNlib.∇conv_filter!(copy(w_gpu), real(x_gpu), y_gpu, cdims, alpha=T(1), beta=T(beta))
    @show w_cpu_2
    @show w_gpu_2
    @show w_cpu_2 ≈ Array(w_gpu_2)
end

Output:

(T, beta, flip) = (Float64, 0, false)
w_cpu_2 = [1.0; 1.0;;;]
w_gpu_2 = [1.0; 1.0;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (Float64, 0, true)
w_cpu_2 = [1.0; 1.0;;;]
w_gpu_2 = [1.0; 1.0;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (Float64, 1, false)
w_cpu_2 = [2.0; 0.0;;;]
w_gpu_2 = [2.0; 0.0;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (Float64, 1, true)
w_cpu_2 = [2.0; 0.0;;;]
w_gpu_2 = [2.0; 0.0;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (ComplexF64, 0, false)
w_cpu_2 = [1.0 + 0.0im; 1.0 + 0.0im;;;]
w_gpu_2 = [1.0 + 0.0im; 1.0 + 0.0im;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (ComplexF64, 0, true)
w_cpu_2 = [1.0 + 0.0im; 1.0 + 0.0im;;;]
w_gpu_2 = [1.0 + 0.0im; 1.0 + 0.0im;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (ComplexF64, 1, false)
w_cpu_2 = [2.0 + 0.0im; 0.0 + 0.0im;;;]
w_gpu_2 = [2.0 + 0.0im; 0.0 + 0.0im;;;]
w_cpu_2 ≈ Array(w_gpu_2) = true
(T, beta, flip) = (ComplexF64, 1, true)
w_cpu_2 = [0.0 + 0.0im; 2.0 + 0.0im;;;]
w_gpu_2 = [2.0 + 0.0im; 0.0 + 0.0im;;;]
w_cpu_2 ≈ Array(w_gpu_2) = false

So only the last case fails with (T=ComplexF64, beat=1, flipkernel=true). As all of the arguments have zero imaginary part, we should expect the answer to be the same as for T=Float64, which points to the CPU version being incorrect.

Edit

I investigated this further in #518, solution in #519. The PR should be good to go once that is merged.

ToucheSir

I noticed that we already have GPU-related notes in https://github.com/FluxML/NNlib.jl/blob/v0.9.3/docs/src/reference.md?plain=1#L79. It would be good to add what @CarloLucibello mentioned there too. Otherwise this LGTM.

ToucheSir · 2023-07-15T23:53:37Z

It's great to have this feature now, thanks!

cudnn complex convolution via gauss trick

df7abef

CarloLucibello reviewed Jun 30, 2023

View reviewed changes

ext/NNlibCUDACUDNNExt/conv.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Jun 30, 2023

View reviewed changes

ext/NNlibCUDACUDNNExt/conv.jl Show resolved Hide resolved

Nikola and others added 2 commits June 30, 2023 10:58

removed debug line, added gauss trick comment

2bc4a24

fix typo in comment

b5d7008

Nikola added 2 commits July 2, 2023 16:00

conv doc updated, mixed real and complex cuda conv added

b935ce3

Merge branch 'nikopj-complexconv' of github.com:nikopj/NNlib.jl into …

ca80b6c

…nikopj-complexconv

Nikola and others added 4 commits July 5, 2023 21:25

update complexconv tests

be44298

Merge branch 'master' into nikopj-complexconv

57b61b8

complex conv test updates

2bcbb58

Merge branch 'FluxML:master' into nikopj-complexconv

f3f8f9a

ToucheSir approved these changes Jul 11, 2023

View reviewed changes

doc reference mention complex conv on CUDA and CPU

970029c

ToucheSir merged commit 6824c6d into FluxML:master Jul 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudnn complex convolution via gauss trick #517

cudnn complex convolution via gauss trick #517

nikopj commented Jun 30, 2023 •

edited

Loading

CarloLucibello commented Jul 1, 2023

nikopj commented Jul 2, 2023 •

edited

Loading

ToucheSir commented Jul 4, 2023 •

edited

Loading

nikopj commented Jul 4, 2023 •

edited

Loading

ToucheSir left a comment

ToucheSir commented Jul 15, 2023

cudnn complex convolution via gauss trick #517

cudnn complex convolution via gauss trick #517

Conversation

nikopj commented Jun 30, 2023 • edited Loading

PR Checklist

CarloLucibello commented Jul 1, 2023

nikopj commented Jul 2, 2023 • edited Loading

ToucheSir commented Jul 4, 2023 • edited Loading

nikopj commented Jul 4, 2023 • edited Loading

Edit

ToucheSir left a comment

Choose a reason for hiding this comment

ToucheSir commented Jul 15, 2023

nikopj commented Jun 30, 2023 •

edited

Loading

nikopj commented Jul 2, 2023 •

edited

Loading

ToucheSir commented Jul 4, 2023 •

edited

Loading

nikopj commented Jul 4, 2023 •

edited

Loading