Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make use of conv_bias_act #1302

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

gartangh
Copy link
Contributor

@gartangh gartangh commented Aug 4, 2020

Make use of conv_bias_act from FluxML/NNlib.jl#228.
Should speed up Convolutional Neural Networks significantly after JuliaGPU/CUDA.jl#321.

@DhairyaLGandhi
Copy link
Member

Bump

@DhairyaLGandhi
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Nov 2, 2020
@bors
Copy link
Contributor

bors bot commented Nov 2, 2020

try

Build failed:

Copy link
Member

@DhairyaLGandhi DhairyaLGandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

@@ -144,7 +144,7 @@ function (c::Conv)(x::AbstractArray)
# ndims(x) == ndims(c.weight)-1 && return squeezebatch(c(reshape(x, size(x)..., 1)))
σ, b = c.σ, reshape(c.bias, ntuple(_->1, length(c.stride))..., :, 1)
cdims = DenseConvDims(x, c.weight; stride=c.stride, padding=c.pad, dilation=c.dilation)
σ.(conv(x, c.weight, cdims) .+ b)
conv_bias_act(x, c.weight, cdims, b, σ)
end

Copy link
Member

@DhairyaLGandhi DhairyaLGandhi Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conv_bias_act(x, w, cdims::DenseConvDims, b::Zeros, σ) = σ.(conv(x, w, cdims))
function conv_bias_act(x::CuArray, w::CuArray{T}, cdims::DenseConvDims, b::Zeros, σ) where T
bz = CUDA.zeros(size(b)...)
NNlib.conv_bias_act(x, w, cdims, bz, σ)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this materialised here since CUDNN expects to get an array here, and the bump in performance over regular Conv calls is enough to justify needing to allocate some memory and quickly putting it back into the pool. We need them 2x speedups on ResNets!

Copy link
Member

@DhairyaLGandhi DhairyaLGandhi Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use reinterpret(reshape, ...) instead

@DhairyaLGandhi DhairyaLGandhi mentioned this pull request Jan 29, 2021
4 tasks
@DhairyaLGandhi DhairyaLGandhi added this to the v0.12 milestone Feb 5, 2021
@CarloLucibello
Copy link
Member

using the conv_bias_act is an inner implementation detail that should not block v0.12 (especially given the fact that the author is unresponsive)

@DhairyaLGandhi
Copy link
Member

Its not an inner detail - conv being equivalent to the forward pass of Conv is essentially the functional form of the layer. Changing that out for a different function is a breaking change since conv and conv_bias_act aren't compatible functions. (conv does not involve about the bias term for example. conv_bias_act also goes through a completely different code path with CUDA with different assumptions regarding accuracy and the algorithms that can be chosen for the convolution)

@CarloLucibello
Copy link
Member

Its not an inner detail - conv being equivalent to the forward pass of Conv is essentially the functional form of the layer.

conv is not equivalent to the forward pass of Conv, Conv also adds a bias an activation on top of that, which means that Conv is equivalent to conv_bias_act + keeping an inner state

@ToucheSir
Copy link
Member

conv_bias_act also goes through a completely different code path with CUDA with different assumptions regarding accuracy and the algorithms that can be chosen for the convolution)

In that case, we should document these assumptions and the tolerences we're okay with in different scenarios. e.g. is Conv allowed to switch to conv_bias_act when bias is set and we're running in a @fastmath context? Whatever the solution, I'm strongly of the opinion that a) we should make conv_bias_act easier to use, and b) defining a separate layer for it is a bad idea.

@DhairyaLGandhi
Copy link
Member

Well, composition means that we don't need to have wrappers around kernels that effectively compute the raison d'etre for the layer without polluting the namespace/ api. The api to get all the details of a layer for a "function" would be pretty ugly. Its not a big deal for users to broadcast an activation over a conv either.

@ToucheSir
Copy link
Member

ToucheSir commented Jul 8, 2021

AIUI, the whole point of conv_bias_act is that certain runtimes (e.g. CuDNN) will provide accelerated (e.g. fused) versions that run faster than the usual act(conv(W, x) .+ b) (which would incur at least 3 dispatches/launches). There may be marginally different results because of how different implementations work, but that's why I mentioned having something like @fastmath.

If we don't make use of conv_bias_act in Flux itself, I can count on one hand the number of people who would use it. There's very little harm in having it as a fast path of Conv that only triggers under specific circumstances.

@mcabbott
Copy link
Member

mcabbott commented Jul 8, 2021

Seems sensible for the fast way to be the default, no? Flux surely doesn't guarantee that exact floating-point values will be the same between versions, and it's definitely not a library targeting uses which care a lot about the 16th decimal place.

I don't see any discussion of how different this is; if it's more than minor floating point stuff can someone summarise & provide links? If it is e.g. substantially less accurate, then you could argue for some Conv(; precise=true) or some global switch or something.

@DhairyaLGandhi
Copy link
Member

some global switch or something.

CUDA.math_mode, and fastmath etc

@darsnack darsnack linked an issue Jan 12, 2022 that may be closed by this pull request
@darsnack darsnack mentioned this pull request Jan 12, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bottleneck in mapreducedim for convolutional layers
5 participants