-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add bilinear upsample layer #1136
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- need docs
- need test
src/layers/upsample.jl
Outdated
|
||
out = similar(x, (newW, newH, C, N)) | ||
|
||
for n = 1:N, c = 1:C, w = 1:newW, h = 1:newH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scalar loop would be extremely slow when x
is CuArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but I must admit I don't have experience optimizing Julia for CUDA usage. Do you have suggestions in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadcast would be fast on GPU
build failed perhaps due to the Github outage I was experiencing? HTTP 500 errors on clones. I don't know how to trigger a rebuild without pushing another commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks good to me, although I'm not sure if this is the optimized one wrt CuArrays. Interpolation could be interpreted as a convolution operation, so it might be possible to have a more efficient implementation, which I'm not sure of. It would be better if someone familiar with this topic can help review.
BTW, can you also add this layer to the docs? I think it could be https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/layers.md#L10
0.823648 0.658877 0.329336 0.164566 | ||
0.845325 0.675933 0.337149 0.167757 | ||
0.888679 0.710044 0.352773 0.174138 | ||
0.910357 0.7271 0.360586 0.177329``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 0.910357 0.7271 0.360586 0.177329```
+ 0.910357 0.7271 0.360586 0.177329
+ ```
|
||
Create an upsampling layer that uses bilinear interpolation. | ||
|
||
The width and height dimensions grow by the `factor` tuple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 1st dimension is height because in Julia it's column first.
The width and height dimensions grow by the `factor` tuple. | |
The input is commonly interpreted as a batch of images, where the height(1st) and width(2nd) dimensions grow by the `factor` tuple. |
""" | ||
BilinearUpsample(factor::Tuple{Integer,Integer}) | ||
|
||
Create an upsampling layer that uses bilinear interpolation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create an upsampling layer that uses bilinear interpolation. | |
Create an upsampling layer that uses bilinear interpolation to upscale the first two dimensions of 4D input. |
factor::Tuple{T,T} | ||
end | ||
|
||
function (b::BilinearUpsample)(x::AbstractArray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function (b::BilinearUpsample)(x::AbstractArray) | |
function (b::BilinearUpsample)(x::AbstractArray{T, 4}) where T |
end | ||
|
||
function (b::BilinearUpsample)(x::AbstractArray) | ||
W, H, C, N = size(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to swap W
and H
but it's still okay to keep it as it is, since most people confuse them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data in Flux is stored in WHCN order isn’t it?
Line 17 in 7a32a70
Data should be stored in WHCN order (width, height, # channels, batch size). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but I prefer to read it as a misunderstanding. I just wanted to mention it here in case you're not aware of it.
It's okay to abuse the usage of WH since it's relative to the column/row-first order.
I put this in a new file which I was hesitant to do. Perhaps it belongs with the conv code and docs because this is really used in convolutional networks? |
I guess where it should be put doesn't matter much. One thing I forgot to mention is to export the symbol |
Let's add gpu tests, and maybe we could think of performance on the GPU. It might be that we want to go with a broadcasting approach |
Linking this code written for bilinear interpolation using Interpolations.jl for reference, in case it offers any value for optimizing this pr. comment |
The approach in this PR seems neither Zygote nor Flux friendly, I don't think this is the way we go. @scimas did you test your code with differentiation and on gpu? |
@CarloLucibello I just got around to testing it today, doesn't work with zygote. I get errors about being unable to check bounds. |
So I did some more testing and Zygote calculates the gradient without any problem if I explicitly tell it to perform using Zygote
up = Upsampler(2, 2);
x = rand(2, 2, 1, 1);
model(x) = sum(up(x));
# This works
grad = gradient(x) do p
Zygote.forwarddiff(p) do p
model(p)
end
end
# But this doesn't
grad = gradient(x) do x
model(x)
end where And frankly I don't have enough expertise to figure out why backward diff doesn't work, simply figuring out the forwarddiff part took most of the day. |
Closing in favor of the superior #1180. Thanks everyone who reviewed this PR. |
Implement a 2D bilinear upsample layer like the one found in tensorflow.