Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add bilinear upsample layer #1136

Closed
wants to merge 6 commits into from
Closed

add bilinear upsample layer #1136

wants to merge 6 commits into from

Conversation

kczimm
Copy link

@kczimm kczimm commented Apr 21, 2020

Implement a 2D bilinear upsample layer like the one found in tensorflow.

Copy link
Contributor

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • need docs
  • need test

src/layers/upsample.jl Show resolved Hide resolved

out = similar(x, (newW, newH, C, N))

for n = 1:N, c = 1:C, w = 1:newW, h = 1:newH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scalar loop would be extremely slow when x is CuArray

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I must admit I don't have experience optimizing Julia for CUDA usage. Do you have suggestions in this case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadcast would be fast on GPU

@kczimm kczimm marked this pull request as draft April 23, 2020 12:02
@kczimm kczimm marked this pull request as ready for review April 23, 2020 13:27
@kczimm
Copy link
Author

kczimm commented Apr 23, 2020

build failed perhaps due to the Github outage I was experiencing? HTTP 500 errors on clones. I don't know how to trigger a rebuild without pushing another commit.

Copy link
Contributor

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good to me, although I'm not sure if this is the optimized one wrt CuArrays. Interpolation could be interpreted as a convolution operation, so it might be possible to have a more efficient implementation, which I'm not sure of. It would be better if someone familiar with this topic can help review.

BTW, can you also add this layer to the docs? I think it could be https://github.com/FluxML/Flux.jl/blob/master/docs/src/models/layers.md#L10

0.823648 0.658877 0.329336 0.164566
0.845325 0.675933 0.337149 0.167757
0.888679 0.710044 0.352773 0.174138
0.910357 0.7271 0.360586 0.177329```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- 0.910357  0.7271    0.360586  0.177329```
+ 0.910357  0.7271    0.360586  0.177329
+ ```


Create an upsampling layer that uses bilinear interpolation.

The width and height dimensions grow by the `factor` tuple.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 1st dimension is height because in Julia it's column first.

Suggested change
The width and height dimensions grow by the `factor` tuple.
The input is commonly interpreted as a batch of images, where the height(1st) and width(2nd) dimensions grow by the `factor` tuple.

"""
BilinearUpsample(factor::Tuple{Integer,Integer})

Create an upsampling layer that uses bilinear interpolation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Create an upsampling layer that uses bilinear interpolation.
Create an upsampling layer that uses bilinear interpolation to upscale the first two dimensions of 4D input.

factor::Tuple{T,T}
end

function (b::BilinearUpsample)(x::AbstractArray)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function (b::BilinearUpsample)(x::AbstractArray)
function (b::BilinearUpsample)(x::AbstractArray{T, 4}) where T

end

function (b::BilinearUpsample)(x::AbstractArray)
W, H, C, N = size(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to swap W and H but it's still okay to keep it as it is, since most people confuse them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data in Flux is stored in WHCN order isn’t it?

Data should be stored in WHCN order (width, height, # channels, batch size).

Copy link
Contributor

@johnnychen94 johnnychen94 Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but I prefer to read it as a misunderstanding. I just wanted to mention it here in case you're not aware of it.

It's okay to abuse the usage of WH since it's relative to the column/row-first order.

@kczimm
Copy link
Author

kczimm commented Apr 24, 2020

I put this in a new file which I was hesitant to do. Perhaps it belongs with the conv code and docs because this is really used in convolutional networks?

@johnnychen94
Copy link
Contributor

johnnychen94 commented Apr 24, 2020

I put this in a new file which I was hesitant to do. Perhaps it belongs with the conv code and docs because this is really used in convolutional networks?

I guess where it should be put doesn't matter much. One thing I forgot to mention is to export the symbol BilinearUpsample in src/Flux.jl.

@DhairyaLGandhi
Copy link
Member

Let's add gpu tests, and maybe we could think of performance on the GPU. It might be that we want to go with a broadcasting approach

@scimas
Copy link
Contributor

scimas commented May 1, 2020

Linking this code written for bilinear interpolation using Interpolations.jl for reference, in case it offers any value for optimizing this pr. comment

@CarloLucibello
Copy link
Member

The approach in this PR seems neither Zygote nor Flux friendly, I don't think this is the way we go. @scimas did you test your code with differentiation and on gpu?

@scimas
Copy link
Contributor

scimas commented May 1, 2020

@CarloLucibello I just got around to testing it today, doesn't work with zygote. I get errors about being unable to check bounds.

@scimas
Copy link
Contributor

scimas commented May 2, 2020

So I did some more testing and Zygote calculates the gradient without any problem if I explicitly tell it to perform forwarddiff. Still have not tested on GPU.

using Zygote
up = Upsampler(2, 2);
x = rand(2, 2, 1, 1);
model(x) = sum(up(x));

# This works
grad = gradient(x) do p
    Zygote.forwarddiff(p) do p
        model(p)
    end
end

# But this doesn't
grad = gradient(x) do x
    model(x)
end

where Upsampler is from the previously linked code.

And frankly I don't have enough expertise to figure out why backward diff doesn't work, simply figuring out the forwarddiff part took most of the day.

@kczimm
Copy link
Author

kczimm commented May 13, 2020

Closing in favor of the superior #1180. Thanks everyone who reviewed this PR.

@kczimm kczimm closed this May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants