Add fold and unfold #444

nikopj · 2022-11-23T16:23:17Z

Added fold / unfold operators for pytorch feature parity. These are useful for nonlocal operators in convnets. I tried to follow a similar style as conv and meanpool, and similar argument structuring as pytorch's fold/unfold.

I initially looked at pull #303 but ended up writing things a little differently.

I have not had the chance to test on GPU yet, but can give it a go soon.

Feedback is greatly appreciated!

PR Checklist

Tests are added (autodiff, wrapper, fold/unfold as inverses)
Documentation (of wrapper functions only)

ToucheSir

This looks great! Just one suggestion.

ToucheSir · 2022-11-24T04:44:23Z

test/fold.jl

+    gradtest(x -> sum(unfold(x, cdims)), x)
+
+    y = unfold(x, cdims)
+    gradtest(y -> sum(fold(y, size(x), cdims)), y)


Suggested change

gradtest(x -> sum(unfold(x, cdims)), x)

y = unfold(x, cdims)

gradtest(y -> sum(fold(y, size(x), cdims)), y)

gradtest(unfold, x, cdims; check_rrule=true)

y = unfold(x, cdims)

gradtest(fold, y, size(x), cdims; check_rrule=true)

Should save a lambda and test a little more at the same time.

I think FiniteDifferences is causing an error in gradtest by trying to perturb the arguments of cdims.

AutoDiff: spatial_rank=1: Error During Test at /home/nikopj/.julia/dev/NNlib/test/fold.jl:29 Got exception outside of a @test TypeError: in new, expected Tuple{Int64}, got a value of type Tuple{Float64} ...

I can pass the finite differences test by passing the function only as an argument of the input array, and I can pass the CRC rrule test by calling it separately. Looking at test/conv.jl, it seems to be doing a similar game with gradtest. The below change passes finite differences and rrule test.

gradtest(x -> unfold(x, cdims), x) test_rrule(unfold, x, cdims)

mcabbott · 2022-11-24T16:43:42Z

For my own understanding, and perhaps for docs, an example where the numbers show you what's going where:

julia> x = [100; 2; 3; 40; 5; 6; 700;;;];  # 1D data, 1 channel, batch of 1

julia> size(x)
(7, 1, 1)

julia> y3 = unfold(x, (3,1,1))  # sliding window of length 3, stride 1
5×3×1 Array{Int64, 3}:          # size = (5 windows, each 3 elements, 1 batch)
[:, :, 1] =
   3   2  100
  40   3    2
   5  40    3
   6   5   40
 700   6    5

julia> fold(ans, size(x), (3,1,1))  # sum of contributions in y. 100 appears once, 40 three times.
7×1×1 Array{Int64, 3}:
[:, :, 1] =
 100
   4
   9
 120
  15
  12
 700

julia> y4 = unfold(x, (4,1,1), stride=2, pad=2)  # sliding window of length 4, stride 2
4×4×1 Array{Int64, 3}:
[:, :, 1] =
  2  100   0    0
 40    3   2  100
  6    5  40    3
  0  700   6    5

julia> fold(y4, size(x), (4,1,1), stride=2, pad=2)
7×1×1 Array{Int64, 3}:
[:, :, 1] =
 200
   4
   6
  80
  10
  12
 700

julia> eachslice(unfold(x, (3,1,1), stride=3), dims=1) .|> vec .|> collect  # non-overlapping windows
2-element Vector{Vector{Int64}}:
 [3, 2, 100]
 [6, 5, 40]      # note that it omits 700 rather than pad

julia> Iterators.partition(x, 3) .|> collect  # Base's equivalent
3-element Vector{Vector{Int64}}:
 [100, 2, 3]
 [40, 5, 6]      # note that the order is reversed
 [700]

Some things to note are

order of indices is the reverse of python's, which seems right.
pad seems to be the upper bound for padding, not lower -- unfold will discard rather than pad.
the order of elements in slices is reversed. I doubt it matters much so long as this is consistent. But is this deliberate or accidental?
the accumulation by fold is what you want for it to be the gradient of unfold.

nikopj · 2022-11-24T17:38:50Z

@mcabbott, your example is nice, I will add it to the docs.

the windows follow what conv would see, and only valid windows that fit the entire conv kernel are used, same as valid conv. For example conv with similar arguments only has two output elements.

julia> x = [100; 2; 3; 40; 5; 6; 700;;;];

julia> w = [1;1;1;;;];

julia> conv(x, w; stride=3)
2×1×1 Array{Int64, 3}:
[:, :, 1] =
 105
  51

The ordering of slice elements is reversed but deliberate to be consistent with conv, and can be changed with flipped=true. I'm happy to have flipped=true be default if you think that's more sensible. In any case I'll add a word about it in the docs. Example:

julia> unfold(x, (3,1,1); stride=3, flipped=true)
2×3×1 Array{Int64, 3}:
[:, :, 1] =
 100  2  3
  40  5  6

I can mention in the docs how fold is the adjoint/transpose operator of unfold and hence the gradient of `unfold.

mcabbott · 2022-11-24T17:56:35Z

Oh I missed the flipped, sorry. I guess a slight preference for the opposite convention, if it's easy to change. I see this doesn't change channel order which is good:

julia> unfold(hcat(x, zero(x), -x), (3,3,1), flipped=true)  # 1D data, 3 channels
5×9×1 Array{Int64, 3}:
[:, :, 1] =
 100   2    3  0  0  0  -100   -2    -3
   2   3   40  0  0  0    -2   -3   -40
   3  40    5  0  0  0    -3  -40    -5
  40   5    6  0  0  0   -40   -5    -6
   5   6  700  0  0  0    -5   -6  -700

And, re stride etc, following conv sounds like exactly the right thing to do.

mcabbott

This looks good, thanks! Nontrivial first PR around here.

I see you're already working on the GPU version, FluxML/NNlibCUDA.jl#59

My only reservation is name collisions from the export really.

mcabbott · 2022-11-28T15:48:38Z

src/NNlib.jl

@@ -61,6 +61,9 @@ export conv, conv!, ∇conv_data, ∇conv_data!, ∇conv_filter,
 include("conv_bias_act.jl")
 export conv_bias_act, conv_bias_act!

+include("fold.jl")
+export unfold, unfold!, fold, fold!


I'm a little worried these names may be too common to export. scatter collided with every plotting library...

It's not working for me right now but https://juliahub.com may be able to tell us.

https://juliahub.com/ui/Search?q=fold&type=symbols&t=function&u=define

possible name confusion with Base too. Given these functions are somewhat domain-specific, I agree it would be better to keep them unexported.

No problem, makes sense. That juliahub tool is very useful, thanks for showing.

nikopj added 3 commits November 22, 2022 22:04

fold/unfold added

a61ca3c

fold kernel flipping

e182a67

docs, fix semicolon error

7d9bc02

nikopj changed the title ~~Added fold/unfold functions~~ Add fold and unfold Nov 23, 2022

ToucheSir reviewed Nov 24, 2022

View reviewed changes

nikopj added 2 commits November 24, 2022 16:22

unfold flipped=true default, added to docs, rrule test

eb71c99

doc example fix for julia 1.6 compat.

5e93219

nikopj mentioned this pull request Nov 28, 2022

added fold/unfold and gpu tests FluxML/NNlibCUDA.jl#59

Merged

1 task

ToucheSir requested a review from mcabbott November 28, 2022 15:32

mcabbott approved these changes Nov 28, 2022

View reviewed changes

removed fold/unfold from export

9ba19c3

mcabbott merged commit a36f15b into FluxML:master Nov 28, 2022

This was referenced Nov 28, 2022

Add fold.jl #303

Closed

PyTorch feature parity FluxML/Flux.jl#1431

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fold and unfold #444

Add fold and unfold #444

nikopj commented Nov 23, 2022

ToucheSir left a comment

ToucheSir Nov 24, 2022

nikopj Nov 24, 2022

mcabbott commented Nov 24, 2022

nikopj commented Nov 24, 2022

mcabbott commented Nov 24, 2022

mcabbott left a comment

mcabbott Nov 28, 2022

mcabbott Nov 28, 2022

ToucheSir Nov 28, 2022

nikopj Nov 28, 2022

Add fold and unfold #444

Add fold and unfold #444

Conversation

nikopj commented Nov 23, 2022

PR Checklist

ToucheSir left a comment

Choose a reason for hiding this comment

ToucheSir Nov 24, 2022

Choose a reason for hiding this comment

nikopj Nov 24, 2022

Choose a reason for hiding this comment

mcabbott commented Nov 24, 2022

nikopj commented Nov 24, 2022

mcabbott commented Nov 24, 2022

mcabbott left a comment

Choose a reason for hiding this comment

mcabbott Nov 28, 2022

Choose a reason for hiding this comment

mcabbott Nov 28, 2022

Choose a reason for hiding this comment

ToucheSir Nov 28, 2022

Choose a reason for hiding this comment

nikopj Nov 28, 2022

Choose a reason for hiding this comment