2D/3D convolution/pooling (`ConvLayer`/`PoolLayer`/`TransposedConvLayer`) depends on order of axes #594

albertz · 2021-08-29T21:26:57Z

We want to avoid anything which depends on the order of axes.

albertz · 2021-09-23T20:23:12Z

Some related discussion is in #597. E.g. there was the proposal of in_to_out_dims. E.g. for ConvLayer with 2D conv, it would have 3 entries: 2 for the two spatial dims, and then one for the linear feature transformation. For PoolLayer and others which do not do a linear transformation, it would contain only entries for the spatial dims.

albertz · 2021-09-23T20:23:15Z

The filter_size and related options would not be a list/tuple but instead a dict[DimensionTag, int] which also specifies the spatial axes it operates on.

albertz · 2021-09-23T20:27:55Z

One open question is the native format (shape, order of axes) of the tf.Variable. For that, the order of spatial axes matters.

Also, the native order of the input could change for various reasons, so we must not make it dependent on that.

JackTemaki · 2021-09-23T20:51:44Z

This would be a very welcome addition, I remember I had some cases where I had to play around a lot to make Returnn run the filters over the axes I wanted... Isn't it fine to have a logic that moves the axes in the optimal order? Just like the [T,B,F] transformation for the nativelstm? I am not sure what is happening right now, but I think a re-ordering based in the selected Dimension tags is possible, and the tf.Variable can be defined accordingly.

albertz · 2021-09-23T21:02:24Z

One core principle of RETURNN is that the order of the axes should never ever matter in any way. This core principles allows layers to reorder axes for efficiency (e.g. like RecLayer). And this is exactly what this issue here is about, because this is violated by ConvLayer for 2D/3D convolution, and similarly for PoolLayer, and some other similar layers.

The proposal here would also not have any order. It would just be like filter_size={dimtag1: 3, dimtag2: 5} or so.

But then, the order of axes for the underlying tf.Variable is ambiguous.

albertz · 2021-09-23T21:09:20Z

Isn't it fine to have a logic that moves the axes in the optimal order

We already do that. But this should not matter to the user.

I had to play around a lot to make Returnn run the filters over the axes I wanted

This is not so much this issue here. This issue here is about the order of spatial axes, which matters, although it should not.

I assume in your case, you had some mixup between spatial and feature dim. You do not want to reorder the axes. You simply need to redefine which axis is the feature dim (channel dim). You can use ReinterpretDataLayer with set_axes: {F: ...} for that.

JackTemaki · 2021-09-23T21:12:50Z

But then, the order of axes for the underlying tf.Variable is ambiguous.

Ah, you mean for the cases where there is no "more efficient" order ? like it is not important if h or w comes first for images? Well... then just a heuristic is fine or not? Like sort by largest filter size first, or sort by name of dimension tag or so...

albertz · 2021-09-23T21:26:21Z

But then, the order of axes for the underlying tf.Variable is ambiguous.

Ah, you mean for the cases where there is no "more efficient" order ? like it is not important if h or w comes first for images? Well... then just a heuristic is fine or not? Like sort by largest filter size first, or sort by name of dimension tag or so...

Yea, sth like that. I'm sure there are some possible solutions. Although I'm not so happy with all the potential solutions I thought about so far.

I don't really like heuristics. And what when the filter sizes are the same (which is not so uncommon)?

I also don't really know which order of the filter dims is optimal. Maybe sorting is actually good, and largest first makes it most efficient. Who knows. Probably depends also on the hardware. Or even CUDA or cuDNN version. E.g. it was true for long time that batch-feature-major was more efficient on GPU than batch-spatial-major, but I recently read that some of the more recent professional Nvidia GPUs probably make batch-spatial-major more efficient.

The current best way I can think of is to use heuristics anyway, or maybe just use the input spatial axis order, but then also store the order explicitly somewhere in the TF checkpoint. And when we load the checkpoint, make sure the order is fine, or maybe reorder accordingly.

The dim tag names are also not really good for identification. Esp the internal names (description actually...) are not really fixed and are being changed from time to time. See also #634. But when we make new dim tags more explicit by the user (#597), i.e. no internal names are used, then this should be less ambiguous.

However, when you import some model, and the dim tag names do not match for whatever reason, should the import fail then?

JackTemaki · 2021-09-23T21:50:21Z

However, when you import some model, and the dim tag names do not match for whatever reason, should the import fail then?

Oh, I see... yes then this is really not easy to choose...

albertz · 2021-09-23T21:55:12Z

One option is to make the order explicit in the config. So filter_size=OrderedDict(...).

albertz · 2021-09-25T21:20:49Z

Or, another option (from the comment here) is to have an argument in_spatial_dims (list/tuple of dim tags / axes) which defines the order. Then filter_size and all the other arguments can stay as they are.

albertz · 2021-11-29T11:05:38Z

in_spatial_dims is implemented now for ConvLayer and PoolLayer (#789).

However, it is not mandatory yet. Maybe this should be done as well, via new behavior version.

albertz · 2021-11-29T13:03:20Z

TransposedConvLayer also has the same problems. Fixed via #791.

Fix #594.

albertz mentioned this issue Sep 24, 2021

Specify dim tags for layers that create new axes #597

Closed

albertz mentioned this issue Oct 1, 2021

SplitDimsLayer fix feature_dim_axis on feature-dim split #705

Merged

albertz mentioned this issue Nov 27, 2021

ConvLayer and PoolLayer, in_dim, in_spatial_dims, out_dim, out_spatial_dims #789

Merged

albertz mentioned this issue Nov 29, 2021

TransposedConvLayer in_dim, out_dim, in_spatial_dims, out_spatial_dims #791

Merged

albertz changed the title ~~2D/3D convolution/pooling (ConvLayer/PoolLayer) depends on order of axes~~ 2D/3D convolution/pooling (ConvLayer/PoolLayer/TransposedConvLayer) depends on order of axes Nov 29, 2021

albertz added a commit that referenced this issue Nov 29, 2021

require in_spatial_dims for more than one spatial dims

4cf3311

Fix #594.

albertz mentioned this issue Nov 29, 2021

Conv/Pool, require in_spatial_dims for more than one spatial dims #792

Merged

albertz added a commit that referenced this issue Nov 30, 2021

require in_spatial_dims for more than one spatial dims

23ab0a9

Fix #594.

albertz closed this as completed in #792 Dec 1, 2021

albertz added a commit that referenced this issue Dec 1, 2021

Conv/Pool, require in_spatial_dims for more than one spatial dims (#792)

1d6793e

Fix #594.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2D/3D convolution/pooling (`ConvLayer`/`PoolLayer`/`TransposedConvLayer`) depends on order of axes #594

2D/3D convolution/pooling (`ConvLayer`/`PoolLayer`/`TransposedConvLayer`) depends on order of axes #594

albertz commented Aug 29, 2021

albertz commented Sep 23, 2021

albertz commented Sep 23, 2021

albertz commented Sep 23, 2021 •

edited

JackTemaki commented Sep 23, 2021

albertz commented Sep 23, 2021 •

edited

albertz commented Sep 23, 2021 •

edited

JackTemaki commented Sep 23, 2021

albertz commented Sep 23, 2021 •

edited

JackTemaki commented Sep 23, 2021

albertz commented Sep 23, 2021

albertz commented Sep 25, 2021

albertz commented Nov 29, 2021

albertz commented Nov 29, 2021

2D/3D convolution/pooling (ConvLayer/PoolLayer/TransposedConvLayer) depends on order of axes #594

2D/3D convolution/pooling (ConvLayer/PoolLayer/TransposedConvLayer) depends on order of axes #594

Comments

albertz commented Aug 29, 2021

albertz commented Sep 23, 2021

albertz commented Sep 23, 2021

albertz commented Sep 23, 2021 • edited

JackTemaki commented Sep 23, 2021

albertz commented Sep 23, 2021 • edited

albertz commented Sep 23, 2021 • edited

JackTemaki commented Sep 23, 2021

albertz commented Sep 23, 2021 • edited

JackTemaki commented Sep 23, 2021

albertz commented Sep 23, 2021

albertz commented Sep 25, 2021

albertz commented Nov 29, 2021

albertz commented Nov 29, 2021

2D/3D convolution/pooling (`ConvLayer`/`PoolLayer`/`TransposedConvLayer`) depends on order of axes #594

2D/3D convolution/pooling (`ConvLayer`/`PoolLayer`/`TransposedConvLayer`) depends on order of axes #594

albertz commented Sep 23, 2021 •

edited

albertz commented Sep 23, 2021 •

edited

albertz commented Sep 23, 2021 •

edited

albertz commented Sep 23, 2021 •

edited