New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2D/3D convolution/pooling (ConvLayer
/PoolLayer
/TransposedConvLayer
) depends on order of axes
#594
Comments
Some related discussion is in #597. E.g. there was the proposal of |
The |
One open question is the native format (shape, order of axes) of the Also, the native order of the input could change for various reasons, so we must not make it dependent on that. |
This would be a very welcome addition, I remember I had some cases where I had to play around a lot to make Returnn run the filters over the axes I wanted... Isn't it fine to have a logic that moves the axes in the optimal order? Just like the [T,B,F] transformation for the nativelstm? I am not sure what is happening right now, but I think a re-ordering based in the selected Dimension tags is possible, and the tf.Variable can be defined accordingly. |
One core principle of RETURNN is that the order of the axes should never ever matter in any way. This core principles allows layers to reorder axes for efficiency (e.g. like The proposal here would also not have any order. It would just be like But then, the order of axes for the underlying |
We already do that. But this should not matter to the user.
This is not so much this issue here. This issue here is about the order of spatial axes, which matters, although it should not. I assume in your case, you had some mixup between spatial and feature dim. You do not want to reorder the axes. You simply need to redefine which axis is the feature dim (channel dim). You can use |
Ah, you mean for the cases where there is no "more efficient" order ? like it is not important if h or w comes first for images? Well... then just a heuristic is fine or not? Like sort by largest filter size first, or sort by name of dimension tag or so... |
Yea, sth like that. I'm sure there are some possible solutions. Although I'm not so happy with all the potential solutions I thought about so far. I don't really like heuristics. And what when the filter sizes are the same (which is not so uncommon)? I also don't really know which order of the filter dims is optimal. Maybe sorting is actually good, and largest first makes it most efficient. Who knows. Probably depends also on the hardware. Or even CUDA or cuDNN version. E.g. it was true for long time that batch-feature-major was more efficient on GPU than batch-spatial-major, but I recently read that some of the more recent professional Nvidia GPUs probably make batch-spatial-major more efficient. The current best way I can think of is to use heuristics anyway, or maybe just use the input spatial axis order, but then also store the order explicitly somewhere in the TF checkpoint. And when we load the checkpoint, make sure the order is fine, or maybe reorder accordingly. The dim tag names are also not really good for identification. Esp the internal names ( However, when you import some model, and the dim tag names do not match for whatever reason, should the import fail then? |
Oh, I see... yes then this is really not easy to choose... |
One option is to make the order explicit in the config. So |
Or, another option (from the comment here) is to have an argument |
However, it is not mandatory yet. Maybe this should be done as well, via new behavior version. |
|
ConvLayer
/PoolLayer
) depends on order of axesConvLayer
/PoolLayer
/TransposedConvLayer
) depends on order of axes
We want to avoid anything which depends on the order of axes.
The text was updated successfully, but these errors were encountered: