Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data verify_out_shape not intuitive when there are implicit dims #1153

Open
albertz opened this issue Oct 19, 2022 · 4 comments
Open

Data verify_out_shape not intuitive when there are implicit dims #1153

albertz opened this issue Oct 19, 2022 · 4 comments

Comments

@albertz
Copy link
Member

albertz commented Oct 19, 2022

(I use returnn-common code here but nothing is really specific about returnn-common. nn.zeros is just ConstantLayer.)

Consider this code:

x = nn.zeros((dim1, dim2))
x.verify_out_shape({dim1, dim2})

This is correct when dim1 and dim2 are static dims.

Once they are dynamic dims and introduce implicit dims, the verify_out_shape fails because it does not cover the implicit dims. E.g. when dim1 has a dyn_size_ext of shape [B], then currently you actually need:

x.verify_out_shape({dim1, dim2, batch_dim})

Or:

x.verify_out_shape({dim1, dim2, ImplicitDynSizeDim(batch_dim)})

Because the batch dim is an implicit dim.

Why do we have implicit dims? What are those? They were introduced mostly for CumConcatLayer. See #391, #589.

Example of such error: rwth-i6/returnn_common#226

See also the earlier discussion on the introduction of out_shape and verify_out_shape: #706, #757

@albertz
Copy link
Member Author

albertz commented Oct 19, 2022

I'm thinking about whether this automatic introduction of implicit dims was a good idea. This original code really should work, as it would be quite confusing otherwise:

x = nn.zeros((dim1, dim2))
x.verify_out_shape({dim1, dim2})

Maybe these implicit dims should actually be explicitly defined. We still would have them, but they are explicitly defined. E.g. a layer like CumConcatLayer would explicitly add those. How would that work? Is this attached to another dim tag? Some attrib like Dim.implicit_dims? Or attached to the Data?

@albertz
Copy link
Member Author

albertz commented Oct 19, 2022

@Zettelkasten what do you think about this? Specifically, here (#706 (comment)) you said:

Do we want to force the user to specify implicit dims if they specify out_shape? I tend to think yes (if you think about recurrent self attention e.g., it's super important that the keys tensor has an implicit query-time dim. I think, if I were to write this down on paper, I would also include it always.)

@albertz
Copy link
Member Author

albertz commented Oct 19, 2022

We could also allow returnn-common to be a bit more relaxed about this and verify_out_shape on returnn-common side would allow to ignore implicit dims, or maybe have an option check_implicit_dims which is False by default.

@albertz
Copy link
Member Author

albertz commented Oct 19, 2022

We could also allow returnn-common to be a bit more relaxed about this and verify_out_shape on returnn-common side would allow to ignore implicit dims, or maybe have an option check_implicit_dims which is False by default.

I did that now, just to fix rwth-i6/returnn_common#226 for now. However, I think this issue here remains, and we should think about it more. Maybe implement my suggestion here on handling it explicitly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant