Remove deprecated generated layers #29

albertz · 2021-10-01T11:32:02Z

This is more a question at this point:

There are a couple of deprecated layers.

E.g. SelfAttentionLayer (so the SelfAttention module), which can be constructed more explicitly via CumConcatLayer etc. We should provide one implementation for self attention, but maybe directly using other atomic layers (CumConcatLayer etc).
GaussWindowAttention
Etc

(We should specify a full list here.)

So, should we remove them now? Better now than later when people start to use them.

The text was updated successfully, but these errors were encountered:

albertz · 2021-10-01T11:32:37Z

@Zettelkasten @Atticus1806 @JackTemaki opinions?

JackTemaki · 2021-10-01T11:35:56Z

Yes, no deprecated layer should appear in the first "release" of the automatically generated API.

albertz · 2021-10-01T11:45:32Z

Another option I'm thinking about, which might make this less important:

Maybe we should not directly export the generated layer modules to the user at all (the from ._generated_layers import * in layers.py). Maybe this should be very explicit only, and otherwise only for internal usage. E.g. see #30. Maybe we should not export the Reduce module at all, and just provide reduce to the user. But we still would need the Reduce module internally to implement reduce. Basically the reduce function would be like:

def reduce(input, **kwargs):
  return _generated_layers.Reduce(**kwargs)(input)

(Very similar also how we did many things in Pytorch-to-RETURNN.)

Edit Ok, we have #30 now, i.e. functional layer modules do not get exported but their function will be instead, all automatically.

But this does not fully solve this issue here. There might also be functional layers which are deprecated, and then there are also potential non-functional deprecated layers.

Atticus1806 · 2021-10-01T12:06:58Z

I agree with @JackTemaki we should not have layers which we mark as deprecated in the generated code.

Maybe we should not export the Reduce module at all, and just provide reduce to the user.

As long as this covers all of the options a user would have with this layer (in the example) this would be fine in my opinion, but I am not sure if it hurts to provide a nice Interface as access to the layer via reduce but still make Reduce available (speaking as an example for general cases)

albertz · 2021-10-01T12:10:31Z

I am not sure if it hurts to provide a nice Interface as access to the layer via reduce but still make Reduce available

But Reduce would not be available then. This is what I meant with:

Maybe we should not directly export the generated layer modules to the user at all

Atticus1806 · 2021-10-01T12:17:33Z

Maybe we should not directly export the generated layer modules to the user at all

I see, but I am not sure if that makes it a bit tedious to keep managing interfaces for all these layers over the versions of Returnn. Because from what I understand that is what would need to happen then.

JackTemaki · 2021-10-01T12:22:04Z

PyTorch also provides both in certain cases, such as the activations:
https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity
vs
https://pytorch.org/docs/stable/nn.functional.html

albertz · 2021-10-01T12:24:24Z

I am not sure if that makes it a bit tedious to keep managing interfaces for all these layers over the versions of Returnn. Because from what I understand that is what would need to happen then.

I don't understand. What do you mean? We generate the generated layer modules automatically.

We just need to have a list of deprecated layers. Even that can maybe be automatically generated by parsing the docstring for "DEPRECATED" or so.

albertz · 2021-10-01T14:07:12Z

PyTorch also provides both in certain cases, such as the activations, e.g. nn.ReLU vs relu (vs relu_)

I assume PyTorch mostly has the nn module variants for nn.Sequential, or also for historical reasons. I don't really see any other reason.

We could make a variant of Sequential which would allow sth like list[Module|function].

We do not have to copy exactly every single aspect of PyTorch. We always should think whether it makes sense for us.

Ok, for functions with options, e.g. dropout, layer_norm, etc, maybe it looks nicer to write:

Sequential([
  ...,
  Dropout(dropout=0.5),
  ...
  ])

instead of

Sequential([
  ...,
  partial(dropout, dropout=0.5),
  ...
  ])

Although you could also argue that partial is more standard here.

albertz · 2021-10-06T07:44:33Z

Many of them are removed now. I think we can close this.

albertz mentioned this issue Oct 1, 2021

Generated layers, functional-like for pure functional layers #30

Closed

rwth-i6 deleted a comment from Atticus1806 Oct 1, 2021

albertz closed this as completed Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove deprecated generated layers #29

Remove deprecated generated layers #29

albertz commented Oct 1, 2021 •

edited

albertz commented Oct 1, 2021

JackTemaki commented Oct 1, 2021

albertz commented Oct 1, 2021 •

edited

Atticus1806 commented Oct 1, 2021

albertz commented Oct 1, 2021

Atticus1806 commented Oct 1, 2021

JackTemaki commented Oct 1, 2021

albertz commented Oct 1, 2021

albertz commented Oct 1, 2021

albertz commented Oct 6, 2021

Remove deprecated generated layers #29

Remove deprecated generated layers #29

Comments

albertz commented Oct 1, 2021 • edited

albertz commented Oct 1, 2021

JackTemaki commented Oct 1, 2021

albertz commented Oct 1, 2021 • edited

Atticus1806 commented Oct 1, 2021

albertz commented Oct 1, 2021

Atticus1806 commented Oct 1, 2021

JackTemaki commented Oct 1, 2021

albertz commented Oct 1, 2021

albertz commented Oct 1, 2021

albertz commented Oct 6, 2021

albertz commented Oct 1, 2021 •

edited

albertz commented Oct 1, 2021 •

edited