Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaskedComputationLayer: beam search bug inside recurrent loop #755

Open
robin-p-schmitt opened this issue Nov 22, 2021 · 5 comments
Open
Labels
good first issue Should be a good starting point to get familiar with RETURNN, hopefully not too hard.

Comments

@robin-p-schmitt
Copy link
Contributor

When the layer is inside a recurrent loop, uses a mask which depends on the previous output and has an input which is not dependent on the output, then there is an error during beam search. I think the problem is caused by the fact that the mask has beam information in such a case while the input has not (because it is not dependent on the output). I am going to create a PR with a corresponding test case.

@albertz
Copy link
Member

albertz commented Nov 22, 2021

Can you post the error?

@robin-p-schmitt
Copy link
Contributor Author

The error is:
AssertionError: ("Layer <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}> has buggy search choices resolution.", 'see search choices debug output')
The full log is in https://gist.github.com/robin-p-schmitt/29e3d77d93824e7020b82158f81f266d.

@albertz
Copy link
Member

albertz commented Nov 22, 2021

Please always also post relevant snippets directly here.
In this case:

layer <network via test_MaskedComputationLayer_beam>/'data' output: Data{'data', [B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}
layer <network via test_MaskedComputationLayer_beam>/'output' output: Data{'output_output', [T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}}
Rec layer 'output' (search True, train False) sub net:
  Input layers moved out of loop: (#: 0)
    None
  Output layers moved out of loop: (#: 0)
    None
  Layers in loop: (#: 5)
    output
    output_prob
    unmask
    mask
    output_is_not_0
  Unused layers: (#: 0)
    None
layer <network via test_MaskedComputationLayer_beam>/output(rec-subnet)/'mask' output: Data{'mask_output', [B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}
debug search choices:
  base: <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>
  network:
    layer: <RecStepInfoLayer output/':i' out_type=Data{[], dtype='int32'}>
    layer: <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>
    layer: <_TemplateLayer(MaskedComputationLayer)(:prev:masked_computation) output/'prev:mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
    layer: <_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
    layer: <_TemplateLayer(CompareLayer)(:prev:compare) output/'prev:output_is_not_0' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
    layer: <_TemplateLayer(UnmaskLayer)(:prev:unmask) output/'prev:unmask' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
  visit: <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>, search choices None
    sources: 'data' search choices None, 'output/prev:mask' search choices None, 'output/prev:output_is_not_0' search choices None, 'data' search choices None
  visit: <SelectSearchSourcesLayer 'data' <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)> out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>, search choices None
    sources: 'data' search choices None
  visit: <SourceLayer 'data' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}>, search choices None
    sources: None
  visit: <SelectSearchSourcesLayer 'prev:mask' <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)> out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}>, search choices None
    sources: 'output/prev:mask' search choices None
  visit: <_TemplateLayer(MaskedComputationLayer)(:prev:masked_computation) output/'prev:mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>, search choices None
    sources: 'data' search choices None, 'output/prev:output_is_not_0' search choices None
  visit: <_TemplateLayer(CompareLayer)(:prev:compare) output/'prev:output_is_not_0' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>, search choices None
    sources: 'output/prev:output' search choices <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
  visit: <_TemplateLayer(CompareLayer)(:template:compare) output/'output_is_not_0' out_type=Data{[B&Beam{'output/output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'mask')>, search choices None
    sources: 'output/output' search choices <SearchChoices owner='output' beam_size=3 beam_scores=None>
  visit: <_TemplateLayer(CompareLayer)(:prev:compare) output/'prev:output_is_not_0' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>, search choices None
    sources: 'output/prev:output' search choices <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
Relevant layers:
[<_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>]
Full dependency map:
{'data': [],
 'output/mask': ['output/prev:output'],
 'output/output_is_not_0': ['output/output'],
 'output/prev:mask': ['output/prev:output'],
 'output/prev:output_is_not_0': ['output/prev:output']}
-> search choices: <_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
Template network (check out types / shapes):
Template network (check out types / shapes):

ERROR: Got exception during in-loop construction of layer 'mask':
AssertionError: ("Layer <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}> has buggy search choices resolution.", 'see search choices debug output')

output: <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data{[B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
output_prob: <_TemplateLayer(LinearLayer)(:template:linear) output/'output_prob' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'output_prob:feature-dense'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output')>
unmask: <_TemplateLayer(UnmaskLayer)(:template:unmask) output/'unmask' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output_prob')>
mask: <_TemplateLayer(MaskedComputationLayer)(:template:masked_computation) output/'mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'unmask')>
output_is_not_0: <_TemplateLayer(CompareLayer)(:template:compare) output/'output_is_not_0' out_type=Data{[B&Beam{'output/output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'mask')>


ERROR: Got exception during in-loop construction of layer 'unmask':
AssertionError: ("Layer <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}> has buggy search choices resolution.", 'see search choices debug output')

output: <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data{[B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
output_prob: <_TemplateLayer(LinearLayer)(:template:linear) output/'output_prob' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'output_prob:feature-dense'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output')>
unmask: <_TemplateLayer(UnmaskLayer)(:template:unmask) output/'unmask' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output_prob')>
mask: <_TemplateLayer(MaskedComputationLayer)(:template:masked_computation) output/'mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'unmask')>
output_is_not_0: <_TemplateLayer(CompareLayer)(:template:compare) output/'output_is_not_0' out_type=Data{[B&Beam{'output/output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'mask')>


ERROR: Got exception during in-loop construction of layer 'output_prob':
AssertionError: ("Layer <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}> has buggy search choices resolution.", 'see search choices debug output')

output: <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data{[B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
output_prob: <_TemplateLayer(LinearLayer)(:template:linear) output/'output_prob' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'output_prob:feature-dense'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output')>
unmask: <_TemplateLayer(UnmaskLayer)(:template:unmask) output/'unmask' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output_prob')>
mask: <_TemplateLayer(MaskedComputationLayer)(:template:masked_computation) output/'mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'unmask')>
Template network (check out types / shapes):
output_is_not_0: <_TemplateLayer(CompareLayer)(:template:compare) output/'output_is_not_0' out_type=Data{[B&Beam{'output/output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'mask')>
Template network (check out types / shapes):

Exception creating layer <network via test_MaskedComputationLayer_beam>/'output' of class RecLayer with opts:

ERROR: Got exception during in-loop construction of layer 'output':
AssertionError: ("Layer <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}> has buggy search choices resolution.", 'see search choices debug output')

output: <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data{[B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
output_prob: <_TemplateLayer(LinearLayer)(:template:linear) output/'output_prob' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'output_prob:feature-dense'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output')>
unmask: <_TemplateLayer(UnmaskLayer)(:template:unmask) output/'unmask' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output_prob')>
mask: <_TemplateLayer(MaskedComputationLayer)(:template:masked_computation) output/'mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'unmask')>
output_is_not_0: <_TemplateLayer(CompareLayer)(:template:compare) output/'output_is_not_0' out_type=Data{[B&Beam{'output/output'}(3)], dtype='bool', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'mask')>

{'_name': 'output',
 '_network': <TFNetwork '<network via test_MaskedComputationLayer_beam>' train=False search>,
 'axis': DimensionTag{'time:var:extern_data:data'[B]},
 'n_out': <class 'returnn.util.basic.NotSpecified'>,
 'name': 'output',
 'network': <TFNetwork '<network via test_MaskedComputationLayer_beam>' train=False search>,
 'output': Data{'output_output', [T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}},
 'sources': [<SourceLayer 'data' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}>],
 'unit': <_SubnetworkRecCell '<network via test_MaskedComputationLayer_beam>/output(rec-subnet)'>}
EXCEPTION
...
  File "/mnt/projects/i6/returnn/returnn/tf/layers/base.py", line 504, in <listcomp>
    line: get_layer(src_name)
    locals:
      get_layer = <local> <function _SubnetworkRecCell._construct.<locals>.get_layer at 0x7f90b86b73a0>
      src_name = <local> 'mask'
  File "/mnt/projects/i6/returnn/returnn/tf/layers/rec.py", line 1684, in _SubnetworkRecCell._construct.<locals>.get_layer
    line: assert (layer.output.beam == layer_template.output.beam and
                  layer_choices.beam_size == layer.output.beam.beam_size == layer_template.output.beam.beam_size), (
            "Layer %r has buggy search choices resolution." % layer,
            self.net.debug_search_choices(layer) or "see search choices debug output")
    locals:
      layer = <local> <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>
      layer.output = <local> Data{'mask_output', [B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}
      layer.output.beam = <local> SearchBeam(name='output/prev:output', beam_size=3)
      layer_template = <local> <_TemplateLayer(MaskedComputationLayer)(:template:masked_computation) output/'mask' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'unmask')>
      layer_template.output = <local> Data{'mask_output', [B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}
      layer_template.output.beam = <local> None
      layer_choices = <local> <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
      layer_choices.beam_size = <local> 3
      layer.output.beam.beam_size = <local> 3
      layer_template.output.beam.beam_size = <local> !AttributeError: 'NoneType' object has no attribute 'beam_size'
      self = <local> <_SubnetworkRecCell '<network via test_MaskedComputationLayer_beam>/output(rec-subnet)'>
      self.net = <local> <TFNetwork '<network via test_MaskedComputationLayer_beam>/output(rec-subnet)' parent_layer=<RecLayer 'output' out_type=Data{[T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionTag{F'classes:sparse-dim'(5)}}> train=False search>
      self.net.debug_search_choices = <local> <bound method TFNetwork.debug_search_choices of <TFNetwork '<network via test_MaskedComputationLayer_beam>/output(rec-subnet)' parent_layer=<RecLayer 'output' out_type=Data{[T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse_dim=DimensionT...
AssertionError: ("Layer <MaskedComputationLayer output/'mask' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}> has buggy search choices resolution.", 'see search choices debug output')

@albertz
Copy link
Member

albertz commented Nov 22, 2021

And also relevant config parts for what you get the error. (The issue should contain all relevant information.)
In this case, the net dict:

network = {
      "output": {"class": "rec", "from": "data", "unit": {
        "mask": {
          "class": "masked_computation", "mask": "prev:output_is_not_0", "from": "base:data",
          "unit": {"class": "copy", "from": "data"}},
        "unmask": {"class": "unmask", "from": "mask", "mask": "prev:output_is_not_0"},
        "output_prob": {"class": "linear", "from": "unmask", "activation": "softmax", "n_out": dim},
        "output": {
          "class": "choice", "from": "output_prob", "beam_size": 3, "input_type": "prob", "target": "classes",
          "initial_output": 0},
        "output_is_not_0": {
          "class": "compare", "from": "output", "value": 0, "kind": "not_equal",
          "initial_output": True},
      }}
    }

@robin-p-schmitt
Copy link
Contributor Author

Okay, I'll do that next time, thanks

@albertz albertz added the good first issue Should be a good starting point to get familiar with RETURNN, hopefully not too hard. label Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Should be a good starting point to get familiar with RETURNN, hopefully not too hard.
Projects
None yet
Development

No branches or pull requests

2 participants