Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

how to share auxiliary states of batchnorm in rnn with variable length input #6115

Closed
R1ncy opened this issue May 5, 2017 · 2 comments
Closed

Comments

@R1ncy
Copy link

R1ncy commented May 5, 2017

hi, I want to add batchnorm layer after i2h in gru. Here is the code:

    if dropout > 0.:
        indata = mx.sym.Dropout(data=indata, p=dropout)
    i2h = mx.sym.FullyConnected(data=indata,
                                weight=param.gates_i2h_weight,
                                bias=param.gates_i2h_bias,
                                num_hidden=num_hidden * 2,
                                name="t%d_l%d_gates_i2h" % (seqidx, layeridx))

    if is_batchnorm:
        i2h = mx.sym.BatchNorm(data=i2h, fix_gamma=False,
                               name="l%d_i2h_bn" % layeridx,
                               gamma=param.i2h_gamma,
                               beta=param.i2h_beta)

In this way, the gamma and beta of bn are shared at each time step.
But the aux states can't be shared like this and I get this error:

  File "/usr/local/lib/python3.5/dist-packages/mxnet-0.9.5-py3.5.egg/mxnet/module/bucketing_module.py", line 200, in init_params
    force_init=force_init)
  File "/usr/local/lib/python3.5/dist-packages/mxnet-0.9.5-py3.5.egg/mxnet/module/module.py", line 283, in init_params
    self._exec_group.set_params(self._arg_params, self._aux_params)
  File "/usr/local/lib/python3.5/dist-packages/mxnet-0.9.5-py3.5.egg/mxnet/module/executor_group.py", line 332, in set_params
    exec_.copy_params_from(arg_params, aux_params)
  File "/usr/local/lib/python3.5/dist-packages/mxnet-0.9.5-py3.5.egg/mxnet/executor.py", line 281, in copy_params_from
    if name in self.aux_dict:
  File "/usr/local/lib/python3.5/dist-packages/mxnet-0.9.5-py3.5.egg/mxnet/executor.py", line 228, in aux_dict
    self._symbol.list_auxiliary_states(), self.aux_arrays)
  File "/usr/local/lib/python3.5/dist-packages/mxnet-0.9.5-py3.5.egg/mxnet/executor.py", line 69, in _get_dict
    raise ValueError('Duplicate names detected, %s' % str(names))
ValueError: Duplicate names detected, ['l0_i2h_bn_moving_mean', 'l0_i2h_bn_moving_var', 'l1_i2h_bn_moving_mean', 'l1_i2h_bn_moving_var', 'l0_i2h_bn_moving_mean', 'l0_i2h_bn_moving_var', 'l1_i2h_bn_moving_mean', ...]

seems the aux_params are unrolled with the max input length, but duplicate names are not allowed.

I also try the following code in speech_recognition example, but it seems cannot solve the unfixed number of params due to the variable length input in my case.

if is_batchnorm:
    batchnorm_gamma = []
    batchnorm_beta = []
         for seqidx in range(seq_len):
             batchnorm_gamma.append(mx.sym.Variable(prefix + "t%d_i2h_gamma" % seqidx))
             batchnorm_beta.append(mx.sym.Variable(prefix + "t%d_i2h_beta" % seqidx))

and it returns the aux_params index out of range error.
I find several issues related, but seems have not soloved, like #3076 and #2663.

Thanks for your time and effort.

@ashwin-cognitiv
Copy link

@R1ncy Did you ever figure it out? Since I'm running the network manually and loading .params (arg_params) along with enabling batch norm, I'm running into the same error.

@szha
Copy link
Member

szha commented Sep 30, 2017

This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!

@szha szha closed this as completed Sep 30, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants