Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

GRU not working with layout 'NTC' #7382

Closed
peschn opened this issue Aug 8, 2017 · 2 comments
Closed

GRU not working with layout 'NTC' #7382

peschn opened this issue Aug 8, 2017 · 2 comments

Comments

@peschn
Copy link
Contributor

peschn commented Aug 8, 2017

For bugs or installation issues, please provide the following information.
The more information you provide, the more likely people will be able to help you.

Environment info

Operating System: Ubuntu 16.04.2 LTS

Package used (Python/R/Scala/Julia): Python

MXNet version: mxnet-cu80==0.10.1b20170803

If you are using python package, please provide

Python version and distribution: python 2.7.13, used with anaconda 4.2.23

Error Message:

Please paste the full error message, including stack trace.

Traceback (most recent call last):
  File "simply/mx/test_rnn.py", line 14, in <module>
    output, hn = layer(input, h0)
  File "/home/huxley/dev/miniconda2/envs/mxnet/lib/python2.7/site-packages/mxnet/gluon/block.py", line 251, in __call__
    return self.forward(*args)
  File "/home/huxley/dev/miniconda2/envs/mxnet/lib/python2.7/site-packages/mxnet/gluon/rnn/rnn_layer.py", line 162, in forward
    str(info['shape']), str(state.shape)))
ValueError: Invalid recurrent state shape. Expecting (2, 2L, 100), got (2L, 8L, 100L).

Minimum reproducible example

if you are using your own code, please provide a short script that reproduces the error.

Steps to reproduce

or if you are running standard examples, please provide the commands you have run that lead to the error.

import mxnet as mx

batch_size = 8
timesteps = 7
nhidden = 100
layers = 2
nin = 10

layer = mx.gluon.rnn.GRU(nhidden, layers, layout='NTC')
layer.initialize()

input = mx.nd.random_uniform(shape=(batch_size, timesteps, nin))
h0 = mx.nd.random_uniform(shape=(layers, batch_size, nhidden))
output, hn = layer(input, h0)

What have you tried to solve it?

The only thing that solves it is changing the layout (and the data accordingly) to 'TNC'.

@szha
Copy link
Member

szha commented Aug 8, 2017

Looks like a bug in handling the unfused states. I'm looking into the fix.

In [2]: layer.state_info()
Out[2]: [{'__layout__': 'LNC', 'shape': (2, 0, 100)}]
In [4]: layer._unfuse().state_info()

Out[4]:
[{'__layout__': 'NC', 'shape': (0, 100)},
 {'__layout__': 'NC', 'shape': (0, 100)}]

@szha
Copy link
Member

szha commented Aug 8, 2017

Should already be fixed by #7385

@szha szha closed this as completed Sep 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants