New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficient usage of resources #12
Comments
thanks @zeryx I'll also investigate... |
Might have something to do with looping over layers in a native python loop. |
I did a little bit of digging as well, it looks like the hidden memory tensor is never actually updated. I made a local change that forced the DRNN layer to make use of the allocated hidden memory tensor, however, due to the python native List around the memory tensors (as each memory tensor has different dimensionality, you can't stack them conventionally) training requires the "retain_graph" variable set to true in the |
ah cool @zeryx please push your stuff then we can try to figure it out and fix it! |
^ that PR would expect the hidden tensor to be
I haven't been able to figure out a way to convert that into torch.Variable, as each layer's hidden tensor has increasing dimensions |
I was exploring this project for a general purpose forecasting model project I was working on, and I realized that the performance of this model with multiple layers (say 5) is actually worse in both forward and backward passes than a standard pytorch GRU module with 5 layers.
https://gist.github.com/zeryx/c43fc53b4d3f71c4942dff44912aa3cb
From my understanding of the paper, the DRNN module should be at least as performant as the equivalent GRU module if not dramatically superior in both forward and backward pass compute time.
The text was updated successfully, but these errors were encountered: