Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About LSTM Layer in GroundHog #23

Closed
gaoyuankidult opened this issue Nov 11, 2014 · 6 comments
Closed

About LSTM Layer in GroundHog #23

gaoyuankidult opened this issue Nov 11, 2014 · 6 comments

Comments

@gaoyuankidult
Copy link

Hello

I have checked the LSTM Layer in GroundHog / groundhog / layers / rec_layers.py. I wonder is this a complete standard LSTM layer(e.g. described in Alex Grave's paper, http://arxiv.org/pdf/1308.0850v5.pdf ) or a prototype for now?

I didn't see bias term in # input/output gate update. did I miss it? btw. do you have an example using this layer?

thanks.

btw. I can write a wiki(tutorial) about it, If there is some example.

formulas described in paper:

gh

code of LSTM layer:

hg_code

@rizar
Copy link

rizar commented Nov 13, 2014

@kyunghyuncho , you wrote the LSTM layer.

@kyunghyuncho
Copy link

Sorry about the late reply!

As @gaoyuankidult correctly noticed, the implementation in GoundHog lacks the bias terms for the gaters, which I believe wouldn't make much difference. Also, note that there are a number of variants of LSTMs (see, e.g., http://arxiv.org/pdf/1409.2329.pdf).

You can train a neural machine translation model using LSTM by

    state['enc_rec_layer'] = 'LSTMLayer'                                                          
    state['enc_rec_gating'] = False
    state['enc_rec_reseting'] = False
    state['dec_rec_layer'] = 'LSTMLayer'                                                          
    state['dec_rec_gating'] = False
    state['dec_rec_reseting'] = False                                                             
    state['dim_mult'] = 4

Though, this feature has been tested only internally quite some time ago. If you run into any issues with this, please, leave a comment here about the details and I'll take a look into it.

@infinitezxc
Copy link

Hello @kyunghyuncho

When I am trying to train a model using LSTM, I run into this error:

ValueError: dimension mismatch in args to gemm (512,2000)x(1000,1000)->(512,1000)
Apply node that caused the error: GpuDot22(GpuReshape{2}.0, W_0_dec_repr_readout)
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(512, 2000), (1000, 1000)]
Inputs strides: [(2000, 1), (1000, 1)]
Inputs values: ['not shown', 'not shown']

Do you have any suggestions? Thanks!

@kyunghyuncho
Copy link

Can you provide your state variables?

On Thu, Jan 29, 2015 at 1:36 PM, infinitezxc notifications@github.com
wrote:

Hello @kyunghyuncho https://github.com/kyunghyuncho

When I am trying to train a model using LSTM, I run into this error:
‘’‘
ValueError: dimension mismatch in args to gemm
(512,2000)x(1000,1000)->(512,1000)
Apply node that caused the error: GpuDot22(GpuReshape{2}.0,
W_0_dec_repr_readout)
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32,
matrix)]
Inputs shapes: [(512, 2000), (1000, 1000)]
Inputs strides: [(2000, 1), (1000, 1)]
Inputs values: ['not shown', 'not shown']
’‘’
Do you have any suggestions? Thanks!


Reply to this email directly or view it on GitHub
#23 (comment)
.

@infinitezxc
Copy link

Thanks for the quick reply :) @kyunghyuncho

I am using the default states in prototype_phrase_lstm_state(), with the source and target paths modified and adding a prototype_phrase_lstm_state line at the end of init.py.

@guxd
Copy link

guxd commented Jan 13, 2016

@infinitezxc Have you solved the LSTM error? I met the error too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants