Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about initializing the lstm hidden states #25

Closed
brisker opened this issue Nov 15, 2017 · 16 comments
Closed

questions about initializing the lstm hidden states #25

brisker opened this issue Nov 15, 2017 · 16 comments

Comments

@brisker
Copy link

brisker commented Nov 15, 2017

here :https://github.com/ruotianluo/neuraltalk2.pytorch/blob/master/models/OldModel.py#L49
you seems to directly init the hidden states with the fc_feats with a linear layer. So I want to ask that if I want to implement an attention model where the lstm takes fc_feats as input at step 0, and takes start token as input at step 1, like the figure below, then how to init the hidden states of lstm?
image

@ruotianluo
Copy link
Owner

Isn't show attend tell in the same file the one you want?

@brisker
Copy link
Author

brisker commented Nov 15, 2017

@ruotianluo
emm, I know this is show _attend _tell model. But it seems that this implementation just use fc_feats to init the hidden states, not to use it as input as step 0. At step 0 , you seems to directly output a word . But at step 0 I just wanna output a start token and do not include attention temporarily.

@ruotianluo
Copy link
Owner

Intialize with zero.

@brisker
Copy link
Author

brisker commented Nov 15, 2017

@ruotianluo
here: https://github.com/ruotianluo/neuraltalk2.pytorch/blob/master/train.py#L113
why labels[:,1:], masks[:,1:]?
not labels[:,:], masks[:,:] ?

@ruotianluo
Copy link
Owner

The zero-th are bos.

@brisker
Copy link
Author

brisker commented Nov 15, 2017

@ruotianluo
what is bos.....

@ruotianluo
Copy link
Owner

start token.

@brisker
Copy link
Author

brisker commented Nov 15, 2017

@ruotianluo
If so, what is the function of masks?

@brisker
Copy link
Author

brisker commented Nov 15, 2017

@ruotianluo
if target is [1, 12, 20,0,0] (1 is start token and 0 is end token ), then
masks should be [1,1,1,0,0]?

@ruotianluo
Copy link
Owner

mask would be 1,1,1,1,0

@brisker
Copy link
Author

brisker commented Nov 16, 2017

@ruotianluo
still:
target is [1, 12, 20,0,0] (1 is start token and 0 is end token )
and if output of rnn is [1 ,12,21,0,0], and mask is [ 1,1,1,1,0], do I need to write the loss like:
crit(RNNDecoder_output[:,1:], labels[:, 1:], masks[:, 1:])
or
crit(RNNDecoder_output[:,:], labels[:, 1:], masks[:, 1:])
??

@ruotianluo
Copy link
Owner

Outputs are softmax probabilities.

@brisker
Copy link
Author

brisker commented Nov 16, 2017

@ruotianluo
how to add weights to different word's loss?
suppose in target [1, 12,20,0,0], I want to assign 12 with weight 5, and assign 20 with weight 1?
(suppose the target is in fixed length.)

@ruotianluo
Copy link
Owner

You have to modify the crit. It's nontrivial.

@brisker
Copy link
Author

brisker commented Nov 18, 2017

@ruotianluo
is changing mask from [1, 1 ,1 ,1 ,0] to [1 ,5 ,1 ,1 ,0] not correct? why?

@ruotianluo
Copy link
Owner

I see what you mean. I misunderstood. Yeah, it should work.

@brisker brisker closed this as completed Nov 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants