questions about initializing the lstm hidden states #25

brisker · 2017-11-15T05:47:08Z

here :https://github.com/ruotianluo/neuraltalk2.pytorch/blob/master/models/OldModel.py#L49
you seems to directly init the hidden states with the fc_feats with a linear layer. So I want to ask that if I want to implement an attention model where the lstm takes fc_feats as input at step 0, and takes start token as input at step 1, like the figure below, then how to init the hidden states of lstm?

ruotianluo · 2017-11-15T05:49:06Z

Isn't show attend tell in the same file the one you want?

brisker · 2017-11-15T05:56:47Z

@ruotianluo
emm, I know this is show _attend _tell model. But it seems that this implementation just use fc_feats to init the hidden states, not to use it as input as step 0. At step 0 , you seems to directly output a word . But at step 0 I just wanna output a start token and do not include attention temporarily.

ruotianluo · 2017-11-15T05:58:11Z

Intialize with zero.

brisker · 2017-11-15T06:26:33Z

@ruotianluo
here: https://github.com/ruotianluo/neuraltalk2.pytorch/blob/master/train.py#L113
why labels[:,1:], masks[:,1:]?
not labels[:,:], masks[:,:] ?

ruotianluo · 2017-11-15T06:27:01Z

The zero-th are bos.

brisker · 2017-11-15T06:27:23Z

@ruotianluo
what is bos.....

ruotianluo · 2017-11-15T06:27:47Z

start token.

brisker · 2017-11-15T06:29:00Z

@ruotianluo
If so, what is the function of masks?

brisker · 2017-11-15T06:40:13Z

@ruotianluo
if target is [1, 12, 20,0,0] (1 is start token and 0 is end token ), then
masks should be [1,1,1,0,0]?

ruotianluo · 2017-11-15T06:48:16Z

mask would be 1,1,1,1,0

brisker · 2017-11-16T08:22:09Z

@ruotianluo
still:
target is [1, 12, 20,0,0] (1 is start token and 0 is end token )
and if output of rnn is [1 ,12,21,0,0], and mask is [ 1,1,1,1,0], do I need to write the loss like:
crit(RNNDecoder_output[:,1:], labels[:, 1:], masks[:, 1:])
or
crit(RNNDecoder_output[:,:], labels[:, 1:], masks[:, 1:])
??

ruotianluo · 2017-11-16T08:29:58Z

Outputs are softmax probabilities.

brisker · 2017-11-16T09:18:09Z

@ruotianluo
how to add weights to different word's loss?
suppose in target [1, 12,20,0,0], I want to assign 12 with weight 5, and assign 20 with weight 1?
(suppose the target is in fixed length.)

ruotianluo · 2017-11-16T15:08:32Z

You have to modify the crit. It's nontrivial.

brisker · 2017-11-18T06:29:46Z

@ruotianluo
is changing mask from [1, 1 ,1 ,1 ,0] to [1 ,5 ,1 ,1 ,0] not correct? why?

ruotianluo · 2017-11-18T07:21:14Z

I see what you mean. I misunderstood. Yeah, it should work.

brisker closed this as completed Nov 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

questions about initializing the lstm hidden states #25

questions about initializing the lstm hidden states #25

brisker commented Nov 15, 2017

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017 •

edited

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017

brisker commented Nov 15, 2017 •

edited

ruotianluo commented Nov 15, 2017

brisker commented Nov 16, 2017 •

edited

ruotianluo commented Nov 16, 2017

brisker commented Nov 16, 2017 •

edited

ruotianluo commented Nov 16, 2017

brisker commented Nov 18, 2017 •

edited

ruotianluo commented Nov 18, 2017

questions about initializing the lstm hidden states #25

questions about initializing the lstm hidden states #25

Comments

brisker commented Nov 15, 2017

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017 • edited

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017

ruotianluo commented Nov 15, 2017

brisker commented Nov 15, 2017

brisker commented Nov 15, 2017 • edited

ruotianluo commented Nov 15, 2017

brisker commented Nov 16, 2017 • edited

ruotianluo commented Nov 16, 2017

brisker commented Nov 16, 2017 • edited

ruotianluo commented Nov 16, 2017

brisker commented Nov 18, 2017 • edited

ruotianluo commented Nov 18, 2017

brisker commented Nov 15, 2017 •

edited

brisker commented Nov 15, 2017 •

edited

brisker commented Nov 16, 2017 •

edited

brisker commented Nov 16, 2017 •

edited

brisker commented Nov 18, 2017 •

edited