This repository has been archived by the owner on Jun 10, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 472
Question about Sequencer.lua #61
Comments
How do you initialize the Sequencer? |
@guillaumekln local Tree, parent = torch.class('onmt.Tree', 'onmt.Sequencer')
function Tree:__init (rvnn)
self.rvnn = rvnn
parent.__init(self, self.rvnn)
self:resetPreallocation()
end
function Tree.load(pretrained)
local self = torch.factory('onmt.Tree')
self.rvnn = pretrained.modules[1]
parent.__init(self, self.rvnn)
self:resetPreallocation()
end
function Tree:training()
parent.training(self)
end
function Tree:evaluate()
parent.evaluate(self)
end
function Tree:serialize()
return {
modules = self.modules
}
end
function Tree:maskPadding()
self.maskPad = true
end
function Tree:resetPreallocation()
self.headProto = torch.Tensor()
self.depProto = torch.Tensor()
self.gradFeedProto = torch.Tensor()
end
function Tree:forward(batch, f2s_)
if self.train then
self.inputs = {}
self:_reset_noise()
end
local head_ = onmt.utils.Tensor.reuseTensor(self.headProto,
{batch.size, self.rvnn.outSize})
local dep_ = onmt.utils.Tensor.reuseTensor(self.depProto,
{batch.size, self.rvnn.outSize})
for t = 1, batch.headLength do
onmt.utils.DepTree._get(head_, f2s_, batch.head[t])
onmt.utils.DepTree._get(dep_, f2s_, batch.dep[t])
local tree_input = {head_, dep_, batch.relation[t]}
if self.train then
self.inputs[t] = tree_input
end
onmt.utils.DepTree._set(f2s_, self:net(t):forward(tree_input), batch.update[t])
end
return f2s_
end
function Tree:backward(batch, gradFeedOutput)
local gradFeed_ = onmt.utils.Tensor.reuseTensor(self.gradFeedProto,
{batch.size, self.rvnn.outSize})
for t = batch.headLength, 1, -1 do
onmt.utils.DepTree._get(gradFeed_, gradFeedOutput, batch.update[t])
local dtree = self:net(t):backward(self.inputs[t], gradFeed_)
onmt.utils.DepTree._add(gradFeedOutput, dtree[1], batch.head[t])
onmt.utils.DepTree._add(gradFeedOutput, dtree[2], batch.dep[t])
onmt.utils.DepTree._fill(gradFeedOutput, 0, batch.update[t])
end
return gradFeedOutput
end |
The difference is that you call function Module:backward(input, gradOutput, scale)
scale = scale or 1
self:updateGradInput(input, gradOutput)
self:accGradParameters(input, gradOutput, scale)
return self.gradInput
end which expects On the other hand, the LSTM module is not directly exposed by the Sequencer and it only relies on However, these lines: self.gradInput = self.net:updateGradInput(input, gradOutput)
return self.gradInput should also appear in the LSTM module for consistency. So thank you for your question. |
@guillaumekln That's really helpful. Thanks! |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I've implemented recursive net, and initialize sequencer with that. (also memory optimizer)
Source code is
But I found it returns gradient with zero dimension.
I have to change the updateGradInput function to
which is not necessary in LSTM.lua.
I can't find any difference between sequencer with LSTM and sequencer with my recursive nets.
I wondering in current sequencer implementation, how self.gradInput is redirected to self.net.gradInput ?
Thanks.
The text was updated successfully, but these errors were encountered: