You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.
It looks like I've encountered a lil bug when batch_size=1 at CPU inference ( haven't checked on GPU yet ). I've found that, whilst forwarding in CPUForgetMult, there is a general squeeze for all dimensions when appending each h to the resulting list of tensors, concretely:
result.append(h.squeeze())
It turns out the size of h at each iteration is (1, batch_size, feats), so when we squeeze with batch_size=1 the resulting tensor is of size (feats,), resulting in a final stack torch.stack(result) of size (seq_len, feats).
This will cause an error when, in QRNN forward, we do C[-1:, :, :] trying to access every sample in batch dimension (i.e. 1) which does not exist because of the squeeze. We can just specify the specific squeeze dimension to be 0 (in batch_first=False option, which is the only one available atm).
The text was updated successfully, but these errors were encountered:
Was going to file what I think is a similar issue, just thought I'd check if it's the same root cause.
Have been following the steps in https://github.com/salesforce/awd-lstm-lm to generate a QRNN model, which completed successfully, but am getting this when trying use it to generate text:
$ python generate.py --data ./data/mydata --checkpoint MYQRNNMODEL.pt --cudaTraceback (most recent call last): File "generate.py", line 65, in <module> output, hidden = model(input, hidden) File "/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/awd-lstm-lm/model.py", line 82, in forward raw_output, new_h = rnn(raw_output, hidden[l]) File "/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/miniconda3/lib/python3.6/site-packages/torchqrnn/qrnn.py", line 60, in forward Xm1 = [self.prevX if self.prevX is not None else X[:1, :, :] * 0, X[:-1, :, :]] File "/miniconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 76, in __getitem__ return Index.apply(self, key) File "/miniconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 16, in forward result = i.index(ctx.index)ValueError: result of slicing is an empty tensor
(same with or without --cuda flag)
Any idea what I can do to get generate.py to work with this model? Or should I file a separate issue?
Hi,
It looks like I've encountered a lil bug when
batch_size=1
at CPU inference ( haven't checked on GPU yet ). I've found that, whilst forwarding inCPUForgetMult
, there is a generalsqueeze
for all dimensions when appending eachh
to the resulting list of tensors, concretely:It turns out the size of
h
at each iteration is(1, batch_size, feats)
, so when we squeeze withbatch_size=1
the resulting tensor is of size(feats,)
, resulting in a final stacktorch.stack(result)
of size(seq_len, feats)
.This will cause an error when, in QRNN forward, we do
C[-1:, :, :]
trying to access every sample in batch dimension (i.e. 1) which does not exist because of the squeeze. We can just specify the specific squeeze dimension to be 0 (inbatch_first=False
option, which is the only one available atm).The text was updated successfully, but these errors were encountered: