New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TorchScript pack_padded_sequence and pad_packed_sequence run time error #41869
Comments
@nlpconf Please provide more details about how to reproduce this issue, such as which input arguments to supply to the function. Ideally, provide a whole self-contained repro .py script. |
Hi @wconstab , Thanks for replying. The input1 and input2 are just 2d encoded padded text sequences with size [batch_size, max_length]. For example, [[1,2,3,4,5],[2,3,5,0,0], ...], mask_len Is the real text length, for example, [5,3,...]. The model is like this:
Without the steps using packed_padded_sequence and pad_packed_sequence, it works fine.
|
@nlpconf could you provide a small self contained script that I can run locally? It's hard for me to reproduce as many functions are missing, and if you could provide a smaller self contained script, it will help me root cause the issue faster :) |
Here I prepared a small piece of code with fake data. Please let me know if this works.
|
@nlpconf have You resolved this issue in any way? I think I have something similar |
@BartlomiejSkwira @nlpconf : Did you guys try out the above suggested workaround? |
@nikithamalgifb I don't see any workaround in this thread. What did You mean exactly? In my case code 'X = X.cuda()' was causing an exception. My workaround was to comment out this part (and it had to be commented out, not just disabled with an if). In the end I trained and served my model on CPU. |
I am referring to this: #41869 (comment) what @wanchaol had suggested earlier |
This seems like an interesting case especially since you mentioned guarding it with an |
@gmagogsfm I have removed Cuda code totally and trained/served my model on CPU. Btw: I think during compilation all code paths have to be visited, so guarding the statement with an if it's not enough |
I've just recently encountered this issue and have a minimal reproducer on the latest stable PyTorch (2.0.1) My reproducer in question is in a custom implementation of topk mask construction: def topk_mask(score, num_dense_elem):
indices = torch.sort(score, dim=-1, descending=True).indices
iota = torch.arange(
indices.shape[-1],
dtype=num_dense_elem.dtype,
device=num_dense_elem.device,
)
in_topk = iota <= num_dense_elem
indices = torch.where(in_topk, indices, indices[..., 0:1])
mask = torch.zeros_like(score, dtype=torch.bool)
mask = mask.scatter(-1, indices, torch.tensor(True))
return mask Calling this using eager PyTorch, I get a valid result In : topk_mask(torch.arange(10, dtype=torch.float32), torch.tensor(4))
Out: tensor([False, False, False, False, False, True, True, True, True, True]) But when I trace the function using TorchScript, I get the following error In : torch.jit.trace(topk_mask, (torch.arange(10, dtype=torch.float32), torch.tensor(4)))
...
<ipython-input-31-8fd7c943c439> in topk_mask(score, num_dense_elem)
10
11 mask = torch.zeros_like(score, dtype=torch.bool)
---> 12 mask = mask.scatter(-1, indices, torch.tensor(True))
13 return mask
14
RuntimeError: Expected index [10] to be smaller than self [10] apart from dimension 0 and to be smaller size than src [] The error appears to be coming from this check |
❓ Questions and Help
Hi, I am facing this problem and have been searching for answers for a day. Anyone can help?
The part of the code causing the problem is here. I was trying to use an RNN. I removed the rnn layer and the problem persists.
My code runs without a problem before I am trying to convert it to a TorchScript. And TorchScript works fine until I add the rnn layer (using pack_padded_sequence and pad_packed_sequence ).
cc @suo @gmagogsfm
The text was updated successfully, but these errors were encountered: