DataParallel is not compatible with pack_padded_sequence #2312

ZiJianZhao · 2017-08-07T07:12:10Z

If the model has pack_padded_sequence, then with DataParallel module it will output error "ValueError: lengths array has incorrect size"

The text was updated successfully, but these errors were encountered:

ngimel · 2017-08-07T16:17:04Z

Can you please provide minimum reproducer?

soumith · 2017-08-30T23:17:54Z

@ZiJianZhao still waiting on a response.

jgc128 · 2017-11-07T02:26:47Z

Hi,

I have the same error (there's also this issue #1591). The code below works on one GPU (CUDA_VISIBLE_DEVICES=0 python pack_padded_sequence_data_parallel.py), but fails with "ValueError: lengths array has incorrect size" on two GPUs (CUDA_VISIBLE_DEVICES=0,1 python pack_padded_sequence_data_parallel.py):

import numpy as np
import torch
from torch.autograd import Variable


class RNNDataParallel(torch.nn.Module):
    def __init__(self):
        super(RNNDataParallel, self).__init__()

    def forward(self, inputs, lengths):
        packed = torch.nn.utils.rnn.pack_padded_sequence(inputs, lengths, batch_first=True)

        return packed


model = RNNDataParallel()
model = torch.nn.DataParallel(model)
model = model.cuda()

inputs = Variable(torch.from_numpy(np.array([
    [1, 2, 3],
    [4, 5, 0],
])))
lengths = [3, 2]

packed = model(inputs, lengths)

print(packed)

My PyTorch version is 0.2.0+e02f7bf

ahmedmagdiosman · 2017-11-20T17:08:20Z

I encountered the same issue as @jgc128 .

EDIT: I think the issue is that DataParallel does not do slice CPU data like the lengths list.

EDIT2: I "fixed" this by transforming the lengths to a Variable(LongTensor.cuda()) before starting the forward pass and reverting it to a List before calling pack_padded_sequence.

jekbradbury · 2017-11-20T17:42:50Z

pack_padded_sequence now supports the lengths being provided as a Tensor or Variable, I think?

ahmedmagdiosman · 2017-11-20T18:53:18Z

@jekbradbury Not on my build (conda pytorch 0.2.0). Even so, the issue lies within DataParallel for not slicing lengths. I fixed the issue with a simple hack (check my 2nd edit on the previous comment).

fixing compiler warning fixing lintrunner entries

soumith added this to Uncategorized in Issue Status Aug 23, 2017

soumith added the awaiting response (this tag is deprecated) This tag is deprecated while we figure out what to do with it label Aug 30, 2017

soumith added this to Crashes / Segfaults / Errors in Issue Categories Aug 30, 2017

zou3519 pushed a commit to zou3519/pytorch that referenced this issue Mar 30, 2018

hypen as a valid part of model names (pytorch#2312)

efe1c2b

gchanan added module: data parallel triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Feb 11, 2020

samnordmann pushed a commit to samnordmann/pytorch that referenced this issue Jan 12, 2023

Lintrunner compiler warning patch (pytorch#2312)

b46ebd6

fixing compiler warning fixing lintrunner entries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataParallel is not compatible with pack_padded_sequence #2312

DataParallel is not compatible with pack_padded_sequence #2312

ZiJianZhao commented Aug 7, 2017

ngimel commented Aug 7, 2017

soumith commented Aug 30, 2017

jgc128 commented Nov 7, 2017 •

edited

ahmedmagdiosman commented Nov 20, 2017 •

edited

jekbradbury commented Nov 20, 2017

ahmedmagdiosman commented Nov 20, 2017

DataParallel is not compatible with pack_padded_sequence #2312

DataParallel is not compatible with pack_padded_sequence #2312

Comments

ZiJianZhao commented Aug 7, 2017

ngimel commented Aug 7, 2017

soumith commented Aug 30, 2017

jgc128 commented Nov 7, 2017 • edited

ahmedmagdiosman commented Nov 20, 2017 • edited

jekbradbury commented Nov 20, 2017

ahmedmagdiosman commented Nov 20, 2017

jgc128 commented Nov 7, 2017 •

edited

ahmedmagdiosman commented Nov 20, 2017 •

edited