Skip to content

transfer_batch_to_gpu not moving List[torch.tensor] and like datastructures to GPU #888

@Laksh1997

Description

@Laksh1997

🐛 Bug

trainer.transfer_batch_to_gpu(torch.randn(3, 3), 0) gives:

tensor([[-1.0600,  0.3306,  1.1276],
        [ 1.0012, -0.2687, -0.5493],
        [-0.5619, -1.7161, -0.8625]], device='cuda:0')

But
trainer.transfer_batch_to_gpu([ torch.randn(3, 3) ], 0) gives

[tensor([[ 1.3631,  0.3408, -1.1074],
         [-1.1176, -0.8056,  0.2937],
         [-0.4235,  0.7321, -0.8811]])]

which is not the expected behaviour as we are supposed to move all tensors in the data structure onto GPU.

Same issue occurs for tuple and dictionary datastructures.

Also - would it be possible to get gpu transfer for argparse.Namespace datastructures, or class datastructures?

IE if I make a class like so:

class Batch(object):
    def __init__(self, 
                 src_input_ids,
                 tgt_input_ids,
                 src_attn_mask,
                 tgt_attn_mask,
                 src_seg_ids,
                 tgt_labels
                ):
        self.__dict__.update(locals())

it would be cool if I could directly pass this in with gpu transfer.

Thanks a lot !

To Reproduce

Just copy one of the examples to get the Trainer class.

Then do trainer=Trainer(...) and run the above code

Code sample

Expected behavior

Environment

Python 3.6
Pytorch 1.3.1
Cuda 10.1
PytorchLightning master (as of today)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedOpen to be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions