-
Couldn't load subscription status.
- Fork 3.6k
Description
🐛 Bug
trainer.transfer_batch_to_gpu(torch.randn(3, 3), 0) gives:
tensor([[-1.0600, 0.3306, 1.1276],
[ 1.0012, -0.2687, -0.5493],
[-0.5619, -1.7161, -0.8625]], device='cuda:0')
But
trainer.transfer_batch_to_gpu([ torch.randn(3, 3) ], 0) gives
[tensor([[ 1.3631, 0.3408, -1.1074],
[-1.1176, -0.8056, 0.2937],
[-0.4235, 0.7321, -0.8811]])]
which is not the expected behaviour as we are supposed to move all tensors in the data structure onto GPU.
Same issue occurs for tuple and dictionary datastructures.
Also - would it be possible to get gpu transfer for argparse.Namespace datastructures, or class datastructures?
IE if I make a class like so:
class Batch(object):
def __init__(self,
src_input_ids,
tgt_input_ids,
src_attn_mask,
tgt_attn_mask,
src_seg_ids,
tgt_labels
):
self.__dict__.update(locals())
it would be cool if I could directly pass this in with gpu transfer.
Thanks a lot !
To Reproduce
Just copy one of the examples to get the Trainer class.
Then do trainer=Trainer(...) and run the above code
Code sample
Expected behavior
Environment
Python 3.6
Pytorch 1.3.1
Cuda 10.1
PytorchLightning master (as of today)