`torch.multinomial(torch.FloatTensor(10).cuda().normal_(), 3)`is returning a FloatTensor on GPU (as opposed to a LongTensor on CPU)