<a href="https://colab.research.google.com/github/lblogan14/PyTorch_tutorial_colab/blob/main/2_Tensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Tensors are the central data abstraction in PyTorch

In [None]:
import torch
import math

#Creating Tensors

The simplest way to create a tensor is with the `torch.empty()` call:

In [None]:
x = torch.empty(3, 4)
print(type(x))
print(x)

<class 'torch.Tensor'>
tensor([[1.5758e+03, 3.0789e-41, 3.3631e-44, 0.0000e+00],
        [       nan, 6.4460e-44, 1.1578e+27, 1.1362e+30],
        [7.1547e+22, 4.5828e+30, 1.2121e+04, 7.1846e+22]])


* We created a tensor using one of the numerous factory methods attached to the `torch` module.
* The tensor itself is 2D, having 3 rows and 4 columns.
* The type of this object is `torch.Tensor`; by default, PyTorch tensors are populated with 32-bit floating point numbers.
* The `torch.empty()` call allocates memory for the tensor, but does not intialize it with any values

More often we want to initialize tensors with some value. Common cases are all zeros, all ones, or random values:

In [None]:
zeros = torch.zeros(2, 3)
print(zeros)

tensor([[0., 0., 0.],
        [0., 0., 0.]])


In [None]:
ones = torch.ones(2, 3)
print(ones)

tensor([[1., 1., 1.],
        [1., 1., 1.]])


In [None]:
torch.manual_seed(1)
random = torch.rand(2, 3)
print(random)

tensor([[0.7576, 0.2793, 0.4031],
        [0.7347, 0.0293, 0.7999]])


#Random Tensors and Seeding
Remember to manually set the random number generator's seed for reproducibility

In [None]:
torch.manual_seed(1)
random1 = torch.rand(2, 3)
print('Random1: ')
print(random1)

random2 = torch.rand(2, 3)
print('\nRandom2: ')
print(random2)

torch.manual_seed(1)
random3 = torch.rand(2, 3)
print('\nRandom3: ')
print(random3)
print('Should be identical to Random1')

random4 = torch.rand(2, 3)
print('\nRandom4: ')
print(random4)
print('Should be identical to Random2')

Random1: 
tensor([[0.7576, 0.2793, 0.4031],
        [0.7347, 0.0293, 0.7999]])

Random2: 
tensor([[0.3971, 0.7544, 0.5695],
        [0.4388, 0.6387, 0.5247]])

Random3: 
tensor([[0.7576, 0.2793, 0.4031],
        [0.7347, 0.0293, 0.7999]])
Should be identical to Random1

Random4: 
tensor([[0.3971, 0.7544, 0.5695],
        [0.4388, 0.6387, 0.5247]])
Should be identical to Random2


#Tensor Shapes
Often, when performing operations on two or more tensors, we expect them to be of the same shape. For that, we have the `torch.*_like()` methods:

In [None]:
x = torch.empty(2, 2, 3)
print(x.shape)
print(x)

torch.Size([2, 2, 3])
tensor([[[1.5760e+03, 3.0789e-41, 0.0000e+00],
         [0.0000e+00, 0.0000e+00, 0.0000e+00]],

        [[0.0000e+00, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 0.0000e+00, 0.0000e+00]]])


In [None]:
empty_like_x = torch.empty_like(x)
print(empty_like_x.shape)
print(empty_like_x)

torch.Size([2, 2, 3])
tensor([[[1.5760e+03, 3.0789e-41, 3.3631e-44],
         [0.0000e+00,        nan, 0.0000e+00]],

        [[1.1578e+27, 1.1362e+30, 7.1547e+22],
         [4.5828e+30, 1.2121e+04, 7.1846e+22]]])


In [None]:
zeros_like_x = torch.zeros_like(x)
print(zeros_like_x.shape)
print(zeros_like_x)

torch.Size([2, 2, 3])
tensor([[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]])


In [None]:
ones_like_x = torch.ones_like(x)
print(ones_like_x.shape)
print(ones_like_x)

torch.Size([2, 2, 3])
tensor([[[1., 1., 1.],
         [1., 1., 1.]],

        [[1., 1., 1.],
         [1., 1., 1.]]])


In [None]:
rand_like_x = torch.rand_like(x)
print(rand_like_x.shape)
print(rand_like_x)

torch.Size([2, 2, 3])
tensor([[[0.6826, 0.3051, 0.4635],
         [0.4550, 0.5725, 0.4980]],

        [[0.9371, 0.6556, 0.3138],
         [0.1980, 0.4162, 0.2843]]])


Another way to create a tensor is to specify its data directly from a PyTorch collection:

In [None]:
some_constants = torch.tensor([[1.1, 2.2], [3.3, 4.4]])
print(some_constants)

tensor([[1.1000, 2.2000],
        [3.3000, 4.4000]])


In [None]:
some_integers = torch.tensor((2,3,5,7,11,13,17,19))
print(some_integers)

tensor([ 2,  3,  5,  7, 11, 13, 17, 19])


In [None]:
more_integers = torch.tensor(((2,4,6), [3,6,9]))
print(more_integers)

tensor([[2, 4, 6],
        [3, 6, 9]])


Using `torch.tensor()` is the most straightforward way to create a tensor if we alreay have data in a Python tuple or list.

Note: `torch.tensor()` creates a copy of data.

#Tensor Data Types
Setting the datatype of a tensor:

In [None]:
a = torch.ones((2,3), dtype=torch.int16)
print(a)

tensor([[1, 1, 1],
        [1, 1, 1]], dtype=torch.int16)


In [None]:
b = torch.rand((2,3), dtype=torch.float64) * 20.
print(b)

tensor([[14.3641,  7.6902,  1.7959],
        [ 2.3491, 12.8048,  3.9353]], dtype=torch.float64)


In [None]:
c = b.to(torch.int32)
print(c)

tensor([[14,  7,  1],
        [ 2, 12,  3]], dtype=torch.int32)


Available data types include: `torch.bool`, `torch.int8`, `torch.uint8`, `torch.int16`, `torch.int32`, `torch.int64`, `torch.half`, `torch.float`, `torch.double`, and `torch.bfloat`.

#Math & Logic with PyTorch Tensors

See how tensors interact with simple scalars:

In [None]:
ones = torch.zeros(2,2) + 1
print(ones)

tensor([[1., 1.],
        [1., 1.]])


In [None]:
twos = torch.ones(2,2) * 2
print(twos)

tensor([[2., 2.],
        [2., 2.]])


In [None]:
threes = (torch.ones(2,2) * 7 - 1) / 2
print(threes)

tensor([[3., 3.],
        [3., 3.]])


In [None]:
fours = twos ** 2
print(fours)

tensor([[4., 4.],
        [4., 4.]])


In [None]:
sqrt2s = twos ** 0.5
print(sqrt2s)

tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])


Operations between two tensors:

In [None]:
powers2 = twos ** torch.tensor([[1,2],[3,4]])
print(powers2)

tensor([[ 2.,  4.],
        [ 8., 16.]])


In [None]:
fives = ones + fours
print(fives)

tensor([[5., 5.],
        [5., 5.]])


In [None]:
dozens = threes * fours
print(dozens)

tensor([[12., 12.],
        [12., 12.]])


This element-wise operations require all tensors are of identical shape.

#In Brief: Tensor Broadcasting
The exception to the same-shape rule is *tensor broadcasting*:

In [None]:
rand = torch.rand(2,4)
doubled = rand * (torch.ones(1,4) * 2)

print(rand)
print(doubled)

tensor([[0.9391, 0.4167, 0.7140, 0.2676],
        [0.9906, 0.2885, 0.8750, 0.5059]])
tensor([[1.8781, 0.8334, 1.4280, 0.5353],
        [1.9812, 0.5769, 1.7499, 1.0118]])


Why is there no error? Broadcasting is a way to perform an operation between tensors that have similarities in their shapes. In the example above, the 1-row, 4-column tensor is multiplied by both rows of the 2-row, 4-column tensor.

This is an important operation in Deep Learning. The common example is multiplying a tensor of learning weights by a *batch* of input tensors, applying the operation to each instance in the batch separately, and returning a tensor of identical shape - just like the (2,4)*(1,4) example above returned a tensor of shape (2,4).

The rules for broadcasting:
* Each tensor must have at least one dimension - no empty tensors.
* Comparing the dimension sizes of the two tensors, *going from last to first*:
    * Each dimension must be equal, or
    * One of the dimensions must be size 1, or
    * The dimension does not exist in one of the tensors

Here are some examples of situations that honor the above rules and allow broadcasting:

In [None]:
a =     torch.ones(4, 3, 2)

b = a * torch.rand(   3, 2) # 3rd & 2nd dims identical to a, dim 1 absent
print(b)

tensor([[[0.2366, 0.7570],
         [0.2346, 0.6471],
         [0.3556, 0.4452]],

        [[0.2366, 0.7570],
         [0.2346, 0.6471],
         [0.3556, 0.4452]],

        [[0.2366, 0.7570],
         [0.2346, 0.6471],
         [0.3556, 0.4452]],

        [[0.2366, 0.7570],
         [0.2346, 0.6471],
         [0.3556, 0.4452]]])


* The multiplication operation that created `b` was broadcast over every "layer" of `a`.

In [None]:
a =     torch.ones(4, 3, 2)
c = a * torch.rand(   3, 1) # 3rd dim = 1, 2nd dim identical to a
print(c)

tensor([[[0.0193, 0.0193],
         [0.2616, 0.2616],
         [0.7713, 0.7713]],

        [[0.0193, 0.0193],
         [0.2616, 0.2616],
         [0.7713, 0.7713]],

        [[0.0193, 0.0193],
         [0.2616, 0.2616],
         [0.7713, 0.7713]],

        [[0.0193, 0.0193],
         [0.2616, 0.2616],
         [0.7713, 0.7713]]])


The operation that created `c` was broadcast over every layer and row of `a` - every 3-element column is identical.

In [None]:
a =     torch.ones(4, 3, 2)
d = a * torch.rand(   1, 2) # 3rd dim identical to a, 2nd dim = 1
print(d)

tensor([[[0.3785, 0.9980],
         [0.3785, 0.9980],
         [0.3785, 0.9980]],

        [[0.3785, 0.9980],
         [0.3785, 0.9980],
         [0.3785, 0.9980]],

        [[0.3785, 0.9980],
         [0.3785, 0.9980],
         [0.3785, 0.9980]],

        [[0.3785, 0.9980],
         [0.3785, 0.9980],
         [0.3785, 0.9980]]])


For `d`, we switched it around - every row is identical, across layers and columns

In [None]:
# The followings throw run-time error
a =     torch.ones(4, 3, 2)

b = a * torch.rand(4, 3)    # dimensions must match last-to-first

c = a * torch.rand(   2, 3) # both 3rd & 2nd dims different

d = a * torch.rand((0, ))   # can't broadcast with an empty tensor

#More Math with Tensors
Some samples from some of the major categories of operations:

In [None]:
# common functions
a = torch.rand(2, 4) * 2 - 1
print('Common functions:')
print(torch.abs(a))
print(torch.ceil(a))
print(torch.floor(a))
print(torch.clamp(a, -0.5, 0.5))

Common functions:
tensor([[0.8016, 0.0468, 0.6675, 0.6090],
        [0.3104, 0.6464, 0.6495, 0.6071]])
tensor([[1., -0., -0., 1.],
        [1., -0., 1., 1.]])
tensor([[ 0., -1., -1.,  0.],
        [ 0., -1.,  0.,  0.]])
tensor([[ 0.5000, -0.0468, -0.5000,  0.5000],
        [ 0.3104, -0.5000,  0.5000,  0.5000]])


In [None]:
# trigonometric functions and their inverses
angles = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
sines = torch.sin(angles)
inverses = torch.asin(sines)
print('Sine and arcsine:')
print(angles)
print(sines)
print(inverses)

Sine and arcsine:
tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7854, 1.5708, 0.7854])


In [None]:
# bitwise operations
print('Bitwise XOR:')
b = torch.tensor([1, 5, 11])
c = torch.tensor([2, 7, 10])
print(torch.bitwise_xor(b, c))

Bitwise XOR:
tensor([3, 2, 1])


In [None]:
# comparisons:
print('Broadcasted, element-wise equality comparison:')
d = torch.tensor([[1., 2.], [3., 4.]])
e = torch.ones(1, 2) # many comparison ops support broadcasting!
print(torch.eq(d, e)) # return a tensor of type bool

Broadcasted, element-wise equality comparison:
tensor([[ True, False],
        [False, False]])


In [None]:
# reductions:
print('Reduction ops:')
print(torch.max(d)) # returns a single-element tensor
print(torch.max(d).item()) # extracts the value from the returned tensor
print(torch.mean(d)) # average
print(torch.std(d)) # std
print(torch.prod(d)) # product of all numers
print(torch.unique(torch.tensor([1, 2, 1, 2, 1, 2]))) # filter unique elements

Reduction ops:
tensor(4.)
4.0
tensor(2.5000)
tensor(1.2910)
tensor(24.)
tensor([1, 2])


In [None]:
# vector and linear algebra
v1 = torch.tensor([1., 0., 0.])         # x unit vector
v2 = torch.tensor([0., 1., 0.])         # y unit vector
m1 = torch.rand(2, 2)                   # random matrix
m2 = torch.tensor([[3., 0.], [0., 3.]]) # three times identity matrix

print('Vectors & Matrices:')
print(torch.cross(v2, v1)) # negative of z unit vector
print(m1)

m3 = torch.matmul(m1, m2)
print(m3) # 3 times m1
print(torch.svd(m3)) # svd

Vectors & Matrices:
tensor([ 0.,  0., -1.])
tensor([[0.5730, 0.1205],
        [0.1452, 0.7720]])
tensor([[1.7191, 0.3616],
        [0.4356, 2.3160]])
torch.return_types.svd(
U=tensor([[ 0.4394,  0.8983],
        [ 0.8983, -0.4394]]),
S=tensor([2.5158, 1.5199]),
V=tensor([[ 0.4557,  0.8901],
        [ 0.8901, -0.4557]]))


#Altering Tensors in Place
Most binary operations on tensors will return a third, new tensor.

There are times that we may wish to alter a tensor in place. For this, most of the math functions have a version with an appended underscore (`_`) that will alter a tensor in place.

In [None]:
a = torch.tensor([0, math.pi/4, math.pi/2, 3*math.pi/4])
print('a:')
print(a)
print('\nsin(a):')
print(torch.sin(a)) # this op creates a new tensor in memory
print('check a:')
print(a) # a has not changed

a:
tensor([0.0000, 0.7854, 1.5708, 2.3562])

sin(a):
tensor([0.0000, 0.7071, 1.0000, 0.7071])
check a:
tensor([0.0000, 0.7854, 1.5708, 2.3562])


In [None]:
b = torch.tensor([0, math.pi/4, math.pi/2, 3*math.pi/4])
print('b:')
print(b)
print('\nsin(b):')
print(torch.sin_(b)) # note the underscore
print('check b:')
print(b) # b has changed

b:
tensor([0.0000, 0.7854, 1.5708, 2.3562])

sin(b):
tensor([0.0000, 0.7071, 1.0000, 0.7071])
check b:
tensor([0.0000, 0.7071, 1.0000, 0.7071])


For arithmetic operations:

In [None]:
a = torch.ones(2,2)
b = torch.rand(2,2)

print('Before:')
print('a:\n', a)
print('b:\n', b)
print('\nAfter adding:')
print('a+b:\n', a.add_(b))
print('a:\n', a)
print('b:\n', b)
print('\nAfter multiplying')
print('b*b:\n', b.mul_(b))
print('b:\n', b)

Before:
a:
 tensor([[1., 1.],
        [1., 1.]])
b:
 tensor([[0.7517, 0.1484],
        [0.1227, 0.5304]])

After adding:
a+b:
 tensor([[1.7517, 1.1484],
        [1.1227, 1.5304]])
a:
 tensor([[1.7517, 1.1484],
        [1.1227, 1.5304]])
b:
 tensor([[0.7517, 0.1484],
        [0.1227, 0.5304]])

After multiplying
b*b:
 tensor([[0.5650, 0.0220],
        [0.0151, 0.2813]])
b:
 tensor([[0.5650, 0.0220],
        [0.0151, 0.2813]])


Note that these in-place arithmetic functions are methods on the `torch.Tensor` object, not attached to the `torch` module like many other functions.

Many of the methods and functions have a `out` argument tha lets us specify a tensor to receive the output. If the `out` tensor is the correct shape and `dtype`, this can happen without a new memory allocation:

In [None]:
a = torch.rand(2,2)
b = torch.rand(2,2)
c = torch.zeros(2,2)
old_id = id(c)

print(c)

tensor([[0., 0.],
        [0., 0.]])


In [None]:
d = torch.matmul(a, b, out=c)
print(c) # contents of c have changed

tensor([[0.3869, 0.2564],
        [0.2603, 0.4632]])


In [None]:
assert c is d 
# test c & d are same object, not just containing equal values

In [None]:
assert id(c), old_id
# make sure that new c is the same object as the old one

In [None]:
torch.rand(2,2, out=c) # works for creation too
print(c)
assert id(c), old_id   # still the same object

tensor([[0.3486, 0.9579],
        [0.4075, 0.7819]])


#Copying Tensors
As with any object in Python, assigning a tensor to a variable makes the variable a *label* of the tensor, and does not copy it:

In [None]:
a = torch.ones(2,2)
b = a

a[0][1] = 561
print(b)

tensor([[  1., 561.],
        [  1.,   1.]])


To save a separate copy, use `clone()` method:

In [None]:
a = torch.ones(2,2)
b = a.clone()

assert b is not a       # different objects in memory
print(torch.eq(a,b))    # but still with the same contents

tensor([[True, True],
        [True, True]])


In [None]:
a[0][1] = 561
print(b)        # but b is still all ones

tensor([[1., 1.],
        [1., 1.]])


NOTE: If your source tensor has autograd, enabled then so will the clone.

In many cases, this will be what you want. For example, if your model has multiple computation paths in its `forward()` method, and both the original tensor and its clone contribute to the model's output, then to enable model learning we want autograd turned on for both tensors.

If our source tensor has autograd enabled, then we will get the result we want.

There is an exceptional case: Imagine we are performing a computation in our model's `forward()` function, where gradients are turned on for everything by default, but we want to pull out some values mid-stream to generate some metrics. In this case, we do not want the cloned copy of our source tensor to track gradients - performance is improved with autograd's hisotry tracking turned off. For this, we can use the `.detach()` method on the source tensor:

In [None]:
a = torch.rand(2,2, requires_grad=True) # turn on autograd
print(a)

tensor([[0.7165, 0.1768],
        [0.0748, 0.9799]], requires_grad=True)


* Here, we created `a` with `requires_grad=True` turned on.
* When we print `a`, it informs us that autograd and computation hisotry tracking are turned on.

In [None]:
b = a.clone()
print(b)

tensor([[0.7165, 0.1768],
        [0.0748, 0.9799]], grad_fn=<CloneBackward0>)


* We clone `a` and label it `b`.
* When we print `b`, we can see that it is tracking its computation history - it has inherted `a`'s autograd settings, and added to the computation hisotry.

In [None]:
c = a.detach().clone()
print(c)

tensor([[0.7165, 0.1768],
        [0.0748, 0.9799]])


* We clone `a` into `c`, but calling `.detach()` first
* Printing `c`, we see no computation hisotry and no `requires_grad=True`.

The `.detach()` method detaches the tensor from its computation history.

In [None]:
print(a)

tensor([[0.7165, 0.1768],
        [0.0748, 0.9799]], requires_grad=True)


#Moving to GPU
PyTorch is robust on CUDA-compatible Nvidia GPUs. ("CUDA" stands for Compute Unified Device Architecture", which is Nvidia's platform for parallel computing.)

In [None]:
if torch.cuda.is_available():
    print('We have a GPU!')
else:
    print('Sorry, CPU only.')

We have a GPU!


Once GPU is available, we can put data on GPU. Whenever we want to perform a computation on a device, we must move all the data needed for that computation to memory accessible by that device.

We can do this at creation time:

In [None]:
if torch.cuda.is_available():
    gpu_rand = torch.rand(2,2, device='cuda')
    print(gpu_rand)
else:
    print('Sorry, CPU only')

tensor([[0.2457, 0.5791],
        [0.0895, 0.9328]], device='cuda:0')


By default, new tensors are created on the CPU, so we have to specify when we want to create our tensor on the GPU with the optional `device` argument.

We can query the number of GPUs with `torch.cuda.device_count()`.

In [None]:
if torch.cuda.is_available():
    print(torch.cuda.device_count())
else:
    print('Sorry, CPU only')

1


As a coding practice, specifying our devices everywhere with string constants is pretty fragile. We can create a device handle that can be passed to our tensors instead of a string:

In [None]:
if torch.cuda.is_available():
    my_device = torch.device('cuda')
else:
    my_device = torch.device('cpu')
print('Device: {}'.format(my_device))

x = torch.rand(2,2, device=my_device)
print(x)

Device: cuda
tensor([[0.8620, 0.4889],
        [0.6784, 0.0840]], device='cuda:0')


If we have an existing tensor living on one device, we can move it to another with the `to()` method.

In [None]:
y = torch.rand(2,2)
y = y.to(my_device)
print(y)

tensor([[0.1905, 0.7696],
        [0.3790, 0.8259]], device='cuda:0')


To do computation involving two or more tensors, all of the tensors must be on the same device. Otherwise, a runtime error is thrown:

In [None]:
x = torch.rand(2,2)
y = torch.rand(2,2, device='gpu')
z = x + y # exception will be thrown

RuntimeError: ignored

#Changing the Number of Dimensions
PyTorch generally expect batches of input so we may need to change the dimensions when we pass a single instance of input to the model.

For example, imagine having a model that works on 3x226x226 images. When we load and transform it, we get a tensor of shape `(3, 226, 226)`. Our model, though, is expecting input of shape `(N, 3, 226, 226)`, where `N` is the number of images in the batch.

In [None]:
a = torch.rand(3, 226, 226)
b = a.unsqueeze(0)

print('Shape of a:')
print(a.shape)
print('\nShape of b:')
print(b.shape)

Shape of a:
torch.Size([3, 226, 226])

Shape of b:
torch.Size([1, 3, 226, 226])


The `unsqueeze()` method adds a dimension of extent 1. `unsqueeze(0)` adds it as a new zeroth dimension.

Any dimension of extent 1 does not change the number of elements in the tensor:

In [None]:
c = torch.rand(1,1,1,1,1)
print(c)
print(c.shape)

tensor([[[[[0.3852]]]]])
torch.Size([1, 1, 1, 1, 1])


If the model output is a 20-element vector for each input, we expect the output to have shape `(N, 20)`, where `N` is the number of instances in the input batch. For a single-input batch, we get an output of shape `(1, 20)`.

In [None]:
a = torch.rand(1, 20)
print(a.shape)
print(a)

torch.Size([1, 20])
tensor([[0.5901, 0.9794, 0.8320, 0.0432, 0.8902, 0.2500, 0.8457, 0.9606, 0.1602,
         0.0629, 0.3874, 0.4036, 0.4176, 0.2175, 0.6039, 0.0071, 0.7547, 0.9400,
         0.0295, 0.6803]])


In [None]:
b = a.squeeze(0)
print(b.shape)
print(b)

torch.Size([20])
tensor([0.5901, 0.9794, 0.8320, 0.0432, 0.8902, 0.2500, 0.8457, 0.9606, 0.1602,
        0.0629, 0.3874, 0.4036, 0.4176, 0.2175, 0.6039, 0.0071, 0.7547, 0.9400,
        0.0295, 0.6803])


In [None]:
c = torch.rand(2, 2)
print(c.shape)
print(c)

torch.Size([2, 2])
tensor([[0.3590, 0.4314],
        [0.9649, 0.9728]])


In [None]:
d = c.squeeze(0)
print(d.shape)
print(d)

torch.Size([2, 2])
tensor([[0.3590, 0.4314],
        [0.9649, 0.9728]])


Another place we may use `unsqueeze()` is to ease broadcasting.

In [None]:
a =     torch.ones(4, 3, 2)
c = a * torch.rand(   3, 1) # 3rd dim = 1, 2nd dim identical to a
print(c)

tensor([[[0.6835, 0.6835],
         [0.1502, 0.1502],
         [0.9035, 0.9035]],

        [[0.6835, 0.6835],
         [0.1502, 0.1502],
         [0.9035, 0.9035]],

        [[0.6835, 0.6835],
         [0.1502, 0.1502],
         [0.9035, 0.9035]],

        [[0.6835, 0.6835],
         [0.1502, 0.1502],
         [0.9035, 0.9035]]])


The overall effect of that was to broadcast the operation over dimensions 0 and 2, causing the random, 3x1 tensor to be multiplied element-wse by every 3-element column in `a`.

If the random vector had just been 3-element vector, we cannot use broadcasting, because the final dimensions would not match up according to the broadcasting rules. Thus, `unsqueeze()` comes to the rescue:

In [None]:
a = torch.ones(4, 3, 2)
b = torch.rand(   3)        # try to multiply a*b will give a runtime error
c = b.unsqueeze(1)          # change to a 2D tensor, adding new dim at the end
print(c.shape)
print(a * c)                # broadcasting works again

torch.Size([3, 1])
tensor([[[0.6546, 0.6546],
         [0.8733, 0.8733],
         [0.2461, 0.2461]],

        [[0.6546, 0.6546],
         [0.8733, 0.8733],
         [0.2461, 0.2461]],

        [[0.6546, 0.6546],
         [0.8733, 0.8733],
         [0.2461, 0.2461]],

        [[0.6546, 0.6546],
         [0.8733, 0.8733],
         [0.2461, 0.2461]]])


The `squeeze()` and `unsqueeze()` methods also have in-place versions:

In [None]:
batch_me = torch.rand(3, 226, 226)
print(batch_me.shape)

torch.Size([3, 226, 226])


In [None]:
batch_me.unsqueeze_(0)
print(batch_me.shape)

torch.Size([1, 3, 226, 226])


Sometimes we want to change the shape of a tensor more radically, while still preserving the number of elements and their contents. This happens at the interface between a convolutional layer of a model and a linear layer.

A convolution kernel will yield an output tensor of shape *features x width x height*, but the following linear layer expects a 1-dimensional input. `reshape()` will do this for us, provided that the dimensions you request yield the same number of elements as the input tensor has:

In [None]:
output3d = torch.rand(6, 20, 20)
print(output3d.shape)

torch.Size([6, 20, 20])


In [None]:
input1d = output3d.reshape(6*20*20)
print(input1d.shape)

torch.Size([2400])


In [None]:
# can also call it as a method on the torch module:
print(torch.reshape(output3d, (6*20*20,)).shape)

torch.Size([2400])


#NumPy Bridge

In [None]:
import numpy as np

In [None]:
numpy_array = np.ones((2,3))
print(numpy_array)

[[1. 1. 1.]
 [1. 1. 1.]]


In [None]:
pytorch_tensor = torch.from_numpy(numpy_array)
print(pytorch_tensor)

tensor([[1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)


In [None]:
pytorch_rand = torch.rand(2,3)
print(pytorch_rand)

numpy_rand = pytorch_rand.numpy()
print(numpy_rand)

tensor([[0.8753, 0.1445, 0.1919],
        [0.8524, 0.6418, 0.7399]])
[[0.8753493  0.14447767 0.19185531]
 [0.85236156 0.6417764  0.7398762 ]]


These converted objects are using the same underlying memory as their source objects, meaning that changes to one are reflected in the other:

In [None]:
numpy_array[1,1] = 23
print(pytorch_tensor)

tensor([[ 1.,  1.,  1.],
        [ 1., 23.,  1.]], dtype=torch.float64)


In [None]:
pytorch_rand[1,1] = 17
print(numpy_rand)

[[ 0.8753493   0.14447767  0.19185531]
 [ 0.85236156 17.          0.7398762 ]]
