# Chapter 1. Getting started with PyTorch

In [2]:
import torch
print(torch.cuda.is_available())
print(torch.rand(2,2))

True
tensor([[0.8494, 0.7562],
        [0.2044, 0.3554]])


In [3]:
x = torch.tensor([[0,0,1]])
x.shape

torch.Size([1, 3])

In [4]:
x

tensor([[0, 0, 1]])

In [5]:
x[0][0] = 5

In [6]:
x

tensor([[5, 0, 1]])

In [7]:
torch.zeros(2,2)

tensor([[0., 0.],
        [0., 0.]])

In [9]:
torch.ones(1,2) + torch.ones(1,2)

tensor([[2., 2.]])

In [12]:
# Only one element tensors can be converted to Python scalars
torch.rand(1).item()

0.3006139397621155

**Tensors can live in the CPU or on the GPU and can be copied between devices by using the to() function:**

In [13]:
cpu_tensor = torch.rand(1)

In [15]:
cpu_tensor.device

device(type='cpu')

In [16]:
gpu_tensor = cpu_tensor.to("cuda")

In [17]:
gpu_tensor.device

device(type='cuda', index=0)

## Tensor Operations

**We often need to find the maximum item in a tensor as well as the index that contains the maximum value (as this often corresponds to the class that the neural network has decided upon in its final prediction). These can be done with the max() and argmax() functions. We can also use item() to extract a standard Python value from a 1D tensor.**

In [18]:
torch.rand(2,2).max()

tensor(0.5977)

In [20]:
torch.rand(2,2).max().item()

0.7306172251701355

In [21]:
long_tensor = torch.tensor([[0,0,1],[1,1,1],[0,0,0]])
long_tensor.type()

'torch.LongTensor'

In [22]:
float_tensor = torch.tensor([[0,0,1],[1,1,1],[0,0,0]]).to(dtype=torch.float32)
float_tensor.type()

'torch.FloatTensor'

**Most functions that operate on a tensor and return a tensor create a new tensor to store the result. However, if you want to save memory, look to see if an in-place function is defined, which should be the same name as the original function but with an appended underscore (_).**

In [31]:
random_tensor = torch.rand(2,2)
random_tensor.log2()

tensor([[-2.9835, -1.2427],
        [-3.6090, -0.9113]])

In [32]:
random_tensor.log2_()

tensor([[-2.9835, -1.2427],
        [-3.6090, -0.9113]])

In [33]:
random_tensor

tensor([[-2.9835, -1.2427],
        [-3.6090, -0.9113]])

Another common operation is **reshaping** a tensor. This can often occur because your neural network layer may require a slightly different input shape than what you currently have to feed into it. For example, the Modified National Institute of Standards and Technology (MNIST) dataset of handwritten digits is a collection of 28 × 28 images, but the way it’s packaged is in arrays of length 784. To use the networks we are constructing, we need to turn those back into 1 × 28 × 28 tensors (the leading 1 is the number of channels—normally red, green, and blue—but as MNIST digits are just grayscale, we have only one channel). We can do this with either view() or reshape():

In [34]:
flat_tensor = torch.rand(784)
viewed_tensor = flat_tensor.view(1,28,28)
viewed_tensor.shape

torch.Size([1, 28, 28])

In [35]:
reshaped_tensor = flat_tensor.reshape(1,28,28)
reshaped_tensor.shape

torch.Size([1, 28, 28])

Note that the reshaped tensor’s shape has to have the same number of total elements as the original. If you try flat_tensor.reshape(3,28,28), you’ll see an error like this:

In [36]:
flat_tensor.reshape(3,28,28)

RuntimeError: shape '[3, 28, 28]' is invalid for input of size 784

Now you might wonder what the **difference is between view() and reshape()**. The answer is that view() operates as a view on the original tensor, so if the underlying data is changed, the view will change too (and vice versa). However, view() can throw errors if the required view is not contiguous; that is, it doesn’t share the same block of memory it would occupy if a new tensor of the required shape was created from scratch. If this happens, you have to call tensor.contiguous() before you can use view(). However, reshape() does all that behind the scenes, so in general, I recommend using reshape() rather than view().

**Finally, you might need to rearrange the dimensions of a tensor. You will likely come across this with images, which often are stored as [height, width, channel] tensors, but PyTorch prefers to deal with these in a [channel, height, width]. You can user permute() to deal with these in a fairly straightforward manner:**

In [37]:
hwc_tensor = torch.rand(640, 480, 3)
chw_tensor = hwc_tensor.permute(2,0,1)
chw_tensor.shape

torch.Size([3, 640, 480])

## Tensor Broadcasting

Borrowed from NumPy, broadcasting allows you to perform operations between a tensor and a smaller tensor. You can broadcast across two tensors if, starting backward from their trailing dimensions:

- The two dimensions are equal.
- One of the dimensions is 1.

In [42]:
torch.tensor([[1, 2], [2, 3]])

tensor([[1, 2],
        [2, 3]])

In [44]:
torch.tensor([[1, 2], [2, 3]]) + 1

tensor([[2, 3],
        [3, 4]])

In [45]:
torch.tensor([[1, 2], [2, 3]]) + torch.tensor([1, 2])

tensor([[2, 4],
        [3, 5]])