<a href="https://colab.research.google.com/github/cuixx289/psychic-chainsaw/blob/master/Copy_of_Pytorch_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyTorch Tutorial

Mingyang Cui

Step 29, 2020




# Python Object

List:

In [None]:
[4,5,6]

[4, 5, 6]

Tuple:

In [None]:
(4,5,6)

(4, 5, 6)

Set:

In [None]:
{4,5,6,'a','b','c',2,4,5}

{2, 4, 5, 6, 'a', 'b', 'c'}

Dictionary 

In [None]:
a = {'a':4,'b':5,'c':6}
a['b']

5

Loop & Index:

In [None]:
for i in range(5):
  i = i+1
  print(i)

1
2
3
4
5


# PyTorch

PyTorch is an open source machine learning framework. At its core, PyTorch provides a few key features:


1.A multidimensional Tensor object, similar to numpy but with GPU accelleration.

2.An optimized autograd engine for automatically computing derivatives

3.A clean, modular API for building and deploying deep learning models

You can find more information about PyTorch by following one of the official tutorials or by reading the documentation.


To use PyTorch, we first need to import the torch package.

In [None]:
import torch
print(torch.__version__)

1.6.0+cu101


# 1.Tensor Basics

Creating and Accessing tensors

A torch tensor is a multidimensional grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the tensor; the shape of a tensor is a tuple of integers giving the size of the array along each dimension.

We can initialize torch tensor from nested Python lists. We can access or mutate elements of a PyTorch tensor using square brackets.

Accessing an element from a PyTorch tensor returns a PyTorch scalar; we can convert this to a Python scalar using the .item() method:

In [None]:
# Create a two-dimensional tensor
b = torch.tensor([[0, 1, 2], [3,4,5]])
print('Here is b:',b)
print('The rank of b is:', b.dim())
print('The b.shape is: ', b.shape)

# Access elements from a multidimensional tensor
print()
print('b[0, 2]:', b[0, 2])
print('b[1, 1]:', b[1, 1])

# Mutate elements of a multidimensional tensor
b[1, 1] = 9
print() 
print('b after mutating:',b)

Here is b: tensor([[0, 1, 2],
        [3, 4, 5]])
The rank of b is: 2
The b.shape is:  torch.Size([2, 3])

b[0, 2]: tensor(2)
b[1, 1]: tensor(4)

b after mutating: tensor([[0, 1, 2],
        [3, 9, 5]])



# Tensor constructors

PyTorch provides many convenience methods for constructing tensors; this avoids the need to use Python lists. 

torch.zeros: Creates a tensor of all zeros
torch.ones: Creates a tensor of all ones
torch.rand: Creates a tensor with uniform random numbers

In [None]:
# Create a tensor of all zeros
a = torch.zeros(2, 3)
print('tensor of zeros:\n',a)

# Create a tensor of all ones
b = torch.ones(1, 2)
print('\ntensor of ones:\n',b)

# Create a 4x4 identity matrix
c = torch.eye(4)
print('\nidentity matrix:\n',c)

# Tensor of random values
d = torch.rand(5, 6)
print('\nrandom tensor:\n',d)


tensor of zeros:
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

tensor of ones:
 tensor([[1., 1.]])

identity matrix:
 tensor([[1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.],
        [0., 0., 0., 1.]])

random tensor:
 tensor([[0.5075, 0.7186, 0.1237, 0.7745, 0.8763, 0.1630],
        [0.6456, 0.5508, 0.2588, 0.2919, 0.8370, 0.3563],
        [0.1753, 0.6445, 0.5912, 0.4226, 0.0853, 0.3655],
        [0.0161, 0.7670, 0.4772, 0.0752, 0.9582, 0.7318],
        [0.3085, 0.5389, 0.8767, 0.4821, 0.6292, 0.0942]])


# Datatypes

In the examples above, you may have noticed that some of our tensors contained floating-point values, while others contained integer values.

PyTorch provides a large set of numeric datatypes that you can use to construct tensors. PyTorch tries to guess a datatype when you create a tensor; functions that construct tensors typically have a dtype argument that you can use to explicitly specify a datatype.

Each tensor has a dtype attribute that you can use to check its data type:

In [None]:
# Let torch choose the datatype
a = torch.tensor([5, 7])   # List of integers
b = torch.tensor([5.24, 7.58]) # List of floats
c = torch.tensor([5.24, 7])  # Mixed list
print('dtype when torch chooses for us:')
print('List of integers:', a.dtype)
print('List of floats:', b.dtype)
print('Mixed list:', c.dtype)

# Force a particular datatype
e = torch.tensor([0, 1], dtype=torch.float32)  # 32-bit float
f = torch.tensor([0, 1], dtype=torch.int32)    # 32-bit (signed) integer
g = torch.tensor([0, 1], dtype=torch.int64)    # 64-bit (signed) integer
print('\ndtype when we force a datatype:')
print('32-bit float: ', e.dtype)
print('32-bit integer: ', f.dtype)
print('64-bit integer: ', g.dtype)

dtype when torch chooses for us:
List of integers: torch.int64
List of floats: torch.float32
Mixed list: torch.float32

dtype when we force a datatype:
32-bit float:  torch.float32
32-bit integer:  torch.int32
64-bit integer:  torch.int64


Cast a tensor to another datatype using the .char(), .float() and .double() that cast to particular datatypes:

In [None]:
x0 = torch.eye(8, dtype=torch.int64)
x1 = x0.float()  # Cast to 32-bit float
x2 = x0.double() # Cast to 64-bit float
x3 = x0.char()   # Cast to 8-bit int
print('x0:', x0.dtype)
print('x1:', x1.dtype)
print('x2:', x2.dtype)
print('x3:', x3.dtype)

x0: torch.int64
x1: torch.float32
x2: torch.float64
x3: torch.int8


PyTorch provides several ways to create a tensor with the same datatype as another tensor:

PyTorch provides tensor constructors such as torch.new_zeros() that create new tensors with the same shape and type as a given tensor

Tensor objects have instance methods such as .new_zeros() that create tensors the same type but possibly different shapes

The tensor instance method .to() can take a tensor as an argument, in which case it casts to the datatype of the argument.

In [None]:
x0 = torch.eye(5, dtype=torch.float64)  # Shape (5, 5), dtype torch.float64
x1 = torch.zeros_like(x0)               # Shape (5, 5), dtype torch.float64
x2 = x0.new_zeros(3, 4)                 # Shape (3, 4), dtype torch.float64
x3 = torch.ones(5, 6).to(x0)            # Shape (5, 6), dtype torch.float64)
print('x0 shape is %r, dtype is %r' % (x0.shape, x0.dtype))
print('x1 shape is %r, dtype is %r' % (x1.shape, x1.dtype))
print('x2 shape is %r, dtype is %r' % (x2.shape, x2.dtype))
print('x3 shape is %r, dtype is %r' % (x3.shape, x3.dtype))

x0 shape is torch.Size([5, 5]), dtype is torch.float64
x1 shape is torch.Size([5, 5]), dtype is torch.float64
x2 shape is torch.Size([3, 4]), dtype is torch.float64
x3 shape is torch.Size([5, 6]), dtype is torch.float64


Even though PyTorch provides a large number of numeric datatypes, the most commonly used datatypes are:

torch.float32: Standard floating-point type; used to store learnable parameters, network activations, etc. Nearly all arithmetic is done using this type.

torch.int64: Typically used to store indices

torch.uint8: Typically used to store boolean values, where 0 is false and 1 is true.



# 2.Tensor indexing

We have already seen how to get and set individual elements of PyTorch tensors. PyTorch also provides many other ways of indexing into tensors. Getting comfortable with these different options makes it easy to modify different parts of tensors with ease.

Slice indexing
Similar to Python lists and numpy arrays, PyTorch tensors can be sliced using the syntax start:stop or start:stop:step. The stop index is always non-inclusive: it is the first element not to be included in the slice.

Start and stop indices can be negative, in which case they count backward from the end of the tensor.

In [None]:
a = torch.tensor([0, 1, 27, 32.4, 0.24, 3, 12,7.8])
print(0, a)        # (0) Original tensor
print(1, a[2:5])   # (1) Elements between index 2 and 5
print(2, a[3:])    # (2) Elements after index 3
print(3, a[:4])    # (3) Elements before index 4
print(4, a[:])     # (4) All elements
print(5, a[1:6:3]) # (5) Every third element between indices 1 and 6
print(6, a[:-1])   # (6) All but the last element
print(7, a[-4::2]) # (7) Every second element, starting from the fourth-last

0 tensor([ 0.0000,  1.0000, 27.0000, 32.4000,  0.2400,  3.0000, 12.0000,  7.8000])
1 tensor([27.0000, 32.4000,  0.2400])
2 tensor([32.4000,  0.2400,  3.0000, 12.0000,  7.8000])
3 tensor([ 0.0000,  1.0000, 27.0000, 32.4000])
4 tensor([ 0.0000,  1.0000, 27.0000, 32.4000,  0.2400,  3.0000, 12.0000,  7.8000])
5 tensor([1.0000, 0.2400])
6 tensor([ 0.0000,  1.0000, 27.0000, 32.4000,  0.2400,  3.0000, 12.0000])
7 tensor([ 0.2400, 12.0000])


In [None]:
# Create the following rank 2 tensor with shape (3, 4)
a = torch.tensor([[0,1.2,43,3.56], [5,6,7,8], [2.2,0.23,8.8,2.89]])
print('Original tensor')
print(a)

row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print('\nTwo ways of accessing a single row:')
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)

# We can make the same distinction when accessing columns::
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print('\nTwo ways of accessing a single column:')
print(col_r1, col_r1.shape)
print(col_r2, col_r2.shape)

Original tensor
tensor([[ 0.0000,  1.2000, 43.0000,  3.5600],
        [ 5.0000,  6.0000,  7.0000,  8.0000],
        [ 2.2000,  0.2300,  8.8000,  2.8900]])

Two ways of accessing a single row:
tensor([5., 6., 7., 8.]) torch.Size([4])
tensor([[5., 6., 7., 8.]]) torch.Size([1, 4])

Two ways of accessing a single column:
tensor([1.2000, 6.0000, 0.2300]) torch.Size([3])
tensor([[1.2000],
        [6.0000],
        [0.2300]]) torch.Size([3, 1])


In [None]:
# Create a tensor, a slice, and a clone of a slice
a = torch.tensor([[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12]])
b = a[1, 2:]
c = a[1, 2:].clone()
print('Before mutating:')
print(a)
print(b)
print(c)

a[1, 2] = 122  # a[1, 2] and b[0] point to the same element
b[1] = 130     # b[1] and a[1, 3] point to the same element
c[1] = 540     # c is a clone, so it has its own data
print('\nAfter mutating:')
print(a)
print(b)
print(c)

print(a.storage().data_ptr() == c.storage().data_ptr())

Before mutating:
tensor([[ 1,  2,  3,  4,  5,  6],
        [ 7,  8,  9, 10, 11, 12]])
tensor([ 9, 10, 11, 12])
tensor([ 9, 10, 11, 12])

After mutating:
tensor([[  1,   2,   3,   4,   5,   6],
        [  7,   8, 122, 130,  11,  12]])
tensor([122, 130,  11,  12])
tensor([  9, 540,  11,  12])
False


**Integer** tensor indexing

When you index into torch tensor using slicing, the resulting tensor view will always be a subarray of the original tensor. This is powerful, but can be restrictive.

We can also use index arrays to index tensors; this lets us construct new tensors with a lot more flexibility than using slices.

As an example, we can use index arrays to reorder the rows or columns of a tensor:

In [None]:
# Create the following rank 2 tensor with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = torch.tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])
print('Original tensor:')
print(a)

# Create a new tensor of shape (5, 5) by reordering rows from a:
# - First two rows same as the first row of a
# - Third row is the same as the last row of a
# - Fourth and fifth rows are the same as the second row from a
idx = [0, 1, 0, 2, 2]  # index arrays can be Python lists of integers
print('\nReordered rows:')
print(a[idx])

# Create a new tensor of shape (3, 4) by reversing the columns from a
idx = torch.tensor([4, 3, 2, 1, 0])  # Index arrays can be int64 torch tensors
print('\nReordered columns:')
print(a[:, idx])

Original tensor:
tensor([[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10],
        [11, 12, 13, 14, 15]])

Reordered rows:
tensor([[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10],
        [ 1,  2,  3,  4,  5],
        [11, 12, 13, 14, 15],
        [11, 12, 13, 14, 15]])

Reordered columns:
tensor([[ 5,  4,  3,  2,  1],
        [10,  9,  8,  7,  6],
        [15, 14, 13, 12, 11]])


In [None]:
a = torch.tensor([[3, 4, 5], [2, 1, 2], [5, 6, 3]])
print('Original tensor:')
print(a)

idx = [0, 1, 0]
print('\nGet the diagonal:')
print(a[idx, idx])

# Modify the diagonal
a[idx, idx] = torch.tensor([56, 34, 22])
print('\nAfter setting the diagonal:')
print(a)

Original tensor:
tensor([[3, 4, 5],
        [2, 1, 2],
        [5, 6, 3]])

Get the diagonal:
tensor([3, 1, 3])

After setting the diagonal:
tensor([[22,  4,  5],
        [ 2, 34,  2],
        [ 5,  6,  3]])


In [None]:
# Create a new tensor from which we will select elements
a = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print('Original tensor:')
print(a)

# Take on element from each row of a:
# from row 0, take element 1;
# from row 1, take element 0;
# from row 2, take element 2;
# from row 3, take element 2;
idx0 = torch.arange(a.shape[0])  # Quick way to build [0, 1, 2, 3]
idx1 = torch.tensor([1, 0, 2, 2])
print('\nSelect one element from each row:')
print(a[idx0, idx1])

# Now set each of those elements to zero
a[idx0, idx1] = 0
print('\nAfter modifying one element from each row:')
print(a)

Original tensor:
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

Select one element from each row:
tensor([ 2,  4,  9, 12])

After modifying one element from each row:
tensor([[ 1,  0,  3],
        [ 0,  5,  6],
        [ 7,  8,  0],
        [10, 11,  0]])


# Boolean tensor indexing

Boolean tensor indexing lets you pick out arbitrary elements of a tensor according to a boolean mask. Frequently this type of indexing is used to select or modify the elements of a tensor that satisfy some condition.

In PyTorch, we use tensors of dtype torch.uint8 to hold boolean masks; 0 means false and 1 means true.

In [None]:
x = torch.tensor([[0, 1, 2], [3, 3, 3], [2, 4, 6]])
print('Original tensor:')
print(a)

# Find the elements of a that are bigger than 2. The mask has the same shape as
# a, where each element of mask tells whether the corresponding element of a
# is greater than three.
mask = (a > 2)
print('\nMask tensor:')
print(mask)

# We can use the mask to construct a rank-1 tensor containing the elements of a
# that are selected by the mask
print('\nSelecting elements with the mask:')
print(a[mask])

# We can also use boolean masks to modify tensors; for example this sets all
# elements <= 3 to zero:
a[a <= 2] = 0
print('\nAfter modifying with a mask:')
print(a)

Original tensor:
tensor([[ 0,  0,  0],
        [ 0,  5,  6],
        [ 7,  8,  0],
        [10, 11,  0]])

Mask tensor:
tensor([[False, False, False],
        [False,  True,  True],
        [ True,  True, False],
        [ True,  True, False]])

Selecting elements with the mask:
tensor([ 5,  6,  7,  8, 10, 11])

After modifying with a mask:
tensor([[ 0,  0,  0],
        [ 0,  5,  6],
        [ 7,  8,  0],
        [10, 11,  0]])


# 3.Reshaping operations

View

PyTorch provides many ways to manipulate the shapes of tensors. The simplest example is .view(): This returns a new tensor with the same number of elements as its input, but with a different shape.

We can use .view() to flatten matrices into vectors, and to convert rank-1 vectors into rank-2 row or column matrices:

In [None]:
x = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print('Original tensor:')
print(x)
print('shape:', x.shape)

# Flatten x into a rank 1 vector of shape (9,)
y = x.view(9)
print('\nFlattened tensor:')
print(y)
print('shape:', y.shape)

# Convert y to a rank 2 "row vector" of shape (1, 9)
z = y.view(1, 9)
print('\nRow vector:')
print(z)
print('shape:', z.shape)

# Convert x1 to a rank 2 "column vector" of shape (9, 1)
k = y.view(9, 1)
print('\nColumn vector:')
print(k)
print('shape:', k.shape)

# Convert y to a rank 3 tensor of shape (1, 3, 3):
m = y.view(1, 3, 3)
print('\nRank 3 tensor:')
print(m)
print('shape:', m.shape)

Original tensor:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
shape: torch.Size([3, 3])

Flattened tensor:
tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
shape: torch.Size([9])

Row vector:
tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
shape: torch.Size([1, 9])

Column vector:
tensor([[1],
        [2],
        [3],
        [4],
        [5],
        [6],
        [7],
        [8],
        [9]])
shape: torch.Size([9, 1])

Rank 3 tensor:
tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])
shape: torch.Size([1, 3, 3])


In [None]:
# We can reuse these functions for tensors of different shapes
x = torch.tensor([[1, 2, 3], [4, 5, 6],[7, 8, 9],[10, 11, 12]])
y = x.view(-1, )
z = x.view(-1, 2)

print('x:')
print(x)

print('y:')
print(y)

print('z:')
print(z)

x:
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])
y:
tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])
z:
tensor([[ 1,  2],
        [ 3,  4],
        [ 5,  6],
        [ 7,  8],
        [ 9, 10],
        [11, 12]])



As its name implies, a tensor returned by .view() shares the same data as the input, so changes to one will affect the other and vice-versa:

In [None]:
x = torch.tensor([[1, 2, 3], [4, 5, 6],[7, 8, 9],[10, 11, 12]])
x_flat = x.view(-1)
print('x before modifying:')
print(x)
print('x_flat before modifying:')
print(x_flat)

x[0, 0] = 10   # x[0, 0] and x_flat[0] point to the same data
x_flat[1] = 20 # x_flat[1] and x[0, 1] point to the same data

print('\nx after modifying:')
print(x)
print('x_flat after modifying:')
print(x_flat)

x before modifying:
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])
x_flat before modifying:
tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

x after modifying:
tensor([[10, 20,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])
x_flat after modifying:
tensor([10, 20,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])


# Swapping axes

Another common reshape operation you might want to perform is transposing a matrix. You might be surprised if you try to transpose a matrix with .view(): The view() function takes elements in row-major order, so you cannot transpose matrices with .view().

In general, you should only use .view() to add new dimensions to a tensor, or to collapse adjacent dimensions of a tensor.

For other types of reshape operations, you usually need to use a function that can swap axes of a tensor. The simplest such function is .t(), specificially for transposing matrices. It is available both as a function in the torch module, and as a tensor instance method:

In [None]:
x = torch.tensor([[1, 2, 3], [4, 5, 6],[7, 8, 9],[10, 11, 12]])
print('Original matrix:')
print(x)
print('\nTransposing with view DOES NOT WORK!')
print(x.view(6, 2))
print('\nTransposed matrix:')
print(torch.t(x))
print(x.t())

Original matrix:
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

Transposing with view DOES NOT WORK!
tensor([[ 1,  2],
        [ 3,  4],
        [ 5,  6],
        [ 7,  8],
        [ 9, 10],
        [11, 12]])

Transposed matrix:
tensor([[ 1,  4,  7, 10],
        [ 2,  5,  8, 11],
        [ 3,  6,  9, 12]])
tensor([[ 1,  4,  7, 10],
        [ 2,  5,  8, 11],
        [ 3,  6,  9, 12]])



For tensors with more than two dimensions, we can use the function torch.transpose to swap arbitrary dimensions, or the .permute method to arbitrarily permute dimensions:

In [None]:
# Create a tensor of shape (2, 3, 3)
x0 = torch.tensor([
     [[1,  2,  3],
      [4, 5,  6 ],
      [7, 8, 9 ]],
     [[10, 11, 12 ],
      [13, 14, 15 ],
      [16, 17, 18]]])
print('Original tensor:')
print(x0)
print('shape:', x0.shape)

# Swap axes 0 and 1; shape is (3, 2, 3)
x1 = x0.transpose(0, 1)
print('\nSwap axes 0 and 1:')
print(x1)
print(x1.shape)

# Permute axes; the argument (1, 2, 0) means:
# - Make the old dimension 1 appear at dimension 0;
# - Make the old dimension 2 appear at dimension 1;
# - Make the old dimension 0 appear at dimension 2
# This results in a tensor of shape (3, 4, 2)
x2 = x0.permute(1, 2, 0)
print('\nPermute axes')
print(x2)
print('shape:', x2.shape)

Original tensor:
tensor([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],

        [[10, 11, 12],
         [13, 14, 15],
         [16, 17, 18]]])
shape: torch.Size([2, 3, 3])

Swap axes 0 and 1:
tensor([[[ 1,  2,  3],
         [10, 11, 12]],

        [[ 4,  5,  6],
         [13, 14, 15]],

        [[ 7,  8,  9],
         [16, 17, 18]]])
torch.Size([3, 2, 3])

Permute axes
tensor([[[ 1, 10],
         [ 2, 11],
         [ 3, 12]],

        [[ 4, 13],
         [ 5, 14],
         [ 6, 15]],

        [[ 7, 16],
         [ 8, 17],
         [ 9, 18]]])
shape: torch.Size([3, 3, 2])


# Contiguous tensors

Some combinations of reshaping operations will fail with cryptic errors. The exact reasons for this have to do with the way that tensors and views of tensors are implemented, and are beyond the scope of this assignment. However if we're curious, this blog post by Edward Yang gives a clear explanation of the problem.

we can typically overcome these sorts of errors by either by calling .contiguous() before .view(), or by using .reshape() instead of .view().

In [None]:
x0 = torch.randn(2, 3, 4)

try:
  # This sequence of reshape operations will crash
  x1 = x0.transpose(1, 2).view(8, 3)
except RuntimeError as e:
  print(type(e), e)
  
# We can solve the problem using either .contiguous() or .reshape()
x1 = x0.transpose(1, 2).contiguous().view(8, 3)
x2 = x0.transpose(1, 2).reshape(8, 3)
print('x1 shape: ', x1.shape)
print('x2 shape: ', x2.shape)

<class 'RuntimeError'> view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
x1 shape:  torch.Size([8, 3])
x2 shape:  torch.Size([8, 3])


# 4.Tensor operations

Elementwise operations


In [None]:
x = torch.tensor([[5, 15, 13, 26]], dtype=torch.float32)
y = torch.tensor([[7 , 9, 64, 27]], dtype=torch.float32)

# Elementwise sum; these will gives the same result
print('Elementwise sum:', x+y)
print(torch.add(x, y))
print(x.add(y))

# Elementwise difference
print('\nElementwise difference:')
print(x - y)
print(torch.sub(x, y))
print(x.sub(y))

# Elementwise product
print('\nElementwise product:')
print(x * y)
print(torch.mul(x, y))
print(x.mul(y))

# Elementwise division
print('\nElementwise division')
print(x / y)
print(torch.div(x, y))
print(x.div(y))

# Elementwise power
print('\nElementwise power')
print(x ** y)
print(torch.pow(x, y))
print(x.pow(y))

Elementwise sum: tensor([[12., 24., 77., 53.]])
tensor([[12., 24., 77., 53.]])
tensor([[12., 24., 77., 53.]])

Elementwise difference:
tensor([[ -2.,   6., -51.,  -1.]])
tensor([[ -2.,   6., -51.,  -1.]])
tensor([[ -2.,   6., -51.,  -1.]])

Elementwise product:
tensor([[ 35., 135., 832., 702.]])
tensor([[ 35., 135., 832., 702.]])
tensor([[ 35., 135., 832., 702.]])

Elementwise division
tensor([[0.7143, 1.6667, 0.2031, 0.9630]])
tensor([[0.7143, 1.6667, 0.2031, 0.9630]])
tensor([[0.7143, 1.6667, 0.2031, 0.9630]])

Elementwise power
tensor([[7.8125e+04, 3.8443e+10,        inf, 1.6006e+38]])
tensor([[7.8125e+04, 3.8443e+10,        inf, 1.6006e+38]])
tensor([[7.8125e+04, 3.8443e+10,        inf, 1.6006e+38]])


In [None]:
x = torch.tensor([[81, 24.5, 0.996, 4]], dtype=torch.float32)

print('Square root:')
print(torch.sqrt(x))
print(x.sqrt())

print('\nTrig functions:')
print(torch.sin(x))
print(x.sin())
print(torch.cos(x))
print(x.cos())

Square root:
tensor([[9.0000, 4.9497, 0.9980, 2.0000]])
tensor([[9.0000, 4.9497, 0.9980, 2.0000]])

Trig functions:
tensor([[-0.6299, -0.5914,  0.8393, -0.7568]])
tensor([[-0.6299, -0.5914,  0.8393, -0.7568]])
tensor([[ 0.7767,  0.8064,  0.5437, -0.6536]])
tensor([[ 0.7767,  0.8064,  0.5437, -0.6536]])



Reduction operations

The simplest reduction operation is summation. We can use the .sum() function to reduce either an entire tensor, or to reduce along only one dimension of the tensor using the dim argument:

In [None]:
a = torch.tensor([[10, 13.22, 82.57], 
                  [31, 17.5, 0.87]], dtype=torch.float32)
print('Original tensor:')
print(a)

print('\nSum over entire tensor:')
print(torch.sum(a))
print(a.sum())

# We can sum over each row:
print('\nSum of each row:')
print(torch.sum(a, dim=0))
print(x.sum(dim=0))

# Sum over each column:
print('\nSum of each column:')
print(torch.sum(a, dim=1))
print(a.sum(dim=1))

Original tensor:
tensor([[10.0000, 13.2200, 82.5700],
        [31.0000, 17.5000,  0.8700]])

Sum over entire tensor:
tensor(155.1600)
tensor(155.1600)

Sum of each row:
tensor([41.0000, 30.7200, 83.4400])
tensor([81.0000, 24.5000,  0.9960,  4.0000])

Sum of each column:
tensor([105.7900,  49.3700])
tensor([105.7900,  49.3700])


In [None]:

# Finding the overall minimum only returns a single value
print('\nOverall minimum: ', a.min())

# Compute the minimum along each column; we get both the value and location:
# The minimum of the first column is 2, and it appears at index 0;
# the minimum of the second column is 3 and it appears at index 1; etc
col_min_vals, col_min_idxs = a.min(dim=0)
print('\nMinimum along each column:')
print('values:', col_min_vals)
print('idxs:', col_min_idxs)
 
# Compute the minimum along each row; we get both the value and location 
row_min_vals, row_min_idxs = a.min(dim=1)
print('\nMinimum along each row:')
print('values:', row_min_vals)
print('idxs:', row_min_idxs)


Overall minimum:  tensor(0.8700)

Minimum along each column:
values: tensor([10.0000, 13.2200,  0.8700])
idxs: tensor([0, 0, 1])

Minimum along each row:
values: tensor([10.0000,  0.8700])
idxs: tensor([0, 2])


In [None]:
x = torch.randn(12, 55, 56, 122, 64)
print(x.shape)

# Take the mean over dimension 1; shape is now (12, 56, 122, 64)
x = x.mean(dim=1)
print(x.shape)

# Take the sum over dimension 2; shape is now (12, 56, 64)
x = x.sum(dim=2)
print(x.shape)

# Take the mean over dimension 1, but keep the dimension from being eliminated
# by passing keepdim=True; shape is now (12, 1, 64)
x = x.mean(dim=1, keepdim=True)
print(x.shape)

torch.Size([12, 55, 56, 122, 64])
torch.Size([12, 56, 122, 64])
torch.Size([12, 56, 64])
torch.Size([12, 1, 64])


Matrix operations

PyTorch provides a number of linear algebra functions that compute different types of vector and matrix products. The most commonly used are:

torch.dot: Computes inner product of vectors
torch.mm: Computes matrix-matrix products
torch.mv: Computes matrix-vector products
torch.addmm / torch.addmv: Computes matrix-matrix and matrix-vector multiplications plus a bias
torch.bmm / torch.baddmm: Batched versions of torch.mm and torch.addmm, respectively
torch.matmul: General matrix product that performs different operations depending on the rank of the inputs; this is similar to np.dot in numpy.

In [None]:
m = torch.tensor([15, 22, 10], dtype=torch.float32)
n = torch.tensor([48, 122, 56], dtype=torch.float32)

# Inner product of vectors
print('Dot products:')
print(torch.dot(m, n))
print(m.dot(n))

# we use mm for matrix-matrix products:
x = torch.tensor([[1,2,3],[4,5,6]], dtype=torch.float32)
y = torch.tensor([[7,8],[9,10],[11,12]], dtype=torch.float32)
print('\nMatrix-matrix product:')
print(torch.mm(x, y))
print(x.mm(y))

# Matrix-vector multiply with torch.mv produces a rank-1 output
print('\nMatrix-vector product with torch.mv (rank 1 output)')
print(torch.mv(x, m))
print(x.mv(m))

Dot products:
tensor(3964.)
tensor(3964.)

Matrix-matrix product:
tensor([[ 58.,  64.],
        [139., 154.]])
tensor([[ 58.,  64.],
        [139., 154.]])

Matrix-vector product with torch.mv (rank 1 output)
tensor([ 89., 230.])
tensor([ 89., 230.])


# 5.Broadcasting

Broadcasting is a powerful mechanism that allows PyTorch to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller tensor and a larger tensor, and we want to use the smaller tensor multiple times to perform some operation on the larger tensor.

In [None]:

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = torch.tensor([[15,22,31,45], [200,133,47,65], [8,19,23,48], [18, 252, 546,21]])
v = torch.tensor([21, 16, 57, 25])
y = torch.zeros_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v
print(y)

tensor([[ 36,  38,  88,  70],
        [221, 149, 104,  90],
        [ 29,  35,  80,  73],
        [ 39, 268, 603,  46]])


In [None]:
vv = v.repeat((4, 1))  # Stack 4 copies of v on top of each other
print(vv)             

tensor([[21, 16, 57, 25],
        [21, 16, 57, 25],
        [21, 16, 57, 25],
        [21, 16, 57, 25]])


In [None]:

y = x + vv  # Add x and vv elementwise
print(y)

tensor([[ 36,  38,  88,  70],
        [221, 149, 104,  90],
        [ 29,  35,  80,  73],
        [ 39, 268, 603,  46]])


In [None]:
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = torch.tensor([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = torch.tensor([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting
print(y)

tensor([[ 2,  2,  4],
        [ 5,  5,  7],
        [ 8,  8, 10],
        [11, 11, 13]])


In [None]:
# Compute outer product of vectors
m = torch.tensor([1, 2, 3, 4])  
n = torch.tensor([5, 6, 7, 8, 9])    
# To compute an outer product, we first reshape mto be a column
# vector of shape (3, 1); we can then broadcast it against n to yield
# an output of shape (3, 2), which is the outer product of m and n:
print(m.view(4, 1) * n)

tensor([[ 5,  6,  7,  8,  9],
        [10, 12, 14, 16, 18],
        [15, 18, 21, 24, 27],
        [20, 24, 28, 32, 36]])


# 6.Running on GPU

PyTorch can use graphics processing units (GPUs) to accelerate its tensor operations.


In [None]:
import torch

if torch.cuda.is_available:
  print('PyTorch can use GPUs!')
else:
  print('PyTorch cannot use GPUs.')

PyTorch can use GPUs!


In [None]:
# Construct a tensor on the CPU
x0 = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
print('x0 device:', x0.device)

# Move it to the GPU using .to()
x1 = x0.to('cuda')
print('x1 device:', x1.device)

# Move it to the GPU using .cuda()
x2 = x0.cuda()
print('x2 device:', x2.device)

# Move it back to the CPU using .to()
x3 = x1.to('cpu')
print('x3 device:', x3.device)

# Move it back to the CPU using .cpu()
x4 = x2.cpu()
print('x4 device:', x4.device)

# We can construct tensors directly on the GPU as well
y = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float64, device='cuda')
print('y device / dtype:', y.device, y.dtype)

# Calling x.to(y) where y is a tensor will return a copy of x with the same
# device and dtype as y
x5 = x0.to(y)
print('x5 device / dtype:', x5.device, x5.dtype)

x0 device: cpu
x1 device: cuda:0
x2 device: cuda:0
x3 device: cpu
x4 device: cpu
y device / dtype: cuda:0 torch.float64
x5 device / dtype: cuda:0 torch.float64


In [None]:
import time

a_cpu = torch.randn(10000, 10000, dtype=torch.float32)
b_cpu = torch.randn(10000, 10000, dtype=torch.float32)

a_gpu = a_cpu.cuda()
b_gpu = b_cpu.cuda()
torch.cuda.synchronize()

t0 = time.time()
c_cpu = a_cpu + b_cpu
t1 = time.time()
c_gpu = a_gpu + b_gpu
torch.cuda.synchronize()
t2 = time.time()

# Check that they computed the same thing
diff = (c_gpu.cpu() - c_cpu).abs().max().item()
print('Max difference between c_gpu and c_cpu:', diff)

cpu_time = 1000.0 * (t1 - t0)
gpu_time = 1000.0 * (t2 - t1)
print('CPU time: %.2f ms' % cpu_time)
print('GPU time: %.2f ms' % gpu_time)
print('GPU speedup: %.2f x' % (cpu_time / gpu_time))

Max difference between c_gpu and c_cpu: 0.0
CPU time: 244.77 ms
GPU time: 7.23 ms
GPU speedup: 33.85 x
