# My summary

## Tensors and Variables

PyTorch provides two kinds of data abstractions called tensors and variables. Tensors are similar to numpy arrays and they can also be used on GPUs, which provide increased performance. They provide easy methods of switching between GPUs and CPUs. For certain operations, we can notice a boost in performance and machine learning algorithms
can understand different forms of data, only when represented as tensors of numbers.
Tensors are like Python arrays and can change in size. For example, images can be represented as three-dimensional arrays (height, weight, channel (RGB)). It is common in deep learning to use tensors of sizes up to five dimensions. Some of the commonly used tensors are as follows:
- Scalar (0-D tensors)
- Vector (1-D tensors)
- Matrix (2-D tensors)
- 3-D tensors
- Slicing tensors
- 4-D tensors
- 5-D tensors
- Tensors on GPU

### Scalar (0-D tensors)

A tensor containing only one element is called a scalar. It will generally be of type FloatTensor or LongTensor. At the time of writing, PyTorch does not have a special tensor with zero dimensions.

In [28]:
import torch

In [29]:
x = torch.rand(10)
print(x)
print(x.size())

tensor([0.9424, 0.2152, 0.7102, 0.4808, 0.1377, 0.8498, 0.3962, 0.5214, 0.7608,
        0.8140])
torch.Size([10])


### Vectors (1-D tensors)

A vector is simply an array of elements. For example, we can use a vector to store the average temperature for the last week.

In [32]:
temp = torch.FloatTensor([23, 24, 24.5, 26, 27, 27.2, 23.0])
print(temp.size())

torch.Size([7])


In [4]:
# This is a 1D-tensor
a = torch.tensor([2,2,1])
print(a)

tensor([2, 2, 1])


### Matrix (2-D tensors)

Most of the structured data is represented in the form of tables or matrices. Torch provides a utility function called
from_numpy, which converts a numpy array into a torch tensor.

In [5]:
# This is a 2D-tensor
b = torch.tensor([[2,1,4], [3,5,4], [1,2,0], [4,3,2]])
print(b)

tensor([[2, 1, 4],
        [3, 5, 4],
        [1, 2, 0],
        [4, 3, 2]])


In [6]:
# The size of the tensors
print(a.shape)
print(b.shape)
print(a.size())
print(b.size())

torch.Size([3])
torch.Size([4, 3])
torch.Size([3])
torch.Size([4, 3])


In [7]:
# Get the heigth/number of rows of b
print(b.shape[0])

4


In [8]:
# This is a 2D-float-tensor
b = torch.FloatTensor([[2,1,4], [3,5,4], [1,2,0], [4,3,2]])
print(b)

tensor([[2., 1., 4.],
        [3., 5., 4.],
        [1., 2., 0.],
        [4., 3., 2.]])


In [9]:
# This is a 2D-float-tensor
a = torch.tensor([2,2,1], dtype = torch.float)
print(a)

tensor([2., 2., 1.])


In [10]:
# This is a 2D-double-tensor
b = torch.DoubleTensor([[2,1,4], [3,5,4], [1,2,0], [4,3,2]])
print(b)

tensor([[2., 1., 4.],
        [3., 5., 4.],
        [1., 2., 0.],
        [4., 3., 2.]], dtype=torch.float64)


In [11]:
# This is a 2D-double-tensor
a = torch.tensor([2,2,1], dtype = torch.double)
print(a)

tensor([2., 2., 1.], dtype=torch.float64)


In [12]:
# print tensor type
print(a.dtype)

torch.float64


In [13]:
# Rechape b
# If one of the dimensions is -1, its size can be inferred
# the size -1 is inferred from other dimensions
# view: Returns a new tensor with the same data but different size.
print(b)
print(b.view(-1, 1))   
print(b.view(12))
print(b.view(-1, 4))
print(b.view(3, 4))
b = b.view(-1, 1)

tensor([[2., 1., 4.],
        [3., 5., 4.],
        [1., 2., 0.],
        [4., 3., 2.]], dtype=torch.float64)
tensor([[2.],
        [1.],
        [4.],
        [3.],
        [5.],
        [4.],
        [1.],
        [2.],
        [0.],
        [4.],
        [3.],
        [2.]], dtype=torch.float64)
tensor([2., 1., 4., 3., 5., 4., 1., 2., 0., 4., 3., 2.], dtype=torch.float64)
tensor([[2., 1., 4., 3.],
        [5., 4., 1., 2.],
        [0., 4., 3., 2.]], dtype=torch.float64)
tensor([[2., 1., 4., 3.],
        [5., 4., 1., 2.],
        [0., 4., 3., 2.]], dtype=torch.float64)


#### Example Boston House Prices

We will use a dataset called Boston House Prices, which is readily available in the Python scikit-learn machine learning library.

In [14]:
from sklearn import datasets
boston = datasets.load_boston()
boston_tensor = torch.from_numpy(boston.data)
print(boston_tensor)
print(boston_tensor.size())
print(boston_tensor[:2])

tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00,  ..., 1.5300e+01, 3.9690e+02,
         4.9800e+00],
        [2.7310e-02, 0.0000e+00, 7.0700e+00,  ..., 1.7800e+01, 3.9690e+02,
         9.1400e+00],
        [2.7290e-02, 0.0000e+00, 7.0700e+00,  ..., 1.7800e+01, 3.9283e+02,
         4.0300e+00],
        ...,
        [6.0760e-02, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9690e+02,
         5.6400e+00],
        [1.0959e-01, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9345e+02,
         6.4800e+00],
        [4.7410e-02, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9690e+02,
         7.8800e+00]], dtype=torch.float64)
torch.Size([506, 13])
tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00, 0.0000e+00, 5.3800e-01, 6.5750e+00,
         6.5200e+01, 4.0900e+00, 1.0000e+00, 2.9600e+02, 1.5300e+01, 3.9690e+02,
         4.9800e+00],
        [2.7310e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01, 6.4210e+00,
         7.8900e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02, 1.7800e+01, 3.9690e+02,
         9.1400

##### Other examples

In [15]:
# Create a matrix with random numbers between 0 and 1
r = torch.rand(4, 4)
print(r)
print(r.dtype)

tensor([[0.7552, 0.2980, 0.5711, 0.1019],
        [0.3721, 0.7840, 0.2033, 0.1114],
        [0.3158, 0.3965, 0.4299, 0.0122],
        [0.0571, 0.4342, 0.4824, 0.9562]])
torch.float32


In [16]:
# Create a matrix with random numbers taken a normal distribution
# with mean 0 and variance 1
r2 = torch.randn(4, 4)
print(r2)
print(r2.dtype)

tensor([[-0.5987,  2.7831, -1.3662, -1.8046],
        [ 0.8115, -0.7716,  0.0513, -0.2709],
        [-0.8423,  0.8892, -0.3247,  0.2348],
        [-0.0828,  0.4432, -1.6563, -1.6077]])
torch.float32


In [17]:
# Create an array of 5 random integers from values between 6 and 9
# (exclusive of 10)
in_array = torch.randint(6, 10, (5,))
print(in_array)
print(in_array.dtype)

tensor([8, 9, 8, 8, 9])
torch.int64


In [18]:
# Create a 2D-array (matrix) of size 3x3 filled with random integers
# from values between 6 and 9 (exclusive of 10)
in_array2 = torch.randint(6, 10, (3,3))
print(in_array2)
print(in_array2.dtype)

tensor([[6, 6, 9],
        [7, 7, 8],
        [9, 6, 6]])
torch.int64


In [19]:
# Get a number of elements in in_array
print(torch.numel(in_array))
# Get a number of elements in in_array2
print(torch.numel(in_array2))

5
9


In [20]:
# Construct a 3x3 matrix of zeros and of dtype long
z = torch.zeros(3, 3, dtype=torch.long)
print(z)
print(z.dtype)
# Construct a 3x3 matrix of ones and of dtype long
z = torch.ones(3, 3)
print(z)
print(z.dtype)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])
torch.int64
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
torch.float32


In [21]:
# Convert the data type of the tensor
r2_like = torch.randn_like(r2, dtype=torch.double)
print(r2_like)
print(r2_like.dtype)

tensor([[ 0.2528,  0.1718, -1.2319,  1.0089],
        [-1.1698,  1.1939,  0.7745,  0.2273],
        [ 0.5267,  0.8730,  1.0343,  0.8326],
        [-0.9843,  0.7574,  1.3067,  0.3439]], dtype=torch.float64)
torch.float64


In [22]:
# Add two tensor
add_result = torch.add(r, r2)
print(add_result)

tensor([[ 0.1565,  3.0811, -0.7951, -1.7028],
        [ 1.1836,  0.0123,  0.2547, -0.1595],
        [-0.5265,  1.2856,  0.1052,  0.2470],
        [-0.0257,  0.8773, -1.1739, -0.6515]])


In [23]:
# In place addition
print(r2)
r2.add_(r)
print(r2)

tensor([[-0.5987,  2.7831, -1.3662, -1.8046],
        [ 0.8115, -0.7716,  0.0513, -0.2709],
        [-0.8423,  0.8892, -0.3247,  0.2348],
        [-0.0828,  0.4432, -1.6563, -1.6077]])
tensor([[ 0.1565,  3.0811, -0.7951, -1.7028],
        [ 1.1836,  0.0123,  0.2547, -0.1595],
        [-0.5265,  1.2856,  0.1052,  0.2470],
        [-0.0257,  0.8773, -1.1739, -0.6515]])


### 3-D tensors

When we add multiple matrices together, we get a 3-D tensor. 3-D tensors are used to represent data-like images. Images can be represented as numbers in a matrix, which are stacked together. An example of an image shape is 224, 224, 3, where the first index
represents height, the second represents width, and the third represents a channel (RGB). Let's see how a computer sees a panda, using the next code snippet:

In [24]:
from PIL import Image
from scipy import misc

face_numpy = misc.face()
face_tensor = torch.from_numpy(face_numpy)
print(face_tensor.size())

torch.Size([768, 1024, 3])


### Slicing tensors

In [25]:
print(boston_tensor)
print(boston_tensor[:,1])
print(boston_tensor[:,:2])
print(boston_tensor[:3,:])
num_ten = boston_tensor[2, 3]
print(num_ten)
print(num_ten.item())
print(boston_tensor[2,:])

tensor([[6.3200e-03, 1.8000e+01, 2.3100e+00,  ..., 1.5300e+01, 3.9690e+02,
         4.9800e+00],
        [2.7310e-02, 0.0000e+00, 7.0700e+00,  ..., 1.7800e+01, 3.9690e+02,
         9.1400e+00],
        [2.7290e-02, 0.0000e+00, 7.0700e+00,  ..., 1.7800e+01, 3.9283e+02,
         4.0300e+00],
        ...,
        [6.0760e-02, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9690e+02,
         5.6400e+00],
        [1.0959e-01, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9345e+02,
         6.4800e+00],
        [4.7410e-02, 0.0000e+00, 1.1930e+01,  ..., 2.1000e+01, 3.9690e+02,
         7.8800e+00]], dtype=torch.float64)
tensor([ 18.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,  12.5000,
         12.5000,  12.5000,  12.5000,  12.5000,  12.5000,  12.5000,   0.0000,
          0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
          0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
          0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0

##### Other examples

In [26]:
# Create a 3D tensor with 2 channels, 3 rows and 4 columns (channels, rows, columns)
three_dim = torch.rand(2, 3, 4)
print(three_dim)
# Rechape to 2 rows and 12 columns
print(three_dim.view(2, 12))
print(three_dim.view(2, -1))

tensor([[[0.9397, 0.4973, 0.6941, 0.8952],
         [0.0063, 0.3394, 0.5473, 0.4390],
         [0.1016, 0.6307, 0.5780, 0.9456]],

        [[0.9246, 0.1236, 0.3158, 0.5507],
         [0.1612, 0.6510, 0.0571, 0.9151],
         [0.2204, 0.2493, 0.5071, 0.7180]]])
tensor([[0.9397, 0.4973, 0.6941, 0.8952, 0.0063, 0.3394, 0.5473, 0.4390, 0.1016,
         0.6307, 0.5780, 0.9456],
        [0.9246, 0.1236, 0.3158, 0.5507, 0.1612, 0.6510, 0.0571, 0.9151, 0.2204,
         0.2493, 0.5071, 0.7180]])
tensor([[0.9397, 0.4973, 0.6941, 0.8952, 0.0063, 0.3394, 0.5473, 0.4390, 0.1016,
         0.6307, 0.5780, 0.9456],
        [0.9246, 0.1236, 0.3158, 0.5507, 0.1612, 0.6510, 0.0571, 0.9151, 0.2204,
         0.2493, 0.5071, 0.7180]])


### 4-D tensors

One common example for four-dimensional tensor types is a batch of images. Modern CPUs and GPUs are optimized to perform the same operations on multiple examples faster. So, they take a similar time to process one image or a batch of images. So, it is common to use a batch of examples rather than use a single image at a time. Choosing the batch size is not straightforward; it depends on several factors. One major restriction for using a bigger batch or the complete dataset is GPU memory limitations (16, 32, and 64 are commonly used batch sizes).

Let's look at an example where we load a batch of cat images of 64 x 224 x 224 x 3 where 64 represents the batch size or the number of images, 244 represents height and width, and 3 represents channels:

In [27]:
import glob
import numpy as np
# Read cat images from disk
cats = glob.glob('data/dogscats/train/cats/'+'*.jpg')
# Convert images into numpy arrays
cat_imgs = np.array([np.array(Image.open(cat).resize((224, 224))) for cat in cats[:64]])
cat_imgs = cat_imgs.reshape(-1, 224, 224, 3)
cat_tensors = torch.from_numpy(cat_imgs)
print(cat_tensors.size())

torch.Size([64, 224, 224, 3])


### 5-D tensors

One common example where you may have to use a five-dimensional tensor is video data. Videos can be split into frames, for example, a 30-second video containing a panda playing with a ball may contain 30 frames, which could be represented as a tensor of shape (1 x 30 x
224 x 224 x 3). A batch of such videos can be represented as tensors of shape (32 x 30 x 224 x 224 x 3)- 30 in the example represents, number of frames in that single video clip, where 32 represents the number of such video clips.