In [1]:
import torch
import numpy as np

In [246]:
def same_storage(a,b): # function to check if two tensors have the same storage
    return (a.storage().data_ptr() == b.storage().data_ptr())

# Chapter 3: Tensors

Deep learning systems derive meaning from data by extracting common attributes between different examples from the same class of data. This is done by first converting the input data into floating point numbers. In PyTorch we deal with floating point numbers using tensors.

Tensors are multi-dimensional arrays and are the fundamental datastructures of PyTorch. While PyTorch tensors are the same as NumPy arrays, they have some added benefits as well:

- We can perform very fast operations on them  on GPUs.

- We can distribute operations on multiple devices/machines.

- We can keep track of the computational graph that created them.

## 3.2. Tensors: Multidimensional arrays

Like a python list tensors are indexed the same way. We start by creating a tensor of specific size filled with 1s.

In [16]:
a = torch.ones(3) 

The above command creates a one-dimensional tensor of size 3. 

In [17]:
print(a)
print(a.shape)

tensor([1., 1., 1.])
torch.Size([3])


We can access the elements of the tensor using zero-based index.

In [18]:
for i in range(a.shape[0]):
    print(a[i])

tensor(1.)
tensor(1.)
tensor(1.)


All the elements inside our 1-d tensor are also tensors. To get just the value from these locations we use a[index].item()

In [19]:
for i in range(a.shape[0]):
    print(a[i].item())

1.0
1.0
1.0


### 3.2.3 Why use tensors and not lists

Python lists an array of pointers that points to different python objects. These python objects are stored in different locations in the memory. The python objects in the list need not be of the same data type. While this makes python lists very flixible it also makes it inefficient. Lists require more memory to store the value of the pointers along with the value of the objects the pointer is pointing to. It also takes more time to perform operation on lists.

Tensors on the otherhand like numpy arrays are contiguous blocks of memory. Tensors require lesser memory and performing numeric operations on them is also much faster. in deep learning we deal with huge volumes of data so being able to store them in small amounts of space and being able to perform operations on them really fast is very important. That is why we use tensors over python lists. 

While PyTorch tensors and NumPy arrays have almost the same benefits, we can store PyTorch tensors in GPUs in order to perform massively parallel, fast computations. We cannot do this with NumPy arrays. That is why we use PyTorch Tensors over NumPy arrays.

## 3.4 Named tensors

The dimensions (or axes) of our tensors usually index something like pixel locations or color channels. This means when we want to index into a tensor, we need to remember the ordering of the dimensions and write our indexing accordingly. As data is transformed through multiple tensors, keeping track of which dimension contains what data can be error-prone. To solve this we can add names to the dimensions. For the tensor operations where we pass dimensions as arguments, we can now pass the names as argument.

*This feature is experimental*

In [74]:
named_a = torch.tensor([[[1,2,3],[4,5,6]], [[7,8,9],[10,11,12]]], names=['a','b','c'])
print(named_a)
print(named_a.shape)

tensor([[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]], names=('a', 'b', 'c'))
torch.Size([2, 2, 3])


In [84]:
flat_named_a = named_a.flatten(['a','b','c'], 'ab')
print(flat_named_a)
print(flat_named_a.shape)

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12], names=('ab',))
torch.Size([12])


## Tensor element types

While creating a tensor, we can specify it's datatype using the dtype argument. The possible values of dtype arguments are:

- torch.float32 or torch.float : 32-bit floating-point
- torch.float64 or torch.double : 64-bit, double-precision floating-point
- torch.float16 or torch.half : 16-bit, half-precision floating-point
- torch.int8 : signed 8-bit integers
- torch.uint8 : unsigned 8-bit integers
- torch.int16 or torch.short : signed 16-bit integers
- torch.int32 or torch.int : signed 32-bit integers
- torch.int64 or torch.long : signed 64-bit integers
- torch.bool : Boolean

The default datatype for tensors is 32 bit floating-point.

Creating tensors with integer as arguments will create a 64 bit integer tensor by default.

In [108]:
a_dt = torch.tensor([1,2,3,4]) # by default creates 64 bit integer tensor aka torch.LongTensor
print(a_dt.dtype)
b_dt = torch.tensor([1.,2.,3.,4.]) # by default creates 32 bit floating point
print(b_dt.dtype)
c_dt = torch.tensor([1.,2.,3.,4.],dtype=torch.float16)
print(c_dt.dtype)

torch.int64
torch.float32
torch.float16


We can also assign datatypes to a tensor using the to methiod. The to method return a copy of the tensor with the dtype passed as argument.

In [115]:
d_dt = a_dt.to(dtype=torch.short)
print(d_dt.dtype)

torch.int16


## 3.7 How tensors are stored in memory

Values of tensors are allocated in contiguous chunks of memory managed by torch.Storage instances. A Storage is a one-dimensional array of numerical data that is a contiguous block of memory containing numbers of a given data type.

A PyTorch tensor instance is a view of such a Storage instance that is capable of indexing into that storage using an offset and per-dimension strides.

Multiple tensors can index the same storage even if they index into the data differently.
![](img/storage.png)

*Fig 1: Tensors are views of a storage instance. Source: Deep Learning with PyTorch. Section 3.7*

In [124]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])

In [125]:
points_storage = points.storage() # to access the storage of the points tensor
print(points_storage)

 4.0
 1.0
 5.0
 3.0
 2.0
 1.0
[torch.FloatStorage of size 6]


A storage instance is always a 1-d array. if we modufy the values of the storage instance, it will also be reflected in the tensors that use that storage instance. in our example the points tensor uses the points_storage instance. if we modify the value of the points_storage instance, the value of the points tensor will change as well.

In [126]:
print(points)
points_storage[0] = 100
print(points)

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])
tensor([[100.,   1.],
        [  5.,   3.],
        [  2.,   1.]])


In addition to the operations on tensors introduced in the previous section, a small
number of operations exist only as methods of the Tensor object. They are recogniz-
able from a trailing underscore in their name, like zero_ , which indicates that the
method operates in place by modifying the input instead of creating a new output tensor
and returning it. For instance, the zero_ method zeros out all the elements of the input.
Any method without the trailing underscore leaves the source tensor unchanged and
instead returns a new tensor.

In [127]:
print(points)
points.zero_() 
print(points)

tensor([[100.,   1.],
        [  5.,   3.],
        [  2.,   1.]])
tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])


# 3.8 Tensor metadata: Size, offset, and stride

In order for the tensors to properly index into a storage instance, the tensors rely on: size, offset and stride. These things together with the storage helps us index the storage properly.

**Size:** Tuple indicating how many elements across each dimension the tensor represents.

**Storage offset:** The index in the storage corresponding to the first element in the tensor.

**Stride:** A tuple indicating the number of elements in the storage that have to be
skipped when the index is increased by 1 in each dimension.

Accessing an element i, j in a 2D tensor results in accessing the storage_offset +
stride[0] * i + stride[1] * j element in the storage.

![](img/metadata.png)

*Relationship between a tensor’s offset, size, and stride. Here the tensor is a view
of a larger storage, like one that might have been allocated when creating a larger tensor. Source:  Deep Learning with PyTorch. Section: 3.8.* 



In [136]:
a_temp = torch.tensor([[1,2,3],[4,5,6],[7,8,9]])
print(a_temp)
b_temp = a_temp[1]
print(b_temp)
print(b_temp.storage_offset())
print(b_temp.size())
print(a_temp.stride())

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([4, 5, 6])
3
torch.Size([3])
(3, 1)


The connection between tensors and storage leads to some operations being really inexpensive like transpose or a subtensor, because they donot lead to a memory reallocation but creating a new tensor with a different offset and stride accessing the same storage.

If we change the value of a tensor, the value is changed in it's storage and therefore the change will be reflected to all the other tensors accesing the same storage. If we donot want this we can clone the tensor/subtensor into a new tensor and make our changes there.

In [156]:
a_temp = torch.ones(3,3)
print("a_temp is:\n{}".format(a_temp))
b_temp = a_temp
print("b_temp is:\n{}".format(b_temp))
b_temp[0,0] = 2
print("after changing b_temp, b_temp is: \n{}".format(b_temp))
print("after changing b_temp, a_temp is: \n{}".format(a_temp))
id(b_temp) == id(a_temp) # returns True is b_temp and a_temp points to the same storage

a_temp is:
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
b_temp is:
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
after changing b_temp, b_temp is: 
tensor([[2., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
after changing b_temp, a_temp is: 
tensor([[2., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


True

In [157]:
c_temp = a_temp.clone()
print("c_temp is:\n{}".format(c_temp))
c_temp[0,0] = 100
print("after changing c_temp, c_temp is: \n{}".format(c_temp))
print("after changing c_temp, a_temp is: \n{}".format(a_temp))
id(c_temp) == id(a_temp) # returns True is c_temp and a_temp points to the same storage

c_temp is:
tensor([[2., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
after changing c_temp, c_temp is: 
tensor([[100.,   1.,   1.],
        [  1.,   1.,   1.],
        [  1.,   1.,   1.]])
after changing c_temp, a_temp is: 
tensor([[2., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


False

### 3.8.4 Contiguous tensors

Some tensor operations in PyTorch only work on contiguous tensors. In that case, PyTorch will throw an informative exception and require us to call contiguous explicitly. It’s worth noting that
calling contiguous will do nothing (and will not hurt performance) if the tensor is
already contiguous. In our case, a_temp is contiguous, while if we transpose a_temp, it will not be contiguous as only the stride will change and therefore we will not be accessing the values of the storage contiguously.

In [185]:
a_temp = torch.rand(3,3)
a_transpose = a_temp.transpose(0,1)
print(a_temp)
print(a_transpose)
print(a_temp.stride())
print(a_transpose.stride())

tensor([[0.6225, 0.3725, 0.6758],
        [0.5325, 0.8845, 0.8656],
        [0.6403, 0.3875, 0.5479]])
tensor([[0.6225, 0.5325, 0.6403],
        [0.3725, 0.8845, 0.3875],
        [0.6758, 0.8656, 0.5479]])
True
(3, 1)
(1, 3)


In [193]:
a_transpose[0,0] = 99
print(a_transpose)
print(a_temp)

tensor([[99.0000,  0.5325,  0.6403],
        [ 0.3725,  0.8845,  0.3875],
        [ 0.6758,  0.8656,  0.5479]])
tensor([[99.0000,  0.3725,  0.6758],
        [ 0.5325,  0.8845,  0.8656],
        [ 0.6403,  0.3875,  0.5479]])


In [192]:
b_transpose = a_temp.clone()
b_transpose = b_transpose.t()
b_transpose[0,0]= 100
print(b_transpose)
print(a_temp)

tensor([[100.0000,   0.5325,   0.6403],
        [  0.3725,   0.8845,   0.3875],
        [  0.6758,   0.8656,   0.5479]])
tensor([[0.6225, 0.3725, 0.6758],
        [0.5325, 0.8845, 0.8656],
        [0.6403, 0.3875, 0.5479]])


We can check if a tensor is contiguous or not using the is_contiguous method. If a tensor is not contiguous, the contiguous method returns a new tensor where the storage is contiguous for that tensor.

In [194]:
a_transpose.is_contiguous()

False

In [197]:
b_transpose.is_contiguous()

False

In [198]:
b_transpose.storage()

 100.0
 0.3725470304489136
 0.6758251190185547
 0.5325178503990173
 0.8845298290252686
 0.865615963935852
 0.6403409242630005
 0.38754844665527344
 0.5479137301445007
[torch.FloatStorage of size 9]

In [201]:
b_cont = b_transpose.contiguous()
b_cont.is_contiguous()

True

In [202]:
b_cont.storage()

 100.0
 0.5325178503990173
 0.6403409242630005
 0.3725470304489136
 0.8845298290252686
 0.38754844665527344
 0.6758251190185547
 0.865615963935852
 0.5479137301445007
[torch.FloatStorage of size 9]


## 3.9 Moving tensors to GPU

PyTorch tensors can also be stored on GPUs to perform massively parallel, fast computations. In addition to dtype , a PyTorch Tensor also has the notion of device , which is where on the computer the tensor data is placed. 

In [203]:
points_gpu = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]], device='cuda')

Created tensors on cpu can be copied to GPUs using the device argument in the to method.

In [209]:
a_temp = a_temp.to(device='cuda') # the to method creates a copy of the tensor

## 3.10 Numpy interoperability

PyTorch tensors can be converted to NumPy arrays and vice versa very efficiently. By doing so, we can take advantage of the huge swath of functionality in the wider Python ecosystem that has built up around the NumPy array type.




In [210]:
points = torch.ones(3, 4)
points_np = points.numpy()
points_np

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], dtype=float32)

The returned numpy array shares the same underlying buffer with the tensor storage. This means any changes made to the numpy array will be reflected on the tensor as well.

In [215]:
points_np[0,0]=10
print(points_np)
print(points)

[[10.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]
tensor([[10.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]])


In [217]:
points = torch.from_numpy(points_np)
print(points)
points[0,0]=20
print(points)
print(points_np)

tensor([[10.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]])
tensor([[20.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]])
[[20.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]


## 3.12 Saving tensors

In [218]:
points

tensor([[20.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]])

We can save the points tensor using the following command

In [219]:
torch.save(points, 'points.t')

We can load the points tensor from the filesystem using the following command:

In [220]:
points = torch.load('points.t')

In [226]:
points

tensor([[20.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]])

The problem with this is the saved tensor can be loaded only with PyTorch. 

## 3.14 Exercise

In [283]:
a = torch.tensor(list(range(10)))

In [237]:
print(a)
b =a.view(2,5)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


In [240]:
b[0,0] = 10
print(b)
print(a)

tensor([[10,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]])
tensor([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])


In [250]:
same_storage(a,b)

True

In [251]:
b

tensor([[10,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]])

In [252]:
c = b[1:,1:]

In [256]:
c.storage()

 10
 1
 2
 3
 4
 5
 6
 7
 8
 9
[torch.LongStorage of size 10]

In [257]:
print(c)

tensor([[6, 7, 8, 9]])


In [258]:
c.shape

torch.Size([1, 4])

In [261]:
c.stride()

(5, 1)

In [262]:
c.storage_offset()

6

In [286]:
a_sqrt = torch.sqrt(a.float())

In [287]:
a

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [288]:
a_sqrt

tensor([0.0000, 1.0000, 1.4142, 1.7321, 2.0000, 2.2361, 2.4495, 2.6458, 2.8284,
        3.0000])

In [297]:
a=a.to(dtype=torch.float32)
a.sqrt_()

tensor([0.0000, 1.0000, 1.4142, 1.7321, 2.0000, 2.2361, 2.4495, 2.6458, 2.8284,
        3.0000])

In [298]:
a

tensor([0.0000, 1.0000, 1.4142, 1.7321, 2.0000, 2.2361, 2.4495, 2.6458, 2.8284,
        3.0000])