# **Introduction to Torch's tensor library**

PyTorch is an open-source machine learning library. It has gained widespread popularity for its dynamic computation graph and ease of use, making it an excellent choice for both research and production. 

It is widely used for applications such as natural language processing and computer vision. 

PyTorch provides a flexible platform for building and training neural networks, with a particular focus on tensor operations and automatic differentiation. At its core, PyTorch offers a robust tensor library that allows for efficient computation and manipulation of high-dimensional data.

PyTorch provides two key features: 

+ a flexible and efficient tensor computation library similar to NumPy, but with GPU acceleration, and 
+ an automatic differentiation library that is essential for training neural networks.

### **Import Libraries & Setting Seed for Reproducibility**

In [60]:
import torch        # Core library for tensor operations.
import torch.autograd as autograd        # for automatic differentiation..
import torch.nn as nn        # Contains neural network layers.
import torch.nn.functional as F        # Contains functions for neural network operations.
import torch.optim as optim        # Optimization algorithms.

# This line sets a manual seed to ensure that random numbers generated are reproducible.
torch.manual_seed(42)

<torch._C.Generator at 0x159a6aecdd0>

### **Creating Tensors** 

Tensors can be created from Python lists with the torch.tensor() function.

In [61]:
# Create a 1D Tensor (Vector)
V_data = [4., 5., 6.]  
# torch.tensor(data) creates a torch.Tensor object with the given data.    
V = torch.tensor(V_data)       # converts a list into a PyTorch tensor.
print(V)

tensor([4., 5., 6.])


In [62]:
# Create a 2D Tensor (Vector)
M_data = [[7., 8., 9.], [10., 11., 12.]]        # a list of lists representing a matrix.
# torch.tensor(data) creates a torch.Tensor object with the given data.    
M = torch.tensor(M_data)       # converts a matrix into a PyTorch tensor.
print(M)

tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [63]:
# Create a 3D Tensor (Vector) of size 2x2x2.
T_data = [[[13., 14.], [15., 16.]],
           [[17., 18.], [19., 20.]]]        # a list of lists of lists, representing a 3D tensor.
# torch.tensor(data) creates a torch.Tensor object with the given data.    
T = torch.tensor(T_data)      # converts a list of lists of lists into a PyTorch tensor.
print(T)

tensor([[[13., 14.],
         [15., 16.]],

        [[17., 18.],
         [19., 20.]]])


What is a 3D tensor, anyway? 

+ If you have a vector, indexing into the vector gives you a scalar.
+ If you have a matrix, indexing into the matrix gives you a vector.
+ If you have a 3D tensor, indexing into the tensor gives you a matrix!

In this tutorial, "tensor" refers to any torch.Tensor object. Matrices and vectors are special cases of torch.Tensor, with dimensions 2 and 1, respectively. When discussing 3D tensors, I will explicitly use the term "3D tensor."

### **Indexing Tensors** 

In [64]:
# Index into V and get a scalar (0 dimensional tensor)
print(V[0], "\n")
# converts this scalar tensor into a Python number.
print(V[0].item(), "\n")

# Index into M and get a vector
print(M[1], "\n")     # gets the second row of the matrix M.

# Index into T and get a matrix
print(T[1])     # gets the second matrix from the 3D tensor T.

tensor(4.) 

4.0 

tensor([10., 11., 12.]) 

tensor([[17., 18.],
        [19., 20.]])


### **Tensors with Different Data Types**

You can also create tensors of other data types. To create a tensor of integer types, try torch.tensor([[1, 2], [3, 4]]) (where all elements in the list are integers).

You can also specify a data type by passing in ``dtype=torch.data_type``.

In [65]:
int_tensor = torch.tensor([[1, 2], [3, 4]], dtype=torch.int)        # creates a tensor with integer data type.
print(int_tensor)

tensor([[1, 2],
        [3, 4]], dtype=torch.int32)


### **Tensors with Random Data** 

You can create a tensor with random data and the supplied dimensionality with torch.randn()

In [66]:
"""Tuple of Dimensions: This tuple specifies the shape or dimensions of the tensor.
2: The number of matrices (2D tensors) in the resulting tensor.
3: The number of rows in each matrix.
4: The number of columns in each matrix."""

random_tensor = torch.randn((2, 3, 4))      # creates a tensor with random values and specified dimensions.
print(random_tensor)

tensor([[[ 1.9269,  1.4873,  0.9007, -2.1055],
         [ 0.6784, -1.2345, -0.0431, -1.6047],
         [ 0.3559, -0.6866, -0.4934,  0.2415]],

        [[-1.1109,  0.0915, -2.3169, -0.2168],
         [-0.3097, -0.3957,  0.8034, -0.6216],
         [-0.5920, -0.0631, -0.8286,  0.3309]]])


### **Tensor Operations**

In [67]:
# a and b are tensors.
a = torch.tensor([1., 2., 3.])
b = torch.tensor([7., 8., 9.])
c = a + b       # performs element-wise addition.
print(c)

d = a * b       # performs element-wise multiplication.
print(d)

tensor([ 8., 10., 12.])
tensor([ 7., 16., 27.])


### **Concatenation of Tensors**

In [68]:
a1 = torch.randn(2, 3)
b1 = torch.randn(2, 3)
# Concatenates a1 and b1 along the first axis (rows).
concat_0 = torch.cat([a1, b1])
print("a1\n", a1)
print("b1\n", b1)
print("concat_0\n", concat_0)

a1
 tensor([[ 1.3525,  0.6863, -0.3278],
        [ 0.7950,  0.2815,  0.0562]])
b1
 tensor([[ 0.5227, -0.2384, -0.0499],
        [ 0.5263, -0.0085,  0.7291]])
concat_0
 tensor([[ 1.3525,  0.6863, -0.3278],
        [ 0.7950,  0.2815,  0.0562],
        [ 0.5227, -0.2384, -0.0499],
        [ 0.5263, -0.0085,  0.7291]])


In [69]:
# Concatenate columns
a2 = torch.randn(2, 2)
b2 = torch.randn(2, 4)
# concatenates a2 and b2 along the second axis (columns).
# second arg specifies which axis to concat along
concat_1 = torch.cat([a2, b2], 1)
print("a2\n", a2)
print("b2\n", b2)
print("concat_1\n", concat_1)

# If your tensors are not compatible, torch will complain.  Uncomment to see the error
# torch.cat([x_1, x_2])
# RuntimeError: Sizes of tensors must match except in dimension 0. 
# Expected size 2 but got size 4 for tensor number 1 in the list.

a2
 tensor([[ 0.1331,  0.8640],
        [-1.0157, -0.8887]])
b2
 tensor([[ 0.1498, -0.2089, -0.3870,  0.9912],
        [ 0.4679, -0.2049, -0.7409,  0.3618]])
concat_1
 tensor([[ 0.1331,  0.8640,  0.1498, -0.2089, -0.3870,  0.9912],
        [-1.0157, -0.8887,  0.4679, -0.2049, -0.7409,  0.3618]])


### **Reshaping Tensors**

Use the .view() method to reshape a tensor. This method receives heavy use, because many neural network components expect their inputs to have a certain shape. Often you will need to reshape before passing your data to the component.

In [70]:
x = torch.randn(2, 3, 4)
print(x, "\n")
print(x.view(2, 12), "\n")  # reshapes the tensor x to 2 rows and 12 columns.
# reshapes the tensor with one dimension inferred from the other.
print(x.view(2, -1))

tensor([[[ 0.7281, -0.7106, -0.6021,  0.9604],
         [ 0.4048, -1.3543, -0.4976,  0.4747],
         [-1.4570, -0.1023, -0.5992,  0.4771]],

        [[ 0.7262,  0.0912, -0.3891,  0.5279],
         [-0.0127,  0.2408,  0.1325,  0.7642],
         [ 1.0950,  0.3399,  0.7200,  0.4114]]]) 

tensor([[ 0.7281, -0.7106, -0.6021,  0.9604,  0.4048, -1.3543, -0.4976,  0.4747,
         -1.4570, -0.1023, -0.5992,  0.4771],
        [ 0.7262,  0.0912, -0.3891,  0.5279, -0.0127,  0.2408,  0.1325,  0.7642,
          1.0950,  0.3399,  0.7200,  0.4114]]) 

tensor([[ 0.7281, -0.7106, -0.6021,  0.9604,  0.4048, -1.3543, -0.4976,  0.4747,
         -1.4570, -0.1023, -0.5992,  0.4771],
        [ 0.7262,  0.0912, -0.3891,  0.5279, -0.0127,  0.2408,  0.1325,  0.7642,
          1.0950,  0.3399,  0.7200,  0.4114]])


### **Computation Graphs and Automatic Differentiation**

The concept of a computation graph is crucial for efficient deep learning, as it automates the backpropagation of gradients. A computation graph defines how data is combined to produce an output. It records the parameters and operations involved, providing all necessary information to compute derivatives.

From a programmer's perspective, torch.Tensor objects store data and shape information. When two tensors are added, the resulting tensor only knows its data and shape, not how it was derived. However, if ``requires_grad = True``, the tensor keeps track of its creation process, which is essential for gradient computation.

<img src="https://miro.medium.com/max/726/1*W6-39saZm_QqL-wQvGESGQ.png"> 

#### **Basics of Computation Graph**

In [71]:
# Tensor factory methods have a ``requires_grad`` flag, which enables automatic differentiation.
# With requires_grad=True, you can still do all the operations you previously could
x = torch.tensor([3., 4., 5.], requires_grad=True)
y = torch.tensor([6., 7., 8.], requires_grad=True)
z = x + y       # computes the element-wise sum.
print(z)
print(z.grad_fn)        # shows the function used to create z.

tensor([ 9., 11., 13.], grad_fn=<AddBackward0>)
<AddBackward0 object at 0x00000159AAC1A490>


#### **Backpropagation** 

In [72]:
s = z.sum()     # computes the sum of all elements in z.
print(s)
print(s.grad_fn)

tensor(33., grad_fn=<SumBackward0>)
<SumBackward0 object at 0x00000159A99F8190>


To find the derivative of the sum with respect to the first component of 𝑥, we need to compute

\begin{align}\frac{\partial s}{\partial x_0}\end{align}

Here, 𝑠 is the sum of the tensor 𝑧, which itself is the sum of 𝑥 and 𝑦. So,

\begin{align}s = \overbrace{x_0 + y_0}^\text{$z_0$} + \overbrace{x_1 + y_1}^\text{$z_1$} + \overbrace{x_2 + y_2}^\text{$z_2$}\end{align}

From this, we can see that the derivative of 𝑠 with respect to 𝑥0 is 1.

In practice, computing this derivative is managed by frameworks like PyTorch. PyTorch's implementation of the sum() and + operations includes gradient computation and backpropagation algorithms. While the details of these algorithms are complex and beyond the scope of this tutorial, we can use PyTorch to compute the gradient. Note that running the gradient computation multiple times will increment the gradient because PyTorch accumulates gradients in the .grad property, which is useful for many models.


In [73]:
# calling .backward() on any variable will run backprop, starting from it.
s.backward()        # computes the gradient of s w.r.t x.
print(x.grad)       # contains the gradient values.

tensor([1., 1., 1.])


#### **Requires Grad and Detach** 

In [74]:
p = torch.randn(3, 3)
q = torch.randn(3, 3)
# By default, user created Tensors have ``requires_grad=False``. So you can't backprop through grad_fn
print(p.requires_grad, q.requires_grad)

False False


In [75]:
# ``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad`` flag in-place. 
# The input flag defaults to ``True`` if not given.
p = p.requires_grad_()
q = q.requires_grad_()
# r contains enough information to compute gradients, as we saw above
r = p + q
print(r.grad_fn)    # shows the function used to create r.

<AddBackward0 object at 0x00000159A99FCD90>


In [76]:
# If any input to an operation has ``requires_grad=True``, so will the output
print(r.requires_grad)

True


In [77]:
# Now r has computation history that relates itself to p and q
# Can we just take its values, and **detach** it from its history?
detached_r = r.detach()

# ... does detached_r have information to backprop to p and q?
# NO!
print(detached_r.grad_fn)
# And how could it? ``r.detach()`` returns a tensor that shares the same storage as ``r``, 
# but with the computation history forgotten. It doesn't know anything about how it was computed.
# In essence, we have broken the Tensor away from its past history

None


+ requires_grad_() changes the requires_grad flag in-place.

+ detach() creates a tensor that does not track gradients.

#### **No Grad Context** 

In [78]:
print(p.requires_grad)
print((p ** 2).requires_grad)

with torch.no_grad():
    print((p ** 2).requires_grad)

True
True
False


with torch.no_grad(): disables gradient tracking for the enclosed operations.