## **Intro to Pytorch**

PyTorch is an open-source machine learning library developed by Facebook's AI Research lab (FAIR). It is widely used for various machine learning tasks, including deep learning. PyTorch is known for its dynamic computational graph, which allows for more intuitive and flexible model development compared to static computational graphs used in some other frameworks.

Follow instructions in the link:
https://pytorch.org/tutorials/beginner/introyt/introyt1_tutorial.html



**Tensors**: Similar to TensorFlow, PyTorch uses tensors as the fundamental building blocks for numerical computations. Tensors are multi-dimensional arrays that can be used for various mathematical operations.

**Autograd**: PyTorch includes an automatic differentiation library called Autograd. This feature automatically computes gradients of tensors with respect to a given objective, which is essential for training machine learning models through techniques like backpropagation.

**Neural Network Module**: PyTorch provides a torch.nn module that simplifies the process of building and training neural networks. It includes pre-defined layers, loss functions, and optimization algorithms, making it easier for developers to construct and train models.

In [None]:
!pip 

In [2]:
import torch
import numpy as np

ModuleNotFoundError: No module named 'torch'

## **Manipulating Tensors in PyTorch**

In [2]:
data = [3, 4]
x_data = torch.tensor(data)
x_data

tensor([3, 4])

Tensors can be created from NumPy arrays and vice versa. This allows for seamless integration with existing NumPy code and libraries.

In [3]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

Using built-in functions:


In [4]:
# Tensor with random values between 0 and 1
rand_tensor = torch.rand(2, 3)

# Tensor with random values from a standard normal distribution
randn_tensor = torch.randn(2, 3)

# Tensor with all ones
ones_tensor = torch.ones(2, 3)

# Tensor with all zeros
zeros_tensor = torch.zeros(2, 3)

# Identity matrix
eye_tensor = torch.eye(3)

# Tensor with values from 0 to 9 (exclusive)
arange_tensor = torch.arange(10)

# Tensor with 10 values evenly spaced between 0 and 1
linspace_tensor = torch.linspace(0, 1, 10)

## **Tensor Attributes**

Understanding tensor attributes is crucial for effective manipulation:



*   Shape: A tuple of tensor dimensions, accessed using tensor. shape or` tensor.size()`. This determines the number of elements and their arrangement within the tensor.
*   Data type: The type of data stored in the tensor (e.g., torch.float32, torch.int64), accessed using `tensor.dtype`. This affects the precision and memory usage of the tensor.

*   Device: The device where the tensor is stored (CPU or GPU), accessed using `tensor.device`. This is important for utilizing GPU acceleration for faster computations.
*   Layout: The way the tensor is stored in memory (e.g., strided, sparse), accessed using `tensor.layout`. This can affect performance and memory usage.

## **Tensor Operations**

PyTorch supports a wide range of tensor operations:

Indexing and slicing: Accessing specific elements or sub-tensors using standard indexing and slicing techniques. This allows for extracting and manipulating parts of the tensor.



In [5]:
tensor = torch.ones(4, 4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)

First row: tensor([1., 1., 1., 1.])
First column: tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


Reshaping: Changing the shape of a tensor without altering its data using methods like `tensor.view()`, `tensor.reshape()`, and `tensor.transpose()`. This is often used to adapt tensors for different operations or layers in a neural network.

In [6]:
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


Concatenation: Joining tensors along a given dimension using `torch.cat()` or `torch.stack()`. This is useful for combining data or features.

In [12]:
# Create two tensors
tensor1 = torch.tensor([1, 2, 3])
tensor2 = torch.tensor([4, 5, 6])

# Concatenate along the existing axis (dim=0)
concatenated_tensor = torch.cat((tensor1, tensor2), dim=0)
stacked_tensor = torch.stack((tensor1, tensor2), dim=0)
print(concatenated_tensor)
print(stacked_tensor)



tensor([1, 2, 3, 4, 5, 6])
tensor([[1, 2, 3],
        [4, 5, 6]])


Mathematical operations: Performing arithmetic, linear algebra, matrix manipulation, and other mathematical operations on tensors. PyTorch provides a comprehensive set of functions for these operations.

In [14]:
# Matrix multiplication
y1 = stacked_tensor @ stacked_tensor.T
y2 = stacked_tensor.matmul(stacked_tensor.T)

# Element-wise multiplication
z1 = stacked_tensor * stacked_tensor
z2 = stacked_tensor.mul(stacked_tensor)

# Element-wise addition
stacked_tensor.add_(5)

tensor([[ 6,  7,  8],
        [ 9, 10, 11]])

In-place operations: Operations that modify the tensor directly, denoted by an underscore postfix (e.g., `tensor.add_()`). These can be more memory-efficient but should be used with caution.

## **Using the Autograd Module**

The autograd module is PyTorch's automatic differentiation engine, crucial for training neural networks. This section covers gradients, Jacobians, Hessians, and computational graphs.

## **1. Gradients**

requires_grad: To compute gradients, set the `requires_grad=True` for a tensor. This enables tracking of operations on that tensor, creating a computational graph.

In [16]:
x = torch.ones(2, 2, requires_grad=True)

`.backward() `: Computes the gradients of a tensor with respect to its leaf nodes in the computational graph. This is the core of backpropagation.

In [17]:
y = x + 2
z = y * y * 3
out = z.mean()
out.backward()
print(x.grad)

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])


`.grad`: Accesses the accumulated gradients for a tensor. This attribute stores the gradients computed during the `.backward()` call.

## **2. Jacobian Product**

For vector-valued functions, autograd computes the Jacobian product, which is the product of the Jacobian matrix and a given vector. This is more efficient than computing the full Jacobian matrix.

In [18]:

# Define a function
def func(x):
    return torch.stack([x[0]**2, x[1]**3])

# Create an input tensor with requires_grad=True
x = torch.tensor([1.0, 2.0], requires_grad=True)

# Compute the output of the function
y = func(x)

# Compute the Jacobian
jacobian = torch.zeros((2, 2))  # initialize a tensor to hold the Jacobian matrix
for i in range(2):
    y[i].backward(retain_graph=True)
    jacobian[i] = x.grad
    x.grad.zero_()  # zero the gradients for the next computation

print(jacobian)


tensor([[ 2.,  0.],
        [ 0., 12.]])


## **3. Hessians**

The Hessian matrix represents the second derivatives of a function, you can use the `torch.autograd.functional.hessian()` function to calculate it.

In [19]:
from torch.autograd.functional import hessian

def f(x):
    return x[0]**2 + x[1]**2

x = torch.tensor([1.0, 2.0], requires_grad=True)
hessian_matrix = hessian(f, x)
print(hessian_matrix)

tensor([[2., 0.],
        [0., 2.]])


## **4. Computational Graph**

Autograd maintains a directed acyclic graph (DAG) representing the operations performed on tensors. This graph is dynamic and recreated after each `.backward()` call. Understanding this graph is crucial for debugging and optimizing your models.

## **5.  Disabling Gradient Tracking**

Gradient tracking can be disabled using `torch.no_grad()` to speed up computations when gradients are not needed, such as during inference or when evaluating a model.

In [20]:
with torch.no_grad():
    print((x ** 2).requires_grad)

False
