**Import Libraries**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
from collections import Counter
from sklearn.datasets import load_diabetes, load_iris, fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, StandardScaler, RobustScaler
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.tree import DecisionTreeRegressor, DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.metrics import precision_score, recall_score, f1_score, explained_variance_score
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, roc_curve, auc
import torch

**1. What is Autograd in PyTorch?**

Autograd is a core feature of PyTorch that enables automatic differentiation for all operations on tensors. This is particularly useful for optimizing models in machine learning, especially in the training phase where gradients are needed for backpropagation. Here's an overview of how Autograd works:

**Key Features of Autograd:**
  - **Automatic Gradient Computation:** Autograd automatically calculates gradients of tensors with respect to some scalar value (usually a loss value). This is done through a directed acyclic graph (DAG) where nodes represent tensors and edges represent functions that produce output tensors.

  - **Dynamic Computation Graph:** Unlike static computation graphs (as seen in some other frameworks), PyTorch uses a dynamic computation graph. This means that the graph is built on-the-fly during execution, allowing for greater flexibility. You can change the way your network behaves on the fly, making it easier to debug and experiment.

  - **Backward Pass:** Once the forward pass is complete and you have computed a loss, you can call the .backward() method on the loss tensor. This triggers the computation of gradients for all tensors that have requires_grad=True, allowing you to perform gradient descent or other optimization techniques.

  - **Tracking History:** Autograd tracks all operations performed on tensors that have requires_grad set to True, allowing it to compute gradients using the chain rule during the backward pass.

  - **Custom Gradients:** While Autograd handles most gradient computations automatically, it also allows users to define custom gradients for operations when needed.

In [2]:
# Create a tensor with requires_grad set to True
x = torch.tensor(2.0, requires_grad=True)
y = x ** 2 + 3 * x + 1  # Define a function of x

# Perform backpropagation to compute the gradient
y.backward()

# Access the gradient
print(x.grad)  # Outputs: tensor(7.)

tensor(7.)


**2. How does automatic differentiation work in PyTorch?**

- Automatic differentiation in PyTorch is a crucial feature that allows the framework to compute gradients automatically, facilitating the training of neural networks through backpropagation rather than manually calculating derivatives.



In [3]:
# Define a tensor with requires_grad=True
x = torch.tensor(2.0, requires_grad=True)

# Define a function
y = x ** 2 + 3 * x + 1

# Perform backward pass to compute gradients
y.backward()

# Access the gradient
print(f"The gradient of y with respect to x is: {x.grad.item()}")  # Outputs: 7.0

# Zero out the gradients before the next iteration
x.grad.zero_()

The gradient of y with respect to x is: 7.0


tensor(0.)

**3. How do you enable gradient tracking for a tensor?**

In [4]:
# Directly When Creating the Tensor
x = torch.tensor([2.0, 3.0], requires_grad=True)
x

tensor([2., 3.], requires_grad=True)

In [5]:
# Using .requires_grad_() Method on an Existing Tensor
x = torch.tensor([2.0, 3.0])  # Gradient tracking is off by default
x.requires_grad_()  # Enables gradient tracking

tensor([2., 3.], requires_grad=True)

In [6]:
# Using .detach() to Create a New Tensor with Gradient Tracking Enabled
y = torch.tensor([1.0, 2.0], requires_grad=True)
z = y.detach().requires_grad_(True)
z

tensor([1., 2.], requires_grad=True)

In [7]:
# Converting a Tensor to requires_grad=True Temporarily with torch.enable_grad
with torch.enable_grad():
    x = torch.tensor([2.0, 3.0], requires_grad=True)
x

tensor([2., 3.], requires_grad=True)

**4. What does .backward() do in PyTorch?**

- The .backward() function in PyTorch is a key part of the backpropagation process. When called on a tensor, it computes the gradients of the tensor with respect to some scalar output, typically a loss, and stores these gradients in the .grad attribute of each tensor with requires_grad=True.

In [8]:
# Define a tensor with requires_grad=True
x = torch.tensor(2.0, requires_grad=True)

# Define a function of x
y = x ** 2 + 3 * x + 1  # y is now a function of x

# Compute gradients
y.backward()  # This computes dy/dx and stores it in x.grad

# Access the gradient
print(x.grad)  # Outputs the gradient of y with respect to x, which is 7

tensor(7.)


**5. How do you access the gradients of a tensor?**

- .grad will be None if .backward() hasn’t been called yet or if requires_grad=False.
- Gradients accumulate by default, so it’s common to zero them out at the beginning of each training step with optimizer.zero_grad() or x.grad.zero_() to prevent gradient accumulation.

In [9]:
# Define a tensor with requires_grad=True to track gradients
x = torch.tensor([2.0, 3.0], requires_grad=True)

# Define a function of x
y = x ** 2 + 3 * x + 1  # Some computation

# Compute gradients
y.sum().backward()  # Sum to make it scalar, then backpropagate

# Access and print the gradients
print(x.grad)  # Outputs the gradient of y with respect to x

tensor([7., 9.])


**6. How can you prevent a tensor from calculating gradients?**

In [10]:
# Using torch.no_grad()
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
with torch.no_grad():
    y = x * 2  # No gradients will be calculated for y
print(y.requires_grad)  # Outputs: False

False


In [11]:
# Using .detach()
z = x.detach()  # Creates a new tensor without gradient tracking
print(z.requires_grad)  # Outputs: False

False


In [12]:
# Setting requires_grad=False Directly
x.requires_grad = False  # Directly disable gradient tracking for x
print(x.requires_grad)  # Outputs: False

False


In [13]:
# Disabling Gradient Tracking Globally (Less Common)
torch.set_grad_enabled(False)
# All subsequent operations will not track gradients until turned back on

# Enable it again
torch.set_grad_enabled(True)

<torch.autograd.grad_mode.set_grad_enabled at 0x7e06703f8790>

**7. What is the purpose of torch.no_grad()?**

- The purpose of torch.no_grad() in PyTorch is to temporarily disable gradient tracking during computations, which can reduce memory usage and improve performance. This is especially useful in scenarios like model inference or evaluation, where gradient calculations are unnecessary since we are not updating the model’s weights.
- Avoid using torch.no_grad() during training, where you need gradients to update weights. It should only be used when you want to freeze the model’s parameters for specific computations or when you are only interested in predictions, not training or backpropagation.

**8. How can you clear gradients in PyTorch?**

- Using optimizer.zero_grad()
- Using model.zero_grad()
- Setting Gradients Manually to Zero (Advanced)

- Gradients accumulate by default in PyTorch, meaning they add up after each .backward() call. To ensure that each batch updates parameters based only on the current batch, we clear the gradients at the start of each iteration.

**9. When should you use .detach()?**

Use .detach() when you want to:
- Prevent gradients from being tracked for intermediate tensors.
- Work with tensor values without influencing the backpropagation.
- Freeze layers during transfer learning.
- Accumulate metrics or statistics without affecting the training process.
- Convert tensors to numpy arrays without tracking gradients.

**10. How do you compute gradients for functions involving multiple variables?**

In [14]:
# Step 1: Define the variables with requires_grad=True
x = torch.tensor(3.0, requires_grad=True)  # Variable x
y = torch.tensor(4.0, requires_grad=True)  # Variable y

# Step 2: Define the function
f = x**2 + y**2  # f(x, y) = x^2 + y^2

# Step 3: Compute the gradients
f.backward()  # Perform backpropagation to compute gradients

# Step 4: Access the gradients
print(f"Gradient with respect to x: {x.grad}")  # ∂f/∂x
print(f"Gradient with respect to y: {y.grad}")  # ∂f/∂y

Gradient with respect to x: 6.0
Gradient with respect to y: 8.0


**11. What is the difference between requires_grad=True and requires_grad=False in PyTorch tensors?**

**requires_grad=True**
  - **Gradient Tracking:** When a tensor has requires_grad=True, PyTorch will track all operations on that tensor. This means that it will construct a computation graph dynamically, allowing you to compute gradients with respect to that tensor.
  - **Backward Propagation:** You can call .backward() on a scalar (single-valued) tensor that results from operations involving tensors with requires_grad=True. This will compute the gradients of all tensors involved in the computation leading to that scalar tensor.
  - **Use Case:** This is typically used for parameters in models (like weights and biases) that need to be updated during training through optimization algorithms like gradient descent.

**requires_grad=False**
  - **No Gradient Tracking:** When a tensor has requires_grad=False, PyTorch will not track operations on that tensor. As a result, no computation graph will be constructed, and you cannot compute gradients with respect to that tensor.
  - **No Backward Propagation:** If you try to call .backward() on a tensor that does not require gradients, it will raise an error unless the tensor is a scalar.
  - **Use Case:** This is often used for input data or any tensor that is not a parameter needing optimization. It helps reduce memory usage and computation time since no gradients will be computed for these tensors.