# 1 Tensor Basics
**Objective:** Learn the basics of PyTorch’s Tensor library.

In [22]:
import torch

---
## 1.1 Understanding Tensors
### Creating Tensors

In [23]:
# Create a tensor from a list
tensor_from_list = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Display the tensor
display(tensor_from_list)
display(tensor_from_list.size())

# Create a tensor of zeros with size 2x3
zero_tensor = torch.zeros(2, 3)

# Create a tensor of ones with size 2x3
ones_tensor = torch.ones(2, 3)

# Create a tensor with random values of size 2x3
rand_tensor = torch.rand(2, 3)

tensor([[1, 2, 3],
        [4, 5, 6]])

torch.Size([2, 3])

### Tensor Operations

In [24]:
# Addition
sum_tensor = tensor_from_list + ones_tensor

# Subtraction
diff_tensor = tensor_from_list - ones_tensor

# Multiplication (element-wise)
product_tensor = tensor_from_list * ones_tensor

# Division (element-wise)
quotient_tensor = tensor_from_list / ones_tensor

---

## 1.2 Basic Operations in PyTorch

### Reshaping Tensors

In [19]:
# Reshape a tensor to 3x2
reshaped_tensor = tensor_from_list.view(3, 2)
display(reshaped_tensor)
display(reshaped_tensor.size())

tensor([[1, 2],
        [3, 4],
        [5, 6]])

torch.Size([3, 2])

### Concatenating Tensors

In [20]:
# Concatenate tensors along a given dimension
concatenated_tensor = torch.cat((tensor_from_list, ones_tensor), dim=0)
display(concatenated_tensor)
display(concatenated_tensor.size())

tensor([[1., 2., 3.],
        [4., 5., 6.],
        [1., 1., 1.],
        [1., 1., 1.]])

torch.Size([4, 3])

### Slicing Tensors

In [21]:
# Slice tensor to get the first row
first_row = tensor_from_list[0, :]
display(first_row)
display(first_row.size())

tensor([1, 2, 3])

torch.Size([3])

---
### 1.3 Autograd: Automatic Differentiation
PyTorch provides automatic differentiation for all operations on Tensors. This is essential for training neural networks.

#### Using Autograd

In [25]:
# Create a tensor and set requires_grad=True to track computation with it
x = torch.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)

# Define a simple operation
y = x + 2

# Compute gradients
y.backward(torch.ones_like(x))

# The gradients w.r.t x are now stored in x.grad
display(x.grad)

tensor([[1., 1.],
        [1., 1.]])

---
## Differentiation in Autograd

### Gradients from Partial Derivatives
Consider the function $f(x, y) = (x + y) \cdot (x - y) = x^2 + y^2 - xy$. We will calculate the partial derivatives of $f$ with respect to $x$ and $y$ and evaluate them at the point $(x=2, y=3)$.

The function $f(x, y)$ is defined as:

$$
f(x, y) = x^2 + y^2 - xy
$$

1. The partial derivative of $f$ with respect to $y$, denoted as $\frac{\partial f}{\partial y}$, is given by:

$$
\frac{\partial f}{\partial y} = 2y - x
$$

2. The partial derivative of $f$ with respect to $x$, denoted as $\frac{\partial f}{\partial x}$, is given by:

$$
\frac{\partial f}{\partial x} = 2x - y
$$

Now, let's evaluate these partial derivatives at the point $(x=2, y=3)$:

1. The partial derivative $\frac{\partial f}{\partial y}$ at $(x=2, y=3)$ is:

$$
\frac{\partial f}{\partial y}\Bigg|_{x=2, y=3} = 2(3) - 2 = 6 - 2 = 4
$$

2. The partial derivative $\frac{\partial f}{\partial x}$ at $(x=2, y=3)$ is:

$$
\frac{\partial f}{\partial x}\Bigg|_{x=2, y=3} = 2(2) - 3 = 4 - 3 = 1
$$

So, at the point $(x=2, y=3)$, the partial derivative of $f$ with respect to $y$ is 4, and the partial derivative with respect to $x$ is 1.

### Gradients with Autograd

In autograd it will apply derivatives recursively on each variable it tracks using the chain rule. This leads to an interesting difference in results when compared to the partial derivatives calculated above.  In the factorized version it gives different results than when in the expanded version. This is because the chain rule is applied differently in each case.

In [30]:
# Define the variables x and y, and set requires_grad=True to track computations
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)
display(x, y)

tensor(2., requires_grad=True)

tensor(3., requires_grad=True)

In [31]:
# Define the function in factorized form
f = (x + y) * (x - y)

# Compute the gradients
f.backward()

# Notice the gradients are different than the partial derivatives
print(f'Gradient of f w.r.t x: {x.grad.item()}')
print(f'Gradient of f w.r.t y: {y.grad.item()}')


Gradient of f w.r.t x: 4.0
Gradient of f w.r.t y: -6.0


In [28]:
# Define the function in expanded form
f = (x ** 2) + (y ** 2) - (x * y)

# Compute the gradients
f.backward()

# Notice the gradients are consistent with the partial derivatives
print(f'Gradient of f w.r.t x: {x.grad.item()}')  # Output: Gradient of f w.r.t x: 4.0
print(f'Gradient of f w.r.t y: {y.grad.item()}')  # Output: Gradient of f w.r.t y: -6.0

tensor(2., requires_grad=True)

tensor(3., requires_grad=True)

Gradient of f w.r.t x: 1.0
Gradient of f w.r.t y: 4.0
