# Exercise 1 : Creating Tensors

In [1]:
import torch

In [2]:
# 1. Create a 1D tensor with values [1, 2, 3, 4]
# 2. Create a 2D tensor of shape (3, 3) filled with random values
# 3. Create a tensor of zeros with shape (2, 5)
# 4. Create a tensor of ones with shape (4, 4)
# 5. Create a tensor with values from 0 to 9

1. Create a 1D tensor with values [1, 2, 3, 4]

In [3]:
oneD_tensor = torch.tensor([1, 2, 3, 4])

oneD_tensor

tensor([1, 2, 3, 4])

In [4]:
oneD_tensor.shape

torch.Size([4])

2. Create a 2D tensor of shape (3, 3) filled with random values

In [6]:
twoD_tensor = torch.tensor(
    [
      [1, 2, 3],
      [4, 5, 6],
      [7, 8, 9]
    ]
)

twoD_tensor

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [7]:
twoD_tensor.shape

torch.Size([3, 3])

3. Create a tensor of zeros with shape (2, 5)

In [8]:
tensor_zeros_2by5 = torch.zeros(2, 5)

In [9]:
tensor_zeros_2by5

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

In [10]:
tensor_zeros_2by5.shape

torch.Size([2, 5])

4. Create a tensor of ones with shape (4, 4)

In [11]:
tensor_ones_4by4 = torch.ones(4, 4)
tensor_ones_4by4, tensor_ones_4by4.shape

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.Size([4, 4]))

5. Create a tensor with values from 0 to 9

In [14]:
tensor_0to9 = torch.arange(0, 10)

In [15]:
tensor_0to9

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [16]:
torch.linspace(0, 9, steps=10, dtype=torch.int32)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=torch.int32)

In [17]:
torch.tensor([0,1,2,3,4,5,6,7,8,9])

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [18]:
torch.tensor(list(range(10)))

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
torch.arange(10).reshape(2,5)

tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])

# Exercise 2 : Tensor Properties

- Given:
  - x = torch.randn(3, 4)

- Print the shape, datatype, and device of x

In [20]:
x = torch.randn(3, 4)

In [21]:
x.shape, x.dtype, x.device

(torch.Size([3, 4]), torch.float32, device(type='cpu'))

In [23]:
x.get_device()

-1

In [None]:
help(x)

# Exercise 3 : Indexing, Slicing, Reshaping

## Indexing and Slicing

```
x = torch.tensor([[10, 20, 30],
                  [40, 50, 60],
                  [70, 80, 90]])
```



1. Get the first row
2. Get the last column
3. Get a sub-tensor containing the center 2x2 block


In [25]:
x = torch.tensor([[10, 20, 30],
                  [40, 50, 60],
                  [70, 80, 90]])

1. Get the first row

In [26]:
x[0]

tensor([10, 20, 30])

In [29]:
x.reshape(-1)

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])

In [34]:
x[:]

tensor([[10, 20, 30],
        [40, 50, 60],
        [70, 80, 90]])

In [35]:
x[:, 1]

tensor([20, 50, 80])

In [36]:
x[1, :]

tensor([40, 50, 60])

2. Get the last column

In [37]:
# last column:

print(x[:, -1])

tensor([30, 60, 90])


3. Get a sub-tensor containing the center 2x2 block

In [40]:
x[:2, :2]

tensor([[10, 20],
        [40, 50]])

## Reshaping


```
x = torch.arange(12)
```


1. Reshape x into (3, 4)
2. Flatten it back to a 1D tensor
3. Reshape into (2, 2, 3)


In [45]:
x = torch.arange(12)

In [46]:
x

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [47]:
x.shape

torch.Size([12])

1. Reshape x into (3, 4)

In [48]:
x = x.reshape(3, 4)

In [49]:
x.shape

torch.Size([3, 4])

In [50]:
x

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

2. Flatten it back to a 1D tensor

In [51]:
# different ways to flatten back to 1D

x_flat = x.reshape(-1)
x_flat

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [52]:
x_flat.shape

torch.Size([12])

In [53]:
x_view_flat = x.view(-1)
x_view_flat, x_view_flat.shape

(tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), torch.Size([12]))

In [54]:
x_flatten = x.flatten()

x_flatten, x_flatten.shape

(tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), torch.Size([12]))

This table summarizes the key differences when transforming tensors to 1D in PyTorch.

| Feature                  | `reshape(-1)`                                                                 | `view(-1)`                                                                 | `flatten()`                                                       |
|--------------------------|-------------------------------------------------------------------------------|----------------------------------------------------------------------------|------------------------------------------------------------------|
| **Purpose**              | General reshaping to any shape. Can flatten or change dimensions.             | Efficient reshaping to any shape using the same memory.                    | Flatten tensor to 1D (or specific dimensions).                  |
| **Contiguous requirement** | Works on both contiguous and non-contiguous tensors. May copy data internally. | Requires the tensor to be contiguous in memory. Otherwise, use `.contiguous()` first. | Works on both contiguous and non-contiguous tensors.             |
| **Memory behavior**      | May return a **copy** if input is non-contiguous; otherwise, no copy.        | Does **not copy** memory; just a new view of the same data.                | May return a **copy** if input is non-contiguous; otherwise, shares memory. |
| **Syntax**               | `x.reshape(-1)`                                                               | `x.view(-1)`                                                               | `x.flatten()`                                                     |
| **Flexibility**          | Can reshape to any shape, including 1D, 2D, 3D, etc.                          | Can reshape to any shape if tensor is contiguous.                           | Primarily used for flattening dimensions; can choose `start_dim` and `end_dim`. |
| **Common Use Case**      | General reshaping operations, safe for any tensor.                            | Memory-efficient reshaping when tensor is contiguous.                       | Flattening before feeding into fully connected layers in NN.     |
| **When to use**          | Use when tensor may not be contiguous or you want a general reshape.          | Use when tensor is contiguous and you want zero-copy reshaping.             | Use when you need a convenient flatten operation, especially in neural networks. |

> **Tip:**  
> - Use `flatten()` when you specifically want a 1D tensor.  
> - Use `reshape(-1)` for safe reshaping, works with any tensor.  
> - Use `view(-1)` if the tensor is contiguous and you want the most memory-efficient approach.


3. Reshape into (2, 2, 3)


In [60]:
x_2_2_3 = x.reshape(2, 2, 3)

x_2_2_3.shape

torch.Size([2, 2, 3])

In [61]:
x_2_2_3

tensor([[[ 0,  1,  2],
         [ 3,  4,  5]],

        [[ 6,  7,  8],
         [ 9, 10, 11]]])

# Exercise 4 : Tensor Operations


```
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])

# 1. Add a and b
# 2. Multiply elementwise
# 3. Compute dot product
# 4. Compute elementwise square of a

```



In [62]:
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])

In [63]:
a, b

(tensor([1, 2, 3]), tensor([4, 5, 6]))

In [64]:
a + b

tensor([5, 7, 9])

In [66]:
a * b

tensor([ 4, 10, 18])

In [67]:
a @ b

tensor(32)

In [68]:
a ** 2

tensor([1, 4, 9])



```
A = torch.randn(2, 3)
B = torch.randn(3, 4)

# 1. Perform matrix multiplication
# 2. Transpose A
# 3. Multiply A with B using both @ and torch.matmul
```



In [69]:
A = torch.randn(2, 3)
B = torch.randn(3, 4)

In [72]:
A.shape, B.shape

(torch.Size([2, 3]), torch.Size([3, 4]))

In [73]:
A @ B

tensor([[-2.2955,  0.0594,  0.8891,  0.7478],
        [ 3.0988,  0.2320, -0.6287, -1.3747]])

The `@` operator performs **dot product** for 1D tensors and **matrix multiplication** for 2D tensors.

| Case                  | Tensor Example         | Operation Type          | Output Type | Description |
|-----------------------|----------------------|-----------------------|-------------|-------------|
| **1D tensors**        | `a = torch.tensor([1,2,3])` <br> `b = torch.tensor([4,5,6])` | Dot product           | Scalar      | Computes sum of elementwise products: `1*4 + 2*5 + 3*6 = 32` |
| **2D tensors**        | `A = torch.randn(2,3)` <br> `B = torch.randn(3,4)` | Matrix multiplication | 2D tensor   | Each element is dot product of a row of `A` with a column of `B`. Output shape: `(2,4)` |


In [75]:
A.T

tensor([[ 0.1612,  0.6446],
        [-0.2690,  0.4325],
        [ 1.2236, -1.2099]])

In [76]:
A

tensor([[ 0.1612, -0.2690,  1.2236],
        [ 0.6446,  0.4325, -1.2099]])

In [77]:
torch.matmul(A, B)

tensor([[-2.2955,  0.0594,  0.8891,  0.7478],
        [ 3.0988,  0.2320, -0.6287, -1.3747]])

# Exercise 5 : Gradient and Autograd

In [83]:
# tensor with gradient tracking
x = torch.tensor(2.0, requires_grad=True)

# function of x
y = 3 * x**2 + 2 * x + 1

# derivative (dy/dx)
y.backward()   # this tells PyTorch to compute ∂y/∂x

In [84]:
x

tensor(2., requires_grad=True)

In [85]:
y

tensor(17., grad_fn=<AddBackward0>)

In [86]:
print("Value of y:", y.item())
print("Gradient dy/dx:", x.grad.item())

Value of y: 17.0
Gradient dy/dx: 14.0


In [87]:
x.grad

tensor(14.)

When you call `y.backward()`, PyTorch computes how the scalar output `y` changes **with respect to** each tensor that has `requires_grad=True`.  
The resulting derivatives (gradients) are stored **in those input tensors** — that’s why you see `x.grad`, not `y.grad`.

| Concept | Explanation | Example | Mathematical View | PyTorch Behavior |
|----------|--------------|----------|--------------------|------------------|
| **Tracked Tensor** | Any tensor created with `requires_grad=True`. PyTorch will record all operations on it. | `x = torch.tensor(2.0, requires_grad=True)` | \( x \) | `x.grad` will store gradients |
| **Output Tensor** | The final scalar result computed using tracked tensors. | `y = 3*x**2 + 2*x + 1` | \( y = 3x^2 + 2x + 1 \) | Gradients flow *from here* |
| **Backward Call** | Starts backpropagation (reverse-mode autodiff) from the output. | `y.backward()` | Computes \( \frac{dy}{dx} \) | PyTorch traverses the computation graph backward |
| **Gradient Storage** | The gradient of the output w.r.t each input is stored **in the input tensor’s `.grad` attribute**. | `x.grad` | \( \frac{dy}{dx} = 6x + 2 \) | `x.grad` → `14.0` when `x=2` |
| **Why not `y.grad`?** | Because `y` is the final scalar result, not a variable whose gradient we need. Gradients describe **how `y` changes w.r.t inputs**, not itself. | – | – | Only leaf nodes (`x`, `y`, etc.) get `.grad` values |
| **Multiple Variables** | Each variable’s `.grad` stores its own partial derivative. | `z = x**2 + 3*y + 2*x*y` | \( ∂z/∂x = 2x + 2y,\; ∂z/∂y = 3 + 2x \) | `x.grad = 6`, `y.grad = 5` when `x=1, y=2` |
| **In Neural Networks** | Model parameters (`weights`, `biases`) are like `x` — we want to know how loss changes w.r.t each. | `loss.backward()` | \( ∂L/∂w \) | Each parameter’s `.grad` is used by the optimizer to update weights |


In [88]:
x = torch.tensor(1.0, requires_grad=True)
y = torch.tensor(2.0, requires_grad=True)

z = x**2 + 3*y + 2*x*y
z.backward()

In [90]:
x.grad.item()

6.0

In [91]:
y.grad.item()

5.0