<a href="https://colab.research.google.com/github/mfmanberg/AquaticEcosystems/blob/main/discussion_1_DL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CDSDS 542 Deep Learning for Data Science - Discussion 1

### Introduction
- Introduction to PyTorch
- Fundamental Tensor Operations and Methods

Install and import PyTorch:
- install instructions: https://pytorch.org/get-started/previous-versions/
- PyTorch tutorials: https://pytorch.org/tutorials/


In [None]:
# CPU
#!pip3 install -q torch torchvision --index-url https://download.pytorch.org/whl/cpu
# GPU
#!pip3 install -q torch torchvision --index-url https://download.pytorch.org/whl/cu126

In [None]:
# imports
import numpy as np

import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim

In [None]:
# check version
print("PyTorch version:", torch.__version__)

# check CUDA availability
print("CUDA available:", torch.cuda.is_available())

PyTorch version: 2.8.0+cu126
CUDA available: True


### 1. Introduction to PyTorch
####  Origin Story

Picture the early 2000s, when there was a need to develop ML frameworks that were flexible enough for daily research tasks. Out of this need, the Torch framework was born. Torch was released in 2002 out of the efforts of Ronan Collobert, Koray Kavukcuoglu, and Clement Farabet. It was later picked up by Facebook AI Research and many other people from several universities and research groups. Although Torch did give the required flexibility, it was written in Lua, which was not a very popular language, and the major drawback the community faced was the learning curve to the new language.

The widespread acceptance of Pyhon in the ML community made the developers pivot to Python, and thus the Python based version of Torch a.k.a PyTorch was born. PyTorch began as an internship project by Adam Paszke, who was working under Soumith Chintala, a core developer of Torch.

Ref: https://subscription.packtpub.com/book/data/9781788834131/1/ch01lvl1sec02/understanding-pytorch-s-history

#### Deep Learning Framework \& Why PyTorch?

PyTorch is an open-source machine learning framework that allows you to write your own neural networks and optimize them efficiently. It was primarily developed by Facebook's AI Research Lab (FAIR). It is based on the Torch library, which is a scientific computing framework with wide support for machine learning algorithms. PyTorch is a Python package that provides two high-level features:
* A replacement for NumPy to use the power of GPUs and other accelerators.
* An automatic differentiation library that is useful to implement neural networks.

#### Why use PyTorch?
1. PyTorch is Pythonic: Ease of using Popular Python packages like NumPy, SciPy, etc to extend the functionality of PyTorch.
2. Easy to Learn: Due to its easy and intuitive syntax.
3. Strong Community Support: PyTorch has a strong community of developers and researchers who have contributed to the framework. When in doubt, you may most likely see a solution to your problem on https://discuss.pytorch.org/.

Pytorch Governence: https://pytorch.org/docs/stable/community/governance.html

### 2. Understanding Tensors

Tensors are the basic building blocks of modern deep learning frameworks. A tensor is a generalization of vectors and matrices and is easily understood as a multidimensional array. A tensor can be a number, a vector, a matrix, or an n-dimensional array, or in general terms a container storing numerical values.

Recommended viewing to understand tensors: What's a Tensor? - Dan Fleisch (https://www.youtube.com/watch?v=f5liqUk0ZTw)

##### Types of Tensor:
* Scalar / 0-D tensor: A container with a single value. (Ex: 23, 5, 2, etc)
* Vector / 1-D tensor: A container with multiple values. (Ex: [1, 2, 3, 4, 5])
* Matrix / 2-D tensor: A container with multiple values arranged in rows and columns. (Ex: [[1, 2, 3], [4, 5, 6], [7, 8, 9]])
* In general, 2-D tensors are called matrices (Matrix, if singular), and anything above 2-D are just called tensors.

##### Rank of a Tensor:

A tensor of rank 1 is a vector, which is a one-dimensional array,
```python
[a,b]
```
A tensor of rank 2 is a vector of vectors, or a matrix, or a two-dimensional array,
```python
[[a,b],[c,d]]
```
A tensor of rank 3 is a vector of vectors of vectors, so something with three nestings,
```python
[[[a,b],[c,d]],[[e,f],[g,h]]]
```
and so on...

Broadly, a tensor can be understood to be a vector of vectors of vectors of vectors... and so on. The rank is the number of nestings of "of vectors".

Or in other words, Rank can be described as the number of information needed to specify a particular element of a tensor. For example, to specify an element in a 2-D tensor, we need 2 pieces of information, the row and the column. So, the rank of a 2-D tensor is 2.

#### Pitfall:
##### Are Matrices and 2-D tensors the same?    

A 2-D tensor can be represented by a matrix, but there is more to a tensor than just its arrangement of numerical values. A tensor is a geometric object, whose components obey certain transformation laws. A matrix is just a collection of numbers in a rectangular array, there's no inherent rule that they have to obey any transformation rules.

Video is a series of images correlated over time. We can use tensors to represent that correlation better and more intuitively than trying to convert it down to two-dimensional matrices. A third-rank tensor can encode all the aspects of each image (height, width, and color), while a rank-4 tensor could also hold information about time or order for the images.

#### 2.1 Create Tensors

In [None]:
t = torch.tensor([1, 2, 3],
                 dtype=torch.float32,
                 device='cpu',
                 requires_grad=True)
t

tensor([1., 2., 3.], requires_grad=True)

In [None]:
# scalar
scalar = torch.tensor(10)

# vector
vector = torch.tensor([10, 20, 30])

# matrix
matrix = torch.tensor([[10, 20, 30], [40, 50, 60]])

# To check the shape of a tensor, use the .shape property
print(f'Shape of scalar: {scalar.shape}, vector: {vector.shape}, matrix: {matrix.shape}')

# To check the number of dimensions of a tensor, use the .ndim property
print(f'Number of dimensions of scalar: {scalar.ndim}, vector: {vector.ndim}, matrix: {matrix.ndim}')

Shape of scalar: torch.Size([]), vector: torch.Size([3]), matrix: torch.Size([2, 3])
Number of dimensions of scalar: 0, vector: 1, matrix: 2


In [None]:
# create Tensor from NumPy array
np_array = np.array([10, 20, 30])
tensor_from_np = torch.from_numpy(np_array)
print(tensor_from_np, tensor_from_np.shape, tensor_from_np.ndim)

tensor([10, 20, 30]) torch.Size([3]) 1


In [None]:
# create tensors of specific shapes and values
t = torch.zeros(2, 3)                # creates a tensor with 0s, by passing the shape as arguments
t_zeros = torch.zeros_like(t)        # zeros_like returns a new tensor, with the same shape as the input tensor, but filled with 0s
t_ones = torch.ones(2, 3)            # creates a tensor with 1s
t_ones_like = torch.ones_like(t)     # ones_like returns a new tensor, with the same shape as the input tensor, but filled with 1s
t_fives = torch.empty(2, 3).fill_(5) # creates a non-initialized tensor and fills it with 5
t_rand = torch.rand(2, 3)          # creates a uniform random tensor
t_rand_like = torch.rand_like(t)     # rand_like returns a new tensor, with the same shape as the input tensor, but filled with random values
t_normal = torch.randn(2, 3)         # creates a normal random tensor
t_randint = torch.randint(low=0, high=10, size=(2, 3)) # creates a random tensor with integers between low and high (exclusive)

np_array = np.zeros((2, 3))
t_from_np = torch.from_numpy(np_array) # creates a tensor from a numpy array

print(t)
print(t_zeros)
print(t_ones)
print(t_fives)
print(t_rand)
print(t_rand_like)
print(t_normal)
print(t_randint)
print(t_from_np)

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[5., 5., 5.],
        [5., 5., 5.]])
tensor([[0.1587, 0.6749, 0.3393],
        [0.3154, 0.9015, 0.1423]])
tensor([[0.5234, 0.1261, 0.9999],
        [0.5749, 0.6171, 0.0021]])
tensor([[ 2.1225, -1.3086, -1.4496],
        [ 1.6222,  0.1481,  0.6676]])
tensor([[3, 0, 2],
        [7, 0, 4]])
tensor([[0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float64)


In [None]:
# tensors created on cpu by default
print('shape:', t.shape, 'dtype:', t.dtype, 'device:', t.device)

# to create on gpu
t_gpu = torch.ones(4, 5, dtype=torch.float64, device="cuda")  # GPU + float64
t_like = torch.zeros_like(t_gpu)

print("shape:", t_gpu.shape, "dtype:", t_gpu.dtype, "device:", t_gpu.device)
print("t_like.shape:", t_like.shape, "dtype:", t_like.dtype, "device:", t_like.device)

shape: torch.Size([2, 3]) dtype: torch.float32 device: cpu
shape: torch.Size([4, 5]) dtype: torch.float64 device: cuda:0
t_like.shape: torch.Size([4, 5]) dtype: torch.float64 device: cuda:0


In [None]:
# tensor round-trip GPU<->CPU
x = torch.arange(5)
if torch.cuda.is_available():
    x_gpu = x.to("cuda")
    print("On GPU:", x_gpu, x_gpu.device)
    x_cpu = x_gpu.to("cpu")
    print("Back to CPU:", x_cpu, x_cpu.device)
else:
    print("No CUDA detected; staying on CPU:", x, x.device)

On GPU: tensor([0, 1, 2, 3, 4], device='cuda:0') cuda:0
Back to CPU: tensor([0, 1, 2, 3, 4]) cpu


The underscore in `t_fives = torch.empty(2, 3).fill_(5)` signifies that the operation is performed in-place. For example, `t_fives.add_(5)` will add 5 to each element of the tensor `t_fives` in-place. Other examples of in-place operations are `t_fives.mul_(5)`, `t_fives.div_(5)`, etc.


In [None]:
t_fives_add = t_fives + 5
print(t_fives_add)
t_fives.add_(5) # in-place addition
print(t_fives)

tensor([[10., 10., 10.],
        [10., 10., 10.]])
tensor([[10., 10., 10.],
        [10., 10., 10.]])


In [None]:
# Accesing the scalar value of a tensor
print(t_fives[0, 1].item()) # item() returns the value of a tensor as a standard Python number
print(type(t_fives[0, 1].item()))

10.0
<class 'float'>


#### 2.1 Operate on Tensors

In [None]:
# Reshape
t = torch.rand(2, 3)
print(t)
print(t.reshape(3, 2))
print(t.reshape(6))
print(t.reshape(1, 6))
print(t.reshape(6, 1))
print(t.reshape(3, -1)) # -1 means "infer the correct value from the shape of the tensor"
print(t.view(3, -1)) # view is an alternative to reshape

tensor([[0.0047, 0.7853, 0.8918],
        [0.0644, 0.0137, 0.2335]])
tensor([[0.0047, 0.7853],
        [0.8918, 0.0644],
        [0.0137, 0.2335]])
tensor([0.0047, 0.7853, 0.8918, 0.0644, 0.0137, 0.2335])
tensor([[0.0047, 0.7853, 0.8918, 0.0644, 0.0137, 0.2335]])
tensor([[0.0047],
        [0.7853],
        [0.8918],
        [0.0644],
        [0.0137],
        [0.2335]])
tensor([[0.0047, 0.7853],
        [0.8918, 0.0644],
        [0.0137, 0.2335]])
tensor([[0.0047, 0.7853],
        [0.8918, 0.0644],
        [0.0137, 0.2335]])


`view` and `reshape` gave the same result. What's the difference?

`view()` will try to change the shape of the tensor while keeping the underlying data allocation the same, thus data will be shared between the two tensors. `reshape()` will create a new underlying memory allocation if necessary.

When you use `reshape()` instead of view, the matrix is made contiguous if that is necessary, but otherwise the same data is used. This is nice if you don't care much about memory use. If you want to be sure that you're not accidentally copying a large matrix, `view()` is the way to go.

Additional Reading: https://stackoverflow.com/questions/49643225/whats-the-difference-between-reshape-and-view-in-pytorch, https://discuss.pytorch.org/t/whats-the-difference-between-torch-reshape-vs-torch-view/159172/3

#### Pytorch Data Types

Available data types include:

```python
torch.bool
torch.int8
torch.uint8
torch.int16
torch.int32
torch.int64
torch.half = torch.float16
torch.float = torch.float32 (default datatype)
torch.double = torch.float64
torch.bfloat16
```

When not specified, the default is `torch.float32`, which differs from plain
python where the default floating point datatype is `float64`.

Ref: https://pytorch.org/tutorials/beginner/introyt/tensors_deeper_tutorial.html

In [None]:
a = torch.ones((2, 3), dtype=torch.int16) # creates a tensor with 1s of type int16
print(a)

b = torch.rand((2, 3), dtype=torch.float64) * 20 # creates a random tensor of type float64 and multiplies it by 20
print(b)

c = b.to(torch.int32) # converts the tensor of type float64 to type int32
print(c)

d = torch.randn([2, 3]) # creates a random tensor of type float32
print(d)
print(d.dtype)

tensor([[1, 1, 1],
        [1, 1, 1]], dtype=torch.int16)
tensor([[16.5479, 14.9651, 18.5607],
        [12.1787,  6.1444, 11.1268]], dtype=torch.float64)
tensor([[16, 14, 18],
        [12,  6, 11]], dtype=torch.int32)
tensor([[ 1.1928,  0.6979, -0.4607],
        [ 0.6199, -0.2942,  0.5860]])
torch.float32


Also note that when the tensor is the default datatype, the `dtype` is not printed.


In [None]:
t = torch.randn(5, 6)
print(t, t.dtype)
t = t.double()  # converts to 64-bit float
print(t)
t = t.byte()    # converts to unsigned 8-bit integer
print(t)

tensor([[ 0.7435,  0.0837, -0.0284,  1.1052, -1.7039,  0.8512],
        [-2.8086,  2.1147, -0.3527,  1.8120,  0.4511,  1.1122],
        [-0.2346, -0.4508, -0.7801, -0.4661, -1.2432,  0.0638],
        [ 0.3198, -0.5995, -1.3254, -0.8066, -0.8307,  1.0488],
        [-1.4712, -0.6267,  0.9095, -0.3076, -0.9028,  0.4656]]) torch.float32
tensor([[ 0.7435,  0.0837, -0.0284,  1.1052, -1.7039,  0.8512],
        [-2.8086,  2.1147, -0.3527,  1.8120,  0.4511,  1.1122],
        [-0.2346, -0.4508, -0.7801, -0.4661, -1.2432,  0.0638],
        [ 0.3198, -0.5995, -1.3254, -0.8066, -0.8307,  1.0488],
        [-1.4712, -0.6267,  0.9095, -0.3076, -0.9028,  0.4656]],
       dtype=torch.float64)
tensor([[  0,   0,   0,   1, 255,   0],
        [254,   2,   0,   1,   0,   1],
        [  0,   0,   0,   0, 255,   0],
        [  0,   0, 255,   0,   0,   1],
        [255,   0,   0,   0,   0,   0]], dtype=torch.uint8)


#### ``torch.float16`` vs ``torch.float32``

- Reduced Memory Usage:   
    - ``torch.float16`` uses half as much memory per element compared to ``torch.float32``. This can be significant when dealing with large neural networks or datasets, especially on GPUs with limited memory.    
    - It allows you to store and process larger models or batches of data.  
    
- Faster Computation and Reduced Bandwidth (While transferring from CPU to GPU, or across network connections)

- Trade-offs:
    - Reduced Precision - may lead to rounding errors in calculations.
    - Not all deep learning models and operations are compatible with ``torch.float16``. Some operations may require ``torch.float32`` or higher precision to maintain accuracy.
    - Training with ``torch.float16`` can be more challenging as it can lead to convergence issues, especially in complex models.

#### Mathematical Operations

In [None]:
a = torch.ones(2, 3)
b = torch.ones(2, 3)
print(f'a+b = {a+b}\n')

c = torch.ones(2, 3) * 2
print(f'c = {c}\n')

d = torch.ones(2, 3) + 1 # broadcasting
print(f'd = {d}\n')

fours = c ** 2 # element-wise exponentiation
print(f'fours = {fours}\n')

powers2 = c ** torch.tensor([1, 2, 3]) # element-wise exponentiation with broadcasting
print(f'powers2 = {powers2}\n')

print(f'c/powers2 = {c/powers2}\n')

a+b = tensor([[2., 2., 2.],
        [2., 2., 2.]])

c = tensor([[2., 2., 2.],
        [2., 2., 2.]])

d = tensor([[2., 2., 2.],
        [2., 2., 2.]])

fours = tensor([[4., 4., 4.],
        [4., 4., 4.]])

powers2 = tensor([[2., 4., 8.],
        [2., 4., 8.]])

c/powers2 = tensor([[1.0000, 0.5000, 0.2500],
        [1.0000, 0.5000, 0.2500]])



#### Matrix Multiplication:

Things to keep in mind:
* Inner dimensions must match (Ex: 2x3 and 3x4)
* The resulting matrix has the shape of the outer dimensions (Ex: 2x4)

In [None]:
# Matrix multiplication

A = torch.rand(2, 3)
B = torch.rand(3, 4)
C = torch.mm(A, B) # matrix multiplication
print(C, C.shape)


tensor([[0.7698, 0.3378, 0.6939, 0.9779],
        [0.2282, 0.2790, 0.2625, 0.4850]]) torch.Size([2, 4])


There are other ways of performing matrix multiplication in PyTorch, such as torch.matmul(), torch.bmm(), and the @ operator. Read more at: https://www.geeksforgeeks.org/python-matrix-multiplication-using-pytorch/

#### Indexing and Slicing
In general PyTorch tensors behave similarly to Numpy arrays. They are zero indexed and support slicing.

In [None]:
t = torch.rand(2, 3)
print(t)
print(t[0, 0]) # access a single element
print(t[0, :]) # access a row
print(t[:, 0]) # access a column
print(t[0:2, 0:2]) # access a sub-matrix


tensor([[0.6889, 0.4788, 0.5620],
        [0.4587, 0.0522, 0.8728]])
tensor(0.6889)
tensor([0.6889, 0.4788, 0.5620])
tensor([0.6889, 0.4587])
tensor([[0.6889, 0.4788],
        [0.4587, 0.0522]])


In [None]:
t[0, 0] = 1 # modify a single element
print(t)
t[0, :] = 2 # modify a row with implicit broadcasting
print(t)
t[:, 0] = 3 # modify a column with implicit broadcasting
print(t)
t[0:2, 0:2] = 4 # modify a sub-matrix with implicit broadcasting
print(t)

tensor([[1.0000, 0.4788, 0.5620],
        [0.4587, 0.0522, 0.8728]])
tensor([[2.0000, 2.0000, 2.0000],
        [0.4587, 0.0522, 0.8728]])
tensor([[3.0000, 2.0000, 2.0000],
        [3.0000, 0.0522, 0.8728]])
tensor([[4.0000, 4.0000, 2.0000],
        [4.0000, 4.0000, 0.8728]])


#### Other Commonly used tensor methods

In [None]:
t = torch.randn(2,3)
t.max()                 # returns the maximum value in a tensor
t.argmax()              # returns the index of the maximum value in a tensor.
t.sum(dim=0)            # sum across rows
t.sum(dim=1)            # sum across columns
t.t()                   # transpose
t.numel()               # number of elements in tensor
t.nonzero()             # indices of non-zero elements
t.squeeze()             # removes size 1 dimensions
t.unsqueeze(0)          # inserts a dimension

t = torch.rand(2, 3, 4)
print('Before permute:', t.shape)
t = t.permute(1, 0, 2)         # permutes dimensions: 1st dim becomes 2nd, 2nd becomes 1st, 3rd remains 3rd
print('After permute:', t.shape)
t = t.flatten(start_dim=1)     # flattens a tensor from the 2nd dimension
print('After flatten on 2nd dim:', t.shape)
t = t.flatten()                # flattens a tensor from the 1st dimension by default
print('After flattening the resultant tensor:', t.shape)

torch.dist(torch.tensor([3.0, 1.0]), torch.tensor([1.0, 2.0]), p=2) # computes the distance between two tensors, Returns the p-norm of (torch.tensor([3.0, 1.0]) - torch.tensor([1.0, 2.0])

torch.arange(0, 10)     # tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
torch.eye(3, 3)         # creates a 3x3 matrix with 1s in the diagonal (identity in this case)
t = torch.arange(0, 3)  # tensor([0, 1, 2])
torch.cat((t, t))       # tensor([0, 1, 2, 0, 1, 2])
torch.stack((t, t))     # tensor([[0, 1, 2],
                        #         [0, 1, 2]])

Before permute: torch.Size([2, 3, 4])
After permute: torch.Size([3, 2, 4])
After flatten on 2nd dim: torch.Size([3, 8])
After flattening the resultant tensor: torch.Size([24])


tensor([[0, 1, 2],
        [0, 1, 2]])

#### Randomness

Having a piece of code that behaves randomly and that spews out different results every time you run it is not a good idea. Athough, as programmers, we do not deal with true randomness, we deal with psuedo-randomness. But torch.randn does seem to give out different results everytime we run it, and by definition is random isn't it?

In a way, yes. But, the random numbers are not quite truly random. They are pseudo-random numbers, which means that a number generator is used to generate a sequence of numbers that appear to be random, but they are not. The sequence of numbers generated by a pseudo-random number generator is determined by a fixed initial value called the seed. Every time we give it the same seed, it will give us the same sequence of numbers.

In [None]:
# Without setting the seed, the results of the random operations will be different every time

random_1 = torch.rand(2, 3)
random_2 = torch.rand(2, 3)

print(random_1 == random_2)

tensor([[False, False, False],
        [False, False, False]])


In [None]:
# By setting the seed, the results of the random operations will be the same every time

torch.manual_seed(42)
random_1 = torch.rand(2, 3)
random_2 = torch.rand(2, 3)     # the second call to torch.rand(2, 3) will return different values than the first call
print(random_1 == random_2)
print()

torch.manual_seed(42)
random_3 = torch.rand(2, 3)
random_4 = torch.rand(2, 3)
print(random_1 == random_3)
print(random_2 == random_4)
print()

torch.manual_seed(42)
random_5 = torch.rand(2, 3)
torch.manual_seed(42)           # need to reset the seed again, to get the same values for the second call to torch.rand(2, 3)
random_6 = torch.rand(2, 3)
print(random_5 == random_6)

tensor([[False, False, False],
        [False, False, False]])

tensor([[True, True, True],
        [True, True, True]])
tensor([[True, True, True],
        [True, True, True]])

tensor([[True, True, True],
        [True, True, True]])


When training a neural network, we use random initialization of weights, shuffling of data during training, and dropout, among other techniques. All these involve randomness, which can lead to slightly different results each time we run the code unless we set a seed. This can be problematic when trying to replicate someone else’s results or debugging your own code.

Additional Reading: https://pieriantraining.com/how-to-set-the-seed-in-pytorch-for-reproducible-results/

### Exercise 1 — Create & Inspect a Tensor
**Task:** Create a 1D tensor with values `[0, 1, 2, 3, 4]` on CPU (default).  
Print **exactly**: `device=..., dtype=..., shape=..., data=...` in **one line** using the shown format.


In [None]:
# === TODO: fill only the blanks marked with __FILL__ ===

x = torch.arange(__FILL__)                   # -> should be [0,1,2,3,4]
print(f"device={x.device}, dtype={x.dtype}, shape={tuple(x.shape)}, data={x.tolist()}")

# Expected: device=cpu, dtype=torch.int64, shape=(5,), data=[0, 1, 2, 3, 4]



### Exercise 2 — Tensor Ranks
**Task:** Create (a) scalar `s=7`, (b) vector `v=[1,2,3,4]`, (c) matrix `M=[[1,2],[3,4]]`, (d) rank-3 `T` of shape `(2,2,3)` filled with `0..11` in row-major order.  
Print `name: ndim=?, shape=?` for each in the order `s, v, M, T`.


In [None]:
# === TODO: fill only the blanks ===

s = torch.tensor(__FILL__)
v = torch.tensor(__FILL__)
M = torch.tensor(__FILL__)
T = torch.arange(__FILL__).reshape(__FILL__)

for name, t in [("s", s), ("v", v), ("M", M), ("T", T)]:
    print(f"{name}: ndim={t.ndim}, shape={tuple(t.shape)}")

# Expected shapes: s:(), v:(4,), M:(2,2), T:(2,2,3)

### Exercise 3 — Same Shape, Different Constructors

**Task:** Create three tensors of shape `(3,4)` using: `zeros`, `ones`, and `arange(...).reshape(3,4)`.  
Convert all to `torch.float32` and print `name: dtype=?, first_row=?` for each in order `A, B, C`.


In [None]:
# === TODO: fill only the blanks ===

A = torch.zeros(__FILL__)
B = torch.ones(__FILL__)
C = torch.arange(__FILL__).reshape(__FILL__)

A = A.__FILL__()     # convert to float32
B = B.__FILL__()
C = C.__FILL__()

for name, t in [("A", A), ("B", B), ("C", C)]:
    print(f"{name}: dtype={t.dtype}, first_row={t[0].tolist()}")

# Expected first rows: A->[0.0,...], B->[1.0,...], C->[0.0,1.0,2.0,3.0]

### Exercise 4 — In-Place Effects with `view` vs `reshape`
**Task:** Create `a = ones((2,3))`, then `b = a.view(-1)`, `c = a.reshape(-1)`.  
Call `a.add_(5)`. Print `a`, `b`, `c` before and after.  
(You only need to fill the creation lines and the in-place op.)


In [None]:
# === TODO: fill only the blanks ===

a = torch.__FILL__(2, 3)
b = a.view(__FILL__)
c = a.reshape(__FILL__)

print("Before add_:", "a=", a, " ", "b=", b, " ", "c=", c, sep="")
a.__FILL__(5)  # in-place add
print("\nAfter add_:\n", "a=", a, "\n", "b=", b, "\n", "c=", c, sep="")

### Exercise 5 — Contiguity and `view`

**Task:** Create `t = torch.arange(24).reshape(2,3,4)`, then `u = t.permute(1,0,2)`.  
Attempt `u.view(-1)` in a try/except and record whether it failed. Then call `u = u.contiguous()` and view again.  
Print `first_view_failed=?, after_contiguous_shape=?`.


In [None]:
# === TODO: fill only the blanks ===

t = torch.arange(24).reshape(2, 3, 4)
u = t.permute(1, 0, 2)

failed_first = False
try:
    _ = u.view(-1)
except RuntimeError:
    failed_first = True

u = u.__FILL__()
flat = u.view(__FILL__)
print(f"first_view_failed={failed_first}, after_contiguous_shape={tuple(flat.shape)}")

# Expected: first_view_failed=True, after_contiguous_shape=(24,)


### Exercise 6 — Casting & Exact Comparison

**Task:** Create `x16 = [[1.5,2.25],[3.5,4.75]]` with dtype `float16`, cast to `float32` as `x32`.  
Print both and `torch.allclose(x32, x16.float())`.


In [None]:
# === TODO: fill only the blanks ===

x16 = torch.tensor([[1.5, 2.25], [3.5, 4.75]], dtype=torch.float16)
x32 = x16.__FILL__()
print("x16:", x16)
print("x32:", x32)
print("allclose:", torch.allclose(x32, x16.__FILL__()))



### Exercise 7 — Broadcasting with Fixed Values

**Task:** Create `A = ones((2,3))`, `b = tensor([10,20,30])`. Compute `A + b`, `A * b`.  
Attempt `A + tensor([1,2])` in try/except and report if it failed.  
Print results in the given labels.


In [None]:
# === TODO: fill only the blanks ===

A = torch.ones(2, 3)
b = torch.tensor([10, 20, 30])

sum_ok = __FILL__
mul_ok = __FILL__

failed = False
try:
    _ = A + torch.tensor([1, 2])
except RuntimeError:
    failed = True

print("A+b:\n", sum_ok)
print("A*b:\n", mul_ok)
print("invalid_broadcast_failed:", failed)


### Exercise 8 — Compare matmul APIs with Constants

**Task:** Let  
`P = [[1,2,3],[4,5,6]]` (float32),  
`Q = [[1,0,0,1,2],[0,1,0,2,3],[0,0,1,3,4]]` (float32).  
Compute `P@Q`, `torch.mm(P,Q)`, `torch.matmul(P,Q)` and print whether all results are equal.


In [None]:
# === TODO: fill only the blanks ===
P = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
Q = torch.tensor([[1, 0, 0, 1, 2],
                  [0, 1, 0, 2, 3],
                  [0, 0, 1, 3, 4]], dtype=torch.float32)

out1 = __FILL__
out2 = __FILL__
out3 = __FILL__

print("all_equal:", torch.allclose(out1, out2) and torch.allclose(out2, out3))


### Exercise 9 — Advanced Slicing & Assignment

**Task:** Create `T = torch.arange(1,13).reshape(3,4)`.  
(a) Set last column to zeros; (b) Swap first two rows; (c) Extract submatrix rows `[0:2]`, cols `[1:3]`.  
Print *Original*, then after (a), after (b), and the submatrix for (c).


In [None]:
# === TODO: fill only the blanks ===

T = torch.arange(1, 13).reshape(3, 4)
print("Original:\n", T)

T[:, -1] = __FILL__
print("\n(a) Last col zeroed:\n", T)

T[[0, 1]] = T[[__FILL__]]
print("\n(b) First two rows swapped:\n", T)

sub = T[0:2, __FILL__]
print("\n(c) Submatrix [0:2, 1:3]:\n", sub)


### Exercise 10 — Reproducible Mini-Experiment

**Task:** Set torch manual seed as 123. Create `m1 = nn.Linear(4,2)`. Use fixed input `x = torch.arange(20).reshape(5,4).float()`.  
Re-seed manual seed = 123 and create `m2 = nn.Linear(4,2)`.  
Compute forward outputs and print `equal_forwards: True/False` (should be True).


In [None]:
# === TODO: fill only the blanks ===

torch.__FILL__(123)
m1 = nn.Linear(4, 2)
x = torch.arange(20).reshape(5, 4).float()
out1 = m1(x)

torch.__FILL__(123)
m2 = nn.Linear(4, 2)
out2 = m2(x)

print("equal_forwards:", torch.allclose(out1, out2))


## End of Discussion