### <div align="center">Getting Started With PyTorch</div>

##### 4.2: Matrix Fundamentals
- Tensor is basic building blocks of deep learning.
- When Tonsor is
  - Single variable or number called Scalar.
  - 1 Dimension called Vector.
  - 2 Dimension called Matrix.
  - 3 Dimension called Cube.
- Matrix is a table like arrangement of numbers.
- Matrix arithmetic (Addition, Subtraction, Multiplication etc.) helps in solving many business problems.
- In neural networks, weights can be efficiently multiplied with the output from the previous layer using matrix multiplication. If you are using a GPU, this becomes even faster as it will use multiple cores to compute dot products in parallel.
- Neural networks require a lot of matrix multiplications and that is the reason why GPUs got very popular for deep learning as it helps in parallel processing.
- You can perform matrix multiplication in two ways:
  1. Element wise multiplication (a.k.a. Hadamard product)
  2. Matrix multiplication
- Basic rule for matrix multiplication is that the columns in the first matrix must be equal to rows in the second matrix

##### PyTorch Tensor Basics
- Tensor is a generic term for scaler (0 dimension tensor), vector (1 dimension tensor), matrix (2 dimension tensor) etc.
- Tensor can have any number of dimensions.
- Using torch.Tensor, you can create a tensor object. Tensor objects look very much like numpy arrays. Numpy arrays can not be created on GPU directly whereas you can create a tensor object directly in GPU memory.
- Tensor has numpy and dataframe like attributes such as dtype, shape, device etc.
- view() method allows you to reshape the tensor
- zeros(), ones(), rand() can be used to create a new tensor with specific values.
- You don’t need to remember these APIs. You can always use ChatGPT and Google to take help on API syntax.

##### 4.5: Derivative and Partial Derivative
- Slope of a line at a given point is Derivative.
- Derivative → Xⁿ = nX^(n-1)
- Slope is used for Linear equations, whereas Derivative is used for non-linear equations.
- Slope is constant, whereas Derivative is a function.
- The purpose of a Partial Derivative is to measure how a function changes as one of its variables is varied while keeping the other variables constant.
##### 4.6: Chain Rule
- Chain rule is a technique used to compute the derivative of a function, composed of multiple functions.
Chain rule will be used in the Gradient Descent Technique.

##### 4.7: Autograd in PyTorch
- Autograd feature allows to calculate gradients (i.e. partial derivatives) automatically. While training a neural network, we need to calculate gradients during backpropagation step. Automatic gradient calculation helps in this process.
- torch.no_grad can be used if you want to temporarily stop calculating gradients.

In [5]:
import torch

##### Gradient for a single input

In [6]:
material_cost = torch.tensor(10, requires_grad=True, dtype=torch.float16)
labor_cost = torch.tensor(5, requires_grad=True, dtype=torch.float16)

total_cost = 3*material_cost**2 + 5*labor_cost + 100
total_cost

tensor(425., dtype=torch.float16, grad_fn=<AddBackward0>)

In [7]:
total_cost.backward()
material_cost.grad, labor_cost.grad

(tensor(60., dtype=torch.float16), tensor(5., dtype=torch.float16))

In [8]:
total_cost.requires_grad

True

##### Disable gradient computation

In [9]:
x = torch.tensor(4, requires_grad=True, dtype=torch.float16)
y = x**2 + 5
print("Outside no_grad: ", y.requires_grad)

# y.backward()
# print(x.grad)
    
with torch.no_grad():
    y = x**2 + 5
    print("Inside no_grad: ", y.requires_grad)

Outside no_grad:  True
Inside no_grad:  False


#### Gradient for a vector

In [11]:
x = torch.tensor([1,2,3],requires_grad=True,dtype=torch.float16)
# x = torch.tensor([[1,2],[3,4]],requires_grad=True,dtype=torch.float16)

y = 2 * x ** 3 + 7 
y

tensor([ 9., 23., 61.], dtype=torch.float16, grad_fn=<AddBackward0>)

In [12]:
result = y.sum()
result

tensor(93., dtype=torch.float16, grad_fn=<SumBackward0>)

In [13]:
result.backward()
x.grad

tensor([ 6., 24., 54.], dtype=torch.float16)

##### 4.8: Numpy Arrays Vs PyTorch Tensors
- PyTorch tensors and numpy arrays have similar functionality but tensor offers 3 key benefits over numpy arrays that are useful in deep learning.
  - Benefit 1: Tensor come with in built support to leverage GPU acceleration.
  - Benefit 2: Tensors have autograd features that computes gradients automatically. Numpy arrays do not have this feature.
  - Benefit 3: Tensors are tightly integrated with PyTorch ecosystem that makes it easier to use with deep learning tasks.

### Problem Statement: **Welcome to AI Town!**

##### You’ve been hired as an AI engineer in **AI Town**, a futuristic city powered by artificial intelligence. Your job is to use PyTorch to solve foundational challenges that AI Town faces. Each task requires you to demonstrate your expertise with PyTorch basics.

In [14]:
# Imports and CUDA
import torch

# Check if CUDA (GPU) is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

Using device: cpu


**Task1: Inventory Management**

AI Town’s warehouse uses sensors to record the inventory of 5 items every day. Each day’s data is represented as a list of integers (number of units).

In [15]:
#1 Create a PyTorch Tensor from the following inventory data
inventory = [[12, 15, 10, 0, 5],
             [10, 8, 7, 5, 4],
             [20, 10, 15, 5, 2]]

inventory_tensor = torch.tensor(inventory)
print(inventory_tensor)

tensor([[12, 15, 10,  0,  5],
        [10,  8,  7,  5,  4],
        [20, 10, 15,  5,  2]])


In [16]:
#2 Find the total inventory for each item across all days
total_inventory = inventory_tensor.sum(dim=0) # sum across rows
print(total_inventory)

tensor([42, 33, 32, 10, 11])


In [17]:
#3 Find the average inventory per day
'''mean() function in PyTorch requires the tensor to be a floating-point type (e.g., float32 or float64)
'''
average_inventory = inventory_tensor.float().mean(dim=1) # mean across columns
print(average_inventory)

tensor([ 8.4000,  6.8000, 10.4000])


OR

In [18]:
average_inventory = inventory_tensor.mean(dim=1, dtype=torch.float32) # mean across columns
print(average_inventory)

tensor([ 8.4000,  6.8000, 10.4000])


**KEY INSIGHTS**

* dim=0 means summing across rows (column-wise), while dim=1 sums across columns (row-wise).
* Tensor operations are efficient and avoid explicit loops.

**Task2: Monitoring Vehicle Flow**

AI Town uses a sensor to monitor the number of vehicles passing through two main roads every hour. The data for one day (24 hours) is represented as two
1×24 tensors.

In [19]:
#1 Simulate this data using PyTorch's Random Functions
road1 = torch.randint(50, 200, (24,))
road2 = torch.randint(50, 200, (24,))
print(road1)
print(road2)

tensor([147, 163,  98,  93, 199, 184, 133, 191, 189, 126, 155,  98, 109,  65,
        193, 140,  67,  71, 121, 183, 110, 140, 178, 122])
tensor([157, 123,  80,  92,  97, 165,  76, 181, 156, 155, 175,  84, 136, 151,
        183, 124, 102, 132, 188, 166, 167,  64, 128, 161])


In [20]:
#2 Write the function to calculate the total vehicle flow for each road across the entire day
total_flow_road1 = road1.sum() # total for road1
total_flow_road2 = road2.sum() # total for road2
print(total_flow_road1)
print(total_flow_road2)

tensor(3275)
tensor(3243)


In [21]:
#3 Write the function to calculate the total vehicle flow for each hour across both roads
total_flow_hourly  = road1 + road2
print()




**KEY INSIGHTS**

* Use torch.randint() for simulating random data in a specific range.
* Element-wise addition works seamlessly on tensors of the same shape.

---

**Task3: Fitness Matrix**

The AI Gym tracks members’ fitness scores using a 3×3 matrix for *strength*, *stamina*, and f*lexibility*. Each row represents a different member, and each column represents a specific metric.

In [23]:
#1 Create a 3*3 tensor matrix and multiply the scores of each member by a weight factor: [0.8, 1.2, 1.5]
fitness = torch.tensor([[10, 20, 30],
                        [40, 50, 60],
                        [70, 80, 90]])
weights = torch.tensor([0.8, 1.2, 1.5])
weighted_fitness = weights * fitness
print(weighted_fitness)

tensor([[  8.0000,  24.0000,  45.0000],
        [ 32.0000,  60.0000,  90.0000],
        [ 56.0000,  96.0000, 135.0000]])


In [24]:
#2 Find the row-wise and column-wise maximum scores
row_max = weighted_fitness.max(dim=1) # max for each row
col_max = weighted_fitness.max(dim=0) # max for each column
print(row_max)
print(col_max)

torch.return_types.max(
values=tensor([ 45.,  90., 135.]),
indices=tensor([2, 2, 2]))
torch.return_types.max(
values=tensor([ 56.,  96., 135.]),
indices=tensor([2, 2, 2]))


In [25]:
#3 Transpose the fitness matrix and interpret it's new structure (shape)
transposed_fitness = fitness.t()
print(transposed_fitness)

tensor([[10, 40, 70],
        [20, 50, 80],
        [30, 60, 90]])


**KEY INSIGHTS**

Learn [Broadcasting](https://pytorch.org/docs/stable/notes/broadcasting.html) and do changes in 'weighted fitness', the output will remain the same

**Task4: Chain Rule in Action**

AI Lab is running experiments to understand the effect of temperature (
x) on a chemical reaction rate (y). The relationship is given as:
$y = 2x^3 + 5x^2 - 3x + 7$

In [27]:
#1 Use PyTorch to compute 'y' for x = 4

# let's define x first
x = torch.tensor(4.0, requires_grad=True)

# now define y
y = 2*x**3 + 5*x**2 - 3*x + 7

print(y)

tensor(203., grad_fn=<AddBackward0>)


In [28]:
#2 Calculate dy/dx (gradient) using PyTorch's autograd
y.backward(retain_graph = True)
gradient = x.grad

print(gradient)

tensor(133.)


**KEY INSIGHTS**

* If you need to call .backward() on the same graph multiple times, you need to specify retain_graph=True when calling .backward()
* *WHY*? because in PyTorch, the computational graph used for the operation is freed by default to save memory

**Task5: Camera Calibration**

AI Town’s surveillance cameras need to align their focus. You are given the following matrices for two cameras’ focus adjustments:

In [29]:
camera1 = torch.tensor([[1, 2], [3, 4]])
camera2 = torch.tensor([[5, 6], [7, 8]])

In [30]:
#1 Perform an element-wise multiplication of matrices ( Hadamard Product)
elementwise_product = camera1 * camera2
print(elementwise_product)

tensor([[ 5, 12],
        [21, 32]])


In [31]:
#2 Compute Dot Product of the two matrices
dot_product = torch.matmul(camera1, camera2)
print(dot_product)

tensor([[19, 22],
        [43, 50]])


In [32]:
#3 Compute the Determinant of each matrix
det_camera1 = torch.det(camera1.float())
det_camera2 = torch.det(camera2.float())
print(det_camera1)
print(det_camera2)

tensor(-2.)
tensor(-2.0000)


**KEY INSIGHTS**:

* Use **torch.matmul()** for matrix multiplication and "*" for element-wise multiplication.
* Determinants work only on square matrices.

**Task6: Neural Network Foundations**

Our AI University uses simplified single neuron model (perceptron):
$$y = wx + b$$

In [34]:
#1 Create tensors for w  = 2, b = 1, x = [1, 2, 3, 4]
w = torch.tensor(2.0, requires_grad=True)
b = torch.tensor(1.0, requires_grad=True)

x = torch.tensor([1, 2, 3, 4])
print(w)
print(b)
print(x)

tensor(2., requires_grad=True)
tensor(1., requires_grad=True)
tensor([1, 2, 3, 4])


In [35]:
#2 Compute tensors for y
y = w * x + b
print(y)

tensor([3., 5., 7., 9.], grad_fn=<AddBackward0>)


PyTorch tracks gradients for all **requires_grad=True** tensors.