### Day 1: PyTorch Basics and Tensors

### 1. Introduction to PyTorch

PyTorch is an open-source machine learning library developed by Facebook's AI Research lab (FAIR). It is widely used for applications such as computer vision and natural language processing.

Why PyTorch?
- Dynamic computational graph (which allows for flexible and fast experimentation).
- Strong GPU acceleration.
- Easy to learn and use, especially for Python users.

PyTorch vs. TensorFlow:
- TensorFlow uses static computational graphs (though it has dynamic graph capabilities via Eager Execution) while PyTorch uses dynamic computational graphs (which are more intuitive for debugging and building complex models).
- PyTorch is more "Pythonic" and easier to learn for beginners.

### 2. Tensors: The Basic Building Blocks

Tensors are a generalized form of vectors and matrices. In PyTorch, tensors are similar to NumPy arrays but with GPU acceleration. Torch is a library where tensor is an array like NumPy.

#### 2.1. Creating Tensors

Let's create some basic tensors.

In [8]:
import torch

# Scalar (0-dimensional tensor)
scalar = torch.tensor(5)
print("Scalar:", scalar)
print("Shape of scalar:", scalar.shape)

# Vector (1-dimensional tensor)
vector = torch.tensor([1, 2, 3])
print("\nVector:", vector)
print("Shape of vector:", vector.shape)

# Matrix (2-dimensional tensor)
matrix = torch.tensor([[1, 2], [3, 4]])
print("\nMatrix:", matrix)
print("Shape of matrix:", matrix.shape)

# Random tensor
random_tensor = torch.rand(2, 3)  # 2 rows, 3 columns
print("\nRandom Tensor (2x3):", random_tensor)

Scalar: tensor(5)
Shape of scalar: torch.Size([])

Vector: tensor([1, 2, 3])
Shape of vector: torch.Size([3])

Matrix: tensor([[1, 2],
        [3, 4]])
Shape of matrix: torch.Size([2, 2])

Random Tensor (2x3): tensor([[0.3521, 0.0691, 0.1474],
        [0.0759, 0.4095, 0.3250]])


### 2.2. Tensor Operations

Tensors support various operations. Let's look at a few.

In [3]:
# Addition
a = torch.tensor([1, 2])
b = torch.tensor([3, 4])
c = a + b
print("a + b =", c)

# Multiplication (element-wise)
d = a * b
print("a * b =", d)

# Matrix multiplication
e = torch.tensor([[1, 2], [3, 4]])
f = torch.tensor([[5, 6], [7, 8]])
g = torch.matmul(e, f)
print("\nMatrix multiplication result:\n", g)

a + b = tensor([4, 6])
a * b = tensor([3, 8])

Matrix multiplication result:
 tensor([[19, 22],
        [43, 50]])


### 2.3. Tensor Shape and Data Type

Understanding the shape and data type of tensors is crucial for debugging.

In [4]:
# Let's check the shape and dtype of the random tensor we created
print("Shape of random_tensor:", random_tensor.shape)
print("Data type of random_tensor:", random_tensor.dtype)

# We can also change the data type (if needed)
int_tensor = random_tensor.to(torch.int32)
print("\nAfter converting to int32:", int_tensor.dtype)
print("Integer tensor:\n", int_tensor)

Shape of random_tensor: torch.Size([2, 3])
Data type of random_tensor: torch.float32

After converting to int32: torch.int32
Integer tensor:
 tensor([[0, 0, 0],
        [0, 0, 0]], dtype=torch.int32)


### 2.4. GPU vs CPU

PyTorch allows you to move tensors to GPU for faster computations. However, if you don't have a GPU, it will run on CPU.

In [5]:
# Check if GPU is available
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print("Using device:", device)

# Move a tensor to the GPU (if available)
if device == 'cuda':
    gpu_tensor = random_tensor.to(device)
    print("Tensor on GPU:", gpu_tensor.device)
else:
    print("GPU not available, tensor stays on CPU.")

Using device: cuda
Tensor on GPU: cuda:0


### 3. Introduction to MLP (Multi-Layer Perceptron)

MLP is a class of feedforward artificial neural network (ANN). It consists of at least three layers:
1. Input layer
2. One or more hidden layers
3. Output layer

Each layer (except the input) is fully connected to the next, and each neuron in a layer uses a nonlinear activation function.

Why MLP?
- Can learn complex patterns.
- Universal function approximator (theoretically can approximate any continuous function).

### 3.1. Key Components

- **Neurons**: Basic units that compute weighted sum of inputs, add bias, and apply activation function.
- **Layers**:
  - Input layer: Receives the input features.
  - Hidden layers: Perform computations and transfer information.
  - Output layer: Produces the final output.
- **Activation functions**: Introduce non-linearity. Common ones: ReLU, Sigmoid, Tanh.


