<a href="https://colab.research.google.com/github/babupallam/Msc_AI_Module2_Natural_Language_Processing/blob/main/L06-Feed%20Forward%20Networks%20for%20Natural%20Language%20Processing/01_MLP_Input_Data_and_Tensor_Operations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### 1. **Introduction**

- **Tensors in Neural Networks**:
  - In PyTorch, data is represented using **tensors**. A tensor is essentially a multi-dimensional array (similar to NumPy arrays) but with added support for GPU acceleration.
  - Neural networks process data in the form of tensors, which are passed through layers, transformed, and produce outputs.

- **Dimensions in Neural Networks**:
  - **Batch size**: Refers to the number of data samples processed simultaneously by the network. For instance, a batch size of 32 means the model processes 32 samples at once.
  - **Input dimensions**: Refers to the number of features in each input sample. For example, in a dataset of images, the input dimension might be the number of pixels (height × width) per image.
  - **Output dimensions**: Typically corresponds to the number of target classes or regression outputs the model predicts.

  **Observation**:
  - Tensors allow us to handle multi-dimensional data efficiently, which is crucial for large datasets and high-dimensional inputs like images or time series.




---



### 2. **Generating Input Data**

- **Creating Random Tensors**:
  - To simulate data, we can generate random tensors using PyTorch’s `torch.rand()`. This function generates random numbers uniformly distributed between 0 and 1.

  - **Shape of the Tensor**:
    - The shape is critical in neural networks. In this case, the shape `(batch_size, input_dim)` means we have a batch of `batch_size` samples, where each sample has `input_dim` features.

  **Code Explanation**:
  - For a batch size of 2 and an input dimension of 3:
  

In [2]:
import torch # Imports the PyTorch library. This line makes 'torch' available for use

batch_size = 2  # Defines the number of samples (batch size) to be processed at once
input_dim = 3  # Defines the number of features in each input sample
x_input = torch.rand(batch_size, input_dim)  # Generates a random input tensor with shape (batch_size, input_dim)
print("Random input tensor:", x_input)  # Prints the generated random tensor


Random input tensor: tensor([[0.3881, 0.7579, 0.5920],
        [0.0592, 0.7366, 0.1417]])



  **Demonstration**:
  - Generate the random tensor and print its shape and values:


In [3]:
print("Input tensor shape:", x_input.shape)  # Prints the shape of the tensor, which is (batch_size, input_dim)
print("Input tensor values:\n", x_input)  # Prints the actual values of the input tensor


Input tensor shape: torch.Size([2, 3])
Input tensor values:
 tensor([[0.3881, 0.7579, 0.5920],
        [0.0592, 0.7366, 0.1417]])



- **Role of Batch Size and Input Dimensions**:
  - **Batch size**: Controls how many data samples are processed at once. A larger batch size can speed up training, but it also requires more memory.
  - **Input dimensions**: Define how much information is passed to the network per sample. For example, if each sample represents an image, the input dimension could be the number of pixels.

  **Observation**:
  - Batch size and input dimensions directly affect memory usage and computation time. It’s essential to choose appropriate values for batch size based on your dataset and hardware.

---



### 3. **Understanding Tensor Shapes**

- **`describe()` Function**:
  - Understanding the **shape** and **type** of tensors is crucial when designing neural networks, as mismatched shapes can cause errors.
  - The `describe()` function will print out key information about the tensor:
    - **Type**: Type of tensor (e.g., `torch.FloatTensor`).
    - **Shape**: Dimensions of the tensor (e.g., `(batch_size, input_dim)`).
    - **Values**: The actual data contained within the tensor.

  **Code**:


In [4]:
def describe(x):
    # Prints the type of the tensor
    print("Type: {}".format(x.type()))

    # Prints the shape of the tensor
    print("Shape: {}".format(x.shape))

    # Prints the actual values of the tensor
    print("Values: \n{}".format(x))



  **Demonstration**:
  - Use the `describe()` function to display details about the tensor created earlier:


In [5]:
describe(x_input)


Type: torch.FloatTensor
Shape: torch.Size([2, 3])
Values: 
tensor([[0.3881, 0.7579, 0.5920],
        [0.0592, 0.7366, 0.1417]])



  **Observation**:
  - Understanding the structure of your data is critical. In neural networks, tensors must have the correct shape for layers to process them properly (e.g., input and output shapes need to align with the layers’ expectations).

---



### 4. **Exercise**

- **Changing Batch Size and Input Dimensions**:
  - Explore how modifying the batch size or input dimensions affects the tensor shape. Changing these values helps understand how different datasets with various sizes are handled by the network.
  
  **Task**:
  - Change the batch size to 4 and input dimensions to 5, then print the new tensor shape and values:


In [6]:
batch_size = 4
input_dim = 5
x_input = torch.rand(batch_size, input_dim)  # Generate new random tensor
describe(x_input)


Type: torch.FloatTensor
Shape: torch.Size([4, 5])
Values: 
tensor([[0.5139, 0.5055, 0.2539, 0.4155, 0.4862],
        [0.9612, 0.8655, 0.6660, 0.2281, 0.9330],
        [0.5295, 0.7268, 0.6760, 0.9847, 0.6083],
        [0.5106, 0.7859, 0.8158, 0.4364, 0.4476]])



  **Observation**:
  - Increasing the batch size increases the number of samples being processed simultaneously. This change will also affect memory usage and how fast the network can train.

- **Creating Additional Tensors with Different Shapes**:
  - Experiment with creating tensors of different shapes, such as:
    - 1D tensor (vector).
    - 2D tensor (matrix).
    - 3D tensor (cube-like data for image processing).

  **Task**:
  - Create and describe different tensors:


In [7]:

# 1D tensor (vector)
x_vector = torch.rand(10)  # Creates a 1D tensor with 10 random values (uniformly sampled between 0 and 1)
describe(x_vector)  # Call the describe function to output type, shape, and values of the 1D tensor

# 2D tensor (matrix)
x_matrix = torch.rand(3, 4)  # Creates a 2D tensor with 3 rows and 4 columns (3x4 matrix) filled with random values
describe(x_matrix)  # Call the describe function to output type, shape, and values of the 2D tensor

# 3D tensor (for example, simulating images with RGB channels)
x_3d = torch.rand(2, 3, 5)  # Creates a 3D tensor: 2 samples, 3 channels (RGB), and 5x5 pixels per channel
describe(x_3d)  # Call the describe function to output type, shape, and values of the 3D tensor

Type: torch.FloatTensor
Shape: torch.Size([10])
Values: 
tensor([0.1835, 0.7427, 0.2901, 0.1761, 0.7132, 0.7392, 0.3549, 0.1595, 0.2869,
        0.8143])
Type: torch.FloatTensor
Shape: torch.Size([3, 4])
Values: 
tensor([[0.8827, 0.8895, 0.8898, 0.1312],
        [0.1597, 0.6499, 0.4837, 0.7159],
        [0.6974, 0.1937, 0.9633, 0.6133]])
Type: torch.FloatTensor
Shape: torch.Size([2, 3, 5])
Values: 
tensor([[[0.9016, 0.0044, 0.3400, 0.0975, 0.1876],
         [0.8151, 0.5927, 0.7756, 0.2288, 0.9963],
         [0.9532, 0.9311, 0.0167, 0.8669, 0.9576]],

        [[0.1763, 0.2425, 0.9991, 0.3652, 0.1233],
         [0.6588, 0.9498, 0.1700, 0.7415, 0.0975],
         [0.7149, 0.1856, 0.2802, 0.3512, 0.6118]]])



  **Observation**:
  - The shape of the tensor changes how data is represented. A 1D tensor might represent a simple list of numbers, while a 3D tensor could represent an image with multiple color channels.

---



### 5. **Conclusion**

- **Recap**:
  - Tensors are the foundation for data representation in neural networks. Understanding their shapes, sizes, and how they flow through the network is critical for successful model building.
  - **Batch size** controls how many data samples are processed at once, while **input dimensions** define how much information each sample carries.

- **Importance of Understanding Tensor Operations**:
  - Tensor operations like reshaping, slicing, and broadcasting are key for feeding data into the model, passing data through layers, and adjusting for mismatches.
  - Misaligned tensor shapes can cause errors, so it’s important to ensure that tensors are correctly sized before passing them through the network.

  **Demonstration**:
  - Show a final demonstration on reshaping a tensor, which is a common operation in neural networks:


In [8]:
# Reshape a 3D tensor into a 2D tensor (flattening the last two dimensions)
x_reshaped = x_3d.view(2, -1)  # Reshape to (2, 15), combining the last two dimensions
describe(x_reshaped)


Type: torch.FloatTensor
Shape: torch.Size([2, 15])
Values: 
tensor([[0.9016, 0.0044, 0.3400, 0.0975, 0.1876, 0.8151, 0.5927, 0.7756, 0.2288,
         0.9963, 0.9532, 0.9311, 0.0167, 0.8669, 0.9576],
        [0.1763, 0.2425, 0.9991, 0.3652, 0.1233, 0.6588, 0.9498, 0.1700, 0.7415,
         0.0975, 0.7149, 0.1856, 0.2802, 0.3512, 0.6118]])



- **Takeaway**:
  - Mastering tensor operations is a fundamental skill in deep learning with PyTorch. Ensuring that your tensors are correctly sized and shaped will save you time debugging errors and allow for smooth model training.

---



## Observations


### 1. **Observation: Random Tensor Generation Reproducibility**
   - Every time you generate random tensors using `torch.rand()`, the output differs unless a random seed is set.
   - **Demonstration**:


In [9]:
torch.manual_seed(42)  # Set random seed for reproducibility
x_input = torch.rand(2, 3)  # Generate random tensor
print("Random tensor with seed 42:\n", x_input)

torch.manual_seed(42)  # Set the same seed again
x_input_again = torch.rand(2, 3)
print("Reproduced tensor with same seed:\n", x_input_again)


Random tensor with seed 42:
 tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])
Reproduced tensor with same seed:
 tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])



### 2. **Observation: Effect of Batch Size on Memory Usage**
   - Larger batch sizes allow more data to be processed at once but increase memory usage.
   - **Demonstration**:


In [10]:
# Test with different batch sizes
batch_size_small = 2
batch_size_large = 64
input_dim = 100

x_input_small = torch.rand(batch_size_small, input_dim)
x_input_large = torch.rand(batch_size_large, input_dim)

print(f"Small batch size memory: {x_input_small.element_size() * x_input_small.nelement()} bytes")
print(f"Large batch size memory: {x_input_large.element_size() * x_input_large.nelement()} bytes")


Small batch size memory: 800 bytes
Large batch size memory: 25600 bytes



### 3. **Observation: Tensor Operations for Multi-Dimensional Data**
   - Tensors can represent complex data (e.g., 3D tensors for RGB images, 2D matrices for grayscale images).
   - **Demonstration**:


In [11]:
# Create tensors to represent different data types
rgb_image = torch.rand(1, 3, 32, 32)  # 1 image, 3 channels (RGB), 32x32 pixels
grayscale_image = torch.rand(1, 1, 32, 32)  # 1 image, 1 channel (grayscale), 32x32 pixels

print("RGB Image shape:", rgb_image.shape)
print("Grayscale Image shape:", grayscale_image.shape)


RGB Image shape: torch.Size([1, 3, 32, 32])
Grayscale Image shape: torch.Size([1, 1, 32, 32])



### 4. **Observation: Understanding Tensor Broadcasting**
   - Broadcasting allows tensors of different shapes to be used in operations by expanding their dimensions.
   - **Demonstration**:


In [13]:
a = torch.rand(3, 1)
b = torch.rand(1, 4)
broadcasted_result = a + b  # Broadcasting allows (3, 1) + (1, 4)
print("Result after broadcasting:\n", broadcasted_result)


Result after broadcasting:
 tensor([[1.2550, 1.2105, 1.1784, 1.0308],
        [1.1506, 1.1061, 1.0741, 0.9265],
        [1.1320, 1.0875, 1.0554, 0.9079]])



### 5. **Observation: Reshaping Tensors with `view()`**
   - Reshaping tensors without changing their data can be useful for adjusting input formats for layers in neural networks.
   - **Demonstration**:


In [14]:
# Reshape 3D tensor into 2D tensor
x_3d = torch.rand(2, 3, 5)  # Shape: (2, 3, 5)
x_reshaped = x_3d.view(2, -1)  # Flatten last two dimensions
print("Original 3D tensor shape:", x_3d.shape)
print("Reshaped 2D tensor shape:", x_reshaped.shape)


Original 3D tensor shape: torch.Size([2, 3, 5])
Reshaped 2D tensor shape: torch.Size([2, 15])



### 6. **Observation: Effect of Input Dimension on Model Complexity**
   - Increasing the input dimension increases the number of parameters in a neural network, making it more complex.
   - **Demonstration**:


In [19]:
# Define input dimensions for two different networks
input_dim_small = 2
input_dim_large = 10
hidden_dim = 15

model_small = torch.nn.Linear(input_dim_small, hidden_dim)
model_large = torch.nn.Linear(input_dim_large, hidden_dim)

print("Number of parameters (small input):", sum(p.numel() for p in model_small.parameters()))
print("Number of parameters (large input):", sum(p.numel() for p in model_large.parameters()))


# In PyTorch, when you call .parameters() on a model (e.g., model_small.parameters()), it returns an iterator over all the learnable parameters (weights and biases) of the model.




Number of parameters (small input): 45
Number of parameters (large input): 165


In [20]:
# Print the model structure
print("Model structure:\n", model_small)

# Access the model's parameters
print("\nModel parameters:")
for param in model_small.parameters():
    print(param)


Model structure:
 Linear(in_features=2, out_features=15, bias=True)

Model parameters:
Parameter containing:
tensor([[ 0.5803,  0.4982],
        [ 0.3824,  0.5655],
        [ 0.5694,  0.4177],
        [ 0.5108, -0.0908],
        [ 0.1842, -0.6078],
        [-0.2534, -0.0661],
        [-0.0345,  0.6974],
        [ 0.5770,  0.2227],
        [ 0.6593, -0.3605],
        [ 0.1375, -0.5609],
        [-0.6015,  0.3781],
        [ 0.2270, -0.0381],
        [-0.4575,  0.5097],
        [-0.0542,  0.4827],
        [-0.4885,  0.5434]], requires_grad=True)
Parameter containing:
tensor([-0.0334, -0.2210, -0.0584, -0.5454, -0.1293,  0.0414, -0.5235,  0.3819,
         0.6680,  0.3646, -0.3473, -0.4394, -0.5604, -0.4359,  0.4221],
       requires_grad=True)



### 7. **Observation: How Tensor Slicing Affects Data**
   - Tensor slicing allows you to access specific elements or rows/columns in a tensor, useful for selecting parts of a dataset.
   - **Demonstration**:


In [22]:
# Create a tensor and slice it
x = torch.rand(4, 5)  # 4 samples, each with 5 features
slice_1 = x[1, :]  # Access second row (second sample)
slice_2 = x[:, 2]  # Access third column (third feature across all samples)

print("Original tensor:\n", x)
print("Second sample (row 1):", slice_1)
print("Third feature (column 2):", slice_2)


Original tensor:
 tensor([[0.2096, 0.9536, 0.7689, 0.5212, 0.2507],
        [0.8813, 0.9706, 0.6002, 0.1777, 0.5172],
        [0.7436, 0.6460, 0.9499, 0.9503, 0.7533],
        [0.1522, 0.4730, 0.5968, 0.3332, 0.6496]])
Second sample (row 1): tensor([0.8813, 0.9706, 0.6002, 0.1777, 0.5172])
Third feature (column 2): tensor([0.7689, 0.6002, 0.9499, 0.5968])



### 8. **Observation: Checking the Data Type of a Tensor**
   - It's crucial to ensure that tensors have the correct data type, especially when switching between CPU and GPU.
   - **Demonstration**:


In [23]:
x_input = torch.rand(2, 3)
print("Tensor type:", x_input.type())  # Default is usually float32

# Convert to another type (e.g., double)
x_double = x_input.double()
print("Converted to double type:", x_double.type())


Tensor type: torch.FloatTensor
Converted to double type: torch.DoubleTensor



### 9. **Observation: Using `.unsqueeze()` to Add Dimensions**
   - `.unsqueeze()` adds dimensions to a tensor, often necessary when batch sizes or channel dimensions need to be included in input data.
   - **Demonstration**:


In [24]:
# Original tensor without a batch dimension
x = torch.rand(5)  # Shape: (5)
x_unsqueezed = x.unsqueeze(0)  # Add batch dimension
print("Original shape:", x.shape)
print("After unsqueeze (with batch dimension):", x_unsqueezed.shape)


Original shape: torch.Size([5])
After unsqueeze (with batch dimension): torch.Size([1, 5])



### 11. **Observation: Tensor Concatenation**
   - You can concatenate two tensors along a specified dimension to combine data from different sources.
   - **Demonstration**:


In [26]:
# Create two tensors
tensor1 = torch.rand(2, 3)
tensor2 = torch.rand(2, 3)

# Concatenate along dimension 0 (rows) and dimension 1 (columns)
concat_dim0 = torch.cat((tensor1, tensor2), dim=0)
concat_dim1 = torch.cat((tensor1, tensor2), dim=1)

print("Concatenated along rows (dim=0):\n", concat_dim0)
print("Concatenated along columns (dim=1):\n", concat_dim1)


Concatenated along rows (dim=0):
 tensor([[0.8429, 0.3490, 0.1078],
        [0.4759, 0.8217, 0.7287],
        [0.7685, 0.2914, 0.4680],
        [0.0355, 0.6551, 0.5548]])
Concatenated along columns (dim=1):
 tensor([[0.8429, 0.3490, 0.1078, 0.7685, 0.2914, 0.4680],
        [0.4759, 0.8217, 0.7287, 0.0355, 0.6551, 0.5548]])



### 12. **Observation: Tensor Stacking**
   - **Stacking** adds a new dimension to tensors by joining them along a new axis, often used when combining batch samples.
   - **Demonstration**:


In [27]:
# Create two tensors and stack them along a new dimension
tensor1 = torch.rand(2, 3)
tensor2 = torch.rand(2, 3)

stacked = torch.stack((tensor1, tensor2), dim=0)  # Stack along a new axis
print("Stacked tensor shape:", stacked.shape)


Stacked tensor shape: torch.Size([2, 2, 3])



### 13. **Observation: Tensor Transposition with `transpose()`**
   - You can **transpose** tensors to swap dimensions, which is essential when working with different data formats.
   - **Demonstration**:


In [28]:
# Create a 2D tensor and transpose its dimensions
tensor = torch.rand(3, 4)
transposed = tensor.transpose(0, 1)  # Swap dimensions

print("Original tensor shape:", tensor.shape)
print("Transposed tensor shape:", transposed.shape)


Original tensor shape: torch.Size([3, 4])
Transposed tensor shape: torch.Size([4, 3])


### 14. **Observation: Squeezing Out Extra Dimensions**
   - The `.squeeze()` function removes dimensions of size 1 from a tensor, which is useful for reducing unnecessary dimensions.
   - **Demonstration**:


In [29]:
# Create a tensor with an extra dimension
tensor = torch.rand(1, 3, 1, 4)

squeezed = tensor.squeeze()  # Remove dimensions of size 1
print("Original shape:", tensor.shape)
print("Squeezed shape:", squeezed.shape)


Original shape: torch.Size([1, 3, 1, 4])
Squeezed shape: torch.Size([3, 4])


### 15. **Observation: Expanding Dimensions with `.expand()`**
   - The `.expand()` function replicates data along a dimension, allowing the tensor to "expand" without copying data.
   - **Demonstration**:


In [30]:
# Create a 1D tensor
tensor = torch.rand(3)

# Expand along the first dimension
expanded = tensor.unsqueeze(0).expand(5, 3)  # Replicate along new dimension
print("Expanded tensor shape:", expanded.shape)


Expanded tensor shape: torch.Size([5, 3])


### 16. **Observation: Calculating Basic Statistics on Tensors**
   - PyTorch allows you to calculate basic statistics (mean, std, max, etc.) directly on tensors.
   - **Demonstration**:


In [31]:
tensor = torch.rand(10)

print("Mean:", tensor.mean())
print("Standard deviation:", tensor.std())
print("Max value:", tensor.max())
print("Min value:", tensor.min())


Mean: tensor(0.5568)
Standard deviation: tensor(0.3316)
Max value: tensor(0.9480)
Min value: tensor(0.0205)


### 17. **Observation: Normalizing Tensors**
   - Normalization ensures that the data is scaled to a specific range, often between 0 and 1 or with zero mean and unit variance.
   - **Demonstration**:


In [32]:
# Normalize tensor to have values between 0 and 1
tensor = torch.rand(5, 5)
min_val = tensor.min()
max_val = tensor.max()

normalized_tensor = (tensor - min_val) / (max_val - min_val)
print("Original tensor:\n", tensor)
print("Normalized tensor:\n", normalized_tensor)


Original tensor:
 tensor([[0.7233, 0.6234, 0.0753, 0.9029, 0.7256],
        [0.1539, 0.4359, 0.1715, 0.9640, 0.0454],
        [0.6139, 0.7571, 0.8677, 0.0027, 0.6088],
        [0.3461, 0.4988, 0.0255, 0.4740, 0.4436],
        [0.8906, 0.6684, 0.2413, 0.3218, 0.5840]])
Normalized tensor:
 tensor([[0.7496, 0.6457, 0.0755, 0.9365, 0.7520],
        [0.1573, 0.4506, 0.1756, 1.0000, 0.0444],
        [0.6358, 0.7848, 0.8998, 0.0000, 0.6305],
        [0.3572, 0.5160, 0.0237, 0.4902, 0.4586],
        [0.9236, 0.6925, 0.2482, 0.3319, 0.6047]])


### 18. **Observation: Element-Wise Multiplication of Tensors**
   - PyTorch supports element-wise operations such as multiplication, where corresponding elements in two tensors are multiplied.
   - **Demonstration**:


In [33]:
tensor1 = torch.rand(3, 3)
tensor2 = torch.rand(3, 3)

elementwise_product = tensor1 * tensor2  # Element-wise multiplication
print("Element-wise multiplication result:\n", elementwise_product)


Element-wise multiplication result:
 tensor([[0.2287, 0.5928, 0.1999],
        [0.1619, 0.1881, 0.0052],
        [0.5981, 0.3062, 0.4026]])


### 19. **Observation: Cloning a Tensor**
   - The `.clone()` function creates a copy of the tensor, which is independent of the original tensor.
   - **Demonstration**:



In [34]:
tensor = torch.rand(2, 3)

tensor_clone = tensor.clone()  # Create a copy of the tensor
tensor_clone[0, 0] = 999  # Modify the clone without affecting the original

print("Original tensor:\n", tensor)
print("Cloned and modified tensor:\n", tensor_clone)


Original tensor:
 tensor([[0.7962, 0.7779, 0.0420],
        [0.3859, 0.3467, 0.4566]])
Cloned and modified tensor:
 tensor([[9.9900e+02, 7.7787e-01, 4.1979e-02],
        [3.8589e-01, 3.4673e-01, 4.5663e-01]])


### 20. **Observation: Moving Tensors Between Devices (CPU and GPU)**
   - Tensors can be moved between the CPU and GPU using `.to()` to utilize GPU acceleration when available.
   - **Demonstration**:


In [35]:
tensor = torch.rand(3, 3)

# Check if a GPU is available and move the tensor to GPU
if torch.cuda.is_available():
    tensor_gpu = tensor.to('cuda')
    print("Tensor moved to GPU:\n", tensor_gpu)
else:
    print("GPU not available, tensor remains on CPU.")


GPU not available, tensor remains on CPU.
