<a href="https://colab.research.google.com/github/tnzmnjm/christmas-coding-challenge-2024/blob/main/notebooks/Day%203/Pytorch_day3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
import torch

**A.2.3 Common PyTorch tensor operations**

Comprehensive coverage of all the different PyTorch tensor operations and commands is outside the scope of this book. However, I will briefly describe relevant operations as we introduce them throughout the book.

We have already introduced the `torch.tensor()` function to create new tensors:

In [3]:
tensor2d = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(tensor2d)
print(tensor2d.shape)

tensor([[1, 2, 3],
        [4, 5, 6]])
torch.Size([2, 3])


Reshaping tensors is an essential operation in machine learning and numerical computations because it allows you to adapt the data to the requirements of specific algorithms or models without changing the underlying data. In my case, as I'm learning PyTorch while following Build LLM from Scratch book, reshaping tensors will likely come up often because different steps in building and training a model (especially for deep learning) require tensors in specific shapes.


As you can see, `.shape` returns `[2, 3]`, meaning the tensor has two rows and three columns. To reshape the tensor into a `3 × 2` tensor, we can use the `.reshape` method:

In [4]:
print(tensor2d.reshape(3, 2))

tensor([[1, 2],
        [3, 4],
        [5, 6]])


However, note that the more common command for reshaping tensors in PyTorch is `.view()`:

In [5]:
print(tensor2d.view(3, 2))

tensor([[1, 2],
        [3, 4],
        [5, 6]])


Similar to `.reshape` and `.view`, in several cases, `PyTorch` offers multiple syntax options for executing the same computation. PyTorch initially followed the original `Lua Torch`¹ syntax convention but then, by popular request, added syntax to make it similar to `NumPy`. (The subtle difference between `.view()` and `.reshape()` in PyTorch lies in their handling of memory layout: `.view()` requires the original data to be contiguous and will fail if it isn’t, whereas `.reshape()` will work regardless, copying the data if necessary to ensure the desired shape.)

**Transpose a Tensor**

Transposing reorders the dimensions (axes) of a tensor. For example:
- A 2D tensor (matrix) with shape (rows, columns) becomes (columns, rows).
- For higher-dimensional tensors, specific axes are swapped.

We may need to transpose tensors for various reasons:

1.   to align the dimensions of two tensors for matrix multiplication.
2.   In an LLM, we might have:
  - Input sequence embeddings with shape `(batch_size, sequence_length, embedding_dim)`.

  - A weight matrix with shape `(embedding_dim, hidden_dim)`.
3. To multiply, we may need to transpose the weight matrix to`(hidden_dim, embedding_dim)`


We can use `.T ` to transpose a tensor, which means flipping it across its diagonal. Note that this is similar to reshaping a tensor, as you can see based on the following result:

In [8]:
print(tensor2d.shape)
print(tensor2d.T)

torch.Size([2, 3])
tensor([[1, 4],
        [2, 5],
        [3, 6]])


**Matrices Multiplication**

I reccommend watching these two videos from 3Blue1Brown youtube channel:
  - [Linear transformations and matrices ](https://www.youtube.com/watch?v=kYB8IZa5AuE&ab_channel=3Blue1Brown)
  - [Matrix multiplication as composition](https://www.youtube.com/watch?v=XkY2DOUCWMU&ab_channel=3Blue1Brown)

  Some notes from the videos:
    - Matrix multiplication means applying a transformation and then another transformation on a matrix.
    - in Matrix Multiplication order is important (A B != B A)
    - Matric Multiplication is associative ((AB)C = A(BC))
    
The common way to multiply two matrices in PyTorch is the `.matmul` method

In [9]:
print(tensor2d.matmul(tensor2d.T))

tensor([[14, 32],
        [32, 77]])


1. `Lua Torch` syntax : Lua Torch was a powerful framework with a compact and expressive syntax, especially for defining neural networks and performing tensor operations. Its key features like GPU support, flexibility, and modular design influenced the development of PyTorch. While Lua Torch is largely obsolete, its core ideas live on in modern frameworks like PyTorch.

In [10]:
print(tensor2d)


tensor([[1, 2, 3],
        [4, 5, 6]])


However, we can also adopt the `@` operator, which accomplishes the same thing more compactly:

In [11]:
print(tensor2d @ tensor2d.T)

tensor([[14, 32],
        [32, 77]])


If you would like to browse through all the different tensor operations available in `PyTorch` you can check out the official documentation at https://pytorch.org/docs/stable/tensors.html.