# PyTorch Matrix Operations Tutorial

This notebook demonstrates various PyTorch matrix operations including tril, triu, diag, and related methods.

In [1]:
import torch

## 1. Basic Matrix Creation

First, let's create a sample matrix to work with:

In [4]:
# Create a sample matrix
x = torch.tensor([[1, 2, 3,4],
                 [4, 5, 6,5],
                 [7, 8, 9,6],
                 [10,11,12,7]])
print("Original matrix:\n", x)

Original matrix:
 tensor([[ 1,  2,  3,  4],
        [ 4,  5,  6,  5],
        [ 7,  8,  9,  6],
        [10, 11, 12,  7]])


## 2. Lower Triangular Matrix (torch.tril)

`torch.tril` returns the lower triangular part of a matrix. The elements above the main diagonal are set to zero.

In [7]:
# Get lower triangular matrix
lower = torch.tril(x)
print("Lower triangular:\n", lower)

# With different diagonal parameters
print("\ntril with diagonal=0 (default):\n", torch.tril(x, diagonal=0))
print("\ntril with diagonal=1:\n", torch.tril(x, diagonal=1))
print("\ntril with diagonal=2:\n", torch.tril(x, diagonal=2))
print("\ntril with diagonal=-1:\n", torch.tril(x, diagonal=-1))
print("\ntril with diagonal=-2:\n", torch.tril(x, diagonal=-2))

Lower triangular:
 tensor([[ 1,  0,  0,  0],
        [ 4,  5,  0,  0],
        [ 7,  8,  9,  0],
        [10, 11, 12,  7]])

tril with diagonal=0 (default):
 tensor([[ 1,  0,  0,  0],
        [ 4,  5,  0,  0],
        [ 7,  8,  9,  0],
        [10, 11, 12,  7]])

tril with diagonal=1:
 tensor([[ 1,  2,  0,  0],
        [ 4,  5,  6,  0],
        [ 7,  8,  9,  6],
        [10, 11, 12,  7]])

tril with diagonal=2:
 tensor([[ 1,  2,  3,  0],
        [ 4,  5,  6,  5],
        [ 7,  8,  9,  6],
        [10, 11, 12,  7]])

tril with diagonal=-1:
 tensor([[ 0,  0,  0,  0],
        [ 4,  0,  0,  0],
        [ 7,  8,  0,  0],
        [10, 11, 12,  0]])

tril with diagonal=-2:
 tensor([[ 0,  0,  0,  0],
        [ 0,  0,  0,  0],
        [ 7,  0,  0,  0],
        [10, 11,  0,  0]])


## 3. Upper Triangular Matrix (torch.triu)

`torch.triu` returns the upper triangular part of a matrix. The elements below the main diagonal are set to zero.

In [8]:
# Get upper triangular matrix
upper = torch.triu(x)
print("Upper triangular:\n", upper)

# With different diagonal parameters
print("\ntriu with diagonal=0 (default):\n", torch.triu(x, diagonal=0))
print("\ntriu with diagonal=1:\n", torch.triu(x, diagonal=1))
print("\ntriu with diagonal=-1:\n", torch.triu(x, diagonal=-1))

Upper triangular:
 tensor([[1, 2, 3, 4],
        [0, 5, 6, 5],
        [0, 0, 9, 6],
        [0, 0, 0, 7]])

triu with diagonal=0 (default):
 tensor([[1, 2, 3, 4],
        [0, 5, 6, 5],
        [0, 0, 9, 6],
        [0, 0, 0, 7]])

triu with diagonal=1:
 tensor([[0, 2, 3, 4],
        [0, 0, 6, 5],
        [0, 0, 0, 6],
        [0, 0, 0, 0]])

triu with diagonal=-1:
 tensor([[ 1,  2,  3,  4],
        [ 4,  5,  6,  5],
        [ 0,  8,  9,  6],
        [ 0,  0, 12,  7]])


## 4. Diagonal Operations (torch.diag)

`torch.diag` can extract the diagonal elements of a matrix or create a matrix from a diagonal vector.

In [9]:
# Extract diagonal elements
diagonal = torch.diag(x)
print("Diagonal elements:", diagonal)

# Create matrix from diagonal
new_matrix = torch.diag(torch.tensor([1, 2, 3]))
print("\nNew matrix from diagonal:\n", new_matrix)

Diagonal elements: tensor([1, 5, 9, 7])

New matrix from diagonal:
 tensor([[1, 0, 0],
        [0, 2, 0],
        [0, 0, 3]])


## 5. Flattened Diagonal (torch.diagflat)

`torch.diagflat` creates a matrix with the given diagonal elements.

In [10]:
# Create matrix with flattened diagonal
flat_diag = torch.diagflat(torch.tensor([1, 2, 3]))
print("Flattened diagonal matrix:\n", flat_diag)

Flattened diagonal matrix:
 tensor([[1, 0, 0],
        [0, 2, 0],
        [0, 0, 3]])


## 6. Identity Matrix (torch.eye)

`torch.eye` creates an identity matrix of specified size.

In [11]:
# Create identity matrix
identity = torch.eye(3)
print("Identity matrix:\n", identity)

Identity matrix:
 tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


## 7. Practical Example: Creating Attention Masks

These operations are commonly used in attention mechanisms to create masks.

In [12]:
# Create causal mask for attention
context_length = 5
mask = torch.triu(torch.ones(context_length, context_length), diagonal=1)
print("Causal mask:\n", mask)

# Create banded matrix
banded = torch.tril(torch.triu(x, diagonal=-1), diagonal=1)
print("\nBanded matrix:\n", banded)

Causal mask:
 tensor([[0., 1., 1., 1., 1.],
        [0., 0., 1., 1., 1.],
        [0., 0., 0., 1., 1.],
        [0., 0., 0., 0., 1.],
        [0., 0., 0., 0., 0.]])

Banded matrix:
 tensor([[ 1,  2,  0,  0],
        [ 4,  5,  6,  0],
        [ 0,  8,  9,  6],
        [ 0,  0, 12,  7]])
