This is a minimal tutorial for our lab. We only include basic functions, and functions that we will use in our lab. If you are interested in more operations of tensors and pytorch's design concept, see their [official tutorial](https://pytorch.org/tutorials/beginner/introyt.html)

In [None]:
import torch

## Tensor
A tensor is a basic computing object/unit in pytorch.   
It can be a scalar, a vector, or a matrix.

## Constructing a tensor

In [None]:
# construct a tensor with random number between 0 and 1
tensor_a = torch.rand(2,5) # this will create a tensor/matrix with shape 2*5
tensor_b = torch.rand(5,2)

print(tensor_a, tensor_b)
print(tensor_a.shape, tensor_b.shape) #.shape tell you the shape of the tensor

tensor([[0.8183, 0.5711, 0.0145, 0.7580, 0.4126],
        [0.2131, 0.8202, 0.0451, 0.2402, 0.0909]]) tensor([[0.2436, 0.9331],
        [0.7286, 0.4861],
        [0.5188, 0.5906],
        [0.2985, 0.6062],
        [0.8484, 0.2746]])
torch.Size([2, 5]) torch.Size([5, 2])


In [None]:
# construct a tensor with all zeros
tensor_a = torch.zeros(2,5)
# construct a tensor with all ones
tensor_b = torch.ones(2,5)

tensor_a, tensor_b

(tensor([[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]),
 tensor([[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]))

In [None]:
# construct a tensor from list
numbert_list = [[1, 2, 3],
                [4, 5, 6]]
tensor_a = torch.tensor(numbert_list)

tensor_a

tensor([[1, 2, 3],
        [4, 5, 6]])

In [None]:
# a tensor can have different data types, you can manually cast it to different data types with ".to"
# the default data type is float32
tensor_a = torch.rand(2,5)
print(tensor_a)
tensor_a = tensor_a.to(torch.float64)
print(tensor_a)
tensor_a = tensor_a.to(torch.int8)
print(tensor_a)
# you can also use ".to" to move tensor to another device ('cuda', 'cpu')
tensor_a = tensor_a.to('cpu') # modify this to 'cuda' if gpu is available
print(tensor_a)

tensor([[0.5005, 0.5656, 0.2767, 0.4751, 0.2111],
        [0.4641, 0.1607, 0.0685, 0.0403, 0.5200]])
tensor([[0.5005, 0.5656, 0.2767, 0.4751, 0.2111],
        [0.4641, 0.1607, 0.0685, 0.0403, 0.5200]], dtype=torch.float64)
tensor([[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]], dtype=torch.int8)
tensor([[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]], dtype=torch.int8)


In [None]:
# you can also get scalar/list from a tensor
tensor_a = torch.rand(2,5)
print(tensor_a)
# get scalar with ".item()"
print(tensor_a[0, 0].item())
# get list with ".tolist()"
print(tensor_a[0].tolist())

tensor([[0.0488, 0.3261, 0.5420, 0.6538, 0.3788],
        [0.3946, 0.3052, 0.2000, 0.6714, 0.6928]])
0.048793256282806396
[0.048793256282806396, 0.3260539174079895, 0.5420272350311279, 0.6538013815879822, 0.378842830657959]


In [None]:
# construct a tensor with given shape and filled with a given number
full_matrix = torch.full((2,3),42) # number: 42, shape: 5*5

full_matrix

tensor([[42, 42, 42],
        [42, 42, 42]])

## Matrix multiplication

[torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html)

In [None]:
a = [[1, 1],
     [2, 3]]
b = [[1, 2],
     [3, 3]]
tensor_a = torch.tensor(a)
tensor_b = torch.tensor(b)
print(torch.matmul(tensor_a,tensor_b))

tensor([[ 4,  5],
        [11, 13]])


## Outer product
compute outer product between two vectors
[torch.outer](https://pytorch.org/docs/stable/generated/torch.outer.html)

In [None]:
tensor_c = torch.arange(1,5)
tensor_d = torch.arange(3,7)

out_product = torch.outer(tensor_c,tensor_d)

tensor_c, tensor_d, out_product

(tensor([1, 2, 3, 4]),
 tensor([3, 4, 5, 6]),
 tensor([[ 3,  4,  5,  6],
         [ 6,  8, 10, 12],
         [ 9, 12, 15, 18],
         [12, 16, 20, 24]]))

## Filter out the upper triangle of one matrix

[torch.triu](https://pytorch.org/docs/stable/generated/torch.triu.html)

In [None]:
original_matrix = torch.rand(5,5)
print(original_matrix)

upper_triangle = torch.triu(original_matrix)
print(upper_triangle)

tensor([[0.3469, 0.5627, 0.8815, 0.6117, 0.9736],
        [0.1828, 0.8382, 0.3507, 0.3997, 0.1586],
        [0.9514, 0.8613, 0.2656, 0.3286, 0.6707],
        [0.6839, 0.0701, 0.4457, 0.1063, 0.4885],
        [0.9282, 0.6690, 0.6848, 0.8794, 0.5329]])
tensor([[0.3469, 0.5627, 0.8815, 0.6117, 0.9736],
        [0.0000, 0.8382, 0.3507, 0.3997, 0.1586],
        [0.0000, 0.0000, 0.2656, 0.3286, 0.6707],
        [0.0000, 0.0000, 0.0000, 0.1063, 0.4885],
        [0.0000, 0.0000, 0.0000, 0.0000, 0.5329]])


## Softmax

Softmax is calculated by

![](https://drive.google.com/uc?id=1qEJAgU3s4QLN-ELv5PjlE0lzlA7wT9uU)

This returns the value proportioned to the sum of value along a given dimension.

Simply put, the process is given as followed:

1. Take exponential for each element in the matrix.
2. Sum the exponential value along the given dimension.
3. Divide the first step by the second step with their corresponding dimension.

In [None]:
# Let's demonstrate how we derive softmax manually
# Here, we calculate sofatmax in the first row
Row = upper_triangle[0]
Sum_of_this_row = torch.sum(torch.exp(Row))
print("Sum of exponentials of the first row: ", Sum_of_this_row)
for i in range(upper_triangle.size(1)):
  print(f"original value: {Row[i]}")
  print(f"exponential value: {torch.exp(Row[i])}")
  print(f"elements divided by sum of exponentials: {torch.exp(Row[i])/Sum_of_this_row}")


## Softmax
softmax_1 = torch.nn.functional.softmax(upper_triangle,dim=1)
print(softmax_1) # you can see the first row is the same as we do manually

Max values of upper_triangle in the 0 dimension:  tensor([0.9736, 0.8382, 0.6707, 0.4885, 0.5329]) 


Sum of exponentials of the first row:  tensor(10.0754)
original value: 0.346870481967926
exponential value: 1.4146335124969482
elements divided by sum of exponentials: 0.1404043287038803
original value: 0.5627419352531433
exponential value: 1.755479335784912
elements divided by sum of exponentials: 0.17423374950885773
original value: 0.8814697861671448
exponential value: 2.4144458770751953
elements divided by sum of exponentials: 0.239637091755867
original value: 0.6116546392440796
exponential value: 1.8434791564941406
elements divided by sum of exponentials: 0.18296785652637482
original value: 0.9735735058784485
exponential value: 2.647387981414795
elements divided by sum of exponentials: 0.26275691390037537
tensor([[0.1404, 0.1742, 0.2396, 0.1830, 0.2628],
        [0.1352, 0.3126, 0.1920, 0.2017, 0.1585],
        [0.1504, 0.1504, 0.1962, 0.2089, 0.2941],
        [0.1742, 0.1742, 0.17

In [None]:
# softmax over columns instead of rows
softmax_2 = torch.nn.functional.softmax(upper_triangle,dim=0)
print(softmax_2)

tensor([[0.2613, 0.2484, 0.3382, 0.2697, 0.2906],
        [0.1847, 0.3271, 0.1989, 0.2182, 0.1287],
        [0.1847, 0.1415, 0.1827, 0.2032, 0.2147],
        [0.1847, 0.1415, 0.1401, 0.1627, 0.1789],
        [0.1847, 0.1415, 0.1401, 0.1463, 0.1871]])


In [None]:
## Use this helper function to inspect what internally a softmax do.
import torch
import torch.nn.functional as F
def manual_softmax_dim0(x):
    # Step 1: Compute the exponential of each element
    exp_x = torch.exp(x)

    # Step 2: Compute the sum of exponentials along dimension 0
    sum_exp_x = torch.sum(exp_x, dim=0, keepdim=True)

    # Step 3: Divide each exponential by the sum
    softmax_x = exp_x / sum_exp_x

    return softmax_x

# Calculate softmax manually
# manual_result = manual_softmax_dim0(upper_triangle)

# # Calculate softmax using PyTorch's built-in function
# torch_result = F.softmax(upper_triangle, dim=0)

# print("Manual Softmax result:")
# print(manual_result)
# print("\nPyTorch Softmax result:")
# print(torch_result)

# # Verify that the results are the same
# print("\nAre the results equal?", torch.allclose(manual_result, torch_result))

## Reshape tensor

[torch.view](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html)

or

[torch.reshape](https://pytorch.org/docs/stable/generated/torch.reshape.html)

These 2 functions produces the same result. torch.view performs quicker as it does not allocate new memory; torch.reshape runs slightly slower because it will allocate new memories if needed.

For details, please refer to [this blog](https://myscale.com/blog/torch-reshape-vs-torch-view-pytorch/)

Pratically, torch.reshape is recommanded as it can handle more cases. We guaranteed that all tensors are contiguous in our lab so we will use torch.view to decrease computation time.

In [None]:
before_reshaped = torch.randn((2,5))
print(before_reshaped)

after_reshaped = before_reshaped.reshape(5,2)
after_viewed = before_reshaped.view(5,2)

print(after_reshaped)
print(after_viewed)

tensor([[ 0.3369,  0.9678, -0.8020, -0.4791,  1.1977],
        [ 0.5820, -1.3635, -0.4105, -1.2536,  0.4773]])
tensor([[ 0.3369,  0.9678],
        [-0.8020, -0.4791],
        [ 1.1977,  0.5820],
        [-1.3635, -0.4105],
        [-1.2536,  0.4773]])
tensor([[ 0.3369,  0.9678],
        [-0.8020, -0.4791],
        [ 1.1977,  0.5820],
        [-1.3635, -0.4105],
        [-1.2536,  0.4773]])


## Get the index of max value

[torch.argmax](https://pytorch.org/docs/stable/generated/torch.argmax.html)

Argument max returns the "index" of the max value, in contrast of max, which directly returns the max value.

(This can also be achieved my selecting indices of torch.max, but will be slightly slower.)

In [None]:
matrix = torch.randn(2,3)
print(matrix)

max_value=torch.max(matrix,dim=1).values
max_index=torch.argmax(matrix,dim=1)
print(max_value)
print(max_index)

tensor([[-0.2352, -0.6111,  1.6632],
        [-0.0459, -0.9726, -0.4152]])
tensor([ 1.6632, -0.0459])
tensor([2, 0])
