## Summary:
(1) Create tensors from list, numpy array.   
(2) Change datatype.   
(3) Reshape.   
(4) Mathematical operations.    
(5) Concatenate, stack, split, reduce dimension.    

In [1]:
import torch
import numpy as np

## Create tensor from list or numpy array

In [2]:
# Create a tensor from a list
a = [1,2,3]
t_a = torch.tensor(a)
print(t_a)

tensor([1, 2, 3])


In [11]:
# The default type is int64
print(t_a.dtype)
print(type(a[0]))

torch.int64
<class 'int'>


In [5]:
# Create a tensor from a numpy array
b = np.array([4,5,6])
t_b = torch.tensor(b)
print(t_b)
# Note that the type is int32!

tensor([4, 5, 6], dtype=torch.int32)


In [9]:
b.dtype

dtype('int32')

The default element type after convertion from a list and from an array are different. The reason is the difference between the two sources. The default type of the integer elements in the list 'a' is int64 (This will show as 'int' if we print its type. In fact, this default type depends on platform and python version). On the other hand, the default type of the elements in the numpy array 'b' is int32, at least on this laptop. torch.tensor does not change the data types. It converts the original type to the corresponding type in torch: int64-->torch.int64, int32-->torch.int32  

In [13]:
# We can specify the data type manually
# Below, the data type of the tensor is not shown. This is because it has the default data type, which is torch.int64
c = np.array([4,5,6], dtype=np.int64)
t_c = torch.tensor(c)
print(t_c)

tensor([4, 5, 6])


In [14]:
# Tensor whose elements are all 1's
t_ones = torch.ones(2,3)
print(t_ones)
print(t_ones.dtype)

tensor([[1., 1., 1.],
        [1., 1., 1.]])
torch.float32


In [16]:
# A random tensor
# Each element is sampled uniformly between 0 and 1
t_rand = torch.rand(2,3)
print(t_rand)
print(t_rand.dtype)

tensor([[0.4436, 0.6921, 0.8738],
        [0.1679, 0.3789, 0.4707]])
torch.float32


## Change data types

In [19]:
print(t_a.dtype)
t_a_int32 = t_a.to(torch.int32)
print(t_a_int32)

torch.int64
tensor([1, 2, 3], dtype=torch.int32)


## Change shape

In [9]:
# Transpose
t = torch.rand(2,3)
t_transpose = torch.transpose(t, 0, 1) # Switch dimension 0 and 1.
print(t)
print(t_transpose)
print(t.shape)
print(t_transpose.shape)

tensor([[0.4821, 0.1011, 0.1499],
        [0.7022, 0.2854, 0.4937]])
tensor([[0.4821, 0.7022],
        [0.1011, 0.2854],
        [0.1499, 0.4937]])
torch.Size([2, 3])
torch.Size([3, 2])


In [21]:
# torch.transpose does not copy the original tensor
# It just provides another way of indexing
t = torch.ones(2,3)
t_transpose = torch.transpose(t, 0, 1)
print(t)
print(t_transpose)
t_transpose[1,1]=10
print(t_transpose)
print(t)

tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])
tensor([[ 1.,  1.],
        [ 1., 10.],
        [ 1.,  1.]])
tensor([[ 1.,  1.,  1.],
        [ 1., 10.,  1.]])


In [23]:
# Reshape
t = torch.ones(2*3)
t_reshape = t.reshape(2,3)
print(t)
print(t_reshape)

tensor([1., 1., 1., 1., 1., 1.])
tensor([[1., 1., 1.],
        [1., 1., 1.]])


In [24]:
# Similarly, t.reshape does not make a copy of the original tensor
t_reshape[1,0] = 10
print(t_reshape)
print(t)

tensor([[ 1.,  1.,  1.],
        [10.,  1.,  1.]])
tensor([ 1.,  1.,  1., 10.,  1.,  1.])


In [13]:
# Squeeze: remove dimensions of size 1
t = torch.zeros(1,2,1,4,1)
t_squeeze = torch.squeeze(t) # This will remove all dimensions of size 1
print(t.shape)
print(t_squeeze.shape)

torch.Size([1, 2, 1, 4, 1])
torch.Size([2, 4])


In [14]:
# Remove certain dimensions
t_squeeze = torch.squeeze(t, 2)
print(t_squeeze.shape)

torch.Size([1, 2, 4, 1])


In [15]:
# Remove certain dimensions
# If the size is not 1, nothing will happpen.
t_squeeze = torch.squeeze(t, 3)
print(t_squeeze.shape)

torch.Size([1, 2, 1, 4, 1])


## Mathematical operations

In [25]:
# Setting the random seed to a fixed value ensures that the same sequence of random numbers will be generated every time the code is run.
torch.manual_seed(1)
# Create a tensor whose entries are in [-1, 1)
t1 = 2 * torch.rand(5,2) - 1 
# Create a tensor whose entries follow the standard normal distribution
t2 = torch.normal(mean=0, std=1, size=(5,2))
print(t1)
print(t2)

tensor([[ 0.5153, -0.4414],
        [-0.1939,  0.4694],
        [-0.9414,  0.5997],
        [-0.2057,  0.5087],
        [ 0.1390, -0.1224]])
tensor([[ 0.8590,  0.7056],
        [-0.3406, -1.2720],
        [-1.1948,  0.0250],
        [-0.7627,  1.3969],
        [-0.3245,  0.2879]])


In [17]:
# Elementwise product
t3 = torch.multiply(t1, t2)
print(t3)

tensor([[ 0.4426, -0.3114],
        [ 0.0660, -0.5970],
        [ 1.1249,  0.0150],
        [ 0.1569,  0.7107],
        [-0.0451, -0.0352]])


In [27]:
# Mean along a given axis
print(torch.mean(t1, axis=0))
# Mean of all entries
print(torch.mean(t1))

tensor([-0.1373,  0.2028])
tensor(0.0327)


In [28]:
# Matrix multiplication
t5 = torch.matmul(t1, torch.transpose(t2, 0, 1)) # To transpose t2, we can also use t2.transpose(0, 1)
print(t5)
# To transpose t2, we can also use t2.transpose(0, 1)
t5 = torch.matmul(t1, t2.transpose(0, 1))
print(t5)
t6 = torch.matmul(t1.transpose(0, 1), t2)
print(t6)

tensor([[ 0.1312,  0.3860, -0.6267, -1.0096, -0.2943],
        [ 0.1647, -0.5310,  0.2434,  0.8035,  0.1980],
        [-0.3855, -0.4422,  1.1399,  1.5558,  0.4781],
        [ 0.1822, -0.5771,  0.2585,  0.8676,  0.2132],
        [ 0.0330,  0.1084, -0.1692, -0.2771, -0.0804]])
tensor([[ 0.1312,  0.3860, -0.6267, -1.0096, -0.2943],
        [ 0.1647, -0.5310,  0.2434,  0.8035,  0.1980],
        [-0.3855, -0.4422,  1.1399,  1.5558,  0.4781],
        [ 0.1822, -0.5771,  0.2585,  0.8676,  0.2132],
        [ 0.0330,  0.1084, -0.1692, -0.2771, -0.0804]])
tensor([[ 1.7453,  0.3392],
        [-1.6038, -0.2180]])


In [29]:
# Norms
# Frobenius norm (the default norm, the square root of the sum of the squares of the elements)
norm_Frobenius_t1 = torch.linalg.norm(t1)
print(norm_Frobenius_t1)
# L1 norm
norm_L1_t1 = torch.linalg.norm(t1, ord=1)
print(norm_L1_t1)
# L1 norm along a specific dimension
norm_L1_t1_dim1 = torch.linalg.norm(t1, ord=1, dim=1)
print(norm_L1_t1_dim1)

tensor(1.5165)
tensor(2.1417)
tensor([0.9566, 0.6632, 1.5412, 0.7145, 0.2615])


In [30]:
# Validation of the norms
# Validate the Frobenius norm
print("Frobenius direct calculation", torch.sqrt(torch.sum(torch.multiply(t1, t1))))
print("Frobenius", norm_Frobenius_t1)

# Validate the L1 norm
# The L1 norm is not the sum of the absolute values of all elements
# The L1 norm is :torch.max(torch.sum(torch.abs(t1), dim=0))
# i.e.: max(sum(abs(t1), dim=0))
# Namely, we take absolute values of all elements, them add along the first dimension (along each column), then we select the maximum sum as the L1 norm
print("Sum of absolute values", torch.sum(torch.abs(t1)))
print("L1", norm_L1_t1)


Frobenius direct calculation tensor(1.5165)
Frobenius tensor(1.5165)
Sum of absolute values tensor(4.1370)
L1 tensor(2.1417)


In [31]:
# Another example of L1 norm
t = torch.tensor([[1,2],[3,4]], dtype=torch.float32)
print(t)
print(torch.linalg.norm(t, ord=1))
# Obviously, the result is not the sum of all absolute values.

tensor([[1., 2.],
        [3., 4.]])
tensor(6.)


## Split and join tensors

In [32]:
t = torch.rand(6)
print(t)
# Split into three parts
t_splits = torch.chunk(t, 3)
print(t_splits)

tensor([0.6397, 0.9743, 0.8300, 0.0444, 0.0246, 0.2588])
(tensor([0.6397, 0.9743]), tensor([0.8300, 0.0444]), tensor([0.0246, 0.2588]))


In [33]:
# When the number of chunks cannot divide the total number of elements, the result can be weird
# The following code tries to divide t into 4 parts. However, the result is a list of 3 tensors
# As mentioned in the book, the last tensor of the split should be smaller than the others.
# If the first three chunks have size 1, then the last will have size 3, which is larger
# If the first three chunks have size 2, then the last will have size 0, and this is just the result we see in the following
t_splits = torch.chunk(t, 4)
print(t_splits)

(tensor([0.6397, 0.9743]), tensor([0.8300, 0.0444]), tensor([0.0246, 0.2588]))


In [34]:
# Specify split size of each chunk
t = torch.rand(5)
print(t)
t_splits = torch.split(t, split_size_or_sections=[3,2])
print(t_splits)
# It seems that there is no way to specify the size of each chunk. All chunk sizes must be specified.

tensor([0.9391, 0.4167, 0.7140, 0.2676, 0.9906])
(tensor([0.9391, 0.4167, 0.7140]), tensor([0.2676, 0.9906]))


In [35]:
# Concatenate
A = torch.ones(3)
B = torch.zeros(2)
C = torch.cat([A,B], axis=0)
print(A)
print(B)
print(C)

tensor([1., 1., 1.])
tensor([0., 0.])
tensor([1., 1., 1., 0., 0.])


In [36]:
# Stack
A = torch.ones(3)
B = torch.zeros(3)
C = torch.stack([A,B], axis=1)
print(A)
print(B)
print(C)
C = torch.stack([A,B], axis=0)
print(C)

tensor([1., 1., 1.])
tensor([0., 0., 0.])
tensor([[1., 0.],
        [1., 0.],
        [1., 0.]])
tensor([[1., 1., 1.],
        [0., 0., 0.]])


In [None]:
# Explanation from ChatGPT
# torch.cat concatenates a sequence of tensors along an existing dimension
# torch.stack concatenates a sequence of tensors along a new dimension
# Therefore, C = torch.cat([A,B], axis=1) will not work, since the axis=1 does not exist
# But torch.stack([A,B], axis=1) will work

In [37]:
# In order to concatenate two row vector, we need to define them as 2D tensor
E = torch.tensor([[1,2,3]])
F = torch.tensor([[4,5,6]])
G = torch.cat([E,F], axis=0)
print(E)
print(F)
print(G)

tensor([[1, 2, 3]])
tensor([[4, 5, 6]])
tensor([[1, 2, 3],
        [4, 5, 6]])


In [39]:
# However, in this situation, torch.stack will not give desired results since it always add a new dimension
G = torch.stack([E,F], axis=0)
print(G)
print(G.shape)
G = torch.stack([E,F], axis=1)
print(G)
print(G.shape)

tensor([[[1, 2, 3]],

        [[4, 5, 6]]])
torch.Size([2, 1, 3])
tensor([[[1, 2, 3],
         [4, 5, 6]]])
torch.Size([1, 2, 3])


In [42]:
G = torch.stack([E,F], axis=0)
G[1]==F

tensor([[True, True, True]])

In [48]:
G = torch.stack([E,F], axis=1)
G[:,1,:]==F

tensor([[True, True, True]])

Maybe it is better to understand torch.stack from a algebraic view. The shape of E and F is both (1,3). 'torch.stack([E,F], axis=0)' adds a new dimension as the first dimension, so that the result has shape (2,1,3). Thus, G[0]==E, G[1]==F. 'torch.stack([E,F], axis=1)' adds a new dimension as the second dimension, so that the result has shape (1,2,3). In this case, G[:,0,:]==E and G[:,1,:]==F.