# Chapter 1 Tensor
## 1.1 Introduction
In algebra, tensor is the generalization of vector and matrix. For example, scalar can be seen as the zero order tensor, vector can be seen as the first order tensor and matrix is the second order.

In PyTorch, ```torch.Tensor``` is the main tool of storing and changing data, which is similar with Numpy. However, ```Tensor``` provide more functions like computation of GPU and automatic gradient calculation. Hence, it makes ```Tensor``` suitable for deep learning.

## 1.2 Creation
In this section, we would introduce some common methods of creating tensor.
1. Initialize random matrix, using ```torch.rand()```

In [1]:
import torch
x = torch.rand(4,3)  # Create 4*3 tensor
print(x)

tensor([[0.1394, 0.8346, 0.3913],
        [0.9298, 0.5708, 0.0141],
        [0.7222, 0.5374, 0.3818],
        [0.3647, 0.7618, 0.4842]])


2. Creating all zero matrix, using ```torch.zeros()```, and set the data type as long. Besides, ```torch.zero_()``` and ```torch.zeros_like()``` can be used to transfer a matrix into all zero matrix.

In [2]:
import torch
x = torch.zeros(4,3, dtype=torch.long)
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


3. Directly using data to construct a tensor, using ```torch.tensor()```.

In [3]:
import torch
x = torch.tensor([5.5, 3])
print(x)

tensor([5.5000, 3.0000])


4. Based on existing tensor, creating a new tensor

In [4]:
x = x.new_ones(4, 3, dtype=torch.double)
print(x)
x = torch.randn_like(x, dtype=torch.float)
print(x)
print(x.size())
print(x.shape)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[-1.4695,  0.6198,  1.4924],
        [ 0.7782,  1.5784,  0.6864],
        [-1.7062,  0.6917, -1.5334],
        [-0.5476,  0.0465,  0.2455]])
torch.Size([4, 3])
torch.Size([4, 3])


5. Common construction methods of Tensor

Function | Application
-------- | -----------
Tensor(sizes) | basic construction function
tensor(data) | similar with np.array
ones(sizes)  | all 1
zeros(sizes) | all 0
eye(sizes) | the diagnoal is 1, others are 0
arrange(s,e,step) | from s to e, with length of step
linespace(s,e,steps) | from s to e, evenly divided into steps parts
rand/randn(sizes) | rand is the uniform distribution $[0,1)$, randn is the normal distribution $N(0,1)$
normal(mean, std) | normal distribution with mean and standard variance(std)
randperm(m)  |  random permutation

## 1.3 Operation of tensor
1. Addition

In [5]:
import torch
# method 1
y = torch.rand(4,3)
print(x+y)

# method 2
print(torch.add(x,y))

# method 3 in-place
y.add_(x)
print(y)

tensor([[-1.0674,  1.1073,  2.3698],
        [ 1.7068,  2.0475,  1.6389],
        [-1.5183,  0.9545, -1.5015],
        [ 0.0825,  0.2446,  0.3342]])
tensor([[-1.0674,  1.1073,  2.3698],
        [ 1.7068,  2.0475,  1.6389],
        [-1.5183,  0.9545, -1.5015],
        [ 0.0825,  0.2446,  0.3342]])
tensor([[-1.0674,  1.1073,  2.3698],
        [ 1.7068,  2.0475,  1.6389],
        [-1.5183,  0.9545, -1.5015],
        [ 0.0825,  0.2446,  0.3342]])


2. Index
   
   Noted that the outcome of index share the memory with original data, and change one of them, another also would be changed. If you do not want change, try to use ```copy()```

In [6]:
import torch
x=torch.rand(4,3)
print(x[:,1])

y = x[0,:]
y += 1
print(y)
print(x[0, :]) # original tensor is changed

tensor([0.9816, 0.9673, 0.9888, 0.9800])
tensor([1.1708, 1.9816, 1.5410])
tensor([1.1708, 1.9816, 1.5410])


3. Change of dimension

    ```torch.view()/torch.reshape()```

In [7]:
x = torch.randn(4,4)
y = x.view(16)
z = x.view(-1,8)  # -1 means this dimension is determined by other dimensions
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


Note that the return of ```torch.view()``` would share memory with origin like index. It only adjust the view of observing data.

In [8]:
x +=1
print(x)
print(y)

tensor([[ 1.3847,  1.0480,  0.6591,  0.4128],
        [ 2.0038,  1.2163,  1.6236,  0.4877],
        [ 0.2315,  1.1471, -0.3989,  0.5407],
        [ 1.5644,  0.9461,  2.2039,  1.9636]])
tensor([ 1.3847,  1.0480,  0.6591,  0.4128,  2.0038,  1.2163,  1.6236,  0.4877,
         0.2315,  1.1471, -0.3989,  0.5407,  1.5644,  0.9461,  2.2039,  1.9636])


Usually, we hope the changed tensor would not effect by original mutually. So we need use the second method ```torch.reshape()```, but it doesn't guarantee the output is the copy. Hence, we recommend firstly use ```clone()``` create a copy and then utilize ```torch.view()```

In [9]:
import torch
x = torch.randn(1) 
print(type(x)) 
print(type(x.item()))  # value extraction

<class 'torch.Tensor'>
<class 'float'>


## 1.4 Broadcasting

When two tenors of different shapes are computed by elements, a broadcasting mechanism may be triggered: the elements are copied appropriately so that the two tensors have the same shape and then computed by elements.

In [10]:
x = torch.arange(1, 3).view(1, 2)
print(x)
y = torch.arange(1, 4).view(3, 1)
print(y)
print(x + y)

tensor([[1, 2]])
tensor([[1],
        [2],
        [3]])
tensor([[2, 3],
        [3, 4],
        [4, 5]])


Since x and y are matrices of 1 row, 2 columns and 3 rows, 1 column, respectively, if x+y is to be calculated, then the 2 elements of the first row in x are broadcasted (copied) to the second and third rows, while the 3 elements of the first column in y are broadcasted (copied) to the second column.
In this way, two matrices with 3 rows and 2 columns can be added by elements.
