# Tensor的操作

Pytorch中的Tensor大约支持100种以上的操作，其中包括了数学运算、线性代数、矩阵操作（转置、索引、切片等），这些操作都可以跑在CPU或GPU上，这也是Pytorch Tensor的强大之处。

我们可以通过这个[页面](https://pytorch.org/docs/stable/torch.html)，来对Tensor支持的所有操作做个大概的了解。

In [1]:
import torch
import numpy as np

# 索引访值

我们可以像访问Numpy.ndarray一样，对torch.Tensor进行各种下标索引与范围切片。

In [2]:
t = torch.arange(12).reshape(3,4)
print(f't: {t}')
print(f'取t的第2行的所有元素：{t[1]}')
print(f'取t的最后一列的所有元素：{t[:, -1]}')
print(f'取t的第2列到最后一列的所有元素：{t[:, 2:]}')
print(f'取t的位置(2,3)上的元素：{t[2, 3]}')

t: tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
取t的第2行的所有元素：tensor([4, 5, 6, 7])
取t的最后一列的所有元素：tensor([ 3,  7, 11])
取t的第2列到最后一列的所有元素：tensor([[ 2,  3],
        [ 6,  7],
        [10, 11]])
取t的位置(2,3)上的元素：11


**单一元素的Tensor**

当我们通过索引访问Tensor的单一元素时，得到的实际是一个`Tensor`类型的对象，它并不是python中的内置数据类型，我们可以通过Tensor的`item()`方法来获取python对象的标量。

In [3]:
type(t[2,3])

torch.Tensor

In [4]:
type(t[2,3].item())

int

# 联合

## torch.cat

```
cat(tensors, dim=0) -> Tensor
```

`torch.cat`将给定义的tensor的序列(tensors)，按给定义的维度上合并起来，这就要求，这些tensor，除了合并的维度，其他的维度必须一致。

In [5]:
t1 = torch.randn(2,3)
t2 = torch.randn(3,3)
torch.cat([t1, t2], dim=0)

tensor([[ 1.8282, -1.3961, -1.6330],
        [-0.5860, -1.5504,  0.5470],
        [-0.4857, -0.4318, -0.0308],
        [ 0.1696, -0.7582,  0.8282],
        [-0.1592, -0.7631,  2.8051]])

## torch.stack

`torch.stack`和`torch.cat`接口用法一致，但它并不是在原有的维度上拼接，而是直接扩展一个新的维度。

这就要求，序列中的tensor在维度上必须一致。

In [6]:
t1 = torch.randn(2,3)
t2 = torch.randn(2,3)
torch.stack([t1, t2], dim=0)

tensor([[[ 0.9310, -2.5397,  0.7603],
         [ 0.5423,  0.0121,  2.4951]],

        [[-0.9212,  1.1312,  0.4553],
         [-0.3836, -2.2080, -0.8785]]])

## torch.gather

# 分片

## torch.split
```python
split(tensor, split_size_or_sections, dim=0)
```
`split`将tensor按指定的维度，分拆为多个Tensor的元组，拆分的块chunk的大小是splite_size指定的。可能出现不能整分的情况，这时候最后一块大小一般小于splite_size

In [7]:
a = torch.arange(10).view(5,2)
a

tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])

In [8]:
torch.split(a, 2)

(tensor([[0, 1],
         [2, 3]]),
 tensor([[4, 5],
         [6, 7]]),
 tensor([[8, 9]]))

`split_size_or_sections`也可能是一个list(int)，这时候，它的每个元素，代表每个chunk的大小

In [9]:
a1, a2, a3 = torch.split(a, (1,3,1))
a1,a2,a3

(tensor([[0, 1]]),
 tensor([[2, 3],
         [4, 5],
         [6, 7]]),
 tensor([[8, 9]]))

In [10]:
# 切分出来的tensor和原tensor是共享存储的
a1[0, 0] = 42
a

tensor([[42,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9]])

## torch.chunk

```python
chunk(input, chunks, dim=0) -> List of Tensors
```
`chunk`和`split`功能类似，不同在于，chunk的第二的参数，直接指定的是chunk的数量，最后一个chunk的数量可能会少一些。

切分出来的这些Tensor和原Tensor都是共享底层存储的，也就是说每个chunk都是原Tensor的一个view。

# 变换操作

## torch.reshape

```python
reshape(input, shape) -> Tensor
```
`reshape`返回一个和原Tensor具有相同数据，相同数量的Tensor，只是shape不一致。

## torch.view

## torch.view vs. torch.reshape

torch.view has existed for a long time. It will return a tensor with the new shape. The returned tensor will share the underling data with the original tensor. See the documentation here.

On the other hand, it seems that torch.reshape has been introduced recently in version 0.4. According to the document, this method will

> Returns a tensor with the same data and number of elements as input, but with the specified shape. When possible, the returned tensor will be a view of input. Otherwise, it will be a copy. Contiguous inputs and inputs with compatible strides can be reshaped without copying, but you should not depend on the copying vs. viewing behavior.

It means that torch.reshape may return a copy or a view of the original tensor. You can not count on that to return a view or a copy. According to the developer:

> if you need a copy use clone() if you need the same storage use view(). The semantics of reshape() are that it may or may not share the storage and you don't know beforehand.

Another difference is that reshape() can operate on both contiguous and non-contiguous tensor while view() can only operate on contiguous tensor. Also see here about the meaning of contiguous.



## torch.transpose

```python
transpose(input, dim0, dim1) -> Tensor
```
转置input的指定的2个维度，返回的Tensor和原来的Tensor共享存储

In [11]:
x = torch.rand(2,3,4)
x

tensor([[[0.7794, 0.2979, 0.1634, 0.8068],
         [0.6452, 0.9279, 0.0745, 0.8439],
         [0.9692, 0.1972, 0.3226, 0.2139]],

        [[0.3651, 0.6311, 0.7130, 0.4922],
         [0.0043, 0.5986, 0.3140, 0.2075],
         [0.4185, 0.1512, 0.0444, 0.4115]]])

In [12]:
torch.transpose(x, 0, 2)

tensor([[[0.7794, 0.3651],
         [0.6452, 0.0043],
         [0.9692, 0.4185]],

        [[0.2979, 0.6311],
         [0.9279, 0.5986],
         [0.1972, 0.1512]],

        [[0.1634, 0.7130],
         [0.0745, 0.3140],
         [0.3226, 0.0444]],

        [[0.8068, 0.4922],
         [0.8439, 0.2075],
         [0.2139, 0.4115]]])

## torch.permute

## contiguous

There are a few operations on Tensors in PyTorch that do not change the contents of a tensor, but change the way the data is organized. These operations include:

`narrow()`, `view()`, `expand()` and `transpose()`

For example: when you call transpose(), PyTorch doesn't generate a new tensor with a new layout, it just modifies meta information in the Tensor object so that the offset and stride describe the desired new shape. In this example, the transposed tensor and original tensor share the same memory:

In [13]:
x = torch.randn(3,2)
y = torch.transpose(x, 0, 1)
x[0, 0] = 42
print(y[0,0])

tensor(42.)


This is where the concept of contiguous comes in. In the example above, x is contiguous but y is not because its memory layout is different to that of a tensor of same shape made from scratch. Note that the word "contiguous" is a bit misleading because it's not that the content of the tensor is spread out around disconnected blocks of memory. Here bytes are still allocated in one block of memory but the order of the elements is different!

When you call contiguous(), it actually makes a copy of the tensor such that the order of its elements in memory is the same as if it had been created from scratch with the same data.

Normally you don't need to worry about this. You're generally safe to assume everything will work, and wait until you get a RuntimeError: input is not contiguous where PyTorch expects a contiguous tensor to add a call to contiguous().

# Math Operations

## 三角函数

## 算术运算

我们可以直接在Tensor上面执行`+-*/`等运算符，这些运算会对Tensor执行逐元素计算。这就要求操作符操作的Tensor的维度上必须一致，或者符合Broadcasting。

In [14]:
t1 = torch.randn(2,3)
t2 = torch.randn(2,3)
print(f't1 = {t1}')
print(f't2 = {t2}')

t1 = tensor([[-1.4154,  0.1441,  1.1729],
        [ 0.4147, -0.3196,  1.4436]])
t2 = tensor([[ 2.5182, -0.7883, -0.8254],
        [-0.3024, -2.4285, -0.0451]])


In [15]:
# 等价于t1.add(t2)或torch.add(t1, t2)
t1 + t2

tensor([[ 1.1028, -0.6442,  0.3475],
        [ 0.1123, -2.7481,  1.3985]])

In [16]:
# 等价于t1.sub(t2)或torch.sub(t1, t2)
t1 - t2

tensor([[-3.9336,  0.9324,  1.9983],
        [ 0.7171,  2.1088,  1.4887]])

In [17]:
# 等价于t1.mul(t2)或torch.mul(t1, t2)
t1 * t2

tensor([[-3.5643, -0.1136, -0.9681],
        [-0.1254,  0.7762, -0.0651]])

In [18]:
# 等价于t1.div(t2)或torch.div(t1, t2)
t1 / t2

tensor([[ -0.5621,  -0.1828,  -1.4210],
        [ -1.3714,   0.1316, -32.0229]])

需要注意的是这里的乘法并不是数学里的矩阵乘法，Tensor的矩阵乘法，使用的是`torch.matmul`或者`@`运算符

In [19]:
# 等价于t1.matmul(t2)或torch.matmul(t1, t2)
t1 @ t2.T

tensor([[-4.6460,  0.0251],
        [ 0.1047,  0.5857]])

## 对数、指数与幂

## 数值截断

## 其他操作

# 降维操作

## torch.mean

```python
'''
Args:
  input (Tensor): the input tensor.
  dim (int or tuple of ints): the dimension or dimensions to reduce.
  keepdim (bool): whether the output tensor has :attr:`dim` retained or not.
'''
mean(input, dim, keepdim=False, *, out=None) -> Tensor
```

对input沿着`dim`的维度求均值，这样的话，指定的那个维度就会被压缩掉，如果指定了`keepdim=True`的话，那个维度会保留，值为1

In [20]:
t = torch.randn(5,6)
t

tensor([[ 0.8801,  0.3556,  1.3300, -0.8489, -1.8893, -0.7004],
        [-1.1341,  0.7043,  0.0767, -0.9126, -0.9413, -0.5077],
        [-0.8743, -2.0277,  0.5664,  0.4266,  2.9812,  0.9459],
        [ 0.1711, -2.1501, -1.3418, -1.8992,  0.6031, -0.8814],
        [ 0.8263,  1.1446, -1.6875,  1.1150, -0.2767, -0.7673]])

In [21]:
#按列的方向(dim=0)将整个Tenoor压缩成为1维的
torch.mean(t, dim=0)

tensor([-0.0262, -0.3947, -0.2112, -0.4238,  0.0954, -0.3822])

In [22]:
torch.mean(t, dim=1, keepdim=True)

tensor([[-0.1455],
        [-0.4524],
        [ 0.3364],
        [-0.9164],
        [ 0.0591]])

对于高维Tensor，我们还可以同时对多个维度进行Reduce，求其均值。

In [23]:
t = torch.randn(2,3,4)
t

tensor([[[ 1.0852,  1.2012,  0.0452,  0.3130],
         [-0.8747, -0.2138,  1.4990, -0.9035],
         [-0.0255,  1.0016, -1.6872,  1.2495]],

        [[ 1.3288,  0.9973, -1.3797,  1.6270],
         [ 2.6580, -0.2791,  0.3662,  1.7222],
         [ 2.5107,  0.3394,  1.1392,  0.2362]]])

In [24]:
# 等价于reduce第0维，得到一个3x4的Tensor后，再reduce第1维，得到(3,)的Vector
torch.mean(t, dim=(0,2))

tensor([0.6523, 0.4968, 0.5955])

In [25]:
t.mean(0).mean(1)

tensor([0.6523, 0.4968, 0.5955])

## torch.sum

`torch.sum`是一个和`torch.mean`用法上很像的操作，只是`sum`的reduce op变成了求和，而不是求均值。

In [26]:
torch.sum(t, dim=(0,2))

tensor([5.2181, 3.9745, 4.7637])

## torch.max、torch.min

# 对比操作

## torch.eq、torch.ge、torch.le、torch.gt、torch.lt

In [27]:
a = torch.randn(3,3)
b = torch.randn(3,3)
a,b

(tensor([[ 1.1975, -0.8859, -0.6680],
         [ 1.7126,  0.5976,  0.1704],
         [-0.7816,  1.4601,  1.1260]]),
 tensor([[-0.4304,  0.7686,  0.1742],
         [ 0.3721, -1.7648, -0.4187],
         [-0.2756, -0.2015, -2.1038]]))

In [28]:
torch.ge(a,b)

tensor([[ True, False, False],
        [ True,  True,  True],
        [False,  True,  True]])

## torch.sort

```python
sort(input, dim=-1, descending=False, *, out=None) -> (Tensor, LongTensor)
```
`sort`对input按给定义的dim进行升序排列，返回排列后的Tensor的同时，也返回一个对应的下标的重排后的Tensor

dim的默认值是Tensor的最后一维

In [29]:
torch.sort(a, dim=1, descending=True)

torch.return_types.sort(
values=tensor([[ 1.1975, -0.6680, -0.8859],
        [ 1.7126,  0.5976,  0.1704],
        [ 1.4601,  1.1260, -0.7816]]),
indices=tensor([[0, 2, 1],
        [0, 1, 2],
        [1, 2, 0]]))

## torch.topk

```python
topk(input, k, dim=None, largest=True, sorted=True, *, out=None) -> (Tensor, LongTensor)
```
`topk`返回input中指定维度上，最大的k个元素，以及对应的索引。

In [30]:
a = torch.randn(5)
a

tensor([ 0.4168, -1.7439, -0.4161, -0.0458,  0.5801])

In [31]:
torch.topk(a, 3)

torch.return_types.topk(
values=tensor([ 0.5801,  0.4168, -0.0458]),
indices=tensor([4, 0, 3]))

## torch.kthvalue

```python
kthvalue(input, k, dim=None, keepdim=False, *, out=None) -> (Tensor, LongTensor)
```
`kthvalue`计算输出Tensor的指定维度上第`k`小的元素以及下标。如果dim没有指定，则默认为Tensor的最后一维。

In [32]:
a = torch.randn(4, 3)
a

tensor([[-1.8665,  1.0052, -0.3483],
        [ 0.3371, -0.3111, -0.8334],
        [-0.9578,  1.3914, -1.5777],
        [-1.4825,  1.5743, -0.7107]])

In [33]:
torch.kthvalue(a, 2, dim=0)

torch.return_types.kthvalue(
values=tensor([-1.4825,  1.0052, -0.8334]),
indices=tensor([3, 0, 1]))

# 频谱操作

# 其他操作

## 原地操作(in-place)

pytorch的Tensor支持了很多原地操作，它们的特点就是在方法末尾以`_`结束

In [34]:
t1 = torch.ones(2,3)
print(f't1 = {t1}')
t1.add_(2)
print(f'after plus 2: t1 = {t1}')

t1 = tensor([[1., 1., 1.],
        [1., 1., 1.]])
after plus 2: t1 = tensor([[3., 3., 3.],
        [3., 3., 3.]])


## 转换为其他数据类型

我们可以调用`numpy`接口,返回一个numpy.ndarray的对象，可以调用`tolist`接口，返回一个list的对象

In [35]:
t = torch.tensor([1,2,3,4,5,6])

In [36]:
# 返回的ndarray还是和t是共享存储的
t.numpy()

array([1, 2, 3, 4, 5, 6])

In [37]:
t.reshape(2,3).tolist()

[[1, 2, 3], [4, 5, 6]]