In [1]:
%matplotlib inline


# Tensors

Tensors are a specialized data structure that are very similar to arrays
and matrices. In PyTorch, we use tensors to encode the inputs and
outputs of a model, as well as the model’s parameters.

Tensors are similar to NumPy’s ndarrays, **except that tensors can run on
GPUs or other specialized hardware to accelerate computing**. If you’re familiar with ndarrays, you’ll
be right at home with the Tensor API. If not, follow along in this quick
API walkthrough.




In [2]:
import torch
import numpy as np

## Tensor Initialization

Tensors can be initialized in various ways. Take a look at the following examples:

**Directly from data**

Tensors can be created directly from data. The data type is automatically inferred.



注意，也有用`torch.Tensor`的，这种写法一般不提倡，原因见[stackoverflow](https://stackoverflow.com/questions/51911749/what-is-the-difference-between-torch-tensor-and-torch-tensor), [discuss.pytorch.org](https://discuss.pytorch.org/t/difference-between-torch-tensor-and-torch-tensor/30786)

In [3]:
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)

In [4]:
type(x_data)

torch.Tensor

**From a NumPy array**

Tensors can be created from NumPy arrays (and vice versa - see `bridge-to-np-label`).



In [5]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

**From another tensor:**

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.



In [6]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float32) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.8785, 0.8628],
        [0.9422, 0.9890]]) 



这里`torch.float`等价于`torch.float32`, 此外还有`torch.float64`

**With random or constant values:**

``shape`` is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.



In [7]:
shape = (2,3,)  # (2, 3)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Random Tensor: 
 tensor([[0.9515, 0.4952, 0.4327],
        [0.9830, 0.9622, 0.5797]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])


--------------




## Tensor Attributes

Tensor attributes describe their shape, datatype, and the device on which they are stored.



In [8]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


--------------




In [9]:
tensor.size()

torch.Size([3, 4])

注意这里`shape`与`size()`是等价的，是为了更加契合Numpy的写法。具体的讨论见[issue](https://github.com/pytorch/pytorch/issues/5544)

## Tensor Operations

Over 100 tensor operations, including transposing, indexing, slicing,
mathematical operations, linear algebra, random sampling, and more are
comprehensively described
[here](https://pytorch.org/docs/stable/torch.html)

Each of them can be run on the GPU (at typically higher speeds than on a
CPU). If you’re using Colab, allocate a GPU by going to Edit > Notebook
Settings.




In [10]:
# We move our tensor to the GPU if available
if torch.cuda.is_available():
  tensor = tensor.to('cuda')

Try out some of the operations from the list.
If you're familiar with the NumPy API, you'll find the Tensor API a breeze to use.




**Standard numpy-like indexing and slicing:**



In [11]:
tensor = torch.ones(4, 4)
tensor[:,1] = 0
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


**Joining tensors** You can use ``torch.cat`` to concatenate a sequence of tensors along a given dimension.
See also [torch.stack](https://pytorch.org/docs/stable/generated/torch.stack.html),
another tensor joining op that is subtly different from ``torch.cat``.



In [12]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


我们通过下面的几组实验观察`cat`与`stack`的用法

### torch.cat & np.concatenate

In [13]:
torch.cat([tensor, tensor, tensor], dim=0).shape

torch.Size([12, 4])

In [14]:
torch.cat([tensor, tensor, tensor], dim=1).shape

torch.Size([4, 12])

如上，可以看到`cat`就是我们通常认为的拼接。我们可以用一种更加数学化的方式来精确地描述这一拼接行为：

我们调用`torch.cat([t1, ..., tm], dim=d_catDim)`来完成对m个tensor的拼接，拼接对这m个tensor有如下要求:
 - tensor ti 可以为空(torch.tensor([]))
 - 当tensor ti 不为空时，所有待拼接的t，除“**拼接维度catDim**”外的维度必须一致
 
如`ti`的维度为`(d0, d1, ..., d_catDim_i, ..., dD)`, 那么在进行`cat`操作时，除了`dcatDim`外，各个tensor的其他所有维度必须一致。这样拼接出来的tensor的维度为`(d0, d1, ..., sum(d_catDim_1, ..., d_catDim_m), ..., dD)`
 

在上面的例子中，我们拼接三个维度为`(4, 4)`的tensor, 所以`dim=0`时候，得到的tensor为`(4+4+4, 4)`,即`(12, 4)`; 在`dim=1`的时候，得到的tensor维度为`(4, 4+4+4)`，即`(4, 12)`

同样的，在Numpy中有类似的操作与之对应，即`np.concatenate`

In [15]:
array = tensor.numpy()
array

array([[1., 0., 1., 1.],
       [1., 0., 1., 1.],
       [1., 0., 1., 1.],
       [1., 0., 1., 1.]], dtype=float32)

In [16]:
np.concatenate([array, array, array], axis=0).shape

(12, 4)

In [17]:
np.vstack([array, array, array]).shape

(12, 4)

In [18]:
np.concatenate([array, array, array], axis=1).shape

(4, 12)

In [19]:
np.hstack([array, array, array]).shape

(4, 12)

可以看到到上面`vstack`和`hstack`分别是`axis=0`, `axis=1`的简洁写法，都是进行同样的拼接。

**注意不要把这里Numpy的vstack, hstack与下面的np.stack与torch.stack搞混，他们有着本质的不同**

### torch.stack &  np.stack

In [20]:
torch.stack([tensor, tensor, tensor], dim=0).shape

torch.Size([3, 4, 4])

In [21]:
torch.stack([tensor, tensor, tensor], dim=1).shape

torch.Size([4, 3, 4])

In [22]:
torch.stack([tensor, tensor, tensor], dim=2).shape

torch.Size([4, 4, 3])

Concatenates sequence of tensors along a **new** dimension.`stack`的关键在于其创建了新的维度出来，并在新的维度上进行拼接。

不同与`torch.cat`中相对宽松的维度要求，这里要求参与`stack`运算的tensor具有安全相同的维度，即全部为`(d0, d1, ..., dD)`的形式，无关于拼接的维度。同样的，我们可以更加形式化地描述m个tensor进行`stack`运行的结果

在`d_stackDim`(**注意0 <= d_stackDim <= D+1**)得到的tensor维度为: `(m, d0, d1, ..., dD)`(当拼接dim为0)， `(d0, m, d1, ..., dD)`(拼接dim为1)，..., `(d0, d1, ..., dD, m)`(拼接dim为D+1)

同样的，`np.stack`的用法也是一致的

In [23]:
np.stack([array, array, array], axis=0).shape

(3, 4, 4)

In [24]:
np.stack([array, array, array], axis=1).shape

(4, 3, 4)

In [25]:
np.stack([array, array, array], axis=2).shape

(4, 4, 3)

**Multiplying tensors**

- element-wise product: `tensor.mul(tensor)`, `tensor * tensor`
- matrix multiplication: `tensor.matmul(tensor)`, `tensor @ tensor`


In [26]:
# This computes the element-wise product
print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")
# Alternative syntax:
print(f"tensor * tensor \n {tensor * tensor}")

tensor.mul(tensor) 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor * tensor 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


This computes the matrix multiplication between two tensors



In [27]:
print(f"tensor.matmul(tensor.T) \n {tensor.matmul(tensor.T)} \n")
# Alternative syntax:
print(f"tensor @ tensor.T \n {tensor @ tensor.T}")

tensor.matmul(tensor.T) 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

tensor @ tensor.T 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])


**In-place operations**
Operations that have a ``_`` suffix are in-place. For example: ``x.copy_(y)``, ``x.t_()``, will change ``x``.



In [28]:
print(tensor, "\n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


<div class="alert alert-info"><h4>Note</h4><p>In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss
     of history. Hence, their use is discouraged.</p></div>



--------------





## Bridge with NumPy

Tensors on the **CPU** and NumPy arrays **can share their underlying memory**
locations, and changing one will change	the other.



### Tensor to NumPy array



In [29]:
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


A change in the tensor reflects in the NumPy array.



In [30]:
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


### NumPy array to Tensor



In [31]:
n = np.ones(5)
t = torch.from_numpy(n)

Changes in the NumPy array reflects in the tensor.



In [32]:
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]
