# Chapter 3 It Starts with a tensor

由于浮点数是网络处理信息的方式，因此我们需要一种方法，将我们希望处理的真实世界的数据编码为网络可理解的数据，然后将输出解码为我们可以理解和使用的数据。
深度神经网络通常在不同阶段学习将数据从一种形式转换为另一种形式，这意味着每个阶段转换的数据可以被认为是一个中间表征序列。
一般来说，这些中间表征是浮点数的集合，它们描述输入的特征，并以一种有助于描述输入映射到神经网络输出的方式捕获数据的结构。
开始将数据转换为浮点数输入之前，我们必须先对 PyTorch 如何处理和存储数据有深入的理解，诸如如何将数据处理作为输入、中间表征和输出。


在深度学习中，张量可以将向量和矩阵推广到任意维度.
这个概念的另一个名称是多维数组，张量的维度与用来表示张量中标量值的索引数量一致。

torch张量或Numpy数组通常是连续内存块的视图，每个字节都是32位bit的浮点数(4字节)
这意味着存储1000,000个浮点数的一维张量恰好需要4000,000个连续字节
在加上元数据的小开销，如维度和数字类型

In [1]:
# 一个例子， 一个二维三角形，顶点坐标是(4,1), (5,3),(2,1)

## 可以使用一维张量
import torch
points = torch.zeros(6)
points[0] = 4.0
points[1] = 1.0
points[2] = 5.0
points[3] = 3.0
points[4] = 2.0
points[5] = 1.0
print(points)
## 或者传一个python列表表达同样效果
points = torch.tensor([4.0,1.0,5.0,3.0,2.0,1.0])
print(points)
## 为了得到第一个点的坐标，可以执行下面操作
float(points[0]),float(points[1])

tensor([4., 1., 5., 3., 2., 1.])
tensor([4., 1., 5., 3., 2., 1.])


(4.0, 1.0)

In [2]:
#也可以使用二维张量
points = torch.tensor([[4.0,1.0],[5.0,3.0],[2.0,1.0]])
print(points)
print(points[0])
print(points[0,1])

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])
tensor([4., 1.])
tensor(1.)


**Remark:**
    注意到上面points[0]输出另外一个张量，新的张量是一个大小为2的一维张量，引用了张量points中第1行的值，
    这是否意味着分配了一个新的内存块，将值复制到其中，并将新内存块包装在一个新的张量对象中返回？
    答案是不！

In [3]:
print(points.shape,"\n")
print(points[1:])
print(points[1:,:]) # All rows after the first; all columns
print(points[1:,0]) # All rows after the first; first column
print(points[None])

torch.Size([3, 2]) 

tensor([[5., 3.],
        [2., 1.]])
tensor([[5., 3.],
        [2., 1.]])
tensor([5., 2.])
tensor([[[4., 1.],
         [5., 3.],
         [2., 1.]]])


## 3.4 Named tensors

In [4]:
img_t = torch.randn(3,5,5) # shape [channels, rows, columns]
weights = torch.tensor([0.2126, 0.7152, 0.0722])
weights.shape

torch.Size([3])

In [5]:
batch_t = torch.randn(2,3,5,5)

In [6]:
img_gray_naive = img_t.mean(-3) # 倒着数，数到第三个
batch_gray_naive = batch_t.mean(-3) # 倒着数，数到第三个
img_gray_naive.shape, batch_gray_naive.shape

(torch.Size([5, 5]), torch.Size([2, 5, 5]))

In [7]:
unsqueezed_weights = weights.unsqueeze(-1).unsqueeze_(-1)
unsqueezed_weights.shape

torch.Size([3, 1, 1])

**<font color =  'red'> Remark </font >**

unsqueeze()函数是PyTorch中的一个方法，它可以在张量的指定位置插入一个新的维度。

例如，如果我们有一个形状为(3, 4)的张量，我们可以使用unsqueeze(0)在第0个位置插入一个新的维度，从而得到一个形状为(1, 3, 4)的张量。

unsqueeze()函数的参数可以是正数或负数。如果参数是正数，则表示在指定的位置插入一个新的维度；

如果参数是负数，则表示在指定的位置插入一个新的维度，但是该维度的索引是从张量的末尾开始计算的。

例如，如果我们有一个形状为(3, 4)的张量，我们可以使用unsqueeze(-1)在最后一个位置插入一个新的维度，从而得到一个形状为(3, 4, 1)的张量。
这种方式可以方便地将两个张量进行广播，从而使它们具有相同的形状。

详细解释见<https://stackoverflow.com/questions/57237352/what-does-unsqueeze-do-in-pytorch/65831759#65831759>

In [8]:
# unsqueeze()的一个例子
a = torch.tensor([[1,2],
                 [3,4]])
print(a.shape)
print(torch.unsqueeze(a, 2) == torch.unsqueeze(a, -1))
print(torch.unsqueeze(a, 1) == torch.unsqueeze(a, -2))
print(torch.unsqueeze(a, 0) == torch.unsqueeze(a, -3))
print(torch.unsqueeze(a, 1) == torch.unsqueeze(a, -3))
a.unsqueeze(-3).shape,a.unsqueeze(-2).shape,a.unsqueeze(-1).shape,a.unsqueeze(0).shape,a.unsqueeze(1).shape,a.unsqueeze(2).shape

torch.Size([2, 2])
tensor([[[True],
         [True]],

        [[True],
         [True]]])
tensor([[[True, True]],

        [[True, True]]])
tensor([[[True, True],
         [True, True]]])
tensor([[[ True,  True],
         [False, False]],

        [[False, False],
         [ True,  True]]])


(torch.Size([1, 2, 2]),
 torch.Size([2, 1, 2]),
 torch.Size([2, 2, 1]),
 torch.Size([1, 2, 2]),
 torch.Size([2, 1, 2]),
 torch.Size([2, 2, 1]))

PyTorch 将允许我们对相同形状的张量进行乘法运算，也允许与给定维度中其中一个操作数大小为 1 的张量进行运算。

它还会自动附加大小为 1 的前导维度，这个特性被称为广播。

In [9]:
img_weights = (img_t * unsqueezed_weights) # 单单从张量的shape角度看 [3,5,5] * [3,1,1] = [3,5,5]
batch_weights = (batch_t * unsqueezed_weights)  # [2,3,5,5]*[3,1,1] = [2,3,5,5]
img_gray_weighted = img_weights.sum(-3) #从末端开始对第三维（3 个通道）进行求和
batch_gray_weighted = batch_weights.sum(-3)
batch_weights.shape, batch_t.shape, unsqueezed_weights.shape

(torch.Size([2, 3, 5, 5]), torch.Size([2, 3, 5, 5]), torch.Size([3, 1, 1]))

## 3.6 The tensor API

The vast majority of operations on and between tensors are available in the torch module and can also be called as methods of a tensor object.

In [10]:
# 例子，转置transpose
## 注意区分 functions & methods
a = torch.ones(3,2)
a_t = torch.transpose(a, 0, 1)
print(a.shape, a_t.shape)

a = torch.ones(3,2)
a_t = a.transpose(0,1)
print(a.shape, a_t.shape)


torch.Size([3, 2]) torch.Size([2, 3])
torch.Size([3, 2]) torch.Size([2, 3])


## 3.7 Tensors: Scenic views of storage

张量中的值被分配到由 torch.Storage 实例所管理的连续内存块中。

A storage is a one-dimensional array of numerical data: that is, a contiguous block of memory containing numbers of a given type, such as float (32 bits representing a floating-point number) or int64 (64 bits representing an integer).

一个 PyTorch 的 Tensor 实例就是这样一个 Storage 实例的视图，该实例能够使用**偏移量**和**每个维度的步长**对该存储区进行**索引**

多个张量可以索引同一存储区，即使它们索引到的数据不同。

In [11]:
## Indexing into storage

points = torch.tensor([[4.0,1.0],[5.0,3.0],[2.0,1.0]])
points.storage

<bound method Tensor.storage of tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])>

In [12]:
##  index into a storage manually

points_storage = points.storage()
print("points_storage[0] = {}".format(points_storage[0]))
print("points.storage()[1] = {}".format(points.storage()[1]))

points_storage[0] = 4.0
points.storage()[1] = 1.0


  points_storage = points.storage()


In [13]:
## 改变一个存储区的值导致与其关联的张量的内容发生变化

points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]]) 
points_storage = points.storage()
points_storage[0] = 2.0
points

tensor([[2., 1.],
        [5., 3.],
        [2., 1.]])

In [14]:
## zero_()方法将输入的所有元素归零，任何不带下画线的方法都不会改变源张量，而是返回一个新的张量

a = torch.ones(3,2)
a.zero_()
print(a)

b = torch.ones(3,2)
print(b.sum())
print(b)

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])
tensor(6.)
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])


##  3.8 Tensor metadata: Size, offset, and stride

In order to index into a storage, tensors rely on a few pieces of information that, together with their storage, unequivocally(明确地；不含糊的) define them: **<font  color = 'red' >size , offset, and stride</font>**.

**The size** (or shape, in NumPy parlance) is a tuple indicating how many elements across each dimension the tensor represents. 

**The storage offset** is the index in the storage corresponding to the first element in the tensor.

**The stride** is the number of elements in the storage that need to be skipped over to obtain the next element along each dimension.

<font color = 'red'>Accessing an element i, j in a 2D tensor results in accessing the storage_offset +
stride[0] * i + stride[1] * j element in the storage. </font>

This indirection between Tensor and Storage makes some operations inexpen-
sive, like transposing a tensor or extracting a subtensor, because they do not lead to
memory reallocations. Instead, they consist of allocating a new Tensor object with a
different value for size, storage offset, or stride.

In [15]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]]) 
second_point = points[1]
print("points.storage_offsets() = {}\n".format(points.storage_offset()),
      "second_point.storage_offsets() = {}".format(second_point.storage_offset()), sep="")

print(second_point.stride())
print(points.stride())

print(second_point.size())
print(second_point.shape)


points.storage_offsets() = 0
second_point.storage_offsets() = 2
(1,)
(2, 1)
torch.Size([2])
torch.Size([2])


<font color=red>**This also means changing the subtensor will have a side effect on the original tensor**</font>

In [16]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]]) 
second_point = points[1].clone()
second_point[0] = 10.0
points

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])

In [17]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
print(points)
points_t = points.t()
points_t

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])


tensor([[4., 5., 2.],
        [1., 3., 1.]])

In [28]:
some_t = torch.ones(3,4,5)
some_t,some_t.stride(),some_t.size()

(tensor([[[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
 
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
 
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]]]),
 (20, 5, 1),
 torch.Size([3, 4, 5]))

In [23]:
transpose_t = some_t.transpose(0,2)
transpose_t.shape,transpose_t.stride(),transpose_t

(torch.Size([5, 4, 3]),
 (1, 5, 20),
 tensor([[[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],
 
         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],
 
         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],
 
         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],
 
         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]]]))

### 3.8.4 Contiguous tensors

In our case, points is contiguous, while its transpose is not.

+ <font color=red> Attention: the tensor is contiguous rather than continuous!</font>

In [30]:
points.is_contiguous(),points_t.is_contiguous()

(True, False)

In [34]:
# The content of the tensor will be the same, but the stride will change, as will the storage.

points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points_t = points.t()
points_t.stride(),points_t.storage(),points_t

((1, 2),
  4.0
  1.0
  5.0
  3.0
  2.0
  1.0
 [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6],
 tensor([[4., 5., 2.],
         [1., 3., 1.]]))

In [35]:
points_t_cont = points_t.contiguous()
points_t_cont.stride(),points_t_cont.storage(),points_t_cont

((3, 1),
  4.0
  5.0
  2.0
  1.0
  3.0
  1.0
 [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6],
 tensor([[4., 5., 2.],
         [1., 3., 1.]]))

## 3.9 Moving tensors to the GPU

In [None]:
# omission