转置卷积（transposed convolution）

转置卷积（Transposed Convolution，也称为反卷积或上采样卷积）是一种用于**放大特征图尺寸**的操作，常用于生成器网络（如GAN）或图像分割模型（如U-Net）的上采样阶段。

尽管名字是“转置卷积”，它并不是普通卷积的数学转置，而是通过增加步长来实现特征图的放大。下面我们通过一个简单的例子来解释它的操作过程。

---

### **普通卷积回顾**
#### 输入：

- 输入矩阵：

$$
\text{Input} = \begin{bmatrix} 
1 & 2 & 3 \\ 
4 & 5 & 6 \\ 
7 & 8 & 9 
\end{bmatrix}
$$


- 卷积核：

 $$
\text{Kernel} = \begin{bmatrix}
1 & 0 \\ 
0 & -1 
\end{bmatrix}
$$

- 步长（stride）：1  

#### 普通卷积计算：
1. 将卷积核在输入矩阵上滑动，每次计算对应位置的点积。
2. 输出大小为：  

$$
   \text{Output size} = \frac{\text{Input size} - \text{Kernel size}}{\text{Stride}} + 1 = \frac{3 - 2}{1} + 1 = 2 \times 2
$$

   输出矩阵：  

$$
\text{Output} = \begin{bmatrix} 
(1-5) & (2-6) \\ 
(4-8) & (5-9)
\end{bmatrix}
= \begin{bmatrix}
 -4 & -4 \\
 -4 & -4 
\end{bmatrix}
$$

---

### **转置卷积**

转置卷积的目标是**反向放大特征图**，即根据小的输入和卷积核产生更大的输出。

#### 输入：
- 输入矩阵： 

$$
\text{Input} = \begin{bmatrix}
1 & 2 \\ 
3 & 4 
\end{bmatrix}
$$

- 卷积核：

$$
\text{Kernel} = \begin{bmatrix}
1 & 0 \\
0 & -1
\end{bmatrix}
$$

- 步长（stride）：1  

#### 转置卷积计算：

1. **插入零填充**（如果步长 > 1）：
   - 步长为 1 时，不需要插入零。
   - 如果步长为 2，则在输入矩阵中每个元素之间插入一个零。

2. **反向滑动卷积核**：
   - 对输入矩阵的每个元素，取该值乘以整个卷积核，并将结果累加到输出矩阵对应的位置。

4. **输出大小计算**：

$$
   \text{Output size} = (\text{Input size} - 1) \times \text{Stride} + \text{Kernel size}
$$

   即 $$ (2 - 1) \times 1 + 2 = 3 \times 3 $$。

#### 具体操作：
1. 输入矩阵的第一个元素 $ 1 $：
 
$$
\begin{bmatrix}
1 & 0 \\ 
0 & -1 
\end{bmatrix}
$$

   放置到输出矩阵的左上角。

2. 第二个元素 $ 2 $：

 $$
\begin{bmatrix}
2 & 0 \\ 
0 & -2
\end{bmatrix}
 $$

   放置到输出矩阵的对应位置（向右平移）。

3. 第三个元素 $ 3 $：

 $$
\begin{bmatrix}
3 & 0 \\ 
0 & -3
\end{bmatrix}
 $$

   放置到输出矩阵的左下角。

4. 第四个元素 $ 3 $：

 $$
\begin{bmatrix}
4 & 0 \\ 
0 & -4
\end{bmatrix}
 $$

   放置到输出矩阵的右下角。

5. 累加位置上的值。

#### 最终输出：

```
    [1  0]      [2  0]
    [0 -1]  +  [0 -2]   =   [1  2  0]
                                     [0 -1 -2]

    [3  0]      [4  0]
    [0 -3]  +  [0 -4]   =   [3  4  0]
                                     [0 -3 -4]

Combining these:

[1   2   0]
[0  -1  -2]
[3   4   0]
[0  -3  -4]

The element at (1,0) of the output is 0 + 3 = 3
The element at (1,1) of the output is -1 + 4 = 3
The element at (1,2) of the output is -2 + 0 = -2

Overlapping and summing gives us:

[1   2   0]
[3   3  -2]
[0  -3  -4]
```

$$
\text{Output} = \begin{bmatrix}
1 & 2 & 0 \\ 
3 & 3 & -2 \\ 
0 & -3 & -4
\end{bmatrix}
$$

---

### **与普通卷积的区别**
1. **输入与输出的尺寸**：
   - 普通卷积：输入大，输出小。
   - 转置卷积：输入小，输出大。

2. **滑动方式**：
   - 普通卷积在输入矩阵上滑动卷积核。
   - 转置卷积在输出矩阵上滑动卷积核的“贡献”。

---

### **总结**
转置卷积通过插值（隐式或显式）和卷积操作实现了特征图的放大。它的核心思想是利用小的输入特征图生成更大的输出，同时保留空间信息，是上采样的一种有效方式。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

转置卷积（transposed convolution） #9

普通卷积回顾

输入：

普通卷积计算：

转置卷积

输入：

转置卷积计算：

具体操作：

最终输出：

与普通卷积的区别

总结

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

转置卷积（transposed convolution） #9

Description

普通卷积回顾

输入：

普通卷积计算：

转置卷积

输入：

转置卷积计算：

具体操作：

最终输出：

与普通卷积的区别

总结

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions