# 计算机视觉中常见的白板代码

面试白板一种是写代码，一种是写伪代码

[TOC]


## 计算IoU

IoU计算要注意两个问题：
- 二维图像的xy坐标系，其原点一般在左上角，y轴正方向向下
- 注意向量操作

代码如下：

In [1]:
'''
pytorch=1.4
'''

import torch


def iou(box1, box2):
    N = box1.size(0)
    M = box2.size(0)
    
    # 重叠区域lt
    lt = torch.max(
        box1[:, :2].unsqueeze(1).expand(N, M, 2),  # [N,2]->[N,1,2]->[N,M,2]
        box2[:, :2].unsqueeze(0).expand(N, M, 2),  # [M,2]->[1,M,2]->[N,M,2]
    )
    
    # 重叠区域rb
    rb = torch.min(
        box1[:, 2:].unsqueeze(1).expand(N, M, 2),
        box2[:, 2:].unsqueeze(0).expand(N, M, 2),
    )

    wh = rb - lt  # [N,M,2]
    wh[wh < 0] = 0  # 两个box没有重叠区域，防止两个负数相乘变成正数
    inter = wh[:, :, 0] * wh[:, :, 1]  # [N,M]

    area1 = (box1[:, 2] - box1[:, 0]) * (box1[:, 3] - box1[:, 1])  # (N,)
    area2 = (box2[:, 2] - box2[:, 0]) * (box2[:, 3] - box2[:, 1])  # (M,)
    area1 = area1.unsqueeze(1).expand(N, M)  # (N,M)
    area2 = area2.unsqueeze(0).expand(N, M)  # (N,M)

    iou = inter / (area1 + area2 - inter)
    return iou


if __name__ == '__main__':
    box1 = torch.tensor([[15., 20., 35., 40.], [20., 30., 40., 50.]])
    box2 = torch.tensor([[15., 19., 15., 20.], [15., 20., 35., 40.]])

    print(iou(box1, box2))


tensor([[0.0000, 1.0000],
        [0.0000, 0.2308]])


## NMS

nms:
1. 将检测结果按照类别分离；
2. 对于每种类别，按照score降序排列，得到降序的list_k；
3. 取出list_k的第一个元素，计算它与其他元素的IoU，剔除IoU大于阈值的元素；
4. 重复操作3，直至list_k为空。

In [2]:
import torch


def nms(bboxes, scores, iou_thres=0.5):
    x1 = bboxes[:, 0]
    y1 = bboxes[:, 1]
    x2 = bboxes[:, 2]
    y2 = bboxes[:, 3]
    areas = (x2 - x1) * (y2 - y1)  # [N,] 每个bbox的面积
    _, order = scores.sort(0, descending=True)  # 降序排列

    keep = []
    while order.numel() > 0:  # torch.numel()返回张量元素个数
        if order.numel() == 1:  # 保留框只剩一个
            i = order.item()
            keep.append(i)
            break
        else:
            i = order[0].item()  # 保留scores最大的那个框box[i]
            keep.append(i)

        # 计算 box[i]与其余各框的 IOU(思路很好)
        # clamp 夹紧上下限，从这里，向量的长度为 N-1
        xx1 = x1[order[1:]].clamp(min=x1[i])  # 即 torch.max(order[1:], x1[i])
        yy1 = y1[order[1:]].clamp(min=y1[i])
        xx2 = x2[order[1:]].clamp(max=x2[i])
        yy2 = y2[order[1:]].clamp(max=y2[i])
        inter = (xx2 - xx1).clamp(min=0) * (yy2 - yy1).clamp(min=0)

        iou = inter / (areas[i] + areas[order[1:]] - inter)
        idx = (iou <= iou_thres).nonzero().squeeze()
        if idx.numel() == 0:
            break
        order = order[idx + 1]  # 修补索引之间的差值, idx从 1 ~ N, 而 order 是 0 ~ N
    return torch.LongTensor(keep)  # Pytorch的索引值为LongTensor


if __name__ == '__main__':
    bboxes = torch.tensor([[15., 20., 35., 40.], [20., 30., 40., 50.], [15., 19., 15., 20.], [15., 20., 35., 40.]])
    scores = torch.tensor([0.8, 0.9, 0.7, 0.78])

    print(nms(bboxes, scores))

tensor([1, 0, 2])


soft-nms:
1. 将检测结果按照类别分离；
2. 对于每种类别，按照score降序排列，得到降序的list_k；
3. 取出list_k的第一个元素$\mathcal{M}$，计算它与其他元素$b_i$的IoU，更新score为

$$
s_{i}=\left\{\begin{array}{ll}s_{i}, & \operatorname{iou}\left(\mathcal{M}, b_{i}\right)<N_{t} \\s_{i} \cdot f\left(1-\operatorname{iou}\left(\mathcal{M}, b_{i}\right)\right), & \operatorname{iou}\left(\mathcal{M}, b_{i}\right) \geq N_{t}
\end{array}\right.
$$

4. 重复操作3，直至list_k为空


