# 练习9-车辆识别
--------
## 介绍

在本练习中，我们将将使用YOLO模型进行识别和定位车辆。


在开始练习前，需要**下载如下的数据文件进行上传**：

- images.tgz -数据集
- model_data.tgz -预训练模型
- out.tgz -输出文件存放

在整个练习中，涉及如下的**作业**：

| 作业 | 分值 |
|--|--|
|[利用种类分值门槛进行过滤](#1)| 20分|
|[实现交并比的计算](#2)|10分|
|[实现非最大抑制](#3)|20分|
|[包装过滤器](#4)|30分|
|[实现模型预测方法](#5)|20分|

**导入相关的包**

In [22]:
#加载时间较长
!tar xzvf images.tgz
!tar xzvf model_data.tgz
!tar xzvf out.tgz
!tar xzvf yad2k.tgz

images/
images/0001.jpg
images/0002.jpg
images/0003.jpg
images/0004.jpg
images/0005.jpg
images/0006.jpg
images/0007.jpg
images/0008.jpg
images/0009.jpg
images/0010.jpg
images/0011.jpg
images/0012.jpg
images/0013.jpg
images/0014.jpg
images/0015.jpg
images/0016.jpg
images/0017.jpg
images/0018.jpg
images/0019.jpg
images/0020.jpg
images/0021.jpg
images/0022.jpg
images/0023.jpg
images/0024.jpg
images/0025.jpg
images/0026.jpg
images/0027.jpg
images/0028.jpg
images/0029.jpg
images/0030.jpg
images/0031.jpg
images/0032.jpg
images/0033.jpg
images/0034.jpg
images/0035.jpg
images/0036.jpg
images/0037.jpg
images/0038.jpg
images/0039.jpg
images/0040.jpg
images/0041.jpg
images/0042.jpg
images/0043.jpg
images/0044.jpg
images/0045.jpg
images/0046.jpg
images/0047.jpg
images/0048.jpg
images/0049.jpg
images/0050.jpg
images/0051.jpg
images/0052.jpg
images/0053.jpg
images/0054.jpg
images/0055.jpg
images/0056.jpg
images/0057.jpg
images/0058.jpg
images/0059.jpg
images/0060.jpg
images/0061.jpg
images/0062.jpg


In [2]:
import argparse
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import scipy.io
import scipy.misc
import numpy as np
import pandas as pd
import PIL
import tensorflow as tf
from keras import backend as K
from keras.layers import Input, Lambda, Conv2D
from keras.models import load_model, Model
from yolo_utils import read_classes, read_anchors, generate_colors, preprocess_image, draw_boxes, scale_boxes
from yad2k.models.keras_yolo import yolo_head, yolo_boxes_to_corners, preprocess_true_boxes, yolo_loss, yolo_body

%matplotlib inline

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [3]:
a = np.random.randn(19*19, 5, 1)
b = np.random.randn(19*19, 5, 80)
c = a * b # shape of c will be (19*19, 5, 80)

1. 对每个box
    1. 找出最高分的分类(80选1)
    2. 得出相应的分数
2. 创建一个门槛mask：比如 ([0.9, 0.3, 0.4, 0.5, 0.1] < 0.4) 返回 [False, True, False, False, True] 注意你想保留的boxes应该为true
3. 利用 TensorFlow 将 mask 应用到 box_class_scores 上，过滤掉不需要的boxes。

In [7]:
def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = .6):
    """通过对对象和类的置信度设置阈值来过滤YOLO框。
    """
    # Step 1: Compute box scores
    box_scores = box_confidence * box_class_probs

    # Step 2: 通过最大的box_scores查找box_classes，跟踪相应得分
    box_classes = K.argmax(box_scores, axis=-1)
    box_class_scores = K.max(box_scores, axis=-1)

    # Step 3: 使用“阈值”基于“ box_class_scores”创建过滤掩码
    filtering_mask = box_class_scores >= threshold

    # Step 4: Apply the mask to scores, boxes and classes
    scores = tf.boolean_mask(box_class_scores, filtering_mask)
    boxes = tf.boolean_mask(boxes, filtering_mask)
    classes = tf.boolean_mask(box_classes, filtering_mask)

    return scores, boxes, classes


In [8]:
#测试yolo_filter_boxes
with tf.Session() as test_a:
    box_confidence = tf.random_normal([19, 19, 5, 1], mean=1, stddev=4, seed = 1)
    boxes = tf.random_normal([19, 19, 5, 4], mean=1, stddev=4, seed = 1)
    box_class_probs = tf.random_normal([19, 19, 5, 80], mean=1, stddev=4, seed = 1)
    scores, boxes, classes = yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = 0.5)
    print("scores[2] = " + str(scores[2].eval()))
    print("boxes[2] = " + str(boxes[2].eval()))
    print("classes[2] = " + str(classes[2].eval()))
    print("scores.shape = " + str(scores.shape))
    print("boxes.shape = " + str(boxes.shape))
    print("classes.shape = " + str(classes.shape))

W0515 07:25:16.103945 140737354041152 deprecation.py:323] From /opt/conda/lib/python3.6/site-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


scores[2] = 10.750582
boxes[2] = [ 8.426533   3.2713668 -0.5313436 -4.9413733]
classes[2] = 7
scores.shape = (?,)
boxes.shape = (?, 4)
classes.shape = (?,)


**期望输出**：
```
scores[2] = 10.7506
boxes[2] = [ 8.42653275  3.27136683 -0.5313437  -4.94137383]
classes[2] = 7
scores.shape = (?,)
boxes.shape = (?, 4)
classes.shape = (?,)
```

In [9]:
def iou(box1, box2):
    """在box1和box2之间实现联合交叉（IoU）
    """
    xi1 = max(box1[0], box2[0])
    yi1 = max(box1[1], box2[1])
    xi2 = min(box1[2], box2[2])
    yi2 = min(box1[3], box2[3])
    inter_area = max(0, xi2 - xi1) * max(0, yi2 - yi1)

    box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
    box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union_area = box1_area + box2_area - inter_area

    iou = inter_area / union_area

    return iou


In [10]:
#测试iou
box1 = (2, 1, 4, 3)
box2 = (1, 2, 3, 4) 
print("iou = " + str(iou(box1, box2)))

iou = 0.14285714285714285


**期望输出**:
```
iou = 0.14285714285714285
```

现在你准备好实现非最大抑制了。关键步骤为：
1. 选出具有最高分数的box
2. 计算该box和其他box的iou, 删除重叠部分iou大于 iou_threshold 的 box
3. 循环1，2 直到没有满足条件的 boxes

这样将会删除所有有大量重叠覆盖的的 boxes，只留下最优的。

**练习：使用 TensorFlow 实现 yolo_non_max_suppression()**

TensorFlow有用的方法：
<span id='3'></span>
- tf.image.non_max_suppression() # 不需要用你自己的 iou 方法了
- K.gather()


In [11]:
def yolo_non_max_suppression(scores, boxes, classes, max_boxes=10, iou_threshold=0.5):
    """
    对一组方框应用非最大抑制(NMS)
    """
    max_boxes_tensor = K.variable(max_boxes, dtype='int32') 
    K.get_session().run(tf.variables_initializer([max_boxes_tensor]))

    nms_indices = tf.image.non_max_suppression(boxes, scores, max_boxes, iou_threshold)

    scores = K.gather(scores, nms_indices)
    boxes = K.gather(boxes, nms_indices)
    classes = K.gather(classes, nms_indices)

    return scores, boxes, classes


In [12]:

#测试yolo_non_max_suppression

with tf.Session() as test_b:
    scores = tf.random_normal([54,], mean=1, stddev=4, seed = 1)
    boxes = tf.random_normal([54, 4], mean=1, stddev=4, seed = 1)
    classes = tf.random_normal([54,], mean=1, stddev=4, seed = 1)
    scores, boxes, classes = yolo_non_max_suppression(scores, boxes, classes)
    print("scores[2] = " + str(scores[2].eval()))
    print("boxes[2] = " + str(boxes[2].eval()))
    print("classes[2] = " + str(classes[2].eval()))
    print("scores.shape = " + str(scores.eval().shape))
    print("boxes.shape = " + str(boxes.eval().shape))
    print("classes.shape = " + str(classes.eval().shape))


W0515 07:26:07.069611 140737354041152 module_wrapper.py:139] From /opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

W0515 07:26:07.070883 140737354041152 module_wrapper.py:139] From /opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

W0515 07:26:07.071945 140737354041152 module_wrapper.py:139] From /opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

W0515 07:26:07.082874 140737354041152 module_wrapper.py:139] From /opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.



scores[2] = 6.938395
boxes[2] = [-5.299932    3.1379814   4.450367    0.95942086]
classes[2] = -2.2452729
scores.shape = (10,)
boxes.shape = (10, 4)
classes.shape = (10,)


**期望输出**：
```
scores[2] = 6.9384
boxes[2] = [-5.299932    3.13798141  4.45036697  0.95942086]
classes[2] = -2.24527
scores.shape = (10,)
boxes.shape = (10, 4)
classes.shape = (10,)
```

### 2.4 包装过滤器<span id='4'></span>
接下来我们需要实现深度卷积神经网络（CNN）(19x19x5x85)

**练习：实现 yolo_eval()**

yolo_eval 方法将YOLO 的输出进行编码并用非最大抑制进行过滤。

表示 box 的方式由好多种，比如左上角/右下角的坐标，比如中心和宽高。YOLO 在运算过程中将灵活转换这些表示方式。
```python
# (x,y,w,h) -->  (x1, y1, x2, y2)
# 用于符合yolo_filter_boxes的输入
boxes = yolo_boxes_to_corners(box_xy, box_wh) 
# 格局图片大小调整 box 大小
boxes = scale_boxes(boxes, image_shape)
```

In [14]:
def yolo_eval(yolo_outputs, image_shape=(720., 1280.), max_boxes=10, score_threshold=.6, iou_threshold=.5):
    """
    将YOLO编码的输出（很多盒子）连同它们的分数，盒子坐标和类一起转换为预测的盒子。
    """
    box_confidence, box_xy, box_wh, box_class_probs = yolo_outputs

    boxes = yolo_boxes_to_corners(box_xy, box_wh)
    scores, boxes, classes = yolo_filter_boxes(box_confidence, boxes, box_class_probs, score_threshold)
    boxes = scale_boxes(boxes, image_shape)

    scores, boxes, classes = yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold)

    return scores, boxes, classes


In [15]:

#测试yolo_eval

with tf.Session() as test_b:
    yolo_outputs = (tf.random_normal([19, 19, 5, 1], mean=1, stddev=4, seed = 1),
                    tf.random_normal([19, 19, 5, 2], mean=1, stddev=4, seed = 1),
                    tf.random_normal([19, 19, 5, 2], mean=1, stddev=4, seed = 1),
                    tf.random_normal([19, 19, 5, 80], mean=1, stddev=4, seed = 1))
    scores, boxes, classes = yolo_eval(yolo_outputs)
    print("scores[2] = " + str(scores[2].eval()))
    print("boxes[2] = " + str(boxes[2].eval()))
    print("classes[2] = " + str(classes[2].eval()))
    print("scores.shape = " + str(scores.eval().shape))
    print("boxes.shape = " + str(boxes.eval().shape))
    print("classes.shape = " + str(classes.eval().shape))


scores[2] = 138.79124
boxes[2] = [1292.3297  -278.52167 3876.9893  -835.56494]
classes[2] = 54
scores.shape = (10,)
boxes.shape = (10, 4)
classes.shape = (10,)


**期望输出**：
```

scores[2] = 138.791
boxes[2] = [ 1292.32971191  -278.52166748  3876.98925781  -835.56494141]
classes[2] = 54
scores.shape = (10,)
boxes.shape = (10, 4)
classes.shape = (10,)
```

**YOLO 的总结**
- 输入图片(608, 608, 3)
- 输入的图片经过一个 CNN，得到一个输出(19,19,5,85)
- 展开图片的后两个维度，得到 (19, 19, 425)
- 19x19 中的每个单元格都包含了图片的425个数
- 425 = 5 x 85 因为每个单元格包含5个预测 boxes, 对于5个 anchor boxes
- 85 = 5 + 80 其中5表示(pc,bx,by,bh,bw)，80代表要检测的分类数
- 然后基于以下规则挑选一些 boxes
    - 分值门槛：扔掉预测值低于门槛的 boxes
    - 非最大抑制：计算 iou，避免重叠的同一个对象识别
- 给出 YOLO 的最后输出

## 3 测试训练好了的 YOLO 模型
创建session

In [19]:
sess = K.get_session()

### 3.1 定义classes, anchers 和 图片大小
classes和anchers文件是分开的，另外原始文件是(720, 1280)的，我们可以处理成(608, 608)

In [20]:
class_names = read_classes("model_data/coco_classes.txt")
anchors = read_anchors("model_data/yolo_anchors.txt")
image_shape = (720., 1280.)   

FileNotFoundError: [Errno 2] No such file or directory: 'model_data/coco_classes.txt'

### 3.2 导入预训练模型
模型来自the official YOLO website的文件——`yolo.h5`。
>注意利用前文程序将图片(m, 608, 608, 3) 转换为 (m, 19, 19, 5, 85)

In [13]:
yolo_model = load_model("model_data/yolov2.h5")
yolo_model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 608, 608, 3)  0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 608, 608, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 608, 608, 32) 128         conv2d_1[0][0]                   
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU)       (None, 608, 608, 32) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
max_poolin



### 3.3 将模型输出转换为识别框tensor

In [14]:
yolo_outputs = yolo_head(yolo_model.output, anchors, len(class_names))

接下来将yolo_ouput 传给模型的 yolo_eval

### 3.4 过滤boxes
yolo_ouput 已经将输出的格式调整好了，调用前文程序 yolo_eval 选出最好的boxes

In [15]:
scores, boxes, classes = yolo_eval(yolo_outputs, image_shape)

### 3.5 在图片上运行模型
<span id='5'></span>

步骤：
1. 创建session
2. yolo_model.input 给到 yolo_model 计算输出 yolo_model.output
3. yolo_model.output 给到 yolo_head，转换为 yolo_output
4. yolo_output 经过过滤-yolo_eval，输出预测的接轨：scores, boxes, classes

**练习：实现模型预测方法 yolo_predict**

提示方法：
```python
image, image_data = preprocess_image("images/" + image_file, model_image_size = (608, 608))
```

方法输出：

- image: 用于在图片上画出 boxes 的 PIL 表示，这里你不需要用它
- image_data: 一个 numpy-array 表示的图片，经作为 CNN 的输入

当模型使用 BatchNorm 时，`feed_dict {K.learning_phase(): 0} `中需要多一个占位符 placeholder

In [20]:
def predict(sess, image_file):
    """
    运行存储在“ sess”中的图形以预测“ image_file”的框。 打印并绘制预测值。

    参数:
    sess -- 包含YOLO图的tensorflow / Keras会话
    image_file -- 存储在“images”文件夹中的图像的名称

    返回值:
    out_scores -- 形状张量（None，），预测盒的分数
    out_boxes -- 形状的张量（None，4），预测框的坐标
    out_classes -- 形状张量（None，），预测框的类索引

    Note: “ None”实际上代表预测的盒子数，在0到max_boxes之间变化。 
    """

    # Preprocess your image
    image, image_data = preprocess_image("images/" + image_file, model_image_size = (608, 608))

    # 使用正确的张量运行会话，并在feed_dict中选择正确的占位符。
    # feed_dict={yolo_model.input: ... , K.learning_phase(): 0})
    ### START CODE HERE ### (≈ 1 line)
    out_scores, out_boxes, out_classes =  
    ### END CODE HERE ###

    # Print predictions info
    print('Found {} boxes for {}'.format(len(out_boxes), image_file))
    # Generate colors for drawing bounding boxes.
    colors = generate_colors(class_names)
    # Draw bounding boxes on the image file
    draw_boxes(image, out_scores, out_boxes, out_classes, class_names, colors)
    # Save the predicted bounding box on the image
    image.save(os.path.join("out", image_file), quality=90)
    # Display the results in the notebook
    output_image = scipy.misc.imread(os.path.join("out", image_file))
    imshow(output_image)

    return out_scores, out_boxes, out_classes

In [None]:

# 在 tset.jpg 上进行测试 
out_scores, out_boxes, out_classes = predict(sess, "test.jpg")

**期望输出**：
```
 Found 7 boxes for test.jpg
 car 0.60 (925, 285) (1045, 374)
 car 0.66 (706, 279) (786, 350)
 bus 0.67 (5, 266) (220, 407)
 car 0.70 (947, 324) (1280, 705)
 car 0.74 (159, 303) (346, 440)
 car 0.80 (761, 282) (942, 412)
 car 0.89 (367, 300) (745, 648)
```
刚才运行的模型可以识别 coco_classes.txt 列出的 80 个种类，你可以自己试一下。

**谨记**
- YOLO 是一个高水平的检测模型，迅速又准确
- 输入图片通过 CNN 输出 19x19x5x85 的维度
- 可以认为 19x19 中的每个单元格都包含 5 个 boxes 的信息
- 过滤器使用非最大抑制进行过滤
    - 门槛过滤器过滤掉低分的识别，只留下高分的识别
    - 利用IOU门槛识别消除重叠的boxes
