Skip to content

Commit

Permalink
Update the lib code and example for review comment
Browse files Browse the repository at this point in the history
Signed-off-by: khalid-davis <huangqinkai1@huawei.com>
  • Loading branch information
khalid-huang authored and llhuii committed Jan 28, 2021
1 parent 595291e commit 7cecee8
Show file tree
Hide file tree
Showing 13 changed files with 234 additions and 339 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ ENV PYTHONPATH "/home/lib"
WORKDIR /home/work
COPY ./lib /home/lib

ENTRYPOINT ["python"]
ENTRYPOINT ["python"]
70 changes: 0 additions & 70 deletions examples/helmet_detection/training/train.py

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,34 +1,51 @@
# Using Incremental Learning Job in Helmet Detection Scenario

This document introduces how to use incremental learning job in helmet detectioni scenario. Using the incremental learning job, our application can automatically retrains, evaluates, and updates models based on the data generated at the edge.
This document introduces how to use incremental learning job in helmet detectioni scenario.
Using the incremental learning job, our application can automatically retrains, evaluates,
and updates models based on the data generated at the edge.

## Helmet Detection Experiment

### Prepare Worker Image
Build the worker image by referring to the [dockerfile](/build/worker/base_images/tensorflow/tensorflow-1.15.Dockerfile)
and put the image to the `gm-config.yaml`'s `imageHub` in [Install Neptune](#install-neptune)
In this demo, we need to replace the requirement.txt to
```
flask==1.1.2
keras==2.4.3
opencv-python==4.4.0.44
websockets==8.1
Pillow==8.0.1
requests==2.24.0
tqdm==4.56.0
matplotlib==3.3.3
```
### Install Neptune

Follow the [Neptune installation document](/docs/setup/install.md) to install Neptune.

### Prepare Data and Model

Download dataset and model to your node:
* step 1: download [dataset](https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz)
* step 1: create dataset directory:
```
mkdir -p /data/helmet_detection
cd /data/helmet_detection
tar -zxvf dataset.tar.gz
```

* step 2: download [base model](https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/model.tar.gz)
```
mkdir /model
cd /model
wget https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
tar -zxvf model.tar.gz
```
### Prepare Script
Download the [scripts](/examples/helmet_detection/training) to the path `code` of your node
Download the [scripts](/examples/helmet_detection_incremental_train/training) to the path `code` of your node


### Create Incremental Job

Create Namespace `kubectl create ns neptune-test`

Create Dataset

```
Expand All @@ -45,7 +62,7 @@ spec:
EOF
```

Create Initial Model
Create Initial Model to simulate the initial model in incremental learning scenario.

```
kubectl create -f - <<EOF
Expand Down Expand Up @@ -163,10 +180,10 @@ EOF

### Mock Video Stream for Inference in Edge Side

* step1: install the open source video streaming server [EasyDarwin](https://github.com/EasyDarwin/EasyDarwin/tree/dev).
* step2: start EasyDarwin server.
* step3: download [video](https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz).
* step4: push a video stream to the url (e.g., `rtsp://localhost/video`) that the inference service can connect.
* step 1: install the open source video streaming server [EasyDarwin](https://github.com/EasyDarwin/EasyDarwin/tree/dev).
* step 2: start EasyDarwin server.
* step 3: download [video](https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/video.tar.gz).
* step 4: push a video stream to the url (e.g., `rtsp://localhost/video`) that the inference service can connect.

```
wget https://github.com/EasyDarwin/EasyDarwin/releases/download/v8.1.0/EasyDarwin-linux-8.1.0-1901141151.tar.gz --no-check-certificate
Expand All @@ -180,13 +197,41 @@ tar -zxvf video.tar.gz
ffmpeg -re -i /data/video/helmet-detection.mp4 -vcodec libx264 -f rtsp rtsp://localhost/video
```


### Check Incremental Job Result

### Check Incremental Learning Job
query the service status
```
kubectl get incrementallearningjob helmet-detection-demo -n neptune-test
```
In the `IncrementalLearningJob` resource helmet-detection-demo, the following trigger is configured:
```
trigger:
checkPeriodSeconds: 60
timer:
start: 02:00
end: 04:00
condition:
operator: ">"
threshold: 500
metric: num_of_samples
```
In a real word, we need to label the hard examples in `HE_SAVED_URL` with annotation tools and then put the examples to `Dataset`'s url.
Without annotation tools, we can simulate the condition of `num_of_samples` in the following ways:
Download [dataset](https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz) to our cloud0 node.
```
cd /data/helmet_detection
wget https://edgeai-neptune.obs.cn-north-1.myhuaweicloud.com/examples/helmet-detection/dataset.tar.gz
tar -zxvf dataset.tar.gz
```
The LocalController component will check the number of the sample, realize trigger conditions are met and notice the GlobalManager Component to start train worker.
When the train worker finish, we can view the updated model in the `/output` directory in cloud0 node.
Then the eval worker will start to evaluate the model that train worker generated.

after the job completed, we can view the updated model in the /output directory in cloud0 node

If the eval result satisfy the `deploySpec`'s trigger
```
trigger:
condition:
operator: ">"
threshold: 0.1
metric: precision_delta
```
the deploy worker will load the new model and provide service.
Original file line number Diff line number Diff line change
Expand Up @@ -215,13 +215,10 @@ def read_data(self, annotation_line, input_shape=416, random=True, max_boxes=50,
return image_data, box_data

def preprocess_true_boxes(self, true_boxes, in_shape=416):
"""
Introduction
------------
对训练数据的ground truth box进行预处理
Parameters
----------
true_boxes: ground truth box 形状为[boxes, 5], x_min, y_min, x_max, y_max, class_id
"""Preprocesses the ground truth box of the training data
:param true_boxes: ground truth box shape is [boxes, 5], x_min, y_min,
x_max, y_max, class_id
"""

num_layers = self.anchors.shape[0] // 3
Expand All @@ -238,20 +235,21 @@ def preprocess_true_boxes(self, true_boxes, in_shape=416):
grid_shapes = [input_shape // 32, input_shape // 16, input_shape // 8]
y_true = [np.zeros((m, grid_shapes[l][0], grid_shapes[l][1], len(anchor_mask[l]), 5 + self.num_classes),
dtype='float32') for l in range(num_layers)]
# 这里扩充维度是为了后面应用广播计算每个图中所有box的anchor互相之间的iou
# The dimension is expanded to calculate the IOU between the
# anchors of all boxes in each graph by broadcasting
anchors = np.expand_dims(self.anchors, 0)
anchors_max = anchors / 2.
anchors_min = -anchors_max
# 因为之前对box做了padding, 因此需要去除全0行
# Because we padded the box before, we need to remove all 0 lines
valid_mask = boxes_wh[..., 0] > 0

for b in range(m):
wh = boxes_wh[b, valid_mask[b]]
if len(wh) == 0: continue

# 为了应用广播扩充维度
# Expanding dimensions for broadcasting applications
wh = np.expand_dims(wh, -2)
# wh 的shape为[box_num, 1, 2]
# wh shape is [box_num, 1, 2]
boxes_max = wh / 2.
boxes_min = -boxes_max

Expand All @@ -263,7 +261,10 @@ def preprocess_true_boxes(self, true_boxes, in_shape=416):
anchor_area = anchors[..., 0] * anchors[..., 1]
iou = intersect_area / (box_area + anchor_area - intersect_area)

# 找出和ground truth box的iou最大的anchor box, 然后将对应不同比例的负责该ground turth box 的位置置为ground truth box坐标
# Find out the largest anchor box with the IOU of the ground truth
# box, and then set the corresponding positions of different
# proportions responsible for the ground turn box as the
# coordinates of the ground truth box
best_anchor = np.argmax(iou, axis=-1)
for t, n in enumerate(best_anchor):
for l in range(num_layers):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,10 @@ def main():

model = validate

model = neptune.incremental_learning.evaluate(model=model,
test_data=test_data,
class_names=class_names,
input_shape=input_shape)

# Save the model based on the config.
# kubeedge_ai.incremental_learning.save_model(model)
neptune.incremental_learning.evaluate(model=model,
test_data=test_data,
class_names=class_names,
input_shape=input_shape)


if __name__ == '__main__':
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
import logging
import os
import time

import cv2
import numpy as np
import os

import neptune
from neptune.incremental_learning import InferenceResult
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ def avg_checkpoints(self):

logging.info("average checkpoints end .......")

def save_model_pb(self):
def save_model_pb(self, saved_model_name):
"""
save model as a single pb file from checkpoint
"""
Expand All @@ -189,6 +189,6 @@ def save_model_pb(self):
print('output_tensors : ', output_tensors)
output_tensors = [t.op.name for t in output_tensors]
graph = tf.graph_util.convert_variables_to_constants(sess, input_graph_def, output_tensors)
tf.train.write_graph(graph, model.model_dir, 'model.pb', False)
tf.train.write_graph(graph, model.model_dir, saved_model_name, False)

logging.info("save model as .pb end .......")
Loading

0 comments on commit 7cecee8

Please sign in to comment.