# Container Playground Yolo-V5 full stack sample

### First part: Build your model on DevCloud
The first part will be different according to different training frameworks, different algorithms, and different training datasets. For example, building a yolov5 neural network under the pytoch training framework to complete the training of the COCO dataset; building a simple neural network under the pytorch training framework to complete the training of the mnist dataset. 

### Second part: Training on AWS
The second part is based on the AWS interface to submit training tasks. First install devcloud_sagemaker provided by AWS. Import three functions from this file：create_training_job, get_training_job_status, get_model。Then follow the order of submitting the training task, querying the task status, and downloading the trained model to complete the training and download the model on the AWS platform

### Third part: Deploy model on DevCloud Edge Nodes
The third part is basically the same as other samples on DevCloud, converting models, deploying models, and observing performance indicators

### Related concepts
#### YOLOv5 introduction
YOLO an acronym for 'You only look once', is an object detection algorithm that divides images into a grid system. Each cell in the grid is responsible for detecting objects within itself.

YOLO is one of the most famous object detection algorithms due to its speed and accuracy.

Python>=3.6.0 is required with all requirements.txt installed including PyTorch>=1.7:

### Build and train the model     

#### Download yolov5 
Python>=3.6.0 is required with all requirements.txt installed including PyTorch>=1.7:

In [None]:
!git clone https://github.com/ultralytics/yolov5.git

#### Install third packages

In [None]:
!pip3 install -r yolov5/requirements.txt

In [None]:
!pip3 install  torchvision  torchaudio 

#### Train model Locally and save model(optional)

Yolov5 maintains its own dataset warehouse. Running the training script will automatically download the dataset for model training. The structure of the dataset is a jpg photo corresponding to a txt label description file.

If you want to use your own dataset, you can modify the dataset description yaml file in the data folder。You can also modify the parameters in the train.py script, such as epoch, etc.

Run commands below to reproduce results on COCO dataset (dataset auto-downloads on first use). Training times for YOLOv5s/m/l/x are 2/4/6/8 days on a single V100 (multi-GPU times faster). Use the largest --batch-size your GPU allows (batch sizes shown for 16 GB devices).

### Install devcloud_sagemaker
## Training on AWS
    
1. Install devcloud_sagemaker
2. Create a training task
3. Query the status of the training task
4. Download the trained model from AWS
5. Convert the model format 
6. Run benchmark tests on DevCloud nodes

### Install devcloud_sagemaker

In [2]:
!pip3 install devcloud_sagemaker_user --upgrade

Collecting devcloud_sagemaker_user
  Downloading devcloud_sagemaker_user-1.8.tar.gz (5.0 kB)
  Preparing metadata (setup.py) ... [?25ldone
[0m[?25hCollecting pytest==2.9.2
  Downloading pytest-2.9.2-py2.py3-none-any.whl (162 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m162.1/162.1 kB[0m [31m324.1 kB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting py>=1.4.29
  Downloading py-1.11.0-py2.py3-none-any.whl (98 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.7/98.7 kB[0m [31m506.7 kB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Building wheels for collected packages: devcloud_sagemaker_user
  Building wheel for devcloud_sagemaker_user (setup.py) ... [?25ldone
[?25h  Created wheel for devcloud_sagemaker_user: filename=devcloud_sagemaker_user-1.8-py3-none-any.whl size=5749 sha256=a41f3afd72c14755c4c329e70a65325754c948a68c2eabe50648561d4d50e7c0
  Stored in directory: /home/build/.cache/pip/wheels/2d/35/f6/99e52445d0c80670069987e8f72317f

In [3]:
from devcloud_sagemaker_user.sm_client import *
import tarfile, os
import shutil

## Query the available training device lise and corresponding Price

In [4]:
query_device_list()

{'instance_type': 'ml.m5.large', 'price_per_hour': 1.114, 'instance_note': '8 GiB of Memory, 2 vCPUs, EBS only, 64-bit platform'}
{'instance_type': 'ml.m5.4xlarge', 'price_per_hour': 8.917, 'instance_note': '64 GiB of Memory, 16 vCPUs, EBS only, 64-bit platform'}
{'instance_type': 'ml.m5.24xlarge', 'price_per_hour': 53.497, 'instance_note': '384 GiB of Memory, 96 vCPUs, EBS only, 64-bit platform'}
{'instance_type': 'ml.m4.2xlarge', 'price_per_hour': 6.186, 'instance_note': '32 GiB of memory, 8 vCPUs, EBS-only, 64-bit platform'}
{'instance_type': 'ml.m5.xlarge', 'price_per_hour': 2.229, 'instance_note': '16 GiB of Memory, 4 vCPUs, EBS only, 64-bit platform'}
{'instance_type': 'ml.c4.8xlarge', 'price_per_hour': 19.955, 'instance_note': '60 GiB of memory, 36 vCPUs, 64-bit platform'}
{'instance_type': 'ml.c5.9xlarge', 'price_per_hour': 14.64, 'instance_note': '72 GiB of memory, 36 vCPUs, 64-bit platform'}
{'instance_type': 'ml.c4.2xlarge', 'price_per_hour': 4.989, 'instance_note': '15 GiB 

### Creating a training task
The submitted parameter contains 4 parts:
- Training dataset. It can be the local dataset folder or the URL of the dataset S3 bucket. Using the s3 bucket link can avoid failure of training task due to failure to upload local datasets
- Packaged source code. It will be packaged before submitting the source code.And it should be noted that train.py is the startup file in the source code folder
- Pre-trained model. In this example, the value is yolov5s.pt.
- Training parameters, including two parts:
    - Define device type. including instance_type instance_count framework_version
    - Define training hyperparameters. The hyperparameters must be the content of the parameter in train.py

In [None]:
login("yaru","intelpass") #login to get token

#### **Manually modify the source code folder yolov5**
Before packaging the source code, we need to make some modifications to the cloned yolov5 source code folder.

1. Modify the requirements.txt in the yolov5 directory, where you need to modify the version number of the torch and torchvision modules, and then comment out opencv. After the modification, it is as follows:

torch==1.9.1              
torchvision==0.10.1             
#opencv-python>=4.1.2              

2. Modify the tran.py file in the yolov5 directory , There are two parts that need to be modified.

In line 17, add the followingcontent below **import os**:   

os.system('/opt/conda/bin/python3.6 -m pip install -r requirements.txt -i https://opentuna.cn/pypi/web/simple')


In line 403, Add the following content below **torch.save(ckpt, best)**:

#sagemaker output                  
sage_output = os.environ["SM_MODEL_DIR"]                     
best_sg = os.path.join(sage_output, 'model.pt')                 
torch.save(ckpt, best_sg)                 
               

In line 460, find the parse_opt function and modify the path, mainly to modify the ROOT path. The modified content is as follows:

parser.add_argument('--weights', type=str, default='../input/data/weights/yolov5s.pt', help='initial weights path')          
parser.add_argument('--cfg', type=str, default='models/yolov5s.yaml', help='model.yaml path')           
parser.add_argument('--data', type=str, default='data/coco.yaml', help='dataset.yaml path')           
parser.add_argument('--hyp', type=str, default='data/hyps/hyp.scratch.yaml', help='hyperparameters path')     

parser.add_argument('--project', default='runs/train', help='save to project/name')          

3. Modify coco.yaml in the yolov5/data directory, and modify line 11 to the following:           
path: ../input/data/datasets/coco  # dataset root dir

In [None]:
jobid_local = submit_a_task("datasets_simple", "yolov5.tar", "yolov5s.pt", 
                    {"instance_type":"ml.p3.2xlarge", 
                     "instance_count":"1", 
                     "framework_version":"1.8.1", 
                     "hyperparameters":{"imgsz":"640", "epochs":"10"}})

In addition to uploading the dataset locally, you can also choose to upload it to the S3 bucket first, and then configure the dataset value the address of the bucket. The advantage is that it will save the time of uploading the dataset locally.

In [None]:
jobid_s3_url = submit_a_task("s3://code-devcloud/jobs/05ec9d2d-8862-49c1-9d1c-a39125e23fc2/Traindata/datasets_simple/",
                    "yolov5.tar",
                    "yolov5s.pt",
                    {"instance_type":"ml.m4.4xlarge",
                    "instance_count":"1",
                    "framework_version":"1.8.1",
                    "hyperparameters":{"imgsz":"640", "epochs":"10"}})

### Query the status of the training task
After the query status is complete, you can run the next module to download the model。

In [None]:
# get_training_job_status(jobid_local)
get_task_status(jobid_local)   

In [None]:
query_task_info(jobid_local)   #inpout task id

In [None]:
query_task_log(jobid_local,2)    #inpout task id,n umber of printed log

### Download the trained model from AWS

In [6]:
download_trained_model(jobid_local)

<Response [200]>
Download trained model from: https://code-devcloud.s3.cn-north-1.amazonaws.com.cn/5d4152c3-b893-44d2-9a76-cfb8fe51270c/output/model.tar.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAXD3YERHOJTDWZEMR%2F20221102%2Fcn-north-1%2Fs3%2Faws4_request&X-Amz-Date=20221102T134325Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Security-Token=FwoDYXdzEO7%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDHPlxAUnj8dPHOEHdCKmAtfbgQQnbboU%2BXpo%2BfNty1M5ZBywodOMBWzu1TdeZAs5cLWL6oizHp4fqCR7p4Wg7SNRq4PpIIZV3kK36CvCf7CbFWu4dN1Cvwr5PWEgj9pAhe1Sm5Fh%2B%2By2bK1b6cIdAQJLcTWuiBXuZzo1fr1Pl%2Ffa2kV6S%2B2I6UPjEfYTKyauzfZUQBMJxXUGk6kot8cSEew8ElgLiBLiWzf3%2FBu95v4cIRDZZ8uhYiCbVwa5QCYChBa033rD7LwUu%2BPnBo%2Fw4DqQi%2Fa3FCeQj4PokKkLJCTE%2BwRYHJzoKDr0N5%2BvCcBylIDFoU5x7Ch3%2F3TU%2FCsCbiFJH%2FLEkt2WvYQK1dvWHvuhywy%2FukF9btyTlKVTczHLWEk%2Bav9UH4eRu%2BBtL%2BtCpEpVNujHjyj654mbBjItd1mJRhhyIP8biYgC6s2wSSQsY29iLJOJWN4KTMEuNXyLtoxoQHnkTaOIwlqy&X-Amz-Signature=0264049389da2a585af71b1473a22eac9e4176de247b0e4026d03

In [9]:
query_account_info(os.environ.get('USER'))#input account name

Query completed, account: u80176 info {'account_id': 6.0, 'account_email': 'u80176@intel.com', 'credits': 14.64, 'account_name': 'u80176'}


In [5]:
#cancel_task("jobid_local")  #inpout task id

### First convert downaded model file to onnx format, then to OpenVINO IR file
#### convert to onnx format

Unzip the Yolo model

In [None]:
!tar -zxvf model.tar.gz

In [23]:
# onnx>=1.9.0  # ONNX export
!pip3 install onnx>=1.9.0
# onnx-simplifier>=0.3.6  # ONNX simplifier
!python3 yolov5/export.py --weights model.pt --data data/coco.yaml --imgsz 640 --batch-size 1 --include onnx


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m22.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-mt8j15ik because the default path (/home/build/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
[34m[1mexport: [0mdata=data/coco.yaml, weights=['model.pt'], imgsz=[640], batch_size=1, device=cpu, half=False, inplace=False, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=12, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx']
YOLOv5 🚀 v6.2-205-geef9057 Python-3.8.2 t

In [None]:
# Create FP32 IR files
!mo \
--input_model model.onnx \
--input_shape [1,3,640,640] \
--data_type FP32 \
--output_dir data

## Build docker image 

In [None]:
!buildah bud --format docker -f ./dockerfile/onnx_yolov5.dockerfile -t $REGISTRY_URL/yolov5:custom .

## Push custom image to Container Library

In [None]:
buildah push $REGISTRY_URL/yolov5:custom

## Please go back to Container Playground's My Library to lunch this container
Navigate to **My Library** > **Resources** and associate the ``ovep-object-detection:custom`` resource with a project, configure the **Output Mount Point** with ``/mount_folder`` and **Environment Variables** with required runtime DEVICE value. Finally click on the launch button.
