This repository was archived by the owner on Aug 28, 2024. It is now read-only.
transfer learning for YOLOv5 object detection #71
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Object Detection with YOLOv5 on iOS
Introduction
YOLO (You Only Look Once) is one of the fastest and most popular object detection models. YOLOv5 is an open-source implementation of the latest version of YOLO (for a quick test of loading YOLOv5 from PyTorch hub for inference, see here). This Object Detection with YOLOv5 iOS sample app uses the PyTorch scripted YOLOv5 model to detect objects of the 80 classes trained with the model.
Update 10-07-2021: A new section of using a custom dataset to fine-tune the YOLOv5 model (aka transfer learning) with steps to change the iOS demo app to use the custom model was added.
Prerequisites
Quick Start
To Test Run the Object Detection iOS App, follow the steps below:
1. Prepare the model
If you don't have the PyTorch environment set up to run the script, you can download the model file here to the
ios-demo-app/ObjectDetection/ObjectDetectionfolder, then skip the rest of this step and go to step 2 directly.The Python script
export.pyin themodelsfolder of the YOLOv5 repo is used to generate a TorchScript-formatted YOLOv5 model namedyolov5s.torchscript.ptlfor mobile apps.Open a Mac/Linux/Windows Terminal, run the following commands:
Note the steps below have been tested with the commit
cd35a009ba964331abccd30f6fa0614224105d39and if there's any issue with running the script or using the model, trygit reset --hard cd35a009ba964331abccd30f6fa0614224105d39.Edit
export.pyto make the following two changes:After
f = file.with_suffix('.torchscript.pt'), add a linefl = file.with_suffix('.torchscript.ptl')After
(optimize_for_mobile(ts) if optimize else ts).save(f), add(optimize_for_mobile(ts) if optimize else ts)._save_for_lite_interpreter(str(fl))Finally, run the script below to generate the optimized TorchScript Lite Interpreter model and copy the generated model file
yolov5s.torchscript.ptlto theios-demo-app/ObjectDetection/ObjectDetectionfolder (the original full JIT modelyolov5s.torchscript.ptwas also generated for comparison):Note that small sized version of the YOLOv5 model, which runs faster but with less accuracy, is generated by default when running the
export.py. You can also change the value of theweightsparameter in theexport.pyto generate the medium, large, and extra large version of the model.2. Use LibTorch-Lite
Run the commands below:
3. Run the app
Select an iOS simulator or device on Xcode to run the app. You can go through the included example test images to see the detection results. You can also select a picture from your iOS device's Photos library, take a picture with the device camera, or even use live camera to do object detection - see this video for a screencast of the app running.
Some example images and the detection results are as follows:
Transfer Learning
In this section, you'll see how to use an example dataset called aicook, used to detect ingredients in your fridge, to fine-tune the YOLOv5 model. For more info on the YOLOv5 transfer learning, see here. If you use the default YOLOv5 model to do object detection on what's inside your fridge, you'll likely not get good results. That's why you need to have a custom model trained with a dataset like aicook.
1. Download the custom dataset
Simply go to here to download the aicook dataset in a zip file. Unzip the file to your
yolov5repo directory, then runcd yolov5; mv train ..; mv valid ..;as the aicookdata.yamlspecifies thetrainandvalfolders to be up one level.2. Retrain the YOLOv5 with the custom dataset
Run the script below to generate a custom model
best.torchscript.ptlocated inruns/train/exp/weights:The precision of the model with the epochs set as 3 is very low - less than 0.01 actually; with a tool such as Weights and Biases, which can be set up in a few minutes and has been integrated with YOLOv5, you can find that with
--epochsset as 80, the precision gets to be 0.95. But on a CPU machine, you can quickly train a custom model using the command above, then test it in the iOS demo app. Below is a sample wandb metrics from 3, 30, and 100 epochs of training:3. Convert the custom model to lite version
With the
export.pymodified as in step 1Prepare the modelof the sectionQuick Start, you can convert the new custom model to its TorchScript lite version:The resulting
best.torchscript.ptlis located inruns/train/exp/weights, which needs to be added to the iOS ObjectDetection demo app project.4. Update the demo app
In Xcode, first in
ViewController.swift, change lineprivate let testImages = ["test1.png", "test2.jpg", "test3.png"]toprivate let testImages = ["aicook1.jpg", "aicook2.jpg", "aicook3.jpg", "test1.png", "test2.jpg", "test3.png"](The three aicook test images have been added to the repo.)
Then change lines in
ObjectDetector.swift:to:
and
to:
(aicook.txt defines the 30 custom class names, copied from
data.yamlin the custom dataset downloaded in step 1 of this section.)Finally in
PrePostProcessor.swift, change linestatic let outputColumn = 85tostatic let outputColumn = 35, which is 5 (left, top, right, bottom, score) + 30 (number of custom classes).Run the app in Xcode and you should see the custom model working on the first three aicook test images: