YOLO (You Only Look Once) is one of the fastest and most popular object detection models. YOLOv5 is an open-source implementation of the latest version of YOLO (for a quick test of loading YOLOv5 from PyTorch hub for inference, see here). This Object Detection with YOLOv5 iOS sample app uses the PyTorch scripted YOLOv5 model to detect objects of the 80 classes trained with the model.
A new section of using a custom dataset to fine-tune the YOLOv5 model (aka transfer learning) with steps to change the iOS demo app to use the custom model was added.
- PyTorch 1.10 and torchvision 0.11 (Optional)
- Python 3.8 (Optional)
- iOS Cocoapods LibTorch-Lite 1.10.0
- Xcode 12 or later
To Test Run the Object Detection iOS App, follow the steps below:
If you don't have the PyTorch environment set up to run the script, you can download the model file here to the ios-demo-app/ObjectDetection/ObjectDetection
folder, then skip the rest of this step and go to step 2 directly.
The Python script export.py
in the models
folder of the YOLOv5 repo is used to generate a TorchScript-formatted YOLOv5 model named yolov5s.torchscript.ptl
for mobile apps.
Open a Mac/Linux/Windows Terminal, run the following commands:
git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt wandb
Note the steps below have been tested with the commit cd35a009ba964331abccd30f6fa0614224105d39
and if there's any issue with running the script or using the model, try git reset --hard cd35a009ba964331abccd30f6fa0614224105d39
.
Edit export.py
to make the following two changes:
-
After
f = file.with_suffix('.torchscript.pt')
, add a linefl = file.with_suffix('.torchscript.ptl')
-
After
(optimize_for_mobile(ts) if optimize else ts).save(f)
, add(optimize_for_mobile(ts) if optimize else ts)._save_for_lite_interpreter(str(fl))
Finally, run the script below to generate the optimized TorchScript Lite Interpreter model and copy the generated model file yolov5s.torchscript.ptl
to the ios-demo-app/ObjectDetection/ObjectDetection
folder (the original full JIT model yolov5s.torchscript.pt
was also generated for comparison):
python export.py --weights yolov5s.pt --include torchscript
Note that small sized version of the YOLOv5 model, which runs faster but with less accuracy, is generated by default when running the export.py
. You can also change the value of the weights
parameter in the export.py
to generate the medium, large, and extra large version of the model.
Run the commands below:
pod install
open ObjectDetection.xcworkspace/
Select an iOS simulator or device on Xcode to run the app. You can go through the included example test images to see the detection results. You can also select a picture from your iOS device's Photos library, take a picture with the device camera, or even use live camera to do object detection - see this video for a screencast of the app running.
Some example images and the detection results are as follows:
In this section, you'll see how to use an example dataset called aicook, used to detect ingredients in your fridge, to fine-tune the YOLOv5 model. For more info on the YOLOv5 transfer learning, see here. If you use the default YOLOv5 model to do object detection on what's inside your fridge, you'll likely not get good results. That's why you need to have a custom model trained with a dataset like aicook.
Simply go to here to download the aicook dataset in a zip file. Unzip the file to your yolov5
repo directory, then run cd yolov5; mv train ..; mv valid ..;
as the aicook data.yaml
specifies the train
and val
folders to be up one level.
Run the script below to generate a custom model best.torchscript.pt
located in runs/train/exp/weights
:
python train.py --img 640 --batch 16 --epochs 3 --data data.yaml --weights yolov5s.pt
The precision of the model with the epochs set as 3 is very low - less than 0.01 actually; with a tool such as Weights and Biases, which can be set up in a few minutes and has been integrated with YOLOv5, you can find that with --epochs
set as 80, the precision gets to be 0.95. But on a CPU machine, you can quickly train a custom model using the command above, then test it in the iOS demo app. Below is a sample wandb metrics from 3, 30, and 100 epochs of training:
With the export.py
modified as in step 1 Prepare the model
of the section Quick Start
, you can convert the new custom model to its TorchScript lite version:
python export.py --weights runs/train/exp/weights/best.pt --include torchscript
The resulting best.torchscript.ptl
is located in runs/train/exp/weights
, which needs to be added to the iOS ObjectDetection demo app project.
In Xcode, first in ViewController.swift
, change line private let testImages = ["test1.png", "test2.jpg", "test3.png"]
to private let testImages = ["aicook1.jpg", "aicook2.jpg", "aicook3.jpg", "test1.png", "test2.jpg", "test3.png"]
(The three aicook test images have been added to the repo.)
Then change lines in ObjectDetector.swift
:
if let filePath = Bundle.main.path(forResource: "yolov5s.torchscript", ofType: "ptl"),
to:
if let filePath = Bundle.main.path(forResource: "best.torchscript", ofType: "ptl"),
and
if let filePath = Bundle.main.path(forResource: "classes", ofType: "txt"),
to:
if let filePath = Bundle.main.path(forResource: "aicook", ofType: "txt"),
(aicook.txt defines the 30 custom class names, copied from data.yaml
in the custom dataset downloaded in step 1 of this section.)
Finally in PrePostProcessor.swift
, change line static let outputColumn = 85
to static let outputColumn = 35
, which is 5 (left, top, right, bottom, score) + 30 (number of custom classes).
Run the app in Xcode and you should see the custom model working on the first three aicook test images: