Photo by Jon Tyson on Unsplash
In this tutorial we will detect a hand on an image on iPhone. To do this we will create an object detection CoreML model using the TuriCreate toolkit.
-
Install Python
a) For MacOS install brew and python3:
> /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" > brew install python3
b) For Windows download and install Python
NOTE: To create this tutorial the Python 3.7.7
was used.
-
Install
pip
and create Python virtual environment usingvenv
. For Windows follow instructions on the page and switch to the step 5) in this guide: Python Environment. For MacOS follow the following steps. -
On MacOS for convenience the following lines may be added to
/Users/username/.bash_profile
alias python=python3
alias pip=pip3
export WORKON_HOME=~/.virtualenvs
mkdir -p $WORKON_HOME
. /usr/local/bin/virtualenvwrapper.sh
After that run the following command to apply bash profile:
> source /Users/username/.bash_profile
username
should be replaced with yours one of course.
- Next step is to create a virtual environment:
> python3 -m venv ml
ml
above is the name of Python environment where all needed dependencies will be installed
- The final step before starting to use an environment is to activate it:
> source ~/.virtualenvs/ml/bin/activate
- Install TuriCreate and Core ML tools. This step should be done in the active Python environment created previously:
(ml)> pip install -U turicreate
(ml)> pip install -U coremltools
-
Download EgoHands and unarchive it to the root folder of this repository.
-
This step is an optional one. Install VS Code. Any other IDE or even a text editor might be used instead.
-
TuriCreate toolkit simplifies ML models creation process and makes it almost trivial. TuriCreate supports several neural network model types. For our needs we are going to use model type called ObjectDetector to do labeling and localization tasks.
-
Before starting to use a neural network it should be trained on so called a train data set which is prepared beforehand in a special way.
-
Usually the training step takes a lot of time and requires powerful computing resources. To tackle this problem TuriCreate does a trick. It downloads already pre-trained neural network model and uses it to do final training based on the users data. This step is also called fine-tuning.
-
In our case TuriCreate will use pre-trained Darknet-YOLO which is a fast, mid size network that should be suitable for a mobile application.
-
As mentioned above before starting to train the data which is formed by a list of images should be prepared in a special way as such each image should be annotated. That means we need to define a bounding box around each of the target object a hand in our case on each image in the dataset. Each bounding box should be associated with a label.
-
In our case we have two types for labeling that correspond to either the left or the right hand.
-
In this tutorial we will not prepare a train data set from the scratch but we will use the one from EgoHands
-
Describe briefly the repo structure and a structure of EgoHands data set
-
Prepare a training data and visualize an example of ground truth data
-
Run a script to create and train a network model
-
To investigate the result CoreML model deeper we can open it in the Neutron viewer: