This MLHub package provides command line tools based on Microsoft's open source Computer Vision Best Practice repository available from https://github.com/microsoft/ComputerVision. Individual command line tools are packaged for common computer vision tasks including image classification and object detection, based on a collection of pre-built computer vision models. These command line tools aim to be used as part of a Linux pipeline of image processing commands. The package also provides an interactive demonstration as an overview of the capabilities of the repository.
Giving computers some semblance of vision (and the implied ability to process the images and identify the content) has become practical with the massive compute and training data that is available today. A number of pre-built machine learning models are freely available for this task and through this package we make them available for command line scripts.
Visit the github repository for for further examples and code: https://github.com/gjwgit/cvbp
ml demo cvbp
ml classify cvbp
ml classify cvbp https://upload.wikimedia.org/wikipedia/commons/2/2d/Koala_in_Australia.JPG
- To install and configure mlhub (Ubuntu 18.04 LTS)
$ pip3 install mlhub
$ ml configure
- To install and configure the package:
$ ml install cvbp
$ ml configure cvbp
$ ml readme cvbp
$ ml commands cvbp
- To run a canned demonstration or an interactive GUI:
$ ml demo cvbp
$ ml gui cvbp
In addition to the demo presented below, the cvbp package provides useful command line tools. Below we demonstrate a number of these. Most commands take an image as a parameter which may be a url or a path to a local file.
classify
The classify command will identify the dominant object in a photo with a level of confidence. The confidence, class, pre-built model used, and filename are printed and can be piped on to other commands within a command line. This can allow us to add the class as a tag to the meta data of an image file.
This example image from the Internet is classified 100% as a koala by the default resnet152 model.
$ ml classify cvbp https://upload.wikimedia.org/wikipedia/commons/2/2d/Koala_in_Australia.JPG
1.00,koala,resnet152,https://upload.wikimedia.org/wikipedia/commons/2/2d/Koala_in_Australia.JPG
As an example of command line processing, we could download this photo (noting its license) and add the appropriate tag to it:
$ wget -O koala.jpg https://upload.wikimedia.org/wikipedia/commons/2/2d/Koala_in_Australia.JPG
$ exiftool koala.jpg | grep -i comment
$ ml classify cvbp koala.jpg |
cut -d, -f2 |
xargs bash -c 'mogrify -comment $0 koala.jpg'
$ exiftool koala.jpg | grep -i comment
Comment : koala
Coffee mugs seem to be fairly standard fare for image classification:
$ ml classify cvbp https://cvbp.blob.core.windows.net/public/images/cvbp_cup.jpg
0.68,coffee_mug,resnet152,https://cvbp.blob.core.windows.net/public/images/cvbp_cup.jpg
Different pre-built image classification models are available. Here we use a more complex model to provide a more confident classification of the coffee mug:
$ ml classify cvbp --model=resnet152 https://cvbp.blob.core.windows.net/public/images/cvbp_cup.jpg
0.85,coffee_mug,resnet152,https://cvbp.blob.core.windows.net/public/images/cvbp_cup.jpg
If no image is supplied on the command line then the computer's webcam will be used to source a live feed and any objects held up to the camera will be classified. The live classification is available within the displayed live video image.
$ ml classify cvbp
If there is more than one camera on the computer then the command may not select the best one. For example, you may have an infrared camera and a webcam on your computer. If the resulting window is all green, then the infrared camera has been chosen. To choose your webcam the use of the --webcam command line option may help:
$ ml classify cvbp --webcam=1
The --webcam command line option is also supported across other commands that use the webcam, including demo.
Multiple images can be classified with one command line:
$ ml classify cvbp images/*.png
0.98,chickadee,resnet152,images/image_01_bw.png
0.99,custard_apple,resnet152,images/image_02_bw.png
1.00,echidna,resnet152,images/image_03_bw.png
0.98,Great_Pyrenees,resnet152,images/image_04_bw.png
0.96,beacon,resnet152,images/image_05_bw.png
0.68,Maltese_dog,resnet152,images/image_06_bw.png
0.91,great_white_shark,resnet152,images/image_07_bw.png
0.13,banana,resnet152,images/image_09_bw.png
0.90,redshank,resnet152,images/image_10_bw.png
We can add a tag to photos which are classified with a confidence greater than 75%. This might allow us to later on search for photos using the photo meta-data tag.
$ ml classify cvbp images/*.png |
awk '$1>0.75{print}' |
cut -d, -f2,4 |
tr ',' ' ' |
xargs -d'\n' -n1 bash -c 'mogrify -comment $0 $1'
The default pre-built model used for classification is resent152 as note above. A collection of alternative pre-built models is available and can be accessed using the --model option. The available models can be listed with --model=list.
$ ml classify cvbp --model=list
densenet201
alexnet
densenet121
densenet161
densenet169
densenet201
resnet101
resnet152
resnet18
resnet34
resnet50
squeezenet1_0
squeezenet1_1
vgg16_bn
vgg19_bn
All models can be selected with --model=all. Note that the first time a pre-built model is utilised it will need to be downloaded from the Internet, which can take a little time.
$ ml classify cvbp --model=all images/*.jpg
0.44,espresso,densenet201,images/coffee_mug.jpg
0.73,coffee_mug,alexnet,images/coffee_mug.jpg
0.44,coffee_mug,densenet121,images/coffee_mug.jpg
0.81,coffee_mug,densenet161,images/coffee_mug.jpg
0.42,coffee_mug,densenet169,images/coffee_mug.jpg
0.44,espresso,densenet201,images/coffee_mug.jpg
0.81,coffee_mug,resnet101,images/coffee_mug.jpg
0.85,coffee_mug,resnet152,images/coffee_mug.jpg
0.68,coffee_mug,resnet18,images/coffee_mug.jpg
0.76,coffee_mug,resnet34,images/coffee_mug.jpg
0.48,coffee_mug,resnet50,images/coffee_mug.jpg
0.71,coffee_mug,squeezenet1_0,images/coffee_mug.jpg
0.68,coffee_mug,squeezenet1_1,images/coffee_mug.jpg
0.54,coffee_mug,vgg16_bn,images/coffee_mug.jpg
0.83,coffee_mug,vgg19_bn,images/coffee_mug.jpg
1.00,kite,densenet201,images/kite.jpg
1.00,kite,alexnet,images/kite.jpg
0.98,kite,densenet121,images/kite.jpg
0.99,kite,densenet161,images/kite.jpg
0.99,kite,densenet169,images/kite.jpg
1.00,kite,densenet201,images/kite.jpg
0.99,kite,resnet101,images/kite.jpg
0.88,kite,resnet152,images/kite.jpg
0.91,kite,resnet18,images/kite.jpg
0.98,kite,resnet34,images/kite.jpg
1.00,kite,resnet50,images/kite.jpg
0.75,kite,squeezenet1_0,images/kite.jpg
0.95,kite,squeezenet1_1,images/kite.jpg
1.00,kite,vgg16_bn,images/kite.jpg
1.00,kite,vgg19_bn,images/kite.jpg
Otherwise individual models can be chosen with --model=densenet201, for example.
$ ml classify cvbp --model=densenet201 images/*.jpg
0.44,espresso,densenet201,images/coffee_mug.jpg
1.00,kite,densenet201,images/kite.jpg
0.66,balance_beam,densenet201,images/yogapose.jpg
detect
The detect command, in addition to identifying objects, returns the bounding box for the objects found in the image, consisting of the left top and right bottom coordinates.
$ ml detect cvbp images/*.{jpg,png}
1.00,cup,61,59,273,244,images/coffee_mug.jpg
1.00,bird,86,56,605,761,images/kite.jpg
0.83,dog,28,100,200,186,images/image_04_bw.png
0.16,sheep,30,100,195,184,images/image_04_bw.png
0.05,cow,241,125,255,204,images/image_04_bw.png
0.74,person,89,50,193,190,images/image_06_bw.png
0.22,horse,86,48,195,193,images/image_06_bw.png
0.17,dog,106,52,193,181,images/image_06_bw.png
0.08,umbrella,6,55,236,238,images/image_07_bw.png
0.99,bird,37,31,251,246,images/image_10_bw.png
As with classify, with no argument the webcam is deployed to obtain images and to detect objects in real time.
$ ml demo cvbp