# OpenCV's DNN module as an inference engine

The imagenet image database is organized according to the wordnet hierarchy. Each meaningful concept in wordnet, which could be multiple words is called a synonym set or a synset. This 1000 classes are stored in this synset file. So if we open this synset file, we can see the 1000 different categories, and here, each row corresponds to a category and this starts with an id, and then one or more words describing the category. 

In [1]:
import numpy as np
import cv2

In [9]:
# Load img
img = cv2.imread('Images/typewriter.jpg')
print(type(img))

<class 'numpy.ndarray'>


In [10]:
# Read the synset classes file and strip off any characters
all_rows = open('synset_words.txt').read().strip().split('\n')

all_rows[1:10]

['n01443537 goldfish, Carassius auratus',
 'n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
 'n01491361 tiger shark, Galeocerdo cuvieri',
 'n01494475 hammerhead, hammerhead shark',
 'n01496331 electric ray, crampfish, numbfish, torpedo',
 'n01498041 stingray',
 'n01514668 cock',
 'n01514859 hen',
 'n01518878 ostrich, Struthio camelus']

In [14]:
# Grab the different descriptions and not the id; we can use a list comprehension. 
# Find classes; looking for a space, and we don't want to include the id for r in all_rows; +1 for text after space
classes = [r[r.find(' ') + 1:] for r in all_rows]

classes[1:10]

['goldfish, Carassius auratus',
 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
 'tiger shark, Galeocerdo cuvieri',
 'hammerhead, hammerhead shark',
 'electric ray, crampfish, numbfish, torpedo',
 'stingray',
 'cock',
 'hen',
 'ostrich, Struthio camelus']

In [18]:
# Loop through a few classes with enumerate
for i, c in enumerate(classes):
    if i==4:
        break
    else:
        print(i, c)

0 tench, Tinca tinca
1 goldfish, Carassius auratus
2 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
3 tiger shark, Galeocerdo cuvieri


In [20]:
# Show image
cv2.imshow('Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Ready to pass the image through a pre-trained model
We're now in a position to actually pass this image through a pre-trained model. So I'm going to just press any key to close this image and next we look at passing the image through a pre-trained model. 

------------

## Output of Image Classification Using Caffe Model

So we've loaded the different classes of ImageNet and we're going to use the OpenCV DNN module as an inference engine. What we're going to do here is pass an image through a pre-trained model that has been trained on the 1,000 classes of ImageNet. The model will then output the probability the image contains each of the 1,000 classes.

### readNetFromCaffe

As Open CV supports models from Caffee, TenserFlow, Torch, DarkNet and models from the omnx format all you need to do is load the models in wait and any configuration files for your use case.

So readNetFromCaffe takes in as arguments the path to the prottotxt file with text descriptions of the network architecture and the path to the caffeModel file with the train model.

    cv2.dnn(readNetFromCaffe(prototxt, caffeModel)
            
So readNetFromCaffe takes in as arguments the path to the prottotxt file with text descriptions of the network architecture and the path to the caffeModel file with the train model.     

In [43]:
# Read the caffe file
net = cv2.dnn.readNetFromCaffe('CaffeModel/bvlc_googlenet.prototxt', 'CaffeModel/bvlc_googlenet.caffemodel')

In [44]:
net

<dnn_Net 000001DB38FEAE30>

Know the model definition and check what size the image needs to be. Now the model definition from the prototxt file, tells us that the model expects images of size 224 by 224. 

### Create the blob after getting net

In [45]:
# Create blob
blob = cv2.dnn.blobFromImage(img, 1, (224, 224))

### Set Input

In [46]:
net.setInput(blob)

### Set forward pass to get the predictions for each of the 1,000 classes

In [47]:
# Set forward pass to get the predictions for each of the 1,000 classes
output = net.forward()

In [48]:
type(output)

numpy.ndarray

### All of classes and probability; Get top 5

In [50]:
# Top 5 probabilities
idx = np.argsort(output[0])[::-1][:5] # first element, sort backwards, get first 5

In [53]:
idx

array([878, 810, 753, 844, 827], dtype=int64)

In [57]:
# Loop through each row
for i, ids in enumerate(idx):
    print('{}. {} ({}): Probability {:.3}%'.format(i+1, classes[ids], ids, output[0][ids]))
#     print(ids)

1. typewriter keyboard (878): Probability 0.854%
2. space bar (810): Probability 0.0545%
3. radiator (753): Probability 0.0201%
4. switch, electric switch, electrical switch (844): Probability 0.00888%
5. stove (827): Probability 0.00873%


----------

## Final Python Code

In [58]:
import numpy as np
import cv2

# Load img
img = cv2.imread('Images/typewriter.jpg')

# Read the synset classes file and strip off any characters
all_rows = open('synset_words.txt').read().strip().split('\n')

# Grab the different descriptions and not the id; we can use a list comprehension. 
# Find classes; looking for a space, and we don't want to include the id for r in all_rows; +1 for text after space
classes = [r[r.find(' ') + 1:] for r in all_rows]

# Read the caffe file
net = cv2.dnn.readNetFromCaffe('CaffeModel/bvlc_googlenet.prototxt', 'CaffeModel/bvlc_googlenet.caffemodel')

# Create blob
blob = cv2.dnn.blobFromImage(img, 1, (224, 224))

# Set Input
net.setInput(blob)

# Set forward pass to get the predictions for each of the 1,000 classes
output = net.forward()

# Top 5 probabilities
idx = np.argsort(output[0])[::-1][:5] # first element, sort backwards, get first 5

# Loop through each row
for i, ids in enumerate(idx):
    print('{}. {} ({}): Probability {:.3}%'.format(i+1, classes[ids], ids, output[0][ids]))

# Show image
cv2.imshow('Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

1. typewriter keyboard (878): Probability 0.854%
2. space bar (810): Probability 0.0545%
3. radiator (753): Probability 0.0201%
4. switch, electric switch, electrical switch (844): Probability 0.00888%
5. stove (827): Probability 0.00873%
