#https://matplotlib.org/users/image_tutorial.html  visit this link for more about plotting using matplot

Apart from OpenCV, Python also provides a module time which is helpful in measuring the time of execution. Another module profile helps to get detailed report on the code, like how much time each function in the code took, how many times the function was called etc. But, if you are using IPython, all these features are integrated in an user-friendly manner.

trying the image tutorials

In [6]:
from matplotlib import pyplot as plt
import numpy as np
import cv2


Measuring Performance with OpenCV

cv2.getTickCount function returns the number of clock-cycles after a reference event (like the moment machine was switched ON) to the moment this function is called. So if you call it before and after the function execution, you get number of clock-cycles used to execute a function.

cv2.getTickFrequency function returns the frequency of clock-cycles, or the number of clock-cycles per second. So to find the time of execution in seconds, you can do following

In [7]:
e1 = cv2.getTickCount()
# your code execution
print("saicharan")
e2 = cv2.getTickCount()
time = (e2 - e1)/ cv2.getTickFrequency()
print(time)

saicharan
0.00022753


In [8]:
img1 = cv2.imread('images/imge1.jpg')

e1 = cv2.getTickCount()
for i in range(5,49,2):
    img1 = cv2.medianBlur(img1,i)
e2 = cv2.getTickCount()
t = (e2 - e1)/cv2.getTickFrequency()
print(t)

0.000181318


Default Optimization in OpenCV

Many of the OpenCV functions are optimized using SSE2, AVX etc. It contains unoptimized code also. So if our system support these features, we should exploit them (almost all modern day processors support them). It is enabled by default while compiling. So OpenCV runs the optimized code if it is enabled, else it runs the unoptimized code. You can use cv2.useOptimized() to check if it is enabled/disabled and cv2.setUseOptimized() to enable/disable it. Let’s see a simple example.

In [9]:
cv2.useOptimized()

True

In [10]:
%timeit res = cv2.medianBlur(img1,49)

324 ns ± 10.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [11]:
cv2.setUseOptimized(False)

In [12]:
cv2.useOptimized()

False

In [13]:
%timeit res = cv2.medianBlur(img1,49)
#compared to above it is less

334 ns ± 17.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [14]:
x = 5
%timeit y=x**2

248 ns ± 35.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [15]:
%timeit y=x*x
#it got drastically varied

48.8 ns ± 7.38 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [16]:
z = np.uint8([5])
%timeit y=z*z

490 ns ± 50.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [17]:
%timeit y=np.square(z)

576 ns ± 53.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


Note : Python scalar operations are faster than Numpy scalar operations. So for operations including one or two elements, Python scalar is better than Numpy arrays. Numpy takes advantage when size of array is a little bit bigger.

In [19]:
%timeit z = cv2.countNonZero(img1)

270 ns ± 25.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [21]:
%timeit z = np.count_nonzero(img1)

797 ns ± 96.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


Normally, OpenCV functions are faster than Numpy functions. So for same operation, OpenCV functions are preferred. But, there can be exceptions, especially when Numpy works with views instead of copies.

#### Performance Optimization Techniques

There are several techniques and coding methods to exploit maximum performance of Python and Numpy. Only relevant ones are noted here and links are given to important sources. The main thing to be noted here is that, first try to implement the algorithm in a simple manner. Once it is working, profile it, find the bottlenecks and optimize them.

Avoid using loops in Python as far as possible, especially double/triple loops etc. They are inherently slow.
Vectorize the algorithm/code to the maximum possible extent because Numpy and OpenCV are optimized for vector operations.
Exploit the cache coherence.
Never make copies of array unless it is needed. Try to use views instead. Array copying is a costly operation.

Additional Resources

Python Optimization Techniques : https://wiki.python.org/moin/PythonSpeed/PerformanceTips
Scipy Lecture Notes - Advanced Numpy : http://scipy-lectures.github.io/advanced/advanced_numpy/index.html#advanced-numpy
Timing and Profiling in IPython : http://pynash.org/2013/03/06/timing-and-profiling.html

#### Overview: Optimize what needs optimizing

You can only know what makes your program slow after first getting the program to give correct results, then running it to see if the correct program is slow. When found to be slow, profiling can show what parts of the program are consuming most of the time. A comprehensive but quick-to-run test suite can then ensure that future optimizations don't change the correctness of your program. In short:

Get it right.
Test it's right.
Profile if slow.
Optimise.
Repeat from 2.
Certain optimizations amount to good programming style and so should be learned as you learn the language. An example would be moving the calculation of values that don't change within a loop, outside of the loop.
### Choose the Right Data Structure
### Sorting
Sorting lists of basic Python objects is generally pretty efficient. The sort method for lists takes an optional comparison function as an argument that can be used to change the sorting behavior. This is quite convenient, though it can significantly slow down your sorts, as the comparison function will be called many times. In Python 2.4, you should use the key argument to the built-in sort instead, which should be the fastest way to sort.

Only if you are using older versions of Python (before 2.4) does the following advice from Guido van Rossum apply:

An alternative way to speed up sorts is to construct a list of tuples whose first element is a sort key that will sort properly using the default comparison, and whose second element is the original list element. This is the so-called Schwartzian Transform, also known as DecorateSortUndecorate (DSU).

In [22]:
def sortBy(somelist,n):
    nlist = [(x[n],x) for x in somelist]
    nlist.sort()
    return [val for (key,val) in nlist]

In [23]:
def sortby_inplace(somelist, n):
    somelist[:] = [(x[n], x) for x in somelist]
    somelist.sort()
    somelist[:] = [val for (key, val) in somelist]
    return

In [27]:
somelist = [(1, 2, 'def'), (3, 6, 'abc'), (2, -4, 'ghi')]

In [28]:
somelist.sort()
somelist

[(1, 2, 'def'), (2, -4, 'ghi'), (3, 6, 'abc')]

In [41]:

e1 = cv2.getTickCount()
nlist = sortBy(somelist, 2)
print(nlist)
e2 = cv2.getTickCount()
time = (e2 - e1)/ cv2.getTickFrequency()
print(time)

[(3, 6, 'abc'), (1, 2, 'def'), (2, -4, 'ghi')]
0.000143997


In [42]:
e1 = cv2.getTickCount()
nlist1 = sortby_inplace(somelist, 2)
print(somelist)
e2 = cv2.getTickCount()
time = (e2 - e1)/ cv2.getTickFrequency()
print(time)

[(3, 6, 'abc'), (1, 2, 'def'), (2, -4, 'ghi')]
0.000146918


In [43]:
somelist == nlist

True

"change all the 'a's to 'b's" in any given string. Instead, you have to create a new string with the desired properties. This continual copying can lead to significant inefficiencies in Python programs.

avoid like this:
    

In [45]:
s = ""
list1 = "saicharan"
for substring in list1:
    s += substring
print(s)

saicharan


In [57]:
#use like this for optimization
s = "".join(list1)
print(s)

saicharan


In [72]:
s = ""
for x in list1:
    s += str(x)
s

'saicharan'

In [77]:
#use like this
slist = [str(elt) for elt in list1]
s = "".join(slist)
s

'saicharan'

If the body of your loop is simple, the interpreter overhead of the for loop itself can be a substantial amount of the overhead. This is where the map function is handy. You can think of map as a for moved into C code. The only restriction is that the "loop body" of map must be a function call. Besides the syntactic benefit of list comprehensions, they are often as fast or faster than equivalent use of map.

Here's a straightforward example. Instead of looping over a list of words and converting them to upper case:

In [81]:
newlist = []
for word in list1:
    newlist.append(word.upper())
print("".join(newlist))

SAICHARAN


you can use map to push the loop from the interpreter into compiled C code:

In [84]:
newlist = map(str.upper, list1)
print("".join(newlist))

SAICHARAN


List comprehensions were added to Python in version 2.0 as well. They provide a syntactically more compact and more efficient way of writing the above for loop:

In [88]:
newlist = [s.upper() for s in list1]
print("".join(newlist))

SAICHARAN


In [89]:
#see we have drastically got down the execution time

Generator expressions were added to Python in version 2.4. They function more-or-less like list comprehensions or map but avoid the overhead of generating the entire list at once. Instead, they return a generator object which can be iterated over bit-by-bit:

In [90]:
iterator = (s.upper() for s in list1)
print("".join(newlist))

SAICHARAN
