## Profiling in Python
Python provides many excellent modules to measure the statistics of a program. This makes us know where the program is spending too much time and what to do inorder to optimize it. It is better to optimize the code inorder to increase the efficiency of a program. So, perform some standard tests to ensure optimization and we can improve the program inorder to increase the efficiency.

* Python includes a profiler called cProfile. It not only gives the total running time, but also times each function separately, and tells how many times each function was called, making it easy to determine where should make optimizations.


In [1]:
import time
start = time.time()
sq=[]
for x in range(10000):
    sq.append(x**2)
print("Time Consumed: % s seconds" % (time.time() - start))

Time Consumed: 0.02500152587890625 seconds


In [2]:
%%time
sq=[]
for x in range(10000):
    sq.append(x**2)

Wall time: 18 ms


In [3]:
%%timeit -n 1000
sq=[]
for x in range(1000):
    sq.append(x**2)

1.02 ms ± 92.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [4]:
#!pip install line_profiler

In [5]:
from line_profiler import LineProfiler
sq=[]
def fun1(n):
    for x in range(n):
        sq.append(x**2)
        
profile = LineProfiler(fun1(10000))
profile.print_stats()

Timer unit: 6.03742e-07 s



  profile = LineProfiler(fun1(10000))


In [6]:
import cProfile

def fun1(n):
    for x in range(n):
        sq.append(x**2)

num=10000
cProfile.run('fun1(num)')

         10004 function calls in 0.019 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.015    0.015    0.019    0.019 <ipython-input-6-57f7059af460>:3(fun1)
        1    0.000    0.000    0.019    0.019 <string>:1(<module>)
        1    0.000    0.000    0.019    0.019 {built-in method builtins.exec}
    10000    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




In [7]:
def msort(x):
    result = []
    if len(x) < 2:
        return x
    mid = int(len(x)/2)
    y = msort(x[:mid])
    z = msort(x[mid:])
    while (len(y) > 0) or (len(z) > 0):
        if len(y) > 0 and len(z) > 0:
            if y[0] > z[0]:
                result.append(z[0])
                z.pop(0)
            else:
                result.append(y[0])
                y.pop(0)
        elif len(z) > 0:
            for i in z:
                result.append(i)
                z.pop(0)
        else:
            for i in y:
                result.append(i)
                y.pop(0)
    return result

In [8]:
import random
import timeit
start_time = timeit.default_timer()
random.seed(1)
list=random.sample(range(1, 10001), 10000)
msort(list)
elapsed = timeit.default_timer() - start_time
elapsed

0.3023647959999991

In [9]:
%%time
import random
random.seed(1)
list=random.sample(range(1, 10001), 10000)
msort(list)
print('Time for sorting is:')

Time for sorting is:
Wall time: 290 ms


In [10]:
import cProfile

random.seed(1)
list=random.sample(range(1, 10001), 10000)

cProfile.run('msort(list)')

         748061 function calls (728063 primitive calls) in 0.940 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  19999/1    0.604    0.000    0.940    0.940 <ipython-input-7-f241ac15eadc>:1(msort)
        1    0.000    0.000    0.940    0.940 <string>:1(<module>)
        1    0.000    0.000    0.940    0.940 {built-in method builtins.exec}
   460827    0.152    0.000    0.152    0.000 {built-in method builtins.len}
   133616    0.058    0.000    0.058    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   133616    0.126    0.000    0.126    0.000 {method 'pop' of 'list' objects}




In [11]:
%%timeit -n 10
random.seed(1)
list=random.sample(range(1, 10001), 10000)
msort(list)

264 ms ± 24.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [12]:
# importing the required modules
import timeit

# binary search function
def binary_search(mylist, find):
	while len(mylist) > 0:
		mid = (len(mylist))//2
		if mylist[mid] == find:
			return True
		elif mylist[mid] < find:
			mylist = mylist[:mid]
		else:
			mylist = mylist[mid + 1:]
	return False


# linear search function
def linear_search(mylist, find):
	for x in mylist:
		if x == find:
			return True
	return False


# compute binary search time
def binary_time():
	SETUP_CODE = '''
from __main__ import binary_search
from random import randint'''

	TEST_CODE = '''
mylist = [x for x in range(10000)]
find = randint(0, len(mylist))
binary_search(mylist, find)'''
	
	# timeit.repeat statement
	times = timeit.repeat(setup = SETUP_CODE,
						stmt = TEST_CODE,
						repeat = 3,
						number = 10000)

	# printing minimum exec. time
	print('Binary search time: {}'.format(min(times)))	


# compute linear search time
def linear_time():
	SETUP_CODE = '''
from __main__ import linear_search
from random import randint'''
	
	TEST_CODE = '''
mylist = [x for x in range(10000)]
find = randint(0, len(mylist))
linear_search(mylist, find)
	'''
	# timeit.repeat statement
	times = timeit.repeat(setup = SETUP_CODE,
						stmt = TEST_CODE,
						repeat = 3,
						number = 10000)

	# printing minimum exec. time
	print('Linear search time: {}'.format(min(times)))

if __name__ == "__main__":
	linear_time()
	binary_time()


Linear search time: 5.183058761000005
Binary search time: 4.216621375999992


---

***For Python file with Extension .py : ***

* calling from interpreter or cmd: <br>
import cProfile <br>
cProfile.run('func()')

* calling cProfile when running a script:<br>
python -m cProfile myscript.py

* can made a little batch file called 'profile.bat':<br>
python -m cProfile %1

* then run only: <br>
profile euler048.py

---

### Visualize Profiling Results
The profiling is an important form of analysis which can be used to analyze time or space complexity of code. The Python has many profiling libraries like `cProfile`, `profile`, `line_profiler`, etc to analyze time complexity and `memory_profiler`, `memprof`, `guppy/hpy`, etc to analyze space complexity. The results generated by profiling libraries like `cProfile` generally log files with many lines each explaining the usage time of various function calls. If the function is very deep and has many lines of code than analyzing such log files can be a very tedious task.

Data visualization is a process where we can represent a lot of data and the human eye can easily catch patterns as well as understand data better. The Python has a library called **`snakeviz`** which can take profiling files generated by `cProfile` and generate visualization out of it.

In [13]:
#!pip install snakeviz

In [14]:
%load_ext snakeviz

In [15]:
import time
import random

def very_slow_random_generator():
    time.sleep(5)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def slow_random_generator():
    time.sleep(2)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def fast_random_generator():
    time.sleep(1)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)


def main_func():
    result = fast_random_generator()
    print(result)

    result = slow_random_generator()
    print(result)

    result = very_slow_random_generator()
    print(result)

%snakeviz main_func()

50.4319
50.52205
50.42586
 
*** Profile stats marshalled to file 'C:\\Users\\Ram\\AppData\\Local\\Temp\\tmp_138grn_'. 
Embedding SnakeViz in this document...


In [16]:
%%snakeviz

import time
import random

def very_slow_random_generator():
    time.sleep(5)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def slow_random_generator():
    time.sleep(2)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)

def fast_random_generator():
    time.sleep(1)
    arr = [random.randint(1,100) for i in range(100000)]
    return sum(arr) / len(arr)


def main_func():
    result = fast_random_generator()
    print(result)

    result = slow_random_generator()
    print(result)

    result = very_slow_random_generator()
    print(result)

main_func()

50.55572
50.60366
50.67423
 
*** Profile stats marshalled to file 'C:\\Users\\Ram\\AppData\\Local\\Temp\\tmpo6e4gobs'. 
Embedding SnakeViz in this document...


---