# Beginner Python and Math for Data Science
## Lecture 20
### Profiling

__Purpose:__
The purpose of this lecture is to understand how to use profiling in Python. 

__At the end of this lecture you will be able to:__
1. Understand how to use profiling to determine the duration of a function

## 1.1 Profiling Programs:

### 1.1.1 What is Profiling? 

__Overview:__
- __[Profiling](https://en.wikipedia.org/wiki/Profiling_(computer_programming)):__ Profiling is the practice of measuring the space (memory) or time complexity of a program
- Recall that in programming, there is ALWAYS more than one way to do something, where some methods are more efficient (less complex, take less time, etc.) than others 
- For example:
> 1. If you are adding elements to a list and you know ahead of time how large the list has to be, why not pre-allocate memory for the list and then simply add in-place, rather than "growing" the list which is inefficient
> 2. If you don't need to create a loop and can leverage a "Pythonic" way of accomplishing the same task, you should use it since loops take long to run 
- Profiling allows you to:
> 1. Find what parts of your program are causing the entire program to slow down 
> 2. Evaluate multiple methods of programming a task so you can choose the most efficient way 

__Helpful Points:__
1. You may not find yourself worrying so much about the time complexity of a program yet, as you are just beginning, but soon enough this will become very important 
2. As you become a more active programmer and spend more time on websites such as [www.stackoverflow.com](https://stackoverflow.com/), you will notice that most answers include an explanation of the time complexity of their solution as well as the time complexitiy of candidate solutions they also evaluated 

### 1.1.2 Profiling in Python 

__Overview:__
- We will focus primarily on measuring the time it takes to run a program and in order to do this, Python offers a few solutions: 
> 1. Using the __[`time`](https://docs.python.org/2/library/time.html)__ module: This module uses the `time()` function to track the __["wall time"/"wall-clock time"](https://en.wikipedia.org/wiki/Elapsed_real_time)__ of a program (actual time it takes) 
> 2. Using the __[`timeit`](https://docs.python.org/2/library/timeit.html)__ module: This module uses the `timeit()` function to track small code snippets
> 3. Using the __[`time`](http://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-time)__ Magic Command: This Magic Command is similar to the `time` module and will show you the time execution of a program in both __[CPU Time](https://en.wikipedia.org/wiki/CPU_time)__ and __Wall Time__. This command can be used in both:
>> a. __Line Mode__: Using the `%time` command next to a statement, you can time a single-line statement<br>
>> b. __Cell Mode__: Using the `%time` or `%%time` command at the top of the cell, you can time the entire cell body. Note, you can't have anything above this statement (not even comments) 
> 4. Using the __[`timeit`](http://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit)__ Magic Command: This Magic Command is similar to the `timeit` module and will show you more information on the time execution of a program. This command can be used in both: 
>> a. __Line Mode__: Using the `%timeit[-n<N> -r<R> [-t|-C] -q -p<P> -0]` statement<br>
>> b. __Cell Mode__: Using the `%%timeit[-n<N> -r<R> [-t|-C] -q -p<P> -0]` statement<br>
>> - In both cases, the parameters refer to the following: 
>> > 1. `-n<N>` says to execute the given statement `<N>` times in a loop (select the value for `N`)
>> > 2. `-r<R>` says to repeat the loop iteration `<R>` times and take the best result 
>> > 3. `-t` says to use `time.time` to measure the time 
>> > 4. `-C` says to use `time.clock` to measure the time 
>> > 5. `-p<P>` says to use a precision of `<P>` digits to display the timing result 
>> > 6. `q` says to be "quiet" (do not print result) 
>> > 7. `o` returns a result that can be stored in a variable to inspect the result in more details 

__Helpful Points:__
1. See examples of both methods below

__Practice:__ Examples of Profiling in Python 

### Part 1 (Using the `time` module):

In [1]:
# import the module
import time

### Example 1.1 (Checking Time for a Nested Loop):

In [4]:
# enter this command before the block of code you want to calculate the time for
start_time = time.time() # this clocks the current time

# block of code
empty_list_i = []
empty_list_j = []
for i in range(10000):
    empty_list_i.append(i)
    for j in range(1000):
        empty_list_j.append(j)
        
# enter this command after the block of code you want to calculate the time for
stop_time = time.time() # this clocks the current time 

elapsed_time = stop_time - start_time
print("The program took {:.5f} seconds".format(elapsed_time))

The program took 1.66015 seconds


### Part 2 (Using the `timeit` module):

In [3]:
# import the module
import timeit

### Example 2.1 (Checking Time for a Nested Loop):

In [4]:
# setup string
setup = """
empty_list_i = []
empty_list_j = []
"""

# statement string that you want to test 
statement = """
for i in range(10000):
    empty_list_i.append(i)
    for j in range(1000):
        empty_list_j.append(j)
"""

In [5]:
# calculates the time to execute the statement n times 
timeit.timeit(stmt = statement, setup = setup, number = 10)

8.4586077999993

### Part 3 (Using the `time` Magic Command):

### Example 3.1 (Checking Time using `%time` for a Single Statement):

In [8]:
%time 2**300

CPU times: user 8 µs, sys: 1 µs, total: 9 µs
Wall time: 22.9 µs


2037035976334486086268445688409378161051468393665936250636140449354381299763336706183397376

In [10]:
n = 10000
%time sum(range(n))

CPU times: user 162 µs, sys: 1 µs, total: 163 µs
Wall time: 166 µs


49995000

### Example 3.2 (Checking Time using `%time` for Multiple Statements):

In [11]:
n = 10000
%time 2**300; sum(range(n))

CPU times: user 263 µs, sys: 1 µs, total: 264 µs
Wall time: 268 µs


### Example 3.3 (Checking Time using `%time` for Cell Body):

In [7]:
# check time of the entire cell body 
%time

empty_list_i = []
empty_list_j = []

for i in range(10000):
    empty_list_i.append(i)
    for j in range(1000):
        empty_list_j.append(j)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 17.9 µs


### Example 3.4 (Checking Time using `%%time` for Cell Body):

In [8]:
%%time

empty_list_i = []
empty_list_j = []

for i in range(10000):
    empty_list_i.append(i)
    for j in range(1000):
        empty_list_j.append(j)

CPU times: user 1.14 s, sys: 136 ms, total: 1.28 s
Wall time: 1.3 s


### Part 4 (Using the `timeit` Magic Command):

### Example 4.1 (Checking Time using `%timeit` for Single Statements):

In [14]:
%timeit -n2 -r4 -t -p4 (2**300)

1.46 µs ± 660.4 ns per loop (mean ± std. dev. of 4 runs, 2 loops each)


This is translated as:
> 1. Repeat the loop 2 times and take the best result ("`2 loops each`")
> 2. Repeat the loop iteration 4 times (`" of 4 runs"`)
> 3. Use the `time.time` measure of time
> 4. Use a precision of 4 digits to display the timing result ("`208.6 ns +- 212.8ns`")

In [15]:
n = 10000
%timeit -n4 -r5 -c -p5 sum(range(n))

394.25 µs ± 182.46 µs per loop (mean ± std. dev. of 5 runs, 4 loops each)


This is translated as:
> 1. Repeat the loop 4 times and take the best result ("`4 loops each`")
> 2. Repeat the loop iteration 5 times (`" of 5 runs"`)
> 3. Use the `time.clock` measure of time
> 4. Use a precision of 5 digits to display the timing result ("`215.55 micro s +- 49.87 micro s`")

### Example 4.2 (Checking Time using `%%timeit`):

In [16]:
%%timeit -o

empty_list_i = []
empty_list_j = []

for i in range(10000):
    empty_list_i.append(i)
    for j in range(1000):
        empty_list_j.append(j)

1.04 s ± 48.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


<TimeitResult : 1.04 s ± 48.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>

In [18]:
# store the result in the _ variable
res = _
print(res)

1.04 s ± 48.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Problem 1:

Create a list that has all the elements from 0 to 9,999 using two different methods.  Use profiling to determine how long it takes to create each list.

In [None]:
### Your code here




# ANSWERS

### Problem 1:

Create a list that has all the elements from 0 to 9,999 using two different methods.  Use profiling to determine how long it takes to create each list.

In [33]:
%%timeit -n10
my_list = list(range(10000))

215 µs ± 32.9 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [34]:
%%timeit -n10
my_list = []
for i in range(10000):
    my_list.append(i)

1.17 ms ± 238 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


We observe that the first method is much faster than the second, although both provide the same output.