Timing and profiling code

In this chapter, you will learn how to gather and compare runtimes between different coding 
approaches. You'll practice using the line_profiler and memory_profiler packages to profile 
your code base and spot bottlenecks. Then, you'll put your learnings to practice by replacing 
these bottlenecks with efficient Python code.

1) Examining runtime

1.1) Using %timeit: your turn!

You'd like to create a list of integers from 0 to 50 using the range() function. However, you 
are unsure whether using list comprehension or unpacking the range object into a list is faster. 
Let's use %timeit to find the best implementation.

For your convenience, a reference table of time orders of magnitude is provided below (faster at 
the top).


| symbol      | name        |  unit (s)    
| ----------- | ----------- | -----------
| ns          | nanosecond  |  10-9
| µs (us)     | microsecond |  10-6
| ms          | millisecond |  10-3
| s           | second      |  100

In [1]:
# Create a list of integers (0-50) using list comprehension
nums_list_comp = [num for num in range(51)]
print(nums_list_comp)

# Create a list of integers (0-50) by unpacking range
nums_unpack = [*range(51)]
print(nums_unpack)

# use timeit with the list of integers (0-50) using list comprehension
%timeit nums_list_comp = [num for num in range(51)]

# use timeit with the list of integers (0-50) by unpacking range
%timeit nums_unpack = [*range(51)]

# OR
# from timeit import timeit
# time_exe1 = timeit("""nums_list_comp = [num for num in range(51)]""")
# time_exe2 = timeit("""nums_unpack = [*range(51)]""")
# print(time_exe1,time_exe2)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
2.43 µs ± 22.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.2 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


Question:
Use %timeit within your IPython console (i.e. not within the script.py window) to compare the 
runtimes for creating a list of integers from 0 to 50 using list comprehension vs. unpacking the 
range object. Don't include the print() statements when timing.

Which method was faster?

Possible Answers

- List comprehension was faster than unpacking range().

- Unpacking the range object was faster than list comprehension.  (Answer:True)

- Both methods had the same runtime.

1.2) Using %timeit: specifying number of runs and loops

A list of 480 superheroes has been loaded into your session (called heroes). You'd like to analyze 
the runtime for converting this heroes list into a set. Instead of relying on the default settings 
for %timeit, you'd like to only use 5 runs and 25 loops per each run.

What is the correct syntax when using %timeit and only using 5 runs with 25 loops per each run?

Possible Answers:

- timeit -runs5 -loops25 set(heroes)

- %%timeit -r5 -n25 set(heroes)

- %timeit set(heroes), 5, 25

- %timeit -r5 -n25 set(heroes) True

1.3) Using %timeit: formal name or literal syntax

Python allows you to create data structures using either a formal name or a literal syntax. 
In this exercise, you'll explore how using a literal syntax for creating a data structure 
can speed up runtimes.

|data structure	   | formal name   | literal syntax
|-------------     |-------------  |-------------
|list	           |    list()	   |        []
|dictionary	       |    dict()	   |        {}
|tuple	           |   tuple()	   |        ()

In [2]:
# Create a list using the formal name
formal_list = list()
print(formal_list)

# Create a list using the literal syntax
literal_list = []
print(literal_list)

# Print out the type of formal_list
print(type(formal_list))

# Print out the type of literal_list
print(type(literal_list))

%timeit formal_list = list()
%timeit literal_list = []

[]
[]
<class 'list'>
<class 'list'>
213 ns ± 38.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
95.3 ns ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


Question:

Use %timeit in your IPython console to compare runtimes between creating a list using the formal name (list()) and the literal syntax ([]). Don't include the print() statements when timing.

Which naming convention is faster?

Possible Answers

- Using the formal name (list()) to create a list is faster.

- Using the literal syntax ([]) to create a list is faster. (True)

- Both naming conventions have the same runtime.

1.4) Using cell magic mode (%%timeit)

From here on out, you'll be working with a superheroes dataset. For this exercise, a list of each hero's weight in kilograms (called wts) is loaded into your session. You'd like to convert these weights into pounds.
```python:

#You could accomplish this using the below for loop:
hero_wts_lbs = []
for wt in wts:
    hero_wts_lbs.append(wt * 2.20462)

# Or you could use a numpy array to accomplish this task:
wts_np = np.array(wts)
hero_wts_lbs_np = wts_np * 2.20462
```
Use %%timeit in your IPython console to compare runtimes between these two approaches. Make sure to press SHIFT+ENTER after the magic command to add a new line before writing the code you wish to time. After you've finished coding, answer the following question:

Which of the above techniques is faster?

Possible Answers:

- The for loop technique was faster.

- The numpy technique was faster. (Answer:True)

- Both techniques had similar runtimes.

2) Code profiling for runtime

2.1 Pop quiz: steps for using %lprun

Below is the convert_units() function, which converts the heights and weights of our favorite superheroes from metric units to Imperial units.

```python:

def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

```
Suppose you have a list of superheroes (named heroes) along with each hero's height (in centimeters) and weight (in kilograms) loaded as NumPy arrays (named hts and wts respectively).

What are the necessary steps you need to take in order to profile the convert_units() function acting on your superheroes data if you'd like to see line-by-line runtimes?

In [2]:
import numpy as np
import line_profiler
heroes = ['Batman', 'Superman', 'Wonder Woman']
hts = np.array([188.0, 191.0, 183.0])
wts = np.array([ 95.0, 101.0, 74.0])

def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

%load_ext line_profiler
%lprun -f convert_units convert_units(heroes, hts, wts)

The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler


What are the necessary steps you need to take in order to profile the convert_units() function acting on your superheroes data if you'd like to see line-by-line runtimes?

Possible Answers:

- Use %load_ext line_profiler to load the line_profiler within your IPython session.

- Use %lprun -f convert_units convert_units(heroes, hts, wts) to get line-by-line runtimes.

- Use %timeit convert_units(heroes, hts, wts) to gather runtimes.

- The first and second options from above are necessary. (True)

2.2 Using %lprun: spot bottlenecks

Profiling a function allows you to dig deeper into the function's source code and potentially spot bottlenecks. When you see certain lines of code taking up the majority of the function's runtime, it is an indication that you may want to deploy a different, more efficient technique.

Lets dig deeper into the convert_units() function.

```python:
def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

```
Load the line_profiler package into your IPython session. Then, use %lprun to profile the convert_units() function acting on your superheroes data. Remember to use the special syntax for working with %lprun (you'll have to provide a -f flag specifying the function you'd like to profile).

The convert_units() function, heroes list, hts array, and wts array have been loaded into your session. After you've finished coding, answer the following question:

What percentage of time is spent on the new_hts list comprehension line of code relative to the total amount of time spent in the convert_units() function?

Possible Answers:

- 0% - 10%
- 11% - 20% (True)
- 21% - 50%
- 51% - 100%

In [4]:
def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

%load_ext line_profiler
%lprun -f convert_units convert_units(heroes, hts, wts)

The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler


2.3 Using %lprun: fix the bottleneck

In the previous exercise, you profiled the convert_units() function and saw that the new_hts list comprehension could be a potential bottleneck. Did you notice that the new_wts list comprehension also accounted for a similar percentage of the runtime? This is an indication that you may want to create the new_hts and new_wts objects using a different technique.

Since the height and weight of each hero is stored in a numpy array, you can use array broadcasting rather than list comprehension to convert the heights and weights. This has been implemented in the below function:

```python:
def convert_units_broadcast(heroes, heights, weights):

    # Array broadcasting instead of list comprehension
    new_hts = heights * 0.39370
    new_wts = weights * 2.20462

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data
```

Load the line_profiler package into your IPython session. Then, use %lprun to profile the convert_units_broadcast() function acting on your superheroes data. The convert_units_broadcast() function, heroes list, hts array, and wts array have been loaded into your session. After you've finished coding, answer the following question:

What percentage of time is spent on the new_hts array broadcasting line of code relative to the total amount of time spent in the convert_units_broadcast() function?

In [5]:
def convert_units_broadcast(heroes, heights, weights):

    # Array broadcasting instead of list comprehension
    new_hts = heights * 0.39370
    new_wts = weights * 2.20462

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

%load_ext line_profiler
%lprun -f convert_units_broadcast convert_units_broadcast(heroes, hts, wts)

The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler
