# Using %timeit: your turn!

You'd like to create a list of integers from 0 to 50 using the `range()` function. However, you are unsure whether using list comprehension or unpacking the range object into a list is faster. Let's use `%timeit` to find the best implementation.

In [2]:
# Create a list of integers (0-50) using list comprehension
%timeit nums_list_comp = [num for num in range(51)]
print(nums_list_comp)

# Create a list of integers (0-50) by unpacking range
%timeit nums_unpack = [*range(51)]
print(nums_unpack)

3.41 µs ± 652 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
849 ns ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]


# Using %timeit: specifying number of runs and loops

What is the correct syntax when using %timeit and only using 5 runs with 25 loops per each run?

In [4]:
# %timeit -r5 -n25 set(heroes)

# Using %timeit: formal name or literal syntax

Python allows you to create data structures using either a formal name or a literal syntax. In this exercise, you'll explore how using a literal syntax for creating a data structure can speed up runtimes.

In [8]:
# Create a list using the formal name
%timeit formal_list = list()
# print(formal_list)

# Create a list using the literal syntax
%timeit literal_list = []
# print(literal_list)

# Print out the type of formal_list
# print(type(formal_list))

# Print out the type of literal_list
# print(type(literal_list))

231 ns ± 64.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
43.6 ns ± 4.67 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


Using the literal syntax ([]) to create a list is faster.

# Using cell magic mode (%%timeit)

Use `%%timeit` in your IPython console to compare runtimes between two approaches. 

In [14]:
%%timeit

import numpy as np
wts= [1,2,3,4,5,6]
hero_wts_lbs = []
for wt in wts:
    hero_wts_lbs.append(wt * 2.20462)


1.52 µs ± 171 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [13]:
%%timeit
wts_np = np.array(wts)
hero_wts_lbs_np = wts_np * 2.20462

4.39 µs ± 567 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


The numpy technique was faster.

# Pop quiz: steps for using %lprun

What are the necessary steps you need to take in order to profile the `convert_units()` function acting on your superheroes data if you'd like to see line-by-line runtimes?

In [15]:
def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

- Use `%load_ext line_profiler` to load the `line_profiler` within your IPython session.
- Use `%lprun -f convert_units convert_units(heroes, hts, wts)` to get line-by-line runtimes.

In [16]:
# %load_ext line_profiler
# %lprun -f convert_units convert_units(heroes, hts, wts)

# Using %lprun: spot bottlenecks

Profiling a function allows you to dig deeper into the function's source code and potentially spot bottlenecks. When you see certain lines of code taking up the majority of the function's runtime, it is an indication that you may want to deploy a different, more efficient technique.

Lets dig deeper into the `convert_units()` function.

In [18]:
heroes = ['Batman', 'Superman', 'Wonder Woman']
hts = np.array([188.0, 191.0, 183.0])
wts = np.array([ 95.0, 101.0, 74.0])
def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

In [19]:
%load_ext line_profiler
%lprun -f convert_units convert_units(heroes, hts, wts)

Timer unit: 1e-07 s

Total time: 3.58e-05 s
File: C:\Users\88016\AppData\Local\Temp/ipykernel_16292/907188768.py
Function: convert_units at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
     4                                           def convert_units(heroes, heights, weights):
     5                                           
     6         1        209.0    209.0     58.4      new_hts = [ht * 0.39370  for ht in heights]
     7         1         60.0     60.0     16.8      new_wts = [wt * 2.20462  for wt in weights]
     8                                           
     9         1          8.0      8.0      2.2      hero_data = {}
    10                                           
    11         4         42.0     10.5     11.7      for i,hero in enumerate(heroes):
    12         3         33.0     11.0      9.2          hero_data[hero] = (new_hts[i], new_wts[i])
    13                                           
    14         1          6.0      6.0      1.7

What percentage of time is spent on the new_hts list comprehension line of code relative to the total amount of time spent in the `convert_units()` function?
- 11% - 20%

# Using %lprun: fix the bottleneck

In the previous exercise, you profiled the `convert_units(`)` function and saw that the new_hts list comprehension could be a potential bottleneck. Did you notice that the new_wts list comprehension also accounted for a similar percentage of the runtime? This is an indication that you may want to create the new_hts and new_wts objects using a different technique.

Since the height and weight of each hero is stored in a `numpy` array, you can use array broadcasting rather than list comprehension to convert the heights and weights. This has been implemented in the below function:

In [20]:
def convert_units_broadcast(heroes, heights, weights):

    # Array broadcasting instead of list comprehension
    new_hts = heights * 0.39370
    new_wts = weights * 2.20462

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

In [22]:
%reload_ext line_profiler
%lprun -f convert_units convert_units(heroes, hts, wts)

Timer unit: 1e-07 s

Total time: 4.78e-05 s
File: C:\Users\88016\AppData\Local\Temp/ipykernel_16292/907188768.py
Function: convert_units at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
     4                                           def convert_units(heroes, heights, weights):
     5                                           
     6         1        210.0    210.0     43.9      new_hts = [ht * 0.39370  for ht in heights]
     7         1         84.0     84.0     17.6      new_wts = [wt * 2.20462  for wt in weights]
     8                                           
     9         1         17.0     17.0      3.6      hero_data = {}
    10                                           
    11         4         87.0     21.8     18.2      for i,hero in enumerate(heroes):
    12         3         67.0     22.3     14.0          hero_data[hero] = (new_hts[i], new_wts[i])
    13                                           
    14         1         13.0     13.0      2.7

What percentage of time is spent on the new_hts array broadcasting line of code relative to the total amount of time spent in the `convert_units_broadcast()` function?
- 0% - 10%

# Pop quiz: steps for using %mprun

What are the necessary steps you need to take in order to profile the `convert_units()` function acting on your superheroes data if you'd like to see the line-by-line memory consumption of `convert_units()`?

- Use the command `from hero_funcs import convert_units` to load the function you'd like to profile.
- Use `%load_ext memory_profiler` to load the `memory_profile`r within your IPython session.
- Use `%mprun -f convert_units convert_units(heroes, hts, wts)` to get line-by-line memory allocations.

# Using %mprun: Hero BMI

You'd like to calculate the body mass index (BMI) for a selected sample of heroes. A function named calc_bmi_lists has also been created and saved to a file titled `bmi_lists.py`. How much memory do the list comprehension lines of code consume in the `calc_bmi_lists()` function? (i.e., what is the total sum of the Increment column for these four lines of code?)

In [28]:
from bmi_lists import calc_bmi_lists
%reload_ext memory_profiler

%mprun -f calc_bmi_lists calc_bmi_lists([1], hts, wts)




Filename: c:\Users\88016\Desktop\DataCamp Exercise\Python\Writing Efficient Python Code\bmi_lists.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     1     40.6 MiB     40.6 MiB           1   def calc_bmi_lists(sample_indices, hts, wts):
     2                                         
     3                                             # Gather sample heights and weights as lists
     4     40.6 MiB      0.0 MiB           4       s_hts = [hts[i] for i in sample_indices]
     5     40.6 MiB      0.0 MiB           4       s_wts = [wts[i] for i in sample_indices]
     6                                         
     7                                             # Convert heights from cm to m and square with list comprehension
     8     40.6 MiB      0.0 MiB           4       s_hts_m_sqr = [(ht / 100) ** 2 for ht in s_hts]
     9                                         
    10                                             # Calculate BMIs as a list with list comprehension
 

- 0.1 MiB - 2.0 MiB

# Using %mprun: Hero BMI 2.0

Let's see if using a different approach to calculate the BMIs can save some memory. If you remember, each hero's height and weight is stored in a numpy array. That means you can use NumPy's handy array indexing capabilities and broadcasting to perform your calculations. A function named `calc_bmi_arrays` has been created and saved to a file titled `bmi_arrays.py`.

How much memory do the array indexing and broadcasting lines of code consume in the `calc_bmi_array()` function? (i.e., what is the total sum of the Increment column for these four lines of code?)

In [29]:
from bmi_arrays import calc_bmi_arrays
%reload_ext memory_profiler

%mprun -f calc_bmi_arrays calc_bmi_arrays([1], hts, wts)




Filename: c:\Users\88016\Desktop\DataCamp Exercise\Python\Writing Efficient Python Code\bmi_arrays.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     1     35.8 MiB     35.8 MiB           1   def calc_bmi_arrays(sample_indices, hts, wts):
     2                                         
     3                                             # Gather sample heights and weights as arrays
     4     35.9 MiB      0.2 MiB           1       s_hts = hts[sample_indices]
     5     35.9 MiB      0.0 MiB           1       s_wts = wts[sample_indices]
     6                                         
     7                                             # Convert heights from cm to m and square with broadcasting
     8     36.0 MiB      0.1 MiB           1       s_hts_m_sqr = (s_hts / 100) ** 2
     9                                         
    10                                             # Calculate BMIs as an array using broadcasting
    11     36.0 MiB      0.0 MiB           1    

# Bringing it all together: Star Wars profiling

You'd like to filter the heroes list based on a hero's specific publisher, but are unsure which of the below functions is more efficient.

In [30]:
def get_publisher_heroes(heroes, publishers, desired_publisher):

    desired_heroes = []

    for i,pub in enumerate(publishers):
        if pub == desired_publisher:
            desired_heroes.append(heroes[i])

    return desired_heroes

In [31]:
def get_publisher_heroes_np(heroes, publishers, desired_publisher):

    heroes_np = np.array(heroes)
    pubs_np = np.array(publishers)

    desired_heroes = heroes_np[pubs_np == desired_publisher]

    return desired_heroes

In [32]:
# # Use get_publisher_heroes() to gather Star Wars heroes
# star_wars_heroes = get_publisher_heroes(heroes, publishers, 'George Lucas')

# print(star_wars_heroes)
# print(type(star_wars_heroes))

# # Use get_publisher_heroes_np() to gather Star Wars heroes
# star_wars_heroes_np = get_publisher_heroes_np(heroes, publishers, 'George Lucas')

# print(star_wars_heroes_np)
# print(type(star_wars_heroes_np))

load the line_profiler and use `%lprun` to profile the two functions for line-by-line runtime. When using `%lprun`, use each function to gather the Star Wars heroes as you did in the previous step. 

- `get_publisher_heroes_np()` is faster.

The `get_publisher_heroes()` function and `get_publisher_heroes_np()` function have been saved within a file titled `hero_funcs.py` (i.e., you can import both functions from hero_funcs). When using `%mprun`, use each function to gather the Star Wars heroes as you did in the previous step. 

- Both functions have the same memory consumption.

Based on your runtime profiling and memory allocation profiling, which function would you choose to gather Star Wars heroes?
- I would use `get_publisher_heroes_np()`. Because although the memory usage is same, execution is faster here