# Speedy Python

Some examples how to measure speed and memory usage.

Comparisons include how to speed up or replace loops with built-in functions or NumPy arrays.

## Timeit
**Line Magic**

In [1]:
# default params
%timeit lambda: "-".join(map(str, range(10000)))

52.8 ns ± 5.29 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [2]:
# custom params
%timeit -r 10 -n 1000 lambda: "-".join(map(str, range(10000)))

40.5 ns ± 0.143 ns per loop (mean ± std. dev. of 10 runs, 1000 loops each)


**Cell magic**

Note: the magic command has to be in the first line of the cell or it won't work

In [3]:
%%timeit
total = 0
for i in range(100):
    for j in range(100):
        total += i * (-1) ** j

3.43 ms ± 96.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Example Setup

In [4]:
import random
import string
import numpy as np

# a function that generates a random numeric and alphabetical an numeric string
def randStr(chars = string.ascii_lowercase, N=10):
    return ''.join(random.choice(chars) for _ in range(N)).title()

def randNamegen(count=1000, length=10):
    nameList=[]
    for i in range(count):
        nameList.append(randStr(N=length))
    return nameList

nameList = randNamegen(count=100000)
print("List",nameList[:5])

nameList_np = np.array(nameList)
print("NumPy Array",nameList_np[:5])

List ['Axofhxrnjp', 'Cepvntlcjy', 'Upyxcbnexw', 'Wrwmyunqpf', 'Gciqvwewke']
NumPy Array ['Axofhxrnjp' 'Cepvntlcjy' 'Upyxcbnexw' 'Wrwmyunqpf' 'Gciqvwewke']


In [5]:
# a function that generates randomized people's heights in centimeters
def randHTgen(count=1000, lower=150, upper=300):
    HTList=[]
    for i in range(count):
        HTList.append(random.randint(lower, upper))
    return HTList

HTList = randHTgen(count=100000)
print("List",HTList[:5])

HTList_np = np.array(HTList)
print("NumPy Array",HTList_np[:5])

List [263, 209, 156, 218, 295]
NumPy Array [263 209 156 218 295]


In [6]:
# a function that generates a range of random people wights in kilogramm
def randWTgen(count=1000, lower=50, upper=150):
    WTList=[]
    for i in range(count):
        WTList.append(random.randint(lower, upper))
    return WTList

WTList = randWTgen(count=100000)
print("List",WTList[:5])

WTList_np = np.array(WTList)
print("NumPy Array",WTList_np[:5])

List [66, 120, 102, 74, 71]
NumPy Array [ 66 120 102  74  71]


**The example function converts the heights from centimeter to feet and weights from kilograms to pounds.**

In [7]:
# a function for measuring (list comprehension)
def convert_units_list(names, heights, weights):
    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]
    people_data = {}
    for i,name in enumerate(names):
        people_data[name] = (new_hts[i], new_wts[i])
    return people_data

In [8]:
# a function for measuring (NumPy array broadcasting)
def convert_units_array(names, heights, weights):
    new_hts = heights * 0.39370
    new_wts = weights * 2.20462
    people_data = {}
    for i,name in enumerate(names):
        people_data[name] = (new_hts[i], new_wts[i])
    return people_data

#convert_units_array(nameList_np, HTList_np, WTList_np)

## Measurement Tools
### Line Profiler

In [9]:
%load_ext line_profiler

%lprun -f convert_units_list convert_units_list(nameList, HTList, WTList)

Timer unit: 1e-07 s

Total time: 0.208038 s
File: <ipython-input-7-b143edde93f5>
Function: convert_units_list at line 2

Line #      Hits         Time  Per Hit   % Time  Line Contents
     2                                           def convert_units_list(names, heights, weights):
     3         1     178665.0 178665.0      8.6      new_hts = [ht * 0.39370  for ht in heights]
     4         1     182745.0 182745.0      8.8      new_wts = [wt * 2.20462  for wt in weights]
     5         1         18.0     18.0      0.0      people_data = {}
     6    100001     856760.0      8.6     41.2      for i,name in enumerate(names):
     7    100000     862181.0      8.6     41.4          people_data[name] = (new_hts[i], new_wts[i])
     8         1          8.0      8.0      0.0      return people_data

In [10]:
%lprun -f convert_units_array convert_units_array(nameList_np, HTList_np, WTList_np)

Timer unit: 1e-07 s

Total time: 0.168481 s
File: <ipython-input-8-5226aaf1be87>
Function: convert_units_array at line 2

Line #      Hits         Time  Per Hit   % Time  Line Contents
     2                                           def convert_units_array(names, heights, weights):
     3         1       6052.0   6052.0      0.4      new_hts = heights * 0.39370
     4         1       6017.0   6017.0      0.4      new_wts = weights * 2.20462
     5         1         17.0     17.0      0.0      people_data = {}
     6    100001     831242.0      8.3     49.3      for i,name in enumerate(names):
     7    100000     841478.0      8.4     49.9          people_data[name] = (new_hts[i], new_wts[i])
     8         1          6.0      6.0      0.0      return people_data

### Memory Profiler

In [11]:
from conv_list import convert_units_list

%load_ext memory_profiler

%mprun -f convert_units_list convert_units_list(nameList, HTList, WTList)




Filename: C:\Users\ChristianV700\Documents\GitHub\Python_coding\speedy_python\conv_list.py

Line #    Mem usage    Increment  Occurences   Line Contents
     1     77.2 MiB     77.2 MiB           1   def convert_units_list(names, heights, weights):
     2     81.3 MiB  -3342.2 MiB      100003       new_hts = [ht * 0.39370  for ht in heights]
     3     85.2 MiB      3.9 MiB      100003       new_wts = [wt * 2.20462  for wt in weights]
     4     85.2 MiB      0.0 MiB           1       people_data = {}
     5     96.9 MiB      6.1 MiB      100001       for i,name in enumerate(names):
     6     96.9 MiB      5.6 MiB      100000           people_data[name] = (new_hts[i], new_wts[i])
     7     96.9 MiB      0.0 MiB           1       return people_data

In [12]:
from conv_array import convert_units_array

%mprun -f convert_units_array convert_units_array(nameList_np, HTList_np, WTList_np)




Filename: C:\Users\ChristianV700\Documents\GitHub\Python_coding\speedy_python\conv_array.py

Line #    Mem usage    Increment  Occurences   Line Contents
     1     77.8 MiB     77.8 MiB           1   def convert_units_array(names, heights, weights):
     2     78.6 MiB      0.8 MiB           1       new_hts = heights * 0.39370
     3     79.4 MiB      0.8 MiB           1       new_wts = weights * 2.20462
     4     79.4 MiB      0.0 MiB           1       people_data = {}
     5    106.5 MiB     21.8 MiB      100001       for i,name in enumerate(names):
     6    106.5 MiB      5.3 MiB      100000           people_data[name] = (new_hts[i], new_wts[i])
     7    106.5 MiB      0.0 MiB           1       return people_data

## Combining items of two lists (loop vs zip)

Let's use the randomized names list and the randomized heights list.

In [13]:
combined = []
for i, name in enumerate(nameList):
    combined.append((name, HTList[i]))
print(combined[:5])

[('Axofhxrnjp', 263), ('Cepvntlcjy', 209), ('Upyxcbnexw', 156), ('Wrwmyunqpf', 218), ('Gciqvwewke', 295)]


In [14]:
%%timeit
combined = []
for i, name in enumerate(nameList):
    combined.append((name, HTList[i]))

15.7 ms ± 49.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [15]:
combined = []
for i, name in enumerate(nameList):
    combined.append((name, HTList[i]))
# unpack zip object to list
unpacked = [*combined]
print(unpacked[:5])

[('Axofhxrnjp', 263), ('Cepvntlcjy', 209), ('Upyxcbnexw', 156), ('Wrwmyunqpf', 218), ('Gciqvwewke', 295)]


In [16]:
%%timeit
combined_zip = zip(nameList, HTList)

198 ns ± 8.07 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


## Counting items on a list (loop vs Counter function)

In [17]:
height_counts = {}
for height in HTList:
    if height not in height_counts:
        height_counts[height] = 1
    else:
        height_counts[height] += 1
print(height_counts)

{263: 627, 209: 673, 156: 685, 218: 656, 295: 703, 150: 700, 205: 645, 246: 607, 275: 670, 225: 668, 253: 656, 269: 719, 154: 681, 151: 691, 186: 655, 208: 649, 233: 656, 194: 652, 228: 651, 199: 673, 178: 646, 163: 675, 259: 674, 198: 627, 227: 672, 200: 639, 217: 653, 270: 695, 250: 628, 231: 701, 188: 679, 174: 675, 170: 631, 229: 694, 179: 664, 251: 654, 165: 706, 212: 648, 223: 677, 273: 633, 172: 691, 242: 652, 221: 738, 173: 652, 235: 671, 152: 662, 247: 611, 183: 663, 239: 648, 296: 676, 291: 682, 262: 672, 210: 630, 159: 621, 258: 648, 276: 684, 169: 657, 240: 662, 286: 662, 245: 689, 238: 673, 181: 685, 155: 668, 213: 643, 204: 636, 237: 650, 166: 660, 300: 675, 241: 643, 187: 657, 293: 662, 283: 647, 192: 675, 230: 654, 184: 656, 175: 643, 214: 675, 277: 704, 272: 643, 219: 708, 226: 689, 285: 683, 158: 662, 222: 666, 157: 646, 216: 674, 257: 655, 162: 638, 254: 643, 292: 649, 189: 651, 264: 641, 203: 618, 164: 648, 278: 625, 294: 700, 249: 646, 288: 606, 195: 694, 268: 715,

In [18]:
%%timeit
height_counts = {}
for height in HTList:
    if height not in height_counts:
        height_counts[height] = 1
    else:
        height_counts[height] += 1

13.2 ms ± 900 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [19]:
from collections import Counter
height_counts = Counter(HTList)
print(height_counts)

Counter({284: 740, 221: 738, 269: 719, 268: 715, 219: 708, 165: 706, 236: 705, 277: 704, 295: 703, 265: 703, 231: 701, 150: 700, 294: 700, 299: 697, 270: 695, 229: 694, 195: 694, 287: 694, 151: 691, 172: 691, 191: 691, 245: 689, 226: 689, 274: 688, 211: 687, 156: 685, 181: 685, 276: 684, 234: 684, 207: 684, 285: 683, 291: 682, 154: 681, 188: 679, 223: 677, 256: 677, 296: 676, 267: 676, 271: 676, 163: 675, 174: 675, 300: 675, 192: 675, 214: 675, 167: 675, 259: 674, 216: 674, 202: 674, 190: 674, 153: 674, 209: 673, 199: 673, 238: 673, 281: 673, 227: 672, 262: 672, 279: 672, 235: 671, 275: 670, 266: 670, 161: 669, 225: 668, 155: 668, 197: 667, 222: 666, 282: 665, 179: 664, 297: 664, 183: 663, 152: 662, 240: 662, 286: 662, 293: 662, 158: 662, 196: 662, 171: 662, 201: 661, 290: 661, 166: 660, 182: 658, 169: 657, 187: 657, 193: 657, 218: 656, 253: 656, 233: 656, 184: 656, 232: 656, 186: 655, 257: 655, 224: 655, 251: 654, 230: 654, 220: 654, 217: 653, 244: 653, 298: 653, 194: 652, 242: 652, 1

In [20]:
%%timeit
height_counts = Counter(HTList)

5.05 ms ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Number of combinations on a list (loop vs Combinations function)

In [21]:
pairs = []
for x in nameList[:100]:
    for y in nameList[:100]:
        if x == y:
            continue
        if ((x,y) not in pairs) & ((y,x) not in pairs):
            pairs.append((x,y))
print(pairs[:5])

[('Axofhxrnjp', 'Cepvntlcjy'), ('Axofhxrnjp', 'Upyxcbnexw'), ('Axofhxrnjp', 'Wrwmyunqpf'), ('Axofhxrnjp', 'Gciqvwewke'), ('Axofhxrnjp', 'Nwkhavnduy')]


In [22]:
%%timeit
for x in nameList[:100]:
    for y in nameList[:100]:
        if x == y:
            continue
        if ((x,y) not in pairs) & ((y,x) not in pairs):
            pairs.append((x,y))

2.1 s ± 85.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [23]:
from itertools import combinations

pairs_obj = combinations(nameList[:100],2)

# unpack combinations object to list
pairs = [*pairs_obj]
print(pairs[:5])

[('Axofhxrnjp', 'Cepvntlcjy'), ('Axofhxrnjp', 'Upyxcbnexw'), ('Axofhxrnjp', 'Wrwmyunqpf'), ('Axofhxrnjp', 'Gciqvwewke'), ('Axofhxrnjp', 'Nwkhavnduy')]


In [24]:
%%timeit
combinations(nameList[:100],2)

782 ns ± 64.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


## Replace loops with built-in functions (zip, map)

In [25]:
loop_output = []
for name,weight in zip(nameList, WTList):
    if weight < 100:
        name_length = len(name)
        tuple = (name, name_length)
        loop_output.append(tuple)

loop_output[:5]

[('Axofhxrnjp', 10),
 ('Wrwmyunqpf', 10),
 ('Gciqvwewke', 10),
 ('Yudnkcbtcd', 10),
 ('Hvmzilmbkt', 10)]

In [26]:
%%timeit
loop_output = []
for name,weight in zip(nameList, WTList):
    if weight < 100:
        name_length = len(name)
        tuple = (name, name_length)
        loop_output.append(tuple)

12.4 ms ± 316 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [27]:
# use list comprehension to generate a filtered new list
filtered_name_list = [name for name,weight in zip(nameList, WTList) if weight > 100]

# use map() to apply a function to a list
name_lengths_map = map(len, filtered_name_list)

# Combine two lists with zip, then unpack zip
output = [*zip(filtered_name_list, name_lengths_map)]

output[:5]

[('Cepvntlcjy', 10),
 ('Upyxcbnexw', 10),
 ('Nwkhavnduy', 10),
 ('Gswwyymyyy', 10),
 ('Rnhjuixhul', 10)]

In [28]:
%%timeit
filtered_name_list = [name for name,weight in zip(nameList, WTList) if weight > 100]
name_lengths_map = map(len, filtered_name_list)
output = [*zip(filtered_name_list, name_lengths_map)]

9.21 ms ± 368 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Arrays instead of lists

In [29]:
%timeit convert_units_list(nameList, HTList, WTList)

36.8 ms ± 1.97 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [30]:
%timeit convert_units_array(nameList_np, HTList_np, WTList_np)

77.9 ms ± 587 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


Oh boy, the array version takes twice as much time when repeated often enough! The line profiler version (see above) must have been a fluke.