
# Writing Fast Python Code 
### William E Fondrie  
 
Noble Lab Meeting  
2020-03-04

# Outline

- Introduction  
- Base Python and the standard library
- Perfomant Pandas
- Lightspeed with Numba
- Easy GPU computing with CuPy and Numba

# Why is Python slow?

Python an interpreted language, in contrast to compiled languages like C/C++.

However, it is often *fast enough* and much easier to write/maintain.

# To optimize or not to optimize?

> “The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.”

<div style="text-align: right">
    - Donald Knuth, <em>The Art of Computer Programming</em>
</div>

Is your Python code *fast enough* already?  
What part of your code is slow?  
Is the problem your algorithm or data structures, not the language?   

# Identify your bottlenecks

# If you suspect a section of code is slow...

Use `time.time()` to time it:

In [19]:
import time

start = time.time()
"-".join(str(n) for n in range(1000000))
finish = time.time()
print(finish - start, "seconds")

0.26831793785095215 seconds


# Use cProfile to find slow code:

In [36]:
import cProfile

def foo():
    x = []
    for i in range(1000000):
        x.append(i)

cProfile.run("foo()")

         1000004 function calls in 0.233 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.144    0.144    0.218    0.218 <ipython-input-36-6a0459c9858c>:3(foo)
        1    0.015    0.015    0.233    0.233 <string>:1(<module>)
        1    0.000    0.000    0.233    0.233 {built-in method builtins.exec}
  1000000    0.074    0.000    0.074    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




# Benchmark your functions

Time your functions with `timeit`:

In [14]:
import timeit

def foo():
    return "-".join(str(n) for n in range(1000))

total = timeit.timeit(foo, number=1000)
print(total / 1000, "seconds/call")

0.00022530062000078032 seconds/call


In [7]:
%timeit foo()

201 µs ± 6.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


# Base Python and the standard library

# Avoid repeatedly appending to a list

In [47]:
def use_list():
    x = []
    for i in range(10000):
        x.append(i)
    return x
        
%timeit use_list()

611 µs ± 6.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Use a list comprehension instead:

In [46]:
def use_comp():
    return [i for i in range(10000)]

%timeit use_comp()

300 µs ± 7.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


# Look-up values in sets and dictionaries, not lists 

In [48]:
data = list(range(10000))
queries = [1000, 2838, 8493, -20]

def lookup_list(data, query):
    return query in data

def lookup_set(data, query):
    data = set(data)
    return query in data

%timeit lookup_list(data, queries[0])
%timeit lookup_set(data, queries[1])

11.3 µs ± 205 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
107 µs ± 508 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
