# Introduction

These are some ideas on how one could speed up their python code. 

Before we get into it, here are some facts that you should know, and hopefully will make this tutorial make more sense:

## Python Language Interpretation
- Python is an interpreted language; This means that variable types are not known at run time. This is ~~bad~~ not so good. Compiled languages like C/C++, are "strongly typed", which means that the compiler can optimize the code at compiletime, since it knows what it's handling at every moment. 


- The python code is also ""compiled"" into a `*.pyc` file. Thisis not a binary file that the hardware can read just yet. The Python interpreter takes the `pyc` and executes translates this to hardware instructions. This is when the variable types, dereferencing of objects etc, takes place. This is also ~~very bad~~ not that good either.


- The interpreter is virtual machine, which is also not optimized for the hardware it's running on either (similar to linux "compile from the source code" vs "using the distribution's binaries" (ugh) ). This is also why Java underperforms C/C++.


- Everything in Python is an Object (a class's instance). This is good because it offers lots of flexibility and we can pass and manipulate variables like if there was no ~~segfault~~ tomorrow, but comes at an efficincy price. Because one can add new attributes to an instance basically anywhere in the code, the compiler needs to check **on every call** if the class has that attribute, where it is located in the memory, then execute it. 


## Iterators
There are no `for loops` in python.

TBC


There are ways we can speed up (as much as possible), but the price to pay is that your code will be less pretty, or a  less readable - well, you can't have everything! Then you must document your code. *Cough, cough*

# 1. Intrinsic Operators

Operators in python (+, -, \*, etc.) are available via the `operators` module. It is better to use them directly as functions, instead of a function that calls the operator multiple times. Also, always use built-in functions rather than building your own.


For instance

In [62]:
import random
import operator
sample_size = 1000000
random_numbers = [random.randint(0,sample_size) for p in range(0,sample_size)]
random_numbers2 = [random.randint(0,sample_size) for p in range(0,sample_size)]



In the following example, the lambda function is called on every iteratios (and that is not that good, as we will see later)

In [63]:
%%time
res=list(map(lambda x,y: x+y, random_numbers, random_numbers2))

CPU times: user 181 ms, sys: 31.4 ms, total: 212 ms
Wall time: 220 ms


In [64]:
%%time
res=list(map(operator.add, random_numbers, random_numbers2))

CPU times: user 146 ms, sys: 27.3 ms, total: 173 ms
Wall time: 177 ms


About **25% faster**

# Function call overhead

As mentioned in the introduction, functions are objects too, and they need to be evaluated during run time, as their arguments. For that reason, functions should handle data aggregation rather than being called on a per element basis.

In [69]:
%%timeit -n 100

# A function in a loop

def doubleit(i):
        x = i*2
    
def f():
    for i in range(100000): 
        doubleit(i)
f()

100 loops, best of 3: 19.4 ms per loop


In [70]:
%%timeit -n 100

# A loop in a function

def doubleit():
    for i in range(100000): 
        x = i*2
def f():
    doubleit()

f()

100 loops, best of 3: 7.33 ms per loop


# `in` operator

Dictionaries and sets are implemented using hash tables (fast).
Lists and tuples are not.

Checking for membership in a list or tuple is not as effecient, as each element must be checked in turn. 
If you need to check membership very often, use dict or set as your container rather than searching a list.

In [94]:
letters = 'abcdefghijklmnopqrstuvwxyz'
letters_list = [x+y+z+w for x in letters for y in letters for z in letters for w in letters]
letters_dict = dict([(x,x) for x in letters_list])
letters_set = set(letters_list)

In [100]:
%%timeit -n 100

# Loop in list
"zzzz" in letters_list

100 loops, best of 3: 9.84 ms per loop


In [96]:
%%timeit -n 100

# Loop in dict
"zzzz" in letters_dict

100 loops, best of 3: 133 ns per loop


In [101]:
%%timeit -n 100

# Loop in set
"zzzz" in letters_set

100 loops, best of 3: 85.3 ns per loop


# Strings

Strings are immuatable. This means everytime you modify a string, say with the operator '+' you are actually copying the string data over and over. Strings have built-in (read "optimized") operators that avoid the copying overhead.

In [103]:
%%timeit -n 100

# hand-built concatenation:
def concat_string(str_list):
    text = ""
    for s in str_list:
        text += " "+s

res = concat_string(letters_list)

100 loops, best of 3: 122 ms per loop


In [104]:
%%timeit -n 100

# Using the str.join():
res = " ".join(letters_list)

100 loops, best of 3: 8.1 ms per loop


# Decorators for caching

Decorator functions can be used to cache results to speed up things. This is useful when one can store intermediate steps (or checkpoints).

In [106]:
def fib(i):
    if i < 2: return 1
    return fib(i-1) + fib(i-2)

In [115]:
%%timeit -n 100

fib(25)

100 loops, best of 3: 49 ms per loop


In [116]:
# Create the decorator
from functools import wraps

def cache(f):
    cache = { }
    @wraps(f)
    def wrap(*arg):
        if arg not in cache: cache[arg] = f(*arg)
        return cache[arg]
    return wrap

@cache
def fib_decorated(i):
    if i < 2: return 1
    return fib(i-1) + fib(i-2)

In [118]:
%%timeit -n 100

fib_decorated(25)

The slowest run took 1916.18 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 413 ns per loop


# Object access

As mentioned in the introduction, accessing object properties (with the `.` operator) has an efficiency impact. This is because Python needs to figure out what is being access exists at that time, what it is, where it is in the memory map, and then executed. Although this may seem as a tiny amount of time, it can add up over millions of operations.

This is problematic for instance in loops. To adpeed things up, you can store the function object as a variable outside of the loop, and call it on every iteration

In [127]:
lowerlist = ['abcdefghijklmnopqrstuvwxyz'[:random.randint(0,25)] for x in range(10000)]
upperlist = []

In [137]:
%%timeit -n 500

# This uses two "." calls: 
def to_upper_1():
    for word in lowerlist:
        upperlist.append(str.upper(word))
        
to_upper_1()

500 loops, best of 3: 7.42 ms per loop


In [138]:
%%timeit -n 500

# This uses two "." calls: 
upper = str.upper
append = upperlist.append

def to_upper_2():
    for word in lowerlist:
        append(upper(word))
        
to_upper_2()

500 loops, best of 3: 5.13 ms per loop


# Scoped Variables

Variables inside a scope are accessed faster than "out-of-scope" variables. In the example above, the operator functions and wordlist are defined outside of the function `to_upper_2()`. Move those variable inside a function so the all belong to the same scope.

In [139]:
%%timeit -n 500

def to_upper_3():
    upperlist = []
    upper = str.upper
    append = upperlist.append
    for word in lowerlist:
        append(upper(word))
    return upperlist
        
upperlist = to_upper_3()

500 loops, best of 3: 4.54 ms per loop


# References
 - https://nyu-cds.github.io/python-performance-tips/

# License

This work is derived from work that is Copyright © Software Carpentry (http://software-carpentry.org/), under the [Creative Commons License](https://creativecommons.org/licenses/by/4.0/legalcode). Examples have been expanded or reduced. Text was modified.