# Notes on efficient computing

by Martin Hronec


### Contents

0. [Midterm solution](#midterm)
1. [Timing](#timing)
2. [Profiling](#profiling)
3. [Generators](#generators)
4. [Iterators](#iterators)


In [1]:
# system-specific functions
import sys

## Efficient computing

* what is meant by an **efficient Python code**
    * minimal completion time (fast runtime)
    * minimal resource consumption (small memory footprint)


* what is meant by **Pythonic**?
    * focus on readability
    * using Python's constructs as intended
    * the Zen of Python

* sometimes at odds with eachother
* optimize only what needs optimizing:
    1. Get it right
    2. Test it's right
    3. Profile if slow
    4. Optimise
    5. Repeat from 2.



### Building with build-ins

* Python comes with "batteries included" (the Python Standard Library)
    they help in building an efficient code
* built-in types: `list`,`tuple`,`set`,`dict`
* built-in functions: `range()`, `round()`, `enumerate()`, `map()`, `zip()`
* built-in modules: `os`, `sys`, `itertools`, `collections`, `sets`, etc.


### Examining runtime (timing and profiling code)

* why time code?
    * allows us to pick the optimal coding implementation
* we have already seen some ipython magic earlier (`%timeit` and `%%time`)
* there is a `time` module for scripts

In [2]:
import time
start = time.time()
a = range(100000)
b = []
for i in a:
    b.append(i*2)
end = time.time()
print(end - start)

0.012430906295776367


In [3]:
time.time()

1606842549.4879186

In [4]:
time.time() / (60*60*24*30*12)

51.66031859239547

* **timeit** module measures an execution time of small code snippets

* calculate runtime with IPython magic command `%timeit` (magic command `%lsmagic`)
    * set the number of runs (-r) and/or loops (-n)
    * use `%%` in front of timeing for cell magic mode
    * save the output to a variable (`-o`)

In [5]:
?%timeit;

Object `%timeit;` not found.


In [6]:
import numpy as np
# number of runs: 2, number of loops: 10
%timeit -r2 -n10 rand_nums = np.random.rand(1000)


7.66 µs ± 1.23 µs per loop (mean ± std. dev. of 2 runs, 10 loops each)


In [7]:
# save timeit results into a variable 
times =  %timeit -o rand_nums = np.random.rand(1000)

6.44 µs ± 205 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [8]:
times.__dict__

{'loops': 100000,
 'repeat': 7,
 'best': 6.2140485900090425e-06,
 'worst': 6.764381769971805e-06,
 'all_runs': [0.6518236590018205,
  0.651060124997457,
  0.6764381769971806,
  0.6632696870001382,
  0.622884267999325,
  0.6214048590009043,
  0.6226210429995263],
 'compile_time': 4.599999999999049e-05,
 '_precision': 3,
 'timings': [6.518236590018205e-06,
  6.5106012499745704e-06,
  6.764381769971805e-06,
  6.632696870001382e-06,
  6.228842679993249e-06,
  6.2140485900090425e-06,
  6.226210429995262e-06]}

In [9]:
%%timeit
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

217 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


 
* when repeating operation is not a good idea, use `%time`
* `%timeit` does heavy lifting for you, i.e. no garbage collection, etc. (things that might affect the timing)



In [10]:
a = np.random.permutation(1000000)

In [11]:
%time a.sort()

CPU times: user 69.7 ms, sys: 0 ns, total: 69.7 ms
Wall time: 68.3 ms


In [12]:
%timeit a.sort()

7.84 ms ± 79.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Code profiling for runtime
* detailed stats on frequency and duration of function calls
* line-by-line analyses
* package used: [line_profiler](https://github.com/rkern/line_profiler)


In [13]:
%load_ext line_profiler

In [14]:
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (i**j) for j in range(N)]
        total += sum(L)
    return total

In [15]:
?%lprun;

Object `%lprun;` not found.


In [16]:
%lprun -f sum_of_lists sum_of_lists(5000)

Timer unit: 1e-06 s

Total time: 0.064148 s
File: <ipython-input-14-0e483e4cb579>
Function: sum_of_lists at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
     1                                           def sum_of_lists(N):
     2         1          2.0      2.0      0.0      total = 0
     3         6          7.0      1.2      0.0      for i in range(5):
     4         5      61674.0  12334.8     96.1          L = [j ^ (i**j) for j in range(N)]
     5         5       2464.0    492.8      3.8          total += sum(L)
     6         1          1.0      1.0      0.0      return total

* shows where the is the time bottle-neck
    * Time is in microseconds
* for more info on available timing options: `%lprun?`


## Code profiling for memory usage
* code's memory footprint (the amount of memory an operation uses)
* using package *memory_profiler* (also as an IPyhton extension)
* contains 2 notably useful magic functions:
    * `%memit`: memory-measuring equivalent of `%timeit`
    * `%mprun`: memory-measuring equivalent of `%lprun` 

In [17]:
# package for memory profiling
%load_ext memory_profiler

In [18]:
# OK but only a size (in bytes) of an individual object
sys.getsizeof([*range(1000)])

9104

* for line-by-line description: `%mprun`
    * gives us a summary of the memory use
    * works only for functions defined separate modules rather than the nb itself
* use `%%file` magic to create a simple module, which will contain code we want to profile

In [19]:
%%file mprun_demo.py
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
        del L # remove reference to L
    return total

def copies_of_lists(N):
    d = {}
    for i in range(N):
        d[i] = [1,2,3]
    return d


Overwriting mprun_demo.py


In [20]:
from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(1000)




Filename: /home/crozier/projects/phd/teaching/DPP_IES/10/mprun_demo.py

Line #    Mem usage    Increment  Occurences   Line Contents
     1     77.7 MiB     77.7 MiB           1   def sum_of_lists(N):
     2     77.7 MiB      0.0 MiB           1       total = 0
     3     77.7 MiB      0.0 MiB           6       for i in range(5):
     4     77.7 MiB      0.0 MiB        5015           L = [j ^ (j >> i) for j in range(N)]
     5     77.7 MiB      0.0 MiB           5           total += sum(L)
     6     77.7 MiB      0.0 MiB           5           del L # remove reference to L
     7     77.7 MiB      0.0 MiB           1       return total

In [21]:
from mprun_demo import copies_of_lists
%mprun -f copies_of_lists copies_of_lists(1000)




Filename: /home/crozier/projects/phd/teaching/DPP_IES/10/mprun_demo.py

Line #    Mem usage    Increment  Occurences   Line Contents
     9     77.7 MiB     77.7 MiB           1   def copies_of_lists(N):
    10     77.7 MiB      0.0 MiB           1       d = {}
    11     77.8 MiB      0.1 MiB        1001       for i in range(N):
    12     77.8 MiB      0.0 MiB        1000           d[i] = [1,2,3]
    13     77.8 MiB      0.0 MiB           1       return d

* the Increment column gives us how much each line affects the total memory budget
* for more info on `%memit` and `% mprun`:

In [22]:
%memit sum_of_lists(10000) 

peak memory: 78.35 MiB, increment: 0.51 MiB


# Gaining efficiencies

## Generators
* functions that can be paused and return on the fly (returning **an iterator**)
* they are lazy!
    * produce items one at time
    * and only when asked
* much more efficient when working with big data
    * exampe from Torch DataLoader
* define function as you normally would, only replace `return` statement with the `yield` statement
* the `yield` statement pauses the function and saves the local state (so that it can be resumed right where it left off)
    * it returns a generator object which is used to control execution

In [23]:
def name_generator():
    yield('Vitek')
    yield('Honza')

In [24]:
?next

[0;31mDocstring:[0m
next(iterator[, default])

Return the next item from the iterator. If default is given and the iterator
is exhausted, it is returned instead of raising StopIteration.
[0;31mType:[0m      builtin_function_or_method


In [25]:
names = name_generator()

In [26]:
next(names)

'Vitek'

In [27]:
next(names)

'Honza'

In [28]:
next(names)

StopIteration: 

In [None]:
def countdown(num):
    print('Starting')
    while num > 0:
        yield num
        num -= 1

In [None]:
# calling the function (generator) does not execute it
val = countdown(5)
val

* generators objects execute when `next()` is called
* `next()` for the first time: execution begins at the start of the function body and continues until the next yield statement where the value to the right of the statement is returned
* subsequent calls to next() continue from the yield statement to the end of the function, and loop around and continue from the start of the function body until another yield is called
*  If yield is not called (which in our case means we don’t go into the if function because num <= 0) a StopIteration exception is raised:

In [None]:
next(val)

In [None]:
next(val)
next(val)
next(val)
next(val)

* generator expressions
    * just like list comprehension, generators can be used in the same manner

In [None]:
my_list = ['a','b','c','d']
gen_obj = (x for x in my_list)
for val in gen_obj:
    print(val)

* the *parens* on either side of the second line denoting a generator expression, which, for the most part, does the same thing that a list comprehension does, but does it lazily

In [None]:
%timeit g = (i * 2 for i in range(1000) if i % 3 == 0 or i % 5 == 0)

In [None]:
%timeit l = [i * 2 for i in range(1000) if i % 3 == 0 or i % 5 == 0]

* don't mix up the syntax of a list comprehension with a generator expression (`[]` vs `()`
    * generator expressions can run *slower* (because of the overhead of function calls)
    * modify parameter in range to see the effect
* however, **generator expressions are drastically faster when the size of your data is larger than the available memory**
* we will look at [dask](https://dask.org/) during the next lecture

## Iterators

* considered to be a "milestone" for any serious Pythonista
* consider the humble *for-in loop*
* but how does loop constructs work behind the scenes?
    * the Python's iterator protocol:
        * Objects that support the `__iter__` and `__next__` dunder methods automatically work with for-in loops.
* below is a class with bare-bones iterator protocol in Python 

In [None]:
class Repeater:
    def __init__(self, value):
        self.value = value

    def __iter__(self):
        return RepeaterIterator(self)
    
    def __next__(self):
        return self.value

* looks like a straightforward Python class but:
* methods `__iter__` and `__next__` are key to make a Python object iterable

In [None]:
repeater = Repeater('Hello')

# working iterator
for i,item in enumerate(repeater):
    if i < 10:
        print(i,item);
    else:
        break

In [None]:
# more explicit of what goes on behind the scene
repeater = Repeater('Hello')
iterator = repeater.__iter__()
i = 0
while i<10: # (alternatively `while True`)
    item = iterator.__next__()
    print(i,item)
    i = i + 1

* prepares the iterator object (by calling `__iter__` method)
* loop repeatedly calls the `__next__` method
* because there’s never more than one element “in flight”, this approach is highly memory-efficient
* in  abstract terms, iterators provide a common interface that allows you to process every element of a container while being completely isolated from the container’s internal structure
* whether you’re dealing with a list of elements, a dictionary, an infinite sequence like the one provided by our Repeater class, or another sequence type—all of that is just an implementation detail. Every single one of these objects can be traversed in the same way by the power of iterators.
* we can replace the calls to `__iter__` and  `__next__` with calls to Python’s built-in functions `iter()` and `next()`
    * other Python's built-in functions with the same purpose of a clean facade: `len(x)` for `x.__len__` 
    
* who want's to iterate forever?

    * how to write an iterator that eventually stops generating?

* StopIteration exception to signal we’ve exhausted all of the available values in the iterator

    * use exceptions for control of iterators flow



In [None]:
my_list = [1,2,3]
iterator = iter(my_list)
next(iterator)
next(iterator)
next(iterator)


In [None]:
# if I keep requesting more values
next(iterator)

* iterators can't be reset ("once they’re exhausted they’re supposed to raise StopIteration every time next() is called on them"
* we can implement the above notions into our code
   * iteration stops after the number of repetitions defined in the max_repeats parameter

In [None]:
class BoundedRepeater:
    def __init__(self, value, max_repeats):
        self.value = value
        self.max_repeats = max_repeats
        self.count = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.count >= self.max_repeats:
            raise StopIteration
        self.count += 1
        return self.value

In [None]:
# for-in loop
repeater = BoundedRepeater('Hello', 3)
iterator = iter(repeater)
while True:
    try:
        item = next(iterator)
    except StopIteration:
        break
    print(item)



* iterators in Python 2.x are syntactically different, so if you need the compatibility, look at it

## itertools
* a collection of tools for handling iterators
    * iterators are data types that can be used in a `for` loop
    * the most common iterator in Python is the list
* for detailed explanation of what is an iterable, look at [Python 3 glossary](https://docs.python.org/3/glossary.html#term-iterable)

* probably not enough to know the definitions of the functions it contains
    * how to copose these functions to create fast, memory-efficient, and good-looking code

* the functions in itertools "operate" on iterators to produce more complex iterators   

* used for creating and using iterators
    * infinite iterators: ``count``, ``cycle``, ``repeat``
    * finite iterators: ``accumulate``, ``chain``, ``zip_longest``
    * combination generators: `product`, `permutations`, `combination` 

In [None]:
import itertools

In [None]:
# zip example (zip is in-built, not an itertool method)
list(zip([1, 2, 3], ['a', 'b', 'c']))



* both lists above are iterable
* zip() function works:
    * by calling iter() on each of its arguments
    * then advancing each iterator returned by iter() with next()
    * and finally aggregating the results into tuples
* the iterator returned by zip() iterates over these tuples
* in general, when iter() called on an iterable, returns an iterator object



In [None]:
type(iter([1,2,3,4]))

In [None]:
# another iterator
list(map(len, ['abc', 'de', 'fghi']))

* as an example of an iterator algebra

    * compose zip() and map() to produce an iterator over combinations of elements in more than one iterable

In [None]:
list(map(sum, zip([1, 2, 3], [4, 5, 6])))

* a collection of building blocks that can be combined to form specialized “data pipelines”
* advantages:
    improved memory efficiency (via lazy evaluation)
    faster execution time
* look at the following example



In [None]:
def naive_grouper(inputs, n):
    num_groups = len(inputs) // n
    return [tuple(inputs[i*n:(i+1)*n]) for i in range(num_groups)]

In [None]:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
naive_grouper(nums, 2)



* what if we pass a list of million elements?
    * we will need a whole lot of available memory
* iterators to save the day

In [None]:
def better_grouper(inputs, n):
    # creates a list of n references to the same iterator
    iters = [iter(inputs)] * n
    return zip(*iters)



SIDENOTE on asterisk (`*`) usecases:

* multiplication and power
* extending the list-type containers
* unpacking the containers
    * keyword arguments



In [None]:
iters

In [None]:
iters = [iter(nums)] * 2
list(id(itr) for itr in iters)  # the same iterator => IDs are the same



* the first element is taken from the "first" iterator
* the "second iterator now starts at 2"(the second element), since it is just a reference to the "first" iterator, i.e. advanced by one step
    * the first tuple produced by zip() is (1, 2)
* "both" iterators in iters start at 3, so when zip() pulls 3 from the "first" iterator, it gets 4 from the "second" to produce the tuple (3, 4)
* this process continues until zip() finally produces (9, 10) and "both" iterators in iters are exhausted



In [None]:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# an iterator over pairs of corresponding elements of each iterator in iters
list(better_grouper(nums, 2))



* the main advantages:
    * can take an iterable as in argument (even infinite iterator)
    * by returning an iterator rather than a list, we can process enormous iterables without trouble and use less memory
* what's missing?
    * what if the value passed to the second argument isn't a factor of the length of the iterable in the first argument?



In [None]:
# 10 is missing from the grouped output because zip stops the moment the shortest iterable passed to it is exhausted
list(better_grouper(nums, 3))

* itertools.zip_longest() to the rescue



In [None]:
x = [1, 2, 3, 4, 5]
y = ['a', 'b', 'c']

print(list(zip(x, y)))
list(itertools.zip_longest(x, y,  fillvalue='Filling the void.'))


* there is plenty of other methods in itertools, have a look at them [here](https://docs.python.org/3/library/itertools.html)
* look at more [itertools recipes](https://docs.python.org/3.6/library/itertools.html#itertools-recipes)

* there is also plenty of other efficient built-in datatypes:
    * `namedtuple`: tuple subclasses with named fields
    * `deque`: list-like container with fast appends and pop
    * `Counter`: dict for counting hashable objects
    * `OrderDict`: dict that retains order of entries
    * `defaultdict`: dict that calls a factory function to supply missing values

* if the built-in general purpose dict, list, set and tuple are not enough
    * use collections module: specialized container datatypes

* built-in `sets` datatype
* `sets` methods:
    * `intersection()`
    * `difference()`
    * `symetric_difference()`
    * `union()`

* fast membership testing:
    * check if value exists in a sequence (or not) using `in` operator|

## Eliminating loops
* using loops is not *necessary* a bad practice
* looping patterns:
    * `for` loop: iterate over sequence piece-by-piece
    * `while` loop: repeat loop as long as condition is met
    * "nested loops: use one loop inside another loop
    * Costly!
* benefits of eliminating loops:
    * fewer lines of code
    * easier to interpret (code readability)
    * "Flat is better than nested."
    
### Eliminating loops with built-ins
* list comprehension
* `*map` functionality

## Writing better loops
* sometimes you can't eliminate the loop
* how to do it better
    * understand what is being done with each loop iteration (to be sure we are not doing anything unnecessary)
    * anything that can be done once, move it outside the loop
        * move one-time calculation outside (above) the loop
        * use holistic conversions outside (below) the loop
    
