# Lecure 10 

by Martin Hronec

Apr 24-25



### Table of contents

0. [Efficient computing](#efficient)
1. [Timing](#timing)
2. [Profiling](#profiling)
    * [Timing](#timing)
    * [Memory usage](#memory)
3. [Generators]()
4. [Iterators]()
5. [itertools]()
6. [Loops]()
7. [Pandas optimization]()

Donald Knuth: "Premature optimization is the root of all evil."

![](https://imgs.xkcd.com/comics/optimization.png)

# Efficient computing

* what is meant by an **efficient Python code**
    * minimal completion time (fast runtime)
    * minimal resource consumption (small memory footprint)


* what is meant by **Pythonic**?
    * focus on readability
    * using Python's constructs as intended
    * the Zen of Python (by Tim Peters)
    
* sometimes at odds with eachother    

* **optimize only what needs optimizing**
    1. Get it right
    2. Test it's right
    3. Profile if slow
    4. Optimise
    5. Repeat from 2.


### Building with build-ins
* Python comes with "batteries included" (the Python Standard Library)
    * they help in building an efficient code
* built-in types: `list,tuple,set,dict`
* built-in functions: `range(), round(), enumerate(), map(), zip()`
* built-in modules: `os, sys, itertools, collections, sets`, etc. 

* let's look at `map()` with lambda (anonymous function) as an example
    * applies a function over an object
    
    
**Lambda expression**
* we use `def` keyword when creating functions in Python, binding them to a name
* sometimes,  we may want to declare a function anonymously (or only use it just once)
    * defining function is a bit extra 
* Lambda expresion is not a statement, i.e. it returns a value

In [54]:
# using lambda keyword to declare it 
lambda x:x-2

<function __main__.<lambda>(e)>

* `x` is the argument and `x-2`is the expression
* 

In [56]:
# pass an argument along with the declaration.
(lambda x:x-2)(1)

-1

In [49]:
l = ['abc', 'de', 'fghi']

# using in-built function len and map
list(map(len, l))

[3, 2, 4]

In [57]:
list(map(lambda x: len(x), l))

[3, 2, 4]

In [52]:
list(map(lambda x: x.__len__(), l))

[3, 2, 4]

### Examining runtime (timing and profiling code)
* why time code?
    * allows us to pick the optimal coding implementation
* we have already seen some ipython magic earlier (`%timeit` and `%%time`)
* there is a [time](https://docs.python.org/3/library/time.html|) module for scripts (not interactive)

In [70]:
# the simplest example
import time
import random

start = time.time()
a = range(100000)
b = []
for i in a:
    b.append(i + i*random.random())
end = time.time()
print(end - start)

0.025999784469604492


* **timeit** module measures an execution time of small code snippets
* calculate runtime with IPython magic command `%timeit` (magic command, use `%lsmagic` to see other options)
    * set the number of runs (-r) and/or loops (-n)
    * use `%%` in front of timeing for cell magic mode
    * save the output to a variable (`-o`)

In [82]:
# more interesting
import numpy as np

# number of runs: 5, number of loops: 10
%timeit -r5 -n10 rand_nums = np.random.permutation(1000000)

23.6 ms ± 1.07 ms per loop (mean ± std. dev. of 2 runs, 10 loops each)


In [83]:
# save timeit results into a variable 
timing_results =  %timeit -o rand_nums = np.random.permutation(1000000)

22.1 ms ± 191 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [89]:
timing_results.__dict__

{'loops': 10,
 'repeat': 7,
 'best': 0.021907293199910784,
 'worst': 0.022547730899896125,
 'all_runs': [0.22199343600004795,
  0.21907293199910782,
  0.2197177259986347,
  0.22148625399859156,
  0.2210496810002951,
  0.22103195400268305,
  0.22547730899896123],
 'compile_time': 4.116399941267446e-05,
 '_precision': 3,
 'timings': [0.022199343600004796,
  0.021907293199910784,
  0.021971772599863472,
  0.022148625399859158,
  0.022104968100029508,
  0.022103195400268304,
  0.022547730899896125]}

In [90]:
%%timeit
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

421 ms ± 3.12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


* when repeating operation is not a good idea, use `%time`
* `%timeit` does heavy lifting for you, i.e. no garbage collection, etc. (things that might affect the timing)

In [98]:
a = np.random.permutation(1000000)

In [99]:
%time a.sort()

Wall time: 65 ms


In [100]:
%timeit a.sort()

6.41 ms ± 123 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Code profiling for runtime
* detailed stats on frequency and duration of function calls
* line-by-line analyses
* we will used standar package [line_profiler](https://github.com/rkern/line_profiler)

In [104]:
%load_ext line_profiler

In [105]:
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
    return total

In [106]:
%lprun -f sum_of_lists sum_of_lists(5000)

Timer unit: 3.00463e-07 s

Total time: 0.00543148 s
File: <ipython-input-105-f105717832a2>
Function: sum_of_lists at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
     1                                           def sum_of_lists(N):
     2         1         10.0     10.0      0.1      total = 0
     3         6         16.0      2.7      0.1      for i in range(5):
     4         5      17440.0   3488.0     96.5          L = [j ^ (j >> i) for j in range(N)]
     5         5        609.0    121.8      3.4          total += sum(L)
     6         1          2.0      2.0      0.0      return total

* shows where the is the time bottle-neck
    * *Time* is in microseconds
* for more info on available timing options: `%lprun?`

## Code profiling for memory usage
* code's memory footprint (the amount of memory an operation uses)
* using package *memory_profiler* (also an IPyhton extension)
* contains 2 notably useful magic functions:
    * `%memit`: memory-measuring equivalent of `%timeit`
    * `%mprun`: memory-measuring equivalent of `%lprun` 

In [112]:
# package for memory profiling
%load_ext memory_profiler

In [113]:
# some in-built option: only a size of an individual object
sys.getsizeof([*range(1000)])

9112

* for line-by-line description: `%mprun`
    * gives us a summary of the memory usage 
    * CON: works only for functions defined separate modules rather than the nb itself
* use `%%file` magic to create a simple module, which will contain code we want to profile

In [195]:
%%file mprun_demo.py
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ** (j+i) for j in range(N)]
        total += sum(L)
        del L # remove reference to L
    return total


Overwriting mprun_demo.py


In [197]:
from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(1000000)




Filename: C:\Users\Analyst1\PyProjects\phd\DPP_IES\10\mprun_demo.py

Line #    Mem usage    Increment   Line Contents
     1     79.0 MiB     79.0 MiB   def sum_of_lists(N):
     2     79.0 MiB      0.0 MiB       total = 0
     3     79.9 MiB      0.0 MiB       for i in range(5):
     4    118.0 MiB      0.8 MiB           L = [j ** (j+i) for j in range(N)]
     5    118.0 MiB      0.0 MiB           total += sum(L)
     6     79.9 MiB      0.0 MiB           del L # remove reference to L
     7     79.0 MiB      0.0 MiB       return total

* the *Increment* column gives us how much each line affects the total memory budget
* for more info: `%memit?` and `%mprun?`:

# Gaining efficiencies

## Generators
* functions that can be paused and return on the fly (returning **an iterator**)
* they are lazy!
    * produce items one at time
    * and only when asked
* much more efficient when working with big data
    * exampe: Torch DataLoader
* define function as you normally would, only replace `return` statement with the `yield` statement
* the `yield` statement pauses the function and saves the local state (so that it can be resumed right where it left off)
    * it returns a generator object which is used to control execution

* use `next()` function to access the elements in the generator (until there are no more values in the generator)

In [138]:
def name_generator():
    yield('Vitek')
    yield('Lenka')

names = name_generator()

In [148]:
?next

[1;31mDocstring:[0m
next(iterator[, default])

Return the next item from the iterator. If default is given and the iterator
is exhausted, it is returned instead of raising StopIteration.
[1;31mType:[0m      builtin_function_or_method


In [139]:
next(names)

'Vitek'

In [140]:
next(names)

'Lenka'

In [141]:
next(names)

StopIteration: 

In [118]:
def countdown(num):
    print('Starting')
    while num > 0:
        yield num
        num -= 1

In [124]:
# calling the function (generator) does not execute it
val = countdown(5)
val

<generator object countdown at 0x0000000007ACFA20>

* generators objects execute when `next()` is called
* `next()` for the first time: execution begins at the start of the function body and continues until the next yield statement where the value to the right of the statement is returned
* subsequent calls to next() continue from the yield statement to the end of the function, and loop around and continue from the start of the function body until another yield is called
* if yield is not called (which in our case means we don’t go into the if function because num <= 0) a StopIteration exception is raised

In [142]:
next(val)

TypeError: 'str' object is not an iterator

In [143]:
print(next(val))
print(next(val))
print(next(val))
print(next(val))

TypeError: 'str' object is not an iterator

In [144]:
next(val)

TypeError: 'str' object is not an iterator

* generator expressions
    * just like list comprehension, generators can be used in the same manner
* the *parens* on either side of the second line denoting a generator expression, which, for the most part, does the same thing that a list comprehension does, but does it lazily

In [145]:
my_list = ['a','b','c','d']

gen_obj = (x for x in my_list)
for val in gen_obj:
    print(val)

a
b
c
d


In [146]:
%timeit g = (i * 2 for i in range(100) if i % 3 == 0 or i % 5 == 0)
print(sys.getsizeof(g))

432 ns ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
120


In [147]:
%timeit l = [i * 2 for i in range(100) if i % 3 == 0 or i % 5 == 0]
print(sys.getsizeof(l))

8.32 µs ± 171 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
38216


* don't mix up the syntax of a list comprehension with a generator expression (`[]` vs `()`)
    * generator expressions can run *slower* (because of the overhead of function calls)
    * modify parameter in range to see the effect
* however, **generator expressions are drastically faster when the size of your data is larger than the available memory**
    * if possible we will look at the advanced parallelism solutions using [dask](https://dask.org/), 

## Iterators

* considered to be a "milestone" for any serious Pythonista
* but how does loop constructs work behind the scenes?
    * so called *Python's iterator* protocol:
        * objects that support the `__iter__` and `__next__` or `__getitem__` dunder (alias for double underscore) methods automatically work with for-in loops.
* for detailed explanation of what is an iterable, look at [Python 3 glossary](https://docs.python.org/3/glossary.html#term-iterable)
* also every generator is an iterator, but not vice versa
* below is a class with bare-bones iterator protocol in Python 

In [158]:
for container in [list, tuple, dict]:
    container_attributes = dir(container)
    print(
        '__iter__' in list_attributes and
        ('__getitem__'  in list_attributes or '__next__' in list_attributes)
    )

True
True
True


In [159]:
# includes __iter__ and __next__ method
class Repeater:
    def __init__(self, value):
        self.value = value

    def __iter__(self):
        return self

    def __next__(self):
        return self.value

* methods `__iter__` and `__next__` are key to make a Python object iterable
* cell below prints 'Hello'indefinitly
    * just calmly interupt the kernel, everything is fine

In [160]:
repeater = Repeater('Hello')
# working iterator! will run forever :)
for item in repeater:
    print(item);

Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hell

KeyboardInterrupt: 

In [4]:
# more explicit of what goes on behind the scene
repeater = Repeater('Hello')
iterator = repeater.__iter__()
while True:
    item = iterator.__next__()
    print(item)

Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hell

KeyboardInterrupt: 

* prepares the iterator object (by calling an `__iter__` method)
* loop repeatedly calls the `__next__` method
* because there’s never more than one element in flight", this approach is highly memory-efficient
* our Repeater class provides an infinite sequence of elements and we can iterate over it just fine
    * not possible using lists!
* iterators provide an interface for processing every element of a container
    * while being completely isolated from the container’s internal structure! 
* we can replace the calls to `__iter__` and  `__next__` with calls to Python's built-in functions `iter()` and `next()`
    * other Python's built-in functions with the same purpose of a clean syntax, e.g. `len(x)` for `x.__len__` 

* who want's to iterate forever?
    * how to write an iterator that eventually stops generating? 
* *StopIteration exception* to signal we’ve exhausted all of the available values in the iterator
    * use exceptions for control of iterators flow

In [161]:
my_list = [1,2,3]
iterator = iter(my_list)
next(iterator)
next(iterator)
next(iterator)

3

In [162]:
next(iterator)

StopIteration: 

In [163]:
# if I keep requesting more values
next(iterator)

StopIteration: 

* iterators can't be reset ("once they’re exhausted they’re supposed to raise StopIteration every time  next() is called on them")
* we can implement the above notions into our code
    *  Iteration stops after the number of repetitions defined in the max_repeats parameter

In [165]:
class BoundedRepeater:
    def __init__(self, value, max_repeats):
        self.value = value
        self.max_repeats = max_repeats
        self.count = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.count >= self.max_repeats:
            raise StopIteration
        self.count += 1
        return self.value

In [166]:
# for-in loop
repeater = BoundedRepeater('Hello', 3)
iterator = iter(repeater)
while True:
    try:
        item = next(iterator)
    except StopIteration:
        break
    print(item)

Hello
Hello
Hello


* iterators in Python 2.x are syntactically different, so if you need the compatibility, look at it

## itertools

* a collection of tools for handling iterators
    * iterators are data types that can be used in a `for` loop
    * the most common iterator in Python is the list
* so called 'iterator algebra'
* the functions in itertools "operate" on iterators to produce more complex iterators, e.g.
    * infinite iterators: count, cycle, repeat
    * finite iterators: accumulate, chain, zip_longest
    * combination generators: product, permutations, combination 

In [167]:
import itertools

In [169]:
# zip example (zip is in-built, not an itertool method)
list(zip([1, 2, 3], ['a', 'b', 'c']))

[(1, 'a'), (2, 'b'), (3, 'c')]

* both lists above are iterable
* `zip()` function works:
    * by calling `iter()` on each of its arguments
    * then advancing each iterator returned by `iter()` with `next()`
    * and finally aggregating the results into tuples
* the iterator returned by zip() iterates over these tuples
* in general, when `iter()` called on an iterable, returns an iterator object

In [170]:
type(iter([1,2,3,4]))

list_iterator

* we have already seen another iterator operator above

In [11]:
list(map(len, ['abc', 'de', 'fghi']))

[3, 2, 4]

* as an example of an iterator algebra
    * compose `zip()` and `map()` to produce an iterator over combinations of elements in more than one iterable

In [175]:
list(map(sum, zip([1, 2, 3], [4, 5, 6])))

[5, 7, 9]

* a collection of building blocks that can be combined to form specialized “data pipelines”
* advantages: 
    * improved memory efficiency (via **lazy evaluation**)
    * faster execution time
* look at the following example

In [177]:
def naive_grouper(inputs, n):
    num_groups = len(inputs) // n
    return [tuple(inputs[i*n:(i+1)*n]) for i in range(num_groups)]

In [176]:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
naive_grouper(nums, 2)

[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]

* what if we pass a list of million elements?
    * we will need a whole lot of available memory
* **iterators** to save the day

In [174]:
def better_grouper(inputs, n):
    # creates a list of n references to the same iterator
    iters = [iter(inputs)] * n
    return zip(*iters)

SIDENOTE on *asterisk* (`*`) usecases: 
* multiplication and power
* extending the list-type containers
* unpacking the containers
    * keyword arguments

In [178]:
iters = [iter(nums)] * 2
list(id(itr) for itr in iters)  # the same iterator => IDs are the same

[132172152, 132172152]

* the first element  is taken from the "first" iterator
* the "second iterator now starts at 2"(the second element), since it is just a reference to the "first" iterator, i.e. advanced by one step
    * the first tuple produced by `zip()` is `(1, 2)`
* "both" iterators in iters start at 3, so when `zip()` pulls 3 from the "first" iterator, it gets 4 from the "second" to produce the tuple `(3, 4)`
* this process continues until zip() finally produces (9, 10) and "both" iterators in iters are exhausted

In [198]:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# an iterator over pairs of corresponding elements of each iterator in iters
list(better_grouper(nums, 2))

[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]

* the main advantages:
    * can take an iterable as in argument (even infinite iterator)
    * by returning an iterator rather than a list, we can process enormous iterables without trouble and use less memory
* what's missing?
    * what if the value passed to the second argument isn't a factor of the length of the iterable in the first argument?

In [188]:
# 10 is missing from the grouped output because zip stops the moment the shortest iterable passed to it is exhausted
list(better_grouper(nums, 3))

[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

* `itertools.zip_longest()` to the rescue

In [199]:
x = [1, 2, 3, 4, 5]
y = ['a', 'b', 'c']

print(list(zip(x, y)))
list(itertools.zip_longest(x, y,  fillvalue='Filling the void.'))

[(1, 'a'), (2, 'b'), (3, 'c')]


[(1, 'a'),
 (2, 'b'),
 (3, 'c'),
 (4, 'Filling the void.'),
 (5, 'Filling the void.')]

* there are plenty of other methods in itertools, have a look at them [here](https://docs.python.org/3/library/itertools.html)
* look at more [itertools recipes](https://docs.python.org/3.6/library/itertools.html#itertools-recipes)

* if the built-in general purpose dict, list, set and tuple are not enough
    * use [collections module](https://docs.python.org/3/library/collections.html): specialized container datatypes
* notable:
    * namedtuple: tuple subclasses with named fields
    * deque: list-like container with fast appends and pop
    * Counter: dict for counting hashable objects
    * OrderDict: dict that retains order of entries
    * defaultdict: dict that calls a factory function to supply missing values

## Eliminating loops
* using loops is not *necessary* a bad practice
* looping patterns:
    * `for` loop: iterate over sequence piece-by-piece
    * `while` loop: repeat loop as long as condition is met
    * "nested loops: use one loop inside another loop
* Costly!
* benefits of eliminating loops:
    * fewer lines of code
    * easier to interpret (code readability)
    * "Flat is better than nested."
    
## Writing better loops
* sometimes you can't eliminate the loop
* how to do it better
    * understand what is being done with each loop iteration (to be sure we are not doing anything unnecessary)
    * anything that can be done once, move it outside the loop
        * move one-time calculation outside (above) the loop
        * use all-encompassing conversions outside (below) the loop
    
# Basic Pandas optimization

* unjustified reputation of Pandas for being slow 
    * what may look like a Pythonic code, can be suboptimal in Pandas (with regards to efficiency)
* Pandas alone will never reach the calculation speed of fully optimized raw C code
    * more on Pandas performance enhancement [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html)
        * Cython
        * Numba
* Pandas is designed for vectorized operations (like NumPy)
    * operation on entire columns or datasets in one sweep
* from slowest to fastest:
    * crude looping over DataFrame rows using indices
        * does not take advantage of any built-in optimizations
    * looping with `iterrows()`
        * a generator that iterates over the rows of the dataframe returning the index of each row
    * looping with `apply()`
        * still inheretly looping but taking advantage of a number of internal optimizations, e.g. iterators in Cython
    * vectorization with pd series
        * Pandas includes a generous collection of vectorized functions
        * the built-in functions are optimized to operate specifically on Pandas series and DataFrames
            * using vectorized Pandas functions is almost always preferable to accomplishing similar ends with custom-written looping
    * vectorization with np arrays
        * Numpy leaves out a lot of overhead present in Pandas series, e.g. indexing, data type checking, etc.
        * operations on NumPy arrays generally faster than Pandas