# Intermediate Python

## Iterators

An iterator is an object that enables a programmer to traverse a data container (similar to a database cursor).


In [1]:
for value in [1,3,2]:
    print(value)

1
3
2


An iterator can be useful in some situations, namely:

* To process data structures with no or slow random access, like trees or on-disk data.

* Iterators can provide a consistent way to iterate on data structures of all kinds, and therefore make the code more readable, reusable, and less sensitive to a change in the data structure.


A typical use of an iterator is to create lists using the `comprehension list` constructor:

In [2]:
[i for i in range(10) if i % 3]

[1, 2, 4, 5, 7, 8]

In [3]:
import numpy as np
np.array([i for i in range(10) if i % 3], dtype='int32')

array([1, 2, 4, 5, 7, 8], dtype=int32)

## Generators

Iterators can be used to create other structures without an intermediate container.  Let's see how we can use a generator (an implementation of an iterator) for doing this:

In [4]:
(i for i in range(10) if i % 3)

<generator object <genexpr> at 0x7f9d9f8e38e0>

In [5]:
np.fromiter((i for i in range(10) if i % 3), dtype='int32')

array([1, 2, 4, 5, 7, 8], dtype=int32)

Note how in this case we don't need an intermediate list so as to create a NumPy array out of the iterator.

### Example

In [6]:
# Generate a range of floating point numbers 
def frange(fmin,fmax,divisions):
    delta = (fmax - fmin) / divisions
    x = fmin
    for i in range(divisions):
        yield x
        x += delta

In [7]:
[f for f in frange(0,5,10)]   # why the result?

[0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5]

In [8]:
[f for f in frange(0.,5,10)]  # the intended outcome

[0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5]

Generators are typically faster than comprehension lists (and takes less memory!):

In [9]:
%time arr2 = np.fromiter((f for f in frange(0.,5.,int(1e6))), dtype='f8')

CPU times: user 276 ms, sys: 0 ns, total: 276 ms
Wall time: 277 ms


In [10]:
%time arr1 = np.fromiter([f for f in frange(0.,5.,int(1e6))], dtype='f8')

CPU times: user 252 ms, sys: 12 ms, total: 264 ms
Wall time: 266 ms


Although the fastest is always use embedded C code (if you can!):

In [11]:
%time arr3 = np.linspace(0, 5, 1e6, endpoint=False)

CPU times: user 8 ms, sys: 12 ms, total: 20 ms
Wall time: 19.5 ms


In [12]:
np.allclose(arr1, arr2), np.allclose(arr1, arr3)

(True, True)

Generators are also useful in reductions:

In [13]:
%timeit sum([f for f in frange(0.,5.,int(1e6))])

1 loop, best of 3: 137 ms per loop


In [14]:
%timeit sum((f for f in frange(0.,5.,int(1e6))))

1 loop, best of 3: 156 ms per loop


### Exercise

Study the generator below and suggest an efficient way to render the image with NumPy/matplotlib (try to avoid reading the solution!)

In [45]:
# Test a given x,y coordinate to see if it's a member of the set
def in_mandelbrot(x0, y0, n):
    x = 0.
    y = 0.
    while n > 0:
        xtemp = x * x - y * y + x0
        y = 2 * x * y + y0
        x = xtemp
        n -= 1
        if x * x + y * y > 4:
            return False
    return True

# Generate a range of floating point numbers 
def frange(fmin, fmax, divisions):
    delta = (fmax - fmin) / divisions
    x = fmin
    for i in range(divisions):
        yield x
        x += delta

# Generate all of the pixels of the mandelbrot set.  The output of
# this function is a sequence of rows.  Each row is a sequence of
# True/False values indicating whether or not a point is a member
# of the set or not. Note: This is using generators and generator
# expressions to produce all of the pixels without ever allocating
# a huge array of pixels in memory. 
def generate_mandel(xmin=-2.0, ymin=-1.5, width=3.0, height=3.0, pixels=128, n=400):
    for y in frange(ymin, ymin + height, pixels):
        for x in frange(xmin, xmin + width, pixels):
            yield in_mandelbrot(x, y, n)


### Solution

## Context managers

Python’s with statement was first introduced five years ago, in Python 2.5. It’s handy when you have two related operations which you’d like to execute as a pair, with a block of code in between. The classic example is opening a file, manipulating the file, then closing it:

In [52]:
# The context manager is here:
with open('output.txt', 'w') as f:
    f.write('Hi there!')
! cat 'output.txt'

Hi there!

Note how the `f` handler is closed only within the scope of the context manager.  After that is closed:

In [53]:
f

<_io.TextIOWrapper name='output.txt' mode='w' encoding='UTF-8'>

Will not go into the details on how to implement them, but I want you not get scared when you see one in the next lectures.

## Decorators

*Using* decorators is easy! ...but writing them can be complicated.  I'll concentrate on usage, but the concept is powerful, so you may want to get more info about writing them.

### Example

In [54]:
import collections
import functools

class memoized(object):
   '''Decorator. Caches a function's return value each time it is called.
   
   If called later with the same arguments, the cached value is returned
   (not reevaluated).
   '''
   
   def __init__(self, func):
      self.func = func
      self.cache = {}

   def __call__(self, *args):
      if not isinstance(args, collections.Hashable):
         # uncacheable. a list, for instance.
         # better to not cache than blow up.
         return self.func(*args)
      if args in self.cache:
         return self.cache[args]
      else:
         value = self.func(*args)
         self.cache[args] = value
         return value

   def __repr__(self):
      '''Return the function's docstring.'''
      return self.func.__doc__

   def __get__(self, obj, objtype):
      '''Support instance methods.'''
      return functools.partial(self.__call__, obj)

@memoized
def fibonacci(n):
   "Return the nth fibonacci number."
   if n in (0, 1):
      return n
   return fibonacci(n-1) + fibonacci(n-2)

In [55]:
print(fibonacci(12))

144


In [56]:
%time fibonacci(130)

CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 1.5 ms


659034621587630041982498215

In [57]:
%time fibonacci(140)

CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 167 µs


81055900096023504197206408605

## Packaging for distribution

Packaging is normally made through [distutils](http://docs.python.org/2/distutils/introduction.html) and is a matter of building a 'setup.py' file.  For example, let's suppose that we want to pack or new mandebrot module to send it to a friend:

In [58]:
%%file mandelbrot.py

# Make a plot of the mandelbrot set
import numpy as np
from matplotlib import pylab as plt
from matplotlib import cm

# Test a given x,y coordinate to see if it's a member of the set
def in_mandelbrot(x0, y0, n):
    x = 0
    y = 0
    while n > 0:
        xtemp = x * x - y * y + x0
        y = 2 * x * y + y0
        x = xtemp
        n -= 1
        if x * x + y * y > 4:
            return False
    return True

# Generate a range of floating point numbers 
def frange(fmin,fmax,divisions):
    delta = (fmax - fmin)/divisions
    x = fmin
    for i in range(divisions):
        yield x
        x += delta

# Generate all of the pixels of the mandelbrot set.  The output of
# this function is a sequence of rows.  Each row is a sequence of
# True/False values indicating whether or not a point is a member
# of the set or not. Note: This is using generators and generator
# expressions to produce all of the pixels without ever allocating
# a huge array of pixels in memory. 
def generate_mandel(xmin=-2.0, ymin=-1.5, width=3.0, height=3.0, pixels=128, n=400):
    for y in frange(ymin, ymin + height, pixels):
        for x in frange(xmin, xmin + width, pixels):
            yield in_mandelbrot(x, y, n)

if __name__ == "__main__":
    import sys
    if len(sys.argv) > 1:
        npixels = int(sys.argv[1])
    else:
        npixels = 128
    img = generate_mandel(pixels=npixels)
    im = np.fromiter(img, dtype=bool).reshape(npixels, npixels)
    plt.imshow(im, cmap=cm.gray_r)
    plt.show()

Overwriting mandelbrot.py


Now, we can build a minimalistic setup.py for packaging it:

In [59]:
%%file setup.py

from distutils.core import setup
setup(name='mymandel',
      version='1.0',
      py_modules=['mandelbrot'],
      )

Overwriting setup.py


And we can create a tarball easily with:

In [60]:
!python setup.py sdist

running sdist
running check




writing manifest file 'MANIFEST'
creating mymandel-1.0
making hard links in mymandel-1.0...
hard linking mandelbrot.py -> mymandel-1.0
hard linking setup.py -> mymandel-1.0
Creating tar archive
removing 'mymandel-1.0' (and everything under it)


In [61]:
!ls -l dist  # this is put in the 'dist/' directory

total 4
-rw-rw-r-- 1 faltet faltet 1099 oct 16 19:20 mymandel-1.0.tar.gz


Then, your friend has to unpack the tarball and install it in his own system with:

In [50]:
# Just don't run the line below so as to not mess your environment...
#!python setup.py install

### Exercise

Remember the warnings during the `sdist` task above?  Try to get rid of them.

Hint: have a look at the [manual of distutils](http://docs.python.org/2/distutils/introduction.html).