# Learning Objectives

* Immutability in Python
* List Comprehension
* Generators and Iterators
* Randomness
* enumerate
* zip and argument unpacking
* args and kwargs

## Pre-req

Awesome if you already know everything in: https://www.datacamp.com/courses/intro-to-python-for-data-science

## 2 to 3

* More Generators and iterators
* More Laziness

# Immutability in Python

In [5]:
x = [5, 3, 1, 6, 8, 4]

y = sorted(x)

In [6]:
y

[1, 3, 4, 5, 6, 8]

In [7]:
x

[5, 3, 1, 6, 8, 4]

In [8]:
x.sort()

In [9]:
x

[1, 3, 4, 5, 6, 8]

member functions vs provided function

In [10]:
word_counts = {"a": 12, "an": 3, "the": 14, "hello": 2, "big": 3, "data": 7}
word_counts

{'a': 12, 'an': 3, 'big': 3, 'data': 7, 'hello': 2, 'the': 14}

In [16]:
word_counts.items()

dict_items([('hello', 2), ('a', 12), ('an', 3), ('big', 3), ('data', 7), ('the', 14)])

In [35]:
from operator import itemgetter

wc = sorted(word_counts.items(),
           key = itemgetter(1),
           reverse = True)
wc

[('the', 14), ('a', 12), ('data', 7), ('an', 3), ('big', 3), ('hello', 2)]

# List Comprehension

Frequently, you’ll want to transform a list into another list, by choosing only certain elements, or by transforming elements, or both. The Pythonic way of doing this is list comprehensions.

In [23]:
# Select even numbers
evens = [x for x in range(21) if x % 2 == 0]

In [24]:
evens

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

In [26]:
[0 for _ in evens]

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [29]:
[(x, y)
  for x in range(3)
  for y in range(3)]

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

Later *for*s can use the result of earlier ones

In [30]:
[(x, y)
  for x in range(3)
  for y in range(x+1, 6)]

[(0, 1),
 (0, 2),
 (0, 3),
 (0, 4),
 (0, 5),
 (1, 2),
 (1, 3),
 (1, 4),
 (1, 5),
 (2, 3),
 (2, 4),
 (2, 5)]

# Generators and Iterators

A problem with lists is that they can easily grow very big. range(1000000) creates an
actual list of 1 million elements. If you only need to deal with them one at a time, this
can be a huge source of inefficiency (or of running out of memory). If you potentially
only need the first few values, then calculating them all is a waste.

A generator is something that you can iterate over (for us, usually using for) but
whose values are produced only as needed (lazily).

## Creating generators with *yield*

In [36]:
def lazy_range(n):
 """a lazy version of range"""
 i = 0
 while i < n:
   yield i
   i += 1

In [37]:
for i in lazy_range(10):
 print(i)

0
1
2
3
4
5
6
7
8
9


Python actually comes with a lazy_range function called xrange, and in Python 3,
range itself is lazy

The flip side of laziness is that you can only iterate through a generator once. If you need to iterate through something multiple times, you’ll need to either recreate the generator each time or use a
list.

## Create generators with for comprehensions

A second way to create generators is by using for comprehensions wrapped in paren‐
theses

In [39]:
lazy_evens_below_20 = (i for i in lazy_range(20) if i % 2 == 0)

In [40]:
lazy_evens_below_20

<generator object <genexpr> at 0x1042a8468>

* every dict has an items() method that returns a list of its key-value pairs
* more frequently we’ll use the iteritems() method, which lazily yields the key-value pairs one at a time as we iterate over it.

# Randomness

* The importance of a random seed
* We'll use NumPy for this

# enumerate

Not infrequently, you’ll want to iterate over a list and use both its elements and their indexes.

In [None]:
# not Pythonic
for i in range(len(documents)):
 document = documents[i]
 do_something(i, document)

In [41]:
# also not Pythonic
i = 0
for document in documents:
 do_something(i, document)
 i += 1

NameError: name 'documents' is not defined

The Pythonic solution is enumerate, which produces tuples (index, element)

In [None]:
# Pythonic
for i, document in enumerate(documents):
 do_something(i, document)

Similarly, if we just want the indexes

In [None]:
for i in range(len(documents)): do_something(i) # not Pythonic

In [None]:
for i, _ in enumerate(documents): do_something(i) # Pythonic

# zip and Argument Unpacking

* Often we will need to zip two or more lists together
* zip transforms multiple lists into a single list of tuples of corresponding elements

In [47]:
list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]
zip(list1, list2)

<zip at 0x1042ad288>

In [48]:
list(zip(list1, list2))

[('a', 1), ('b', 2), ('c', 3)]

* In Python3, zip() is now a *generator* that aggregates elements from each of the iterables
* The iterator stops when the shortest input iterable is exhausted
* If you would like to match the length of the longer input, use itertools.zip_longest
    - https://docs.python.org/3/library/itertools.html#itertools.zip_longest

## Unzip

In [51]:
pairs = [('a', 1), ('b', 2), ('c', 3)]
letters, numbers = zip(*pairs)
print(letters, numbers)

('a', 'b', 'c') (1, 2, 3)


The asterisk performs argument unpacking, which uses the elements of pairs as individual arguments to zip. 

In [52]:
def add(a, b): return a + b

In [53]:
add(1, 2)

3

In [54]:
add([1, 2]) 

TypeError: add() missing 1 required positional argument: 'b'

In [56]:
add(*[1, 2])

3

# \*args and \*\*kwargs

Let’s say we want to create a higher-order function that 
* takes as input some function f, and 
* returns a new function that for any input returns twice the value of f

In [59]:
def doubler(f):
 def g(x):
   return 2 * f(x)
 return(g)

In [61]:
def f1(x):
 return x + 1

g = doubler(f1)
print(g(3))
print(g(-1))

8
0


However, it breaks down with functions that take more than a single argument

In [62]:
def f2(x, y):
 return x + y

g = doubler(f2)
print(g(1, 2))

TypeError: g() takes 1 positional argument but 2 were given

* we need a way to specify a function that takes arbitrary arguments
* we can do this with argument unpacking

In [64]:
def magic(*args, **kwargs):
   print("unnamed args:", args)
   print("keyword args:", kwargs)
    
magic(1, 2, key="word", key2="word2")

unnamed args: (1, 2)
keyword args: {'key2': 'word2', 'key': 'word'}


* args is a tuple of its unnamed arguments
* kwargs is a dict of its named arguments
* It works the other way too, if you want to use a list (or tuple) and dict to supply arguments to a function

In [66]:
def other_way_magic(x, y, z):
    return x + y + z

x_y_list = [1, 2]
z_dict = { "z" : 3 }

print(other_way_magic(*x_y_list, **z_dict))

6


In [68]:
def doubler_correct(f):
   """works no matter what kind of inputs f expects"""
   def g(*args, **kwargs):
     """whatever arguments g is supplied, pass them through to f"""
     return 2 * f(*args, **kwargs)
   return g

g = doubler_correct(f2)
print(g(1, 2))

6
