# Control Flow

*Control flow* is where the rubber really meets the road in programming.
Without it, a program is simply a list of statements that are sequentially executed.
With control flow, you can execute certain code blocks conditionally and/or repeatedly: these basic building blocks can be combined to create surprisingly sophisticated programs!

Here we'll cover *conditional statements* (including "``if``", "``elif``", and "``else``"), *loop statements* (including "``for``" and "``while``" and the accompanying "``break``", "``continue``", and "``pass``").

## Conditional Statements: ``if``-``elif``-``else``:
Conditional statements, often referred to as *if-then* statements, allow the programmer to execute certain pieces of code depending on some Boolean condition.
A basic example of a Python conditional statement is this:

In [None]:
x = -15

if x == 0:
    print(x, "is zero")
elif x > 0:
    print(x, "is positive")
elif x < 0:
    print(x, "is negative")
else:
    print(x, "is unlike anything I've ever seen...")

Note especially the use of colons (``:``) and whitespace to denote separate blocks of code.

Python adopts the ``if`` and ``else`` often used in other languages; its more unique keyword is ``elif``, a contraction of "else if".

In these conditional clauses, ``elif`` and ``else`` blocks are optional; additionally, you can optinally include as few or as many ``elif`` statements as you would like.

## ``for`` loops
Loops in Python are a way to repeatedly execute some code statement.
So, for example, if we'd like to print each of the items in a list, we can use a ``for`` loop:

In [3]:
for N in [2, 3, 5, 7]:
    print(N, end=' ') # print all on same line

2 3 5 7 

Notice the simplicity of the `for` loop: we specify the variable we want to use, the sequence we want to loop over, and use the "`in`" operator to link them together in an intuitive and readable way. We do not need to deal with a looping index variable (unless we want to!).

* The Python for loop is considered a "foreach" loop in other languages. You can read the above as "for each N in [2,3,5,7]".

The object to the right of the "``in``" can be any Python *iterator*, not just a list.
An iterator can be thought of as a generalized sequence, and we'll discuss them in [Iterators and List Comprehensions](13-Iterators-and-List-Comprehensions.ipynb).

For example, one of the most commonly-used iterators in Python is the ``range`` object, which generates a sequence of numbers:

In [4]:
for i in range(10):
    print(i, end=' ')

0 1 2 3 4 5 6 7 8 9 

Note that the range starts at zero by default, and that by convention the top of the range is not included in the output.
Range objects can also have more complicated values:

In [5]:
# range from 5 to 10
list(range(5, 10))

[5, 6, 7, 8, 9]

In [6]:
# range from 0 to 10 by 2
list(range(0, 10, 2))

[0, 2, 4, 6, 8]

You might notice that the meaning of ``range`` arguments is very similar to the slicing syntax that we covered in [Lists](07-Built-in-Data-Structures.ipynb#Lists).

Note that the behavior of ``range()`` is one of the differences between Python 2 and Python 3: in Python 2, ``range()`` produces a list, while in Python 3, ``range()`` produces an iterable object.

## ``while`` loops
The other type of loop in Python is a ``while`` loop, which iterates until some condition is met:

In [None]:
i = 0
while i < 10:
    print(i, end=' ')
    i += 1

The argument of the ``while`` loop is evaluated as a boolean statement, and the loop is executed until the statement evaluates to False.

## ``break`` and ``continue``: Fine-Tuning Your Loops
There are two useful statements that can be used within loops to fine-tune how they are executed:

- The ``break`` statement breaks-out of the loop entirely
- The ``continue`` statement skips the remainder of the current loop, and goes to the next iteration

These can be used in both ``for`` and ``while`` loops.

Here is an example of using ``continue`` to print a string of even numbers.
In this case, the result could be accomplished just as well with an ``if-else`` statement, but sometimes the ``continue`` statement can be a more convenient way to express the idea you have in mind:

In [None]:
for n in range(20):
    # check if n is even
    if n % 2 == 0:
        #ontinue
        break
    print(n, end=' ')

Here is an example of a ``break`` statement used for a less trivial task.
This loop will fill a list with all Fibonacci numbers up to a certain value:

In [None]:
a, b = 0, 1
amax = 100
L = []

while True:
    (a, b) = (b, a + b)
    if a > amax:
        break
    L.append(a)

print(L)

Notice that we use a ``while True`` loop, which will loop forever unless we have a break statement!

# Defining and Using Functions

So far, our scripts have been simple, single-use code blocks.
One way to organize our Python code and to make it more readable and reusable is to **factor out useful pieces into reusable functions**.
Here we'll cover two ways of creating functions: the ``def`` statement, useful for any type of function, and the ``lambda`` statement, useful for creating short anonymous functions.

## Using Functions

Functions are groups of code that have a name, and can be called using parentheses.
We've seen functions before. For example, ``print`` in Python 3 is a function:

In [None]:
print('abc')

Here ``print`` is the function name, and ``'abc'`` is the function's *argument*.

In addition to arguments, functions can be designed to accept *keyword arguments* that are optional and specified by name.
One available keyword argument for the ``print()`` function (in Python 3) is ``sep``, which tells what character or characters should be used to separate multiple items:

In [None]:
print(1, 2, 3,4,5)

In [17]:
print(1, 2, 3, sep='--')

1--2--3


When non-keyword arguments are used together with keyword arguments, the keyword arguments must come at the end.

## Defining Functions
Functions become even more useful when we begin to define our own, organizing functionality to be used in multiple places.
In Python, functions are defined with the ``def`` statement.
For example, we can encapsulate a version of the Fibonacci sequence code from the previous section as follows:

In [16]:
def add(x,y):
    return x+y

print(add(1,2))

3


In [18]:
def myadd(n):
    if n<=0 :
        return 0
    else:
        return n+myadd(n-1)
myadd(10)

55

In [19]:
def fibonacci(N):
    L = []
    a, b = 0, 1
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

fibonacci(100)

[1,
 1,
 2,
 3,
 5,
 8,
 13,
 21,
 34,
 55,
 89,
 144,
 233,
 377,
 610,
 987,
 1597,
 2584,
 4181,
 6765,
 10946,
 17711,
 28657,
 46368,
 75025,
 121393,
 196418,
 317811,
 514229,
 832040,
 1346269,
 2178309,
 3524578,
 5702887,
 9227465,
 14930352,
 24157817,
 39088169,
 63245986,
 102334155,
 165580141,
 267914296,
 433494437,
 701408733,
 1134903170,
 1836311903,
 2971215073,
 4807526976,
 7778742049,
 12586269025,
 20365011074,
 32951280099,
 53316291173,
 86267571272,
 139583862445,
 225851433717,
 365435296162,
 591286729879,
 956722026041,
 1548008755920,
 2504730781961,
 4052739537881,
 6557470319842,
 10610209857723,
 17167680177565,
 27777890035288,
 44945570212853,
 72723460248141,
 117669030460994,
 190392490709135,
 308061521170129,
 498454011879264,
 806515533049393,
 1304969544928657,
 2111485077978050,
 3416454622906707,
 5527939700884757,
 8944394323791464,
 14472334024676221,
 23416728348467685,
 37889062373143906,
 61305790721611591,
 99194853094755497,
 160500643

Now we have a function named ``fibonacci`` which takes a single argument ``N``, does something with this argument, and ``return``s a value; in this case, a list of the first ``N`` Fibonacci numbers:

In [20]:
fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

If you're familiar with strongly-typed languages like ``C``, you'll immediately notice that there is no type information associated with the function inputs or outputs.
Of course, if a function is designed around an integer argument, it is likely an error will occur if it receives a string.

Python functions can return any Python object, simple or compound, which means constructs that may be difficult in other languages are straightforward in Python.
For example, multiple return values are simply put in a tuple, which is indicated by commas:

In [21]:
def real_imag_conj(val):
    return val.real, val.imag, val.conjugate()

r, i, c = real_imag_conj(3 + 4j)
print(r, i, c)

3.0 4.0 (3-4j)


## Default Argument Values

Often when defining a function, there are certain values that we want the function to use *most* of the time, but we'd also like to give the user some flexibility.
In this case, we can use *default values* for arguments.
Consider the ``fibonacci`` function from before.
What if we would like the user to be able to play with the starting values?
We could do that as follows:

In [29]:
def fibonacci(N, a=0, b=1, fun=max):
    L = []
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return fun(L)

def xy(L):
    return L[-1]

fibonacci(10,0,1,fun = xy)

55

In [23]:
x = myadd

In [24]:
x(10)

55

With a single argument, the result of the function call is identical to before:

In [None]:
fibonacci(10)

But now we can use the function to explore new things, such as the effect of new starting values:

In [None]:
fibonacci(10, 0, 2 ,function = max)

The values can also be specified by name if desired, in which case the order of the named values does not matter:

In [30]:
fibonacci(10, b=3, a=1)

199

## ``*args`` and ``**kwargs``: Flexible Arguments
Sometimes you might wish to write a function in which you don't initially know how many arguments the user will pass.
In this case, you can use the special form ``*args`` and ``**kwargs`` to catch all arguments that are passed.
Here is an example:

In [33]:
def catch_all(*args, **kwargs):
    # plot(x,y,**kwargs)
    print("args =", args)
    print("kwargs = ", kwargs)

In [34]:
catch_all(1, 2, 3,4,5,6, a=4, b=5, c=7,d=10, lw=67)

args = (1, 2, 3, 4, 5, 6)
kwargs =  {'a': 4, 'b': 5, 'c': 7, 'd': 10, 'lw': 67}


In [36]:
catch_all('a', keyword=2)

args = ('a',)
kwargs =  {'keyword': 2}


Here it is not the names ``args`` and ``kwargs`` that are important, but the ``*`` characters preceding them.
``args`` and ``kwargs`` are just the variable names often used by convention, short for "arguments" and "keyword arguments".

The operative difference is the asterisk characters: a single ``*`` before a variable means "expand this as a sequence", while a double ``**`` before a variable means "expand this as a dictionary".
In fact, this syntax can be used not only with the function definition, but with the function call as well!

In [38]:
inputs = (1, 2, 3)
keywords = {'pi': 3.14}

catch_all(*inputs, **keywords)
catch_all(inputs, keywords)

print(("a"
      ))

args = (1, 2, 3)
kwargs =  {'pi': 3.14}
args = ((1, 2, 3), {'pi': 3.14})
kwargs =  {}
a


It is common in many large python packages, especially matplotlib, to have many functions that take ``**kwargs``.

## Anonymous (``lambda``) Functions
Earlier we quickly covered the most common way of defining functions, the ``def`` statement.
You'll likely come across another way of defining short, one-off functions with the ``lambda`` statement.
It looks something like this:

In [52]:
# def add(x, y):
    # return x+y 

add = lambda x, y: x + y    # 冒号前为输入的参数，冒号后为返回的值

# add(1, 2)

sorted([(5,1),(9,0),(8,2)], key=lambda x:x[1])

[(9, 0), (5, 1), (8, 2)]

This lambda function is roughly equivalent to

In [53]:
def add(x, y):
    return x + y

add(2,1)

3

In [None]:
# sort by year of birth
sorted(data, key=lambda item: item['YOB'], reverse=True)

While these key functions could certainly be created by the normal, ``def`` syntax, the compact ``lambda`` syntax is convenient for such short one-off functions like these.

# Classes and Methods

As mentioned before, everything in Python is an object. That means Python code can be organized into classes, allowing objects to neatly bundle up data and associated functionality. Classes allow for more modular and reusable code. Functions attached to objects are referred to as "methods".

Even if you are not writing classes yourself, since everything is an object, it is important to understand the basics if you encounter errors related to classes or class functionality.

## Defining classes and methods

Here is a simple example of a class with an initialization function (all Python initialization functions are named `__init__`) and several methods

In [None]:
class Dog():
    def __init__(self, name, age,weight):
        self.name = name
        self.age = age
        self.weight = weight
        
        self.location = (0,0)
        self.trajectory = [self.location]

    def howl(self):
        print("I am {}. Hear me roar!!!".format(self.name))

    def walk(self, step=0.1):
        """To update the position of the dog..."""
        x,y = self.location
        self.location = x+step, y+step
        self.trajectory.append(self.location)

    def save(self, prefix):
        filename = "{}_n{name}_a{age}_w{weight}_traj.txt".format(prefix,name=self.name, age=self.age, weight=self.weight)
        print(filename)
        # write to disk in some way

Here we defined a class called `Dog` and gave it several attributes (`name`, `age`, ...) and methods (`howl`, `walk`, ...). The syntax is very simple, methods are defined just like functions except they are indented within the `class` block.

Each method of the class should begin with an argument called `self`. You can think of this argument as a placeholder for the object that will be created. 


In [None]:
fido = Dog("fido", 2.5, 20)

fido.howl()

Here we create a `Dog` object giving it a name (`fido`), age (`2.5`), and weight (`20`) which are set up as attributes automatically inside the `__init__` method. We then call the `howl` method. Note that the definition of the method has an argument called `self` but this is only a placeholder for the object. Now that the object (`fido`) exists, `fido.howl()` will automatically resolve to `Dog.howl(fido)` and we do not need to worry about `self`. 

# Errors and Exceptions

No matter your skill as a programmer, you will eventually make a coding mistake.
Such mistakes come in three basic flavors:

- *Syntax errors:* Errors where the code is not valid Python (generally easy to fix)
- *Runtime errors:* Errors where syntactically valid code fails to execute, perhaps due to invalid user input (sometimes easy to fix)
- *Semantic errors:* Errors in logic: code executes without a problem, but the result is not what you expect (often very difficult to track-down and fix)

Here we're going to focus on how to deal cleanly with *runtime errors*.
As we'll see, Python handles runtime errors via its *exception handling* framework.

### Tracebacks

Python reports error messages called "tracebacks" (sometimes known as "stack traces"). These messages highlight the portion of code where the error occurred and any code that may have called the erroneous code. This "tracing" backwards through the sequence of code allows you to pinpoint the location of the error and helps guide you to identifying the root cause of the error.

This example (`example_traceback.py`) contains an error:
```Python
# example_traceback.py

def loader(filename):
    fin = open(filenam)

loader("data/result_ab.txt")
```

Here is the traceback that is emitted when running this code. I have added color-coding and annotations to help show how to read tracebacks.

In [58]:

def loader(filename):
    fin = open(filename)

loader("./test_text.txt")

FileNotFoundError: [Errno 2] No such file or directory: './test_text.txt'

![Example Traceback Figure](fig/example_traceback.png)

Googling for parts of tracebacks, such as error messages (after removing any information such as filenames that are unique to your code), are a great way to solve problems.


## Runtime Errors

If you've done any coding in Python, you've likely come across runtime errors.
They can happen in a lot of ways.

For example, if you try to reference an undefined variable:

In [59]:
print(Q)

NameError: name 'Q' is not defined

Or if you try an operation that's not defined:

In [60]:
1 + 'abc'

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Or you might be trying to compute a mathematically ill-defined result:

In [61]:
2 / 0

ZeroDivisionError: division by zero

Or maybe you're trying to access a sequence element that doesn't exist:

In [62]:
L = [1, 2, 3]
L[1000]

IndexError: list index out of range

Note that in each case, Python is kind enough to not simply indicate that an error happened, but to spit out a *meaningful* exception that includes information about what exactly went wrong, along with the exact line of code where the error happened.
Although it may not always be as clear as these examples, having access to meanngful errors like this is immensely useful when trying to trace the root of problems in your code.

## Catching Exceptions: ``try`` and ``except``
The main tool Python gives you for handling runtime exceptions is the ``try``...``except`` clause.
Its basic structure is this:

In [1]:
try:
    #x[5]
    #print("this gets executed first")
    x[5]
except:
    #print("this gets executed only if there is an error")
    pass

print("good")

good


In [5]:
try:
    a = 10
    b = 0
    print(a/b)
except:
print("Error!")

IndentationError: expected an indented block (3078561330.py, line 6)

Note that the second block here did not get executed: this is because the first block did not return an error.
Let's put a problematic statement in the ``try`` block and see what happens:

In [2]:
try:
    print("let's try something:")
    x = 1 / 0 # ZeroDivisionError
except:
    print("something bad happened!")

let's try something:
something bad happened!


Here we see that when the error was raised in the ``try`` statement (in this case, a ``ZeroDivisionError``), the error was caught, and the ``except`` statement was executed.

One way this is often used is to check user input within a function or another piece of code.
For example, we might wish to have a function that catches zero-division and returns some other value, perhaps a suitably large number like $10^{100}$:

In [None]:
def safe_divide(a, b):
    try:
        return a / b
    except:
        return 1E100

In [7]:
def safe_divide(a, b):
    try:
        return a / b
    except:
        return 1E100
safe_divide(1, 2)

0.5

In [8]:
def safe_divide(a, b):
    try:
        return a / b
    except:
        return 1E100
safe_divide(2, 0)

1e+100

In [10]:
def safe_divide(a, b):
    try:
        return a / b
    except:
        return 1E100
safe_divide

<function __main__.safe_divide(a, b)>

There is a subtle problem with this code, though: what happens when another type of exception comes up? For example, this is probably not what we intended:

In [11]:
def safe_divide(a, b):
    try:
        return a / b
    except:
        return 1E100
safe_divide (1, '2')

1e+100

Dividing an integer and a string raises a ``TypeError``, which our  code caught and assumed was a ``ZeroDivisionError``!
For this reason, it's nearly always a better idea to catch exceptions *explicitly*:

In [None]:
def safe_divide(a, b):
    try:
        return a / b
    except ZeroDivisionError:
        return 1E100

In [12]:
safe_divide(1, 0)

1e+100

In [13]:
safe_divide(1, '2')

1e+100

We're now catching zero-division errors only, and letting all other errors pass through un-modified.

Try-except blocks are powerful and efficient in Python, and being explicit about the errors you catch helps ensure that other errors are not being skipped over without your knowledge.

# Iterators and List Comprehensions

Often an important piece of data analysis is repeating a similar calculation, over and over, in an automated fashion.
For example, you may have a table of a names that you'd like to split into first and last, or perhaps of dates that you'd like to convert to some standard format.
One of Python's answers to this is the *iterator* syntax.
We've seen this already with the ``range`` iterator:

In [14]:
for i in range(10):
    print(i, end=' ')

0 1 2 3 4 5 6 7 8 9 

Here we're going to dig a bit deeper.
It turns out that in Python 3, ``range`` is not a list, but is something called an *iterator*, and learning how it works is key to understanding a wide class of very useful Python functionality.

## Iterating over lists
Iterators are perhaps most easily understood in the concrete case of iterating through a list.
Consider the following:

In [None]:
for value in [2, 4, 6, 8, 10]:
    # do some operation
    print(value + 1, end=' ')

The familiar "``for x in y``" syntax allows us to repeat some operation for each value in the list.
The fact that the syntax of the code is so close to its English description ("*for [each] value in [the] list*") is just one of the syntactic choices that makes Python such an intuitive language to learn and use.

But the face-value behavior is not what's *really* happening.
When you write something like "``for val in L``", the Python interpreter checks whether it has an *iterator* interface, which you can check yourself with the built-in ``iter`` function:

In [20]:
iter([2, 4, 6, 8, 10])

<list_iterator at 0x1c12406ba90>

It is this iterator object that provides the functionality required by the ``for`` loop.
The ``iter`` object is a container that gives you access to the next object for as long as there **is** a next object, which can be seen with the built-in function ``next``:

In [16]:
I = iter([2, 4, 6, 8, 10])

In [17]:
print(next(I))

2


In [18]:
print(next(I))

4


In [19]:
print(next(I))

6


What is the purpose of this level of indirection?
Well, it turns out this is incredibly useful, because it allows Python to treat things as lists that are *not actually lists*.

## ``range()``: A List Is Not Always a List
Perhaps the most common example of this indirect iteration is the ``range()`` function in Python 3 (named ``xrange()`` in Python 2), which returns not a list, but a special ``range()`` object:

In [21]:
range(10000)

range(0, 10000)

``range``, like a list, exposes an iterator:

In [22]:
iter(range(10))

<range_iterator at 0x1c125a8be90>

So Python knows to treat it *as if* it's a list:

In [23]:
for i in range(10):
    print(i, end=' ')

0 1 2 3 4 5 6 7 8 9 

The benefit of the iterator indirection is that **the full list is never explicitly created!**
We can see this by doing a range calculation that would overwhelm our system memory if we actually instantiated it (note that in Python 2, ``range`` creates a list, so running the following will not lead to good things!):

In [25]:
N = 10 ** 12
for i in range(N):
    if i >= 10: break
    print(i, end=', ')

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 

If ``range`` were to actually create that list of one trillion values, it would occupy tens of terabytes of machine memory: a waste, given the fact that we're ignoring all but the first 10 values!

In fact, there's no reason that iterators ever have to end at all!
Python's ``itertools`` library contains a ``count`` function that acts as an infinite range:

In [24]:
from itertools import count

for i in count():
    if i >= 10:
        break
    print(i, end=', ')

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 

Had we not thrown in a loop break here, it would go on happily counting until the process is manually interrupted or killed (using, for example, ``ctrl-C``).

## Useful Iterators
This iterator syntax is used nearly universally in Python built-in types as well as the more data science-specific objects we'll explore in later sections.
Here we'll cover some of the more useful iterators in the Python language:

### ``enumerate``
Often you need to iterate not only the values in an array, but also keep track of the index.
You might be **tempted** to do things this way:

In [26]:
L = [2, 4, 6, 8, 10]
for i in range(len(L)):
    print(i, L[i])

0 2
1 4
2 6
3 8
4 10


Although this does work, Python provides a cleaner syntax using the ``enumerate`` iterator:

In [27]:
for i, val in enumerate(L):
    print(i, val)

0 2
1 4
2 6
3 8
4 10


This is the more "Pythonic" way to enumerate the indices and values in a list.

### ``zip``
Other times, you may have multiple lists that you want to iterate over simultaneously.
You could certainly iterate over the index as in the non-Pythonic example we looked at previously, but it is better to use the ``zip`` iterator, which zips together iterables:

In [30]:
L = [2, 4, 6, 8, 10]
R = [3, 6, 9, 12, 15]
for lval, rval in zip(L, R):
    print(lval, rval)

2 3
4 6
6 9
8 12
10 15


In [31]:
X = list(zip(L, R))
X

[(2, 3), (4, 6), (6, 9), (8, 12), (10, 15)]

In [34]:
x, y = zip(*X)   # 解包函数
print(x)
print(y)

(2, 4, 6, 8, 10)
(3, 6, 9, 12, 15)


In [32]:
list(zip(*X))

[(2, 4, 6, 8, 10), (3, 6, 9, 12, 15)]

Any number of iterables can be zipped together, and if they are different lengths, the shortest will determine the length of the ``zip``.

### ``map`` and ``filter``
The ``map`` iterator takes a function and applies it to the values in an iterator:

In [35]:
# find the first 10 square numbers
square = lambda x: x ** 2
for val in map(square, range(10)):
    print(val, end=' ')

0 1 4 9 16 25 36 49 64 81 

The ``filter`` iterator looks similar, except it only passes-through values for which the filter function evaluates to True:

In [36]:
# find values up to 10 for which x % 2 is zero
is_even = lambda x: x % 2 == 0
for val in filter(is_even, range(10)):
    print(val, end=' ')

0 2 4 6 8 

### Iterators as function arguments

We saw in [``*args`` and ``**kwargs``: Flexible Arguments]. that ``*args`` and ``**kwargs`` can be used to pass sequences and dictionaries to functions.
It turns out that the ``*args`` syntax works not just with sequences, but with any iterator:

In [37]:
print(*range(10))

0 1 2 3 4 5 6 7 8 9


So, for example, we can get tricky and compress the ``map`` example from before into the following:

In [38]:
print(*map(lambda x: x ** 2, range(10)))

#map(lambda x: x ** 2, range(10))

0 1 4 9 16 25 36 49 64 81


Using this trick lets us answer the age-old question that comes up in Python learners' forums: why is there no ``unzip()`` function which does the opposite of ``zip()``?
If you lock yourself in a dark closet and think about it for a while, you might realize that the opposite of ``zip()`` is... ``zip()``! The key is that ``zip()`` can zip-together any number of iterators or sequences. Observe:

In [40]:
L1 = (1, 2, 3, 4)
L2 = ('a', 'b', 'c', 'd')

In [49]:
z = zip(L1, L2)
print(*z)

(1, 'a') (2, 'b') (3, 'c') (4, 'd')


In [42]:
print(*zip(L1,L2,L1))

(1, 'a', 1) (2, 'b', 2) (3, 'c', 3) (4, 'd', 4)


In [43]:
z = zip(L1, L2)
new_L1, new_L2 = zip(*z)
print(new_L1, "  &  ", new_L2)

(1, 2, 3, 4)   &   ('a', 'b', 'c', 'd')


Ponder this for a while. If you understand why it works, you'll have come a long way in understanding Python iterators!

## Specialized Iterators: ``itertools``

We briefly looked at the infinite ``range`` iterator, ``itertools.count``.
The ``itertools`` module contains a whole host of useful iterators; it's well worth your while to explore the module to see what's available.
As an example, consider the ``itertools.permutations`` function, which iterates over all permutations of a sequence:

In [47]:
from itertools import permutations
p = permutations(range(3))
print(*p)

(0, 1, 2) (0, 2, 1) (1, 0, 2) (1, 2, 0) (2, 0, 1) (2, 1, 0)


Similarly, the ``itertools.combinations`` function iterates over all unique combinations of ``N`` values within a list:

In [51]:
from itertools import combinations
c = combinations(range(4), 2)
print(*c)

(0, 1) (0, 2) (0, 3) (1, 2) (1, 3) (2, 3)


Somewhat related is the ``product`` iterator, which iterates over all sets of pairs between two or more iterables:

In [52]:
from itertools import product
p = product('ab', range(3))
print(*p)

('a', 0) ('a', 1) ('a', 2) ('b', 0) ('b', 1) ('b', 2)


Many more useful iterators exist in ``itertools``: the full list can be found, along with some examples, in Python's [online documentation](https://docs.python.org/3.5/library/itertools.html).

## List Comprehensions
If you read enough Python code, you'll eventually come across the  efficient construction known as a *list comprehension*.
This is one feature of Python I expect **you will fall in love with** if you've not used it before; it looks something like this:

In [46]:
x= [i for i in range(20) if i % 3 > 0]
{i:str(i) for i in x}


{1: '1',
 2: '2',
 4: '4',
 5: '5',
 7: '7',
 8: '8',
 10: '10',
 11: '11',
 13: '13',
 14: '14',
 16: '16',
 17: '17',
 19: '19'}

The result of this is a list of numbers which excludes multiples of 3.
While this example may seem a bit dense and confusing at first, as familiarity with Python grows, reading and writing list comprehensions will become second nature.

### Basic List Comprehensions

List comprehensions are simply a way to compress a list-building for-loop into a single short, readable line.
For example, here is a loop that constructs a list of the first 12 square integers:

In [53]:
L = []
for n in range(12):
    L.append(n ** 2)
L

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

The list comprehension equivalent of this is the following:

In [54]:
[n ** 2 for n in range(12)]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

As with many Python statements, you can almost read-off the meaning of this statement in plain English: "construct a list consisting of the square of ``n`` for each ``n`` from zero to 12".

This basic syntax, then, is ``[``*``expr``* ``for`` *``var``* ``in`` *``iterable``*``]``, where *``expr``* is any valid expression, *``var``* is a variable name, and *``iterable``* is any iterable Python object.

In [55]:
[x*y for x in [] for y in [] if]

SyntaxError: invalid syntax (3078318117.py, line 1)

### Multiple Iteration
Sometimes you want to build a list not just from one value, but from two. To do this, simply add another ``for`` expression in the comprehension:

In [56]:
[(i, j) for i in range(2) for j in range(3)]

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]

Notice that the second ``for`` expression acts as the interior index, varying the fastest in the resulting list.
This type of construction can be extended to three, four, or more iterators within the comprehension, though at some point code readibility will suffer!

### Conditionals on the Iterator
You can further control the iteration by adding a conditional to the end of the expression.
In the first example of the section, we iterated over all numbers from 1 to 20, but left-out multiples of 3.
Look at this again, and notice the construction:

In [60]:
[val for val in range(20) if val % 3 > 0]

[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

The expression ``(i % 3 > 0)`` evaluates to ``True`` unless ``val`` is divisible by 3.
Again, the English language meaning can be immediately read off: "Construct a list of values for each value up to 20, but only if the value is not divisible by 3".
Once you are comfortable with it, this is much easier to write – and to understand at a glance – than the equivalent loop syntax:

In [59]:
L = []
for val in range(20):
    if val % 3:
        L.append(val)
L

[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

### Conditionals on the Value
If you've programmed in C, you might be familiar with the single-line conditional enabled by the ``?`` operator:
``` C
int absval = (val < 0) ? -val : val
```
Python has something very similar to this, which is most often used within list comprehensions, ``lambda`` functions, and other places where a simple expression is desired:

In [None]:
val = -10
val if val >= 0 else -val

We see that this simply duplicates the functionality of the built-in ``abs()`` function, but the construction lets you do some really interesting things within list comprehensions.
This is getting pretty complicated now, but you could do something like this:

In [None]:
[val if val % 2 else -val
 #val %2 ==1 ? val :-val
 for val in range(20) if val % 3]

Note the line break within the list comprehension before the ``for`` expression: this is valid in Python, and is often a nice way to break-up long list comprehensions for greater readibility.
Look this over: what we're doing is constructing a list, leaving out multiples of 3, and negating all mutliples of 2.

Once you understand the dynamics of list comprehensions, it's straightforward to move on to other types of comprehensions. The syntax is largely the same; the only difference is the type of bracket you use.

For example, with curly braces you can create a ``set`` with a *set comprehension*:

In [61]:
{n**2 for n in range(12)}

{0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121}

Recall that a ``set`` is a collection that contains no duplicates.
The set comprehension respects this rule, and eliminates any duplicate entries:

In [62]:
{a % 3 for a in range(1000)}

{0, 1, 2}

With a slight tweak, you can add a colon (``:``) to create a *dict comprehension*:

In [63]:
{n:n**2 for n in range(6)}

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

### Generators

Finally, if you use parentheses rather than square brackets, you get what's called a **generator expression**:

In [64]:
(n**2 for n in range(12))

<generator object <genexpr> at 0x000001C125BF9350>

A generator expression is essentially a list comprehension in which **elements are generated as-needed rather than all at-once**, and the simplicity here belies the power of this language feature.