![](../docs/banner.png)

# Chapter 2: Loops & Functions

## Unpacking tuples
If you try to assign to a tuple-like expression of variables, Python will attempt to unpack the value on the righthand side of the equals sign:

In [None]:
tup = (4, 5, 6)

In [None]:
a, b, c = tup

In [None]:
c

Even sequences with nested tuples can be unpacked:

In [None]:
tup = 4, 5, (8, 9)

In [None]:
type(tup)

In [None]:
a, b, (c, d) = tup

In [None]:
c

There are some situations where you may want to "pluck" a few elements from the beginning of a tuple. There is a special syntax that can do this, `*rest`, which is also used in function signatures to capture an arbitrarily long list of positional arguments:

In [None]:
values = 1, 2, 3, 4, 5
a, b, *rest = values
print(a)
print(b)
print(rest)

This rest bit is sometimes something you want to discard; there is nothing special about the rest name. As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables

In [None]:
a, b, *_ = values

In many contexts, the parentheses in tuple can be omitted, so here we could also have written:

In [None]:
tup = 5,6,7

In [None]:
type(tup)

## 1. `for` Loops
<hr>

For loops allow us to execute code a specific number of times.

In [None]:
for n in [2, 7, -1, 5]:
    print(f"The number is {n} and its square is {n**2}")
print("I'm outside the loop!")

The main points to notice:

* Keyword `for` begins the loop. Colon `:` ends the first line of the loop.
* Block of code indented is executed for each value in the list (hence the name "for" loops)
* The loop ends after the variable `n` has taken all the values in the list
* We can iterate over any kind of "iterable": `list`, `tuple`, `range`, `set`, `string`.
* An iterable is really just any object with a sequence of values that can be looped over. In this case, we are iterating over the values in a list.

In [None]:
word = "Python"
for letter in word:
    print("Gimme a " + letter + "!")

print(f"What's that spell?!! {word}!")

A very common pattern is to use `for` with the `range()`. `range()` gives you a sequence of integers up to some value (non-inclusive of the end-value) and is typically used for looping. The `range` function generates a sequence of evenly spaced integers:

In [None]:
range(10)

In [None]:
list(range(10))

In [None]:
for i in range(10):
    print(i)

As you can see, `range` produces integers up to but not including the endpoint. A common use of range is for iterating through sequences by index:

In [None]:
seq = [1, 2, 3, 4]

for i in range(len(seq)):
   print(f"element {i}: {seq[i]}")

We can also specify a start value and a skip-by value with `range`:

In [None]:
for i in range(1,101,10):
    print(i)

`reversed` iterates over the elements of a sequence in reverse order:

In [None]:
list(reversed(range(10)))

In [None]:
for i in reversed(range(10)):
    print(i)

We can write a loop inside another loop to iterate over multiple dimensions of data:

In [None]:
for x in [1, 2, 3]:
    for y in ["a", "b", "c"]:
        print((x, y))

In [None]:
list_1 = [0, 1, 2]
list_2 = ["a", "b", "c"]
for i in range(3):
    print(list_1[i], list_2[i])

`zip` “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples:

In [None]:
seq1 = ["foo", "bar", "baz"]
seq2 = ["one", "two", "three"]

In [None]:
zipped = zip(seq1, seq2)

In [None]:
list(zipped)

There are many clever ways of doing these kinds of things in Python. When looping over objects, I tend to use `zip()` and `enumerate()` quite a lot in my work. `zip()` returns a zip object which is an iterable of tuples.

In [None]:
for i in zip(list_1, list_2):
    print(i)
    print(type(i))

We can even "unpack" these tuples directly in the `for` loop:

In [None]:
for i, j in zip(list_1, list_2):
    print(i, j)

`enumerate()` adds a counter to an iterable which we can use within the loop.

In [None]:
for i in enumerate(list_2):
    print(i)
    print(type(i))

In [None]:
for n, i in enumerate(list_2):
    print(f"index {n}, value {i}")

We can loop through key-value pairs of a dictionary using `.items()`. The general syntax is `for key, value in dictionary.items()`.

In [None]:
courses = {521 : "awesome",
           551 : "riveting",
           511 : "naptime!"}

for course_num, description in courses.items():
    print(f"DSCI {course_num}, is {description}")

We can even use `enumerate()` to do more complex un-packing:

In [None]:
for n, (course_num, description) in enumerate(courses.items()):
    print(f"Item {n}: DSCI {course_num}, is {description}")

You can advance a for loop to the next iteration, skipping the remainder of the block, using the **continue** keyword. Consider this code, which sums up integers in a list and skips None values:

In [None]:
sequence = [1, 2, None, 4, None, 5]
total = 0
for value in sequence:
    if value is None:
        print("None, continue...")
        continue
    total += value
    print(total)

A for loop can be exited altogether with the **break** keyword. This code sums elements of the list until a 5 is reached:

In [None]:
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        print("5")
        break
    total_until_5 += value
    print(f"value {value}, total_until_5 {total_until_5}")
print("I'm out of loop")

The break keyword only terminates the innermost for loop; any outer for loops will continue to run:

In [None]:
for i in range(4):
    for j in range(4):
        if j > i:
            break
        print((i, j))


## 2. `while` loops
<hr>

We can also use a [`while` loop](https://docs.python.org/3/reference/compound_stmts.html#while) to excute a block of code several times. But beware! If the conditional expression is always `True`, then you've got an infintite loop! 

In [None]:
n = 10
while n > 10:
    print(n)

print("Blast off!")

Let's read the `while` statement above as if it were in English. It means, “While `n` is greater than 0, display the value of `n` and then decrement `n` by 1. When you get to 0, display the word Blast off!”

For some loops, it's hard to tell when, or if, they will stop! Take a look at the [Collatz conjecture](https://en.wikipedia.org/wiki/Collatz_conjecture). The conjecture states that no matter what positive integer `n` we start with, the sequence will always eventually reach 1 - we just don't know how many iterations it will take.

In [None]:
n = 11
while n != 1:
    print(int(n))
    if n % 2 == 0: # n is even
        n = n / 2
    else: # n is odd
        n = n * 3 + 1
print("end",int(n))

Hence, in some cases, you may want to force a `while` loop to stop based on some criteria, using the `break` keyword.

In [None]:
n = 123
i = 0
while n != 1:
    print(int(n))
    if n % 2 == 0: # n is even
        n = n / 2
    else: # n is odd
        n = n * 3 + 1
    i += 1
    if i == 10:
        print(f"Ugh, too many iterations!")
        break

The `continue` keyword is similar to `break` but won't stop the loop. Instead, it just restarts the loop from the top.

In [None]:
n = 10
while n > 0:
    if n % 2 != 0: # n is odd
        n = n - 1
        continue
        break  # this line is never executed because continue restarts the loop from the top
    print(n)
    n = n - 1

print("Blast off!")

## 3. Comprehensions
<hr>

Comprehensions allow us to build lists/tuples/sets/dictionaries in one convenient, compact line of code. I use these quite a bit! Below is a standard `for` loop you might use to iterate over an iterable and create a list:

In [None]:
subliminal = ['Tom', 'ingests', 'many', 'eggs', 'to', 'outrun', 'large', 'eagles', 'after', 'running', 'near', '!']
first_letters = []
for word in subliminal:
    first_letters.append(word[0])
print(first_letters)

List comprehension allows us to do this in one compact line:

In [None]:
letters = [word[0] for word in subliminal]  # list comprehension
letters

We can make things more complicated by doing multiple iteration or conditional iteration:

In [None]:
[(i, j) for i in range(3) for j in range(4)]

In [None]:
my_list = []
for i in range(3):
    for j in range(4):
        my_list.append((i,j))

In [None]:
my_list

In [None]:
[i for i in range(11) if i % 2 == 0]  # condition the iterator, select only even numbers

In [None]:
[-i if i % 2 else i for i in range(11)]  # condition the value, -ve odd and +ve even numbers

In [None]:
bool(None)


There is also set comprehension:

In [None]:
words = ['hello', 'goodbye', 'the', 'antidisestablishmentarianism']
y = {word[-1] for word in words}  # set comprehension
y  # only has 3 elements because a set contains only unique items and there would have been two e's

Dictionary comprehension:

In [None]:
word_lengths = {word:len(word) for word in words} # dictionary comprehension
word_lengths

Tuple comprehension doesn't work as you might expect... We get a "generator" instead (more on that later).

In [None]:
y = (word[-1] for word in words)  # this is NOT a tuple comprehension - more on generators later
print(y)

## 4. `try` / `except`

Handling Python errors or exceptions gracefully is an important part of building robust programs. In data analysis applications, many functions work only on certain kinds of input. As an example, Python’s float function is capable of casting a string to a floating-point number, but it fails with ValueError on improper inputs:

In [None]:
float("something")

If something goes wrong, we don't want our code to crash - we want it to **fail gracefully**. In Python, this can be accomplished using `try`/`except`. Here is a basic example:

In [None]:
this_variable_does_not_exist
print("Another line")  # code fails before getting to this line

In [None]:
try:
    this_variable_does_not_exist
except:
    print("You did something bad! But I won't raise an error.") # print something
print("Another line")

Python tries to execute the code in the `try` block. If an error is encountered, we "catch" this in the `except` block (also called `try`/`catch` in other languages). There are many different error types, or **exceptions** - we saw `NameError` above. 

In [None]:
5/0  # ZeroDivisionError

In [None]:
my_list = [1,2,3]
my_list[5]  # IndexError

In [None]:
my_tuple = (1,2,3)
my_tuple[0] = 0  # TypeError

Ok, so there are apparently a bunch of different errors one could run into. With `try`/`except` you can also catch the exception itself:

In [None]:
this_variable_does_not_exist

In [None]:
try:
    this_variable_does_not_exist
except Exception as ex:
    print("You did something bad!")
    print(ex)
    print(type(ex))

In the above, we caught the exception and assigned it to the variable `ex` so that we could print it out. This is useful because you can see what the error message would have been, without crashing your program. You can also catch specific exceptions types. This is typically the recommended way to catch errors, you want to be specific in catching your error so you know exactly where and why your code failed.

In [None]:
try:
    #this_variable_does_not_exist  # name error
     #(1, 2, 3)[0] = 1  # type error
     5/0  # ZeroDivisionError
except TypeError:
    print("You made a type error!")
except NameError:
    print("You made a name error!")
except:
    print("You made some other sort of error")

## 5. Functions
<hr>

A [function](https://docs.python.org/3/tutorial/controlflow.html#defining-functions) is a **reusable piece of code** that can **accept input parameters**, also known as **"arguments"**. As a rule of thumb, if you anticipate needing to repeat the same or very similar code more than once, it may be worth writing a reusable function. Functions can also help make your code more readable by giving a name to a group of Python statements. For example, let's define a function called `square` which takes one input parameter `n` and returns the square `n**2`:

In [None]:
def square(n):
    n_squared = n**2
    return n_squared

In [None]:
output = square(2452542)

In [None]:
print(output)

In [None]:
square(100)

In [None]:
square(12345)

Functions begin with the `def` keyword, then the function name, arguments in parentheses, and then a colon (`:`). The code executed by the function is defined by indentation. The output or "return" value of the function is specified using the `return` keyword.

### Null Return Type


If Python reaches the end of a function without encountering a return statement, `None` is returned automatically. For example:

In [None]:
def function_without_return(x):
    print(x)

result = function_without_return("hello!")

In [None]:
print(result)

### Namespaces, Scope, and Local Functions

Functions can access variables created inside the function as well as those outside the function in higher (or even global) scopes. An alternative and more descriptive name describing a variable scope in Python is a namespace. Any variables that are assigned within a function by default are assigned to the local namespace. The local namespace is created when the function is called and is immediately populated by the function’s arguments. After the function is finished, the local namespace is destroyed (with some exceptions that are outside the purview of this chapter). Consider the following function:

In [None]:
def func():
    a = []
    for i in range(5):
        a.append(i)
    print(a)


func()


When func() is called, the empty list a is created, five elements are appended, and then a is destroyed when the function exits. Suppose instead we had declared a as follows:

In [None]:
a = []

In [None]:
def func():
    for i in range(5):
        a.append(i)
        

In [None]:
func()

In [None]:
a

Each call to func will modify list a:

In [None]:
func()

a

In [None]:
func()

a

Assigning variables outside of the function's scope is possible, but those variables must be declared explicitly using either the global or nonlocal keywords:

In [None]:
a = None

In [None]:
def bind_a_variable():
    global a
    a = []
bind_a_variable()

In [None]:
print(a)

### Optional (Keyword) & Required (Positional) Arguments

Each function can have positional arguments and keyword arguments. Keyword arguments are most commonly used to specify default values or optional arguments. Here we will define a function with an optional z argument with the default value 1.5:

In [None]:
def my_function2(x, y, z=2):
    print(f"x: {x}, y: {y}, z: {z}")
    if z > 1:
        return z * (x + y)
    else:
        return z / (x + y)

While keyword arguments are optional, all positional arguments must be specified when calling a function

You can pass values to the z argument with or without the keyword provided, though using the keyword is encouraged:

In [None]:
my_function2()

In [None]:
my_function2(3.14, 7, 3.5)

In [None]:
my_function2(10, 20)

The main restriction on function arguments is that the keyword arguments must follow the positional arguments (if any). You can specify keyword arguments in any order. This frees you from having to remember the order in which the function arguments were specified. You need to remember only what their names are.

You can have any number of required arguments and any number of optional arguments. All the optional arguments must come after the required arguments. The required arguments are mapped by the order they appear. The optional arguments can be specified out of order when using the function.

In [None]:
def example(a, b, c="DEFAULT", d="DEFAULT"):
    print(a, b, c, d)
    
example(1, 2)

Using the defaults for `c` and `d`:

In [None]:
example(1, 2)

Specifying `c` and `d` as **keyword arguments** (i.e. by name):

In [None]:
example(1, 2, d=4, c=3)

Specifying only one of the optional arguments, by keyword:

In [None]:
example(1, 2, c=3)

Specifying all the arguments as keyword arguments, even though only `c` and `d` are optional:

In [None]:
example(a=1, b=2, c=3, d=4)

Specifying `c` by the fact that it comes 3rd (I do not recommend this because I find it is confusing):

In [None]:
example(1, 2, 3)

Specifying the optional arguments by keyword, but in the wrong order (this can also be confusing, but not so terrible - I am fine with it):

In [None]:
example(1, 2, d=4, c=3)

Specifying the non-optional arguments by keyword (I am fine with this):

In [None]:
example(a=1, b=2)

Specifying the non-optional arguments by keyword, but in the wrong order (not recommended, I find it confusing):

In [None]:
example(b=2, a=1)

Specifying keyword arguments before non-keyword arguments (this throws an error):

In [None]:
example(a=2, 1)

### Multiple Return Values

In many programming languages, functions can only return one object. That is technically true in Python too, but there is a "workaround", which is to return a tuple.

In [None]:
def sum_and_product(x, y):
    return (x + y, x * y)

In [None]:
sum_and_product(5, 6)

The parentheses can be omitted (and often are), and a `tuple` is implicitly returned as defined by the use of the comma: 

In [None]:
def sum_and_product(x, y):
    return x + y, x * y

In [None]:
sum_and_product(5, 6)

It is common to immediately unpack a returned tuple into separate variables, so it really feels like the function is returning multiple values:

In [None]:
s, p = sum_and_product(5, 6)

In [None]:
s

In [None]:
p

As an aside, it is conventional in Python to use `_` for values you don't want:

In [None]:
s, _ = sum_and_product(5, 6)

In [None]:
s

In [None]:
_

### Functions with Arbitrary Number of Arguments

You can also call/define functions that accept an arbitrary number of positional or keyword arguments using `*args` and `**kwargs`.

In [None]:
def add(*args):
    print(args)
    return sum(args)

In [None]:
add(1, 2, 3, 4, 5, 6,4,6)

In [None]:
def add(**kwargs):
    print(kwargs)
    return sum(kwargs.values())

In [None]:
add(a=3, b=4, c=5)

## 6. Functions as  a Data Type

In Python, functions are actually a data type:

In [None]:
def do_nothing(x):
    return x

In [None]:
type(do_nothing)

In [None]:
print(do_nothing)

This means you can pass functions as arguments into other functions.

In [None]:
def square(y):
    return y**2

def evaluate_function_on_x_plus_1(fun, x):
    return fun(x+1)

In [None]:
type(square)

In [None]:
evaluate_function_on_x_plus_1(square, 5)

So what happened above?
- `fun(x+1)` becomes `square(5+1)`
- `square(6)` becomes `36`

## 7. Anonymous Functions
<hr>

There are two ways to define functions in Python. The way we've beenusing up until now:

In [None]:
def add_one(x):
    return x+1

In [None]:
add_one(7.2)

Or by using the `lambda` keyword:

In [None]:
add_one = lambda x: x+1 

In [None]:
type(add_one)

In [None]:
add_one(7.2)

I usually refer to these as lambda functions in the rest of the book. They are especially convenient in data analysis because, as you’ll see, there are many cases where data transformation functions will take functions as arguments. It’s often less typing (and clearer) to pass a lambda function as opposed to writing a full-out function declaration or even assigning the lambda function to a local variable. Consider this example:iting a full-out function declaration or even assigning the lambda function to a local variable. Consider this example:

In [None]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]

In [None]:
ints = [4, 0, 1, 5, 6]

In [None]:
apply_to_list(ints, lambda x: x//2)

You could also have written `[x * 2 for x in ints]`, but here we were able to succinctly pass a custom operator to the apply_to_list function.

## 9. Generators
<hr>

In [None]:
def square_numbers(nums):
    results = []
    for i in nums:
        results.append(i*i)
    return results

my_nums = square_numbers([1,2,3,4,5])



In [None]:
type(my_nums)

In [None]:
for num in my_nums:
    print(num)

 We can create a generator using functions and the yield keyword (instead of the return keyword):

In [None]:
def square_numbers(nums):
    for i in nums:
        yield i

In [None]:
type(square_numbers)

In [None]:
my_nums = square_numbers([1,2,3,4,5])


In [None]:
type(my_nums)

In [None]:
next(my_nums)

In [None]:
print("2:",next(my_nums))
print("3:",next(my_nums))
print("4:",next(my_nums))
print("5:",next(my_nums))


Once the generator is exhausted, it will no longer return values:

In [None]:
print("6:",next(my_nums))

Finally, we can loop over generator objects too:

In [None]:
my_nums = square_numbers([1,2,3,4,5])
for num in my_nums:
    print(num)

In [None]:
my_nums = (x*x for x in [1,2,3,4,5])

for num in my_nums:
    print(num)


Recall list comprehension from earlier in the chapter:

In [None]:
[n for n in range(10)]

Comprehensions evaluate the entire expression at once, and then returns the full data product. Sometimes, we want to work with just one part of our data at a time, for example, when we can't fit all of our data in memory. For this, we can use *generators*.

In [None]:
(n for n in range(10))

Notice that we just created a `generator object`. Generator objects are like a "recipe" for generating values. They don't actually do any computation until they are asked to. We can get values from a generator in three main ways:
- Using `next()`
- Using `list()`
- Looping

In [None]:
gen = (n for n in range(10))

In [None]:
next(gen)

In [None]:
next(gen)

We can see all the values of a generator using `list()` but this defeats the purpose of using a generator in the first place:

In [None]:
gen = (n for n in range(10))
list(gen)