$$
\def\CC{\bf C}
\def\QQ{\bf Q}
\def\RR{\bf R}
\def\ZZ{\bf Z}
\def\NN{\bf N}
$$
# 1. Collections

We have already encountered some simple Python types like numbers,
strings and booleans. Now we will see how we can group multiple values
together in a *collection* -- like a *list* of numbers, or a
*dictionary* which we can use to store and retrieve key-value pairs.
Many useful collections are built-in types in Python, and we will
encounter them quite often.

## 1.1. Lists

The Python list type is called `list`. It is a type of sequence -- we
can use it to store multiple values, and access them sequentially, by
their position, or *index*, in the list. We define a list *literal* by
putting a comma-separated list of values inside square brackets (`[` and
`]`):

In [None]:
# a list of strings
animals = ['cat', 'dog', 'fish', 'bison']

# a list of integers
numbers = [1, 7, 34, 20, 12]

# an empty list
my_list = []

# a list of lists we defined previously
things = [
    animals,
    numbers,
    my_list, # this trailing comma is legal in Python
]

As you can see, we have used plural nouns to name most of our list
variables. This is a common convention, and it's useful to follow it in
most cases.

To refer to an element in the list, we use the list identifier followed
by the index inside square brackets. Indices are integers which *start
from zero*:

In [None]:
print(animals[0]) # cat
print(numbers[1]) # 7

# This will give us an error, because the list only has four elements
print(animals[6])

cat
7


IndexError: list index out of range

We can also count from the end:

In [None]:
print(animals[-1]) # the last element -- bison
print(numbers[-2]) # the second-last element -- 20

bison
20


We can extract a subset of a list, which will itself be a list, using a
*slice*. This uses almost the same syntax as accessing a single element,
but instead of specifying a single index between the square brackets we
need to specify an upper and lower bound.

💡 The sublist will *include* the element at the lower bound, but *exclude* the element at the upper bound:

In [None]:
print(animals[1:3]) # ['dog', 'fish']
print(animals[1:-1]) # ['dog', 'fish']

['dog', 'fish']
['dog', 'fish']


If one of the bounds is one of the ends of the list, we can leave it
out. A slice with neither bound specified gives us a copy of the list:

In [None]:
print(animals[2:]) # ['fish', 'bison']
print(animals[:2]) # ['cat', 'dog']
print(animals[:]) # a copy of the whole list

['fish', 'bison']
['cat', 'dog']
['cat', 'dog', 'fish', 'bison']


We can even include a third parameter to specify the step size:

In [None]:
print(animals[::2]) # ['cat', 'fish']

['cat', 'fish']


Lists are mutable -- we can modify elements, add elements to them or
remove elements from them. A list will change size dynamically when we
add or remove elements -- we do not have to manage this ourselves:

In [None]:
# assign a new value to an existing element
animals[3] = "hamster"

# add a new element to the end of the list
animals.append("squirrel")

# remove an element by its index
del animals[2]
print(animals)

['cat', 'dog', 'hamster', 'squirrel']


Because lists are mutable, we can *modify* a list variable without
assigning the variable a completely new value. Remember that if we
assign the same `list` value to two variables, any in-place changes that
we make while referring to the list by one variable name will also be
reflected when we access the list through the other variable name:

In [None]:
animals = ['cat', 'dog', 'goldfish', 'canary']
pets = animals # now both variables refer to the same list object

animals.append('aardvark')
print(pets) # pets is still the same list as animals

animals = ['rat', 'gerbil', 'hamster'] # now we assign a new list value to animals
print(pets) # pets still refers to the old list

pets = animals[:] # assign a *copy* of animals to pets
animals.append('aardvark')
print(pets) # pets remains unchanged, because it refers to a copy, not the original list

['cat', 'dog', 'goldfish', 'canary', 'aardvark']
['cat', 'dog', 'goldfish', 'canary', 'aardvark']
['rat', 'gerbil', 'hamster']


We can mix the types of values that we store in a list:

In [None]:
my_list = ['cat', 12, 35.8]

How do we check whether a list contains a particular value? We use `in`
or `not in`, the membership operators:

In [None]:
numbers = [34, 67, 12, 29]
number = 67

if number in numbers:
    print("%d is in the list!" % number)

number = 90
if number not in numbers:
    print("%d is not in the list!" % number)

67 is in the list!
90 is not in the list!


💡 `in` and `not in` fall between the logical operators (`and`, `or` and
`not`) and the identity operators (`is` and `is not`) in the order of
precedence.

### 1.1.1. List Methods & Functions

There are many built-in functions which we can use on lists and other
sequences:

In [None]:
# the length of a list
len(animals)

# the sum of a list of numbers
sum(numbers)

# are any of these values true?
any([1,0,1,0,1])

# are all of these values true?
all([1,0,1,0,1])

False

List objects also have useful methods which we can call:

In [None]:
numbers = [1, 2, 3, 4, 5]

# we already saw how to add an element to the end
numbers.append(5)

# count how many times a value appears in the list
numbers.count(5)

# append several values at once to the end
numbers.extend([56, 2, 12])

# find the index of a value
numbers.index(3)
# if the value appears more than once, we will get the index of the first one
numbers.index(2)
# if the value is not in the list, we will get a ValueError!
numbers.index(42)

# insert a value at a particular index
numbers.insert(0, 45) # insert 45 at the beginning of the list

# remove an element by its index and assign it to a variable
my_number = numbers.pop(0)

# remove an element by its value
numbers.remove(12)
# if the value appears more than once, only the first one will be removed
numbers.remove(5)

If we want to sort or reverse a list, we can either call a method on the
list to modify it *in-place*, or use a function to return a modified
copy of the list while leaving the original list untouched:

In [None]:
numbers = [3, 2, 4, 1]

# these return a modified copy, which we can print
print(sorted(numbers))
print(list(reversed(numbers)))

# the original list is unmodified
print(numbers)

# now we can modify it in place
numbers.sort()
numbers.reverse()

print(numbers)

[1, 2, 3, 4]
[1, 4, 2, 3]
[3, 2, 4, 1]
[4, 3, 2, 1]


The `reversed` function actually returns a generator, not a list (we
will look at generators in the next chapter), so we have to convert it
to a list before we can print the contents. To do this, we call the
`list` type like a function, just like we would call `int` or `float` to
convert numbers. We can also use `list` as another way to make a copy of
a list:

In [None]:
animals = ['cat', 'dog', 'goldfish', 'canary']
pets = list(animals)

animals.sort()
pets.append('gerbil')

print(animals)
print(pets)

['canary', 'cat', 'dog', 'goldfish']
['cat', 'dog', 'goldfish', 'canary', 'gerbil']


### 1.1.2. Arithmetic Operators for Lists

Some of the arithmetic operators we have used on numbers before can also
be used on lists, but the effect may not always be what we expect:

In [None]:
# we can concatenate two lists by adding them
print([1, 2, 3] + [4, 5, 6])

# we can concatenate a list with itself by multiplying it by an integer
print([1, 2, 3] * 3)

# not all arithmetic operators can be used on lists -- this will give us an error!
print([1, 2, 3] - [2, 3])

[1, 2, 3, 4, 5, 6]
[1, 2, 3, 1, 2, 3, 1, 2, 3]


TypeError: ignored

### Example 1

1.  Create a list `a` which contains the first three odd positive
    integers and a list `b` which contains the first three even positive
    integers.
2.  Create a new list `c` which combines the numbers from both lists
    (order is unimportant).
3.  Create a new list `d` which is a sorted copy of `c`, leaving `c`
    unchanged.
4.  Reverse `d` in-place.
5.  Set the fourth element of `c` to `42`.
6.  Append `10` to the end of `d`.
7.  Append `7`, `8` and `9` to the end of `c`.
8.  Print the first three elements of `c`.
9.  Print the last element of `d` without using its length.
10. Print the length of `d`.

#### Answers to Example 1

In [None]:
a = [1, 3, 5]
b = [2, 4, 6]

c = a + b

d = sorted(c)
d.reverse()

c[3] = 42
d.append(10)
d.extend([7, 8, 9])

print(c[:2])
print(d[-1])
print(len(d))

[1, 3]
9
10


## 1.2. Tuples

Python has another sequence type which is called `tuple`. Tuples are
similar to lists in many ways, but they are immutable. We define a tuple
*literal* by putting a comma-separated list of values inside round
brackets (`(` and `)`):

In [None]:
WEEKDAYS = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday')

We can use tuples in much the same way as we use lists, except that we
cannot modify them:

In [None]:
animals = ('cat', 'dog', 'fish')

# an empty tuple
my_tuple = ()

# we can access a single element
print(animals[0])

# we can get a slice
print(animals[1:]) # note that our slice will be a new tuple, not a list

# we can count values or look up an index
animals.count('cat')
animals.index('cat')

# ... but this is not allowed:
animals.append('canary')
animal[1] = 'gerbil'

cat
('dog', 'fish')


AttributeError: 'tuple' object has no attribute 'append'

We have already been using tuples when inserting multiple values into a
formatted string:

In [None]:
print("%d %d %d" % (1, 2, 3))

To define a tuple with a single element you have to use double () and a trailing comma. So like this.

In [None]:
#good
print((3,))

(3,)


In [None]:
#bad
print(3)
print((3)) # this is still just 3

3
3


### Example 2

1.  Create a tuple `a` which contains the first four positive integers
    and a tuple `b` which contains the next four positive integers.
2.  Create a tuple `c` which combines all the numbers from `a` and `b`
    in any order.
3.  Create a tuple `d` which is a sorted copy of `c`.
4.  Print the third element of `d`.
5.  Print the last three elements of `d` without using its length.
6.  Print the length of `d`.

#### Answers to Example 2

In [None]:
a = (1, 2, 3, 4)
b = (5, 6, 7, 8)

c = a + b
d = sorted(c)

print(d[3])
print(d[-3:])
print(len(d))

4
[6, 7, 8]
8


## 1.3. Sets

The Python set type is called `set`. A set is a collection of *unique
elements*. If we add multiple copies of the same element to a set, the
duplicates will be eliminated, and we will be left with one of each
element. To define a set literal, we put a comma-separated list of
values inside curly brackets (`{` and `}`):

In [None]:
animals = {'cat', 'dog', 'goldfish', 'canary', 'cat'}
print(animals) # the set will only contain one cat

{'canary', 'dog', 'goldfish', 'cat'}


We can perform various set operations on sets:

In [None]:
even_numbers = {2, 4, 6, 8, 10}
big_numbers = {6, 7, 8, 9, 10}

# subtraction: big numbers which are not even
print(big_numbers - even_numbers)

# union: numbers which are big or even
print(big_numbers | even_numbers)

# intersection: numbers which are big and even
print(big_numbers & even_numbers)

# numbers which are big or even but not both
print(big_numbers ^ even_numbers)

{9, 7}
{2, 4, 6, 7, 8, 9, 10}
{8, 10, 6}
{2, 4, 7, 9}


It is important to note that unlike lists and tuples sets are *not
ordered*. When we print a set, the order of the elements will be random.
If we want to process the contents of a set in a particular order, we
will first need to convert it to a list or tuple and sort it:

In [None]:
print(animals)
print(sorted(animals))

{'canary', 'dog', 'goldfish', 'cat'}
['canary', 'cat', 'dog', 'goldfish']


The `sorted` function returns a `list` object.

How do we make an empty set? We have to use the `set` function.
Dictionaries, which we will discuss in the next section, used curly
brackets before sets adopted them, so an empty set of curly brackets is
actually an empty dictionary:

In [None]:
# this is an empty dictionary
a = {}

# this is how we make an empty set
b = set()

We can use the `list`, `tuple`, `dict` and even `int`, `float` or `str`
functions in the same way -- they all have sensible defaults -- but we
will probably seldom find a reason to do so.

### Example 3

1.  Create a set `a` which contains the first four positive integers and
    a set `b` which contains the first four odd positive integers.
2.  Create a set `c` which combines all the numbers which are in `a` or
    `b` (or both).
3.  Create a set `d` which contains all the numbers in `a` but not in
    `b`.
4.  Create a set `e` which contains all the numbers in `b` but not in
    `a`.
5.  Create a set `f` which contains all the numbers which are both in
    `a` and in `b`.
6.  Create a set `g` which contains all the numbers which are either in
    `a` or in `b` but not in both.
7.  Print the number of elements in `c`.



#### Answers to Exaple 3

In [None]:
a = {1, 2, 3, 4}
b = {1, 3, 5, 7}

c = a | b
d = a - b
e = b - a
f = a & b
g = a ^ b

print(len(c))

6


## 1.4. Ranges

`range` is another kind of immutable sequence type. It is very
specialised -- we use it to create ranges of integers. Ranges are also
*generators*. We will find out more about generators in the next
chapter, but for now we just need to know that the numbers in the range
are generated one at a time as they are needed, and not all at once. In
the examples below, we convert each range to a list so that all the
numbers are generated and we can print them out:

In [None]:
# print the integers from 0 to 9
print(list(range(10)))

# print the integers from 1 to 10
print(list(range(1, 11)))

# print the odd integers from 1 to 10
print(list(range(1, 11, 2)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 3, 5, 7, 9]


## 1.5. Dictionaries

The Python dictionary type is called `dict`. We can use a dictionary to
store key-value pairs. To define a dictionary literal, we put a
comma-separated list of key-value pairs between curly brackets. We use a
colon to separate each key from its value. We access values in the
dictionary in much the same way as list or tuple elements, but we use
keys instead of indices:

In [None]:
marbles = {"red": 34, "green": 30, "brown": 31, "yellow": 29 }

personal_details = {
    "name": "Jane Doe",
    "age": 38, # trailing comma is legal
}

print(marbles["green"])
print(personal_details["name"])

# This will give us an error, because there is no such key in the dictionary
# print(marbles["blue"])

# modify a value
marbles["red"] += 3
personal_details["name"] = "Jane Q. Doe"

print(marbles)
print(personal_details)

30
Jane Doe
{'red': 37, 'green': 30, 'brown': 31, 'yellow': 29}
{'name': 'Jane Q. Doe', 'age': 38}


The keys of a dictionary do not have to be strings -- they can be *any
immutable type*, including numbers and even tuples. We can mix different
types of keys and different types of values in one dictionary. Keys are
unique -- if we repeat a key, we will overwrite the old value with the
new value. When we store a value in a dictionary, the key does not have
to exist -- it will be created automatically:

In [None]:
battleship_guesses = {
    (3, 4): False,
    (2, 6): True,
    (2, 5): True,
}

surnames = {} # this is an empty dictionary
surnames["John"] = "Smith"
surnames["John"] = "Doe"
print(surnames) # we overwrote the older surname

marbles = {"red": 34, "green": 30, "brown": 31, "yellow": 29 }
marbles["blue"] = 30 # this will work
marbles["purple"] += 2 # this will fail -- the increment operator needs an existing value to modify!

{'John': 'Doe'}


KeyError: ignored

Like sets, dictionaries are not ordered -- if we print a dictionary, the
order will be random.

Here are some commonly used methods of dictionary objects:

In [None]:
marbles = {"red": 34, "green": 30, "brown": 31, "yellow": 29 }

# Get a value by its key, or None if it does not exist
marbles.get("orange")
# We can specify a different default
marbles.get("orange", 0)

# Add several items to the dictionary at once
marbles.update({"orange": 34, "blue": 23, "purple": 36})

# All the keys in the dictionary
marbles.keys()
# All the values in the dictionary
marbles.values()
# All the items in the dictionary
marbles.items()

dict_items([('red', 34), ('green', 30), ('brown', 31), ('yellow', 29), ('orange', 34), ('blue', 23), ('purple', 36)])

The last three methods return special sequence types which are read-only
*views* of various properties of the dictionary. We cannot edit them
directly, but they will be updated when we modify the dictionary. We
most often access these properties because we want to iterate over them
(something we will discuss in the next chapter), but we can also convert
them to other sequence types if we need to.

We can check if a key is in the dictionary using `in` and `not in` :

In [None]:
print("purple" in marbles)
print("white" not in marbles)

True
True


We can also check if a value is in the dictionary using `in` in
conjunction with the `values` method:

In [None]:
print("red" in marbles)

True


💡 In Python 2, `keys`, `values` and `items` return list copies of these
sequences, `iterkeys`, `itervalues` and `iteritems` return iterator
objects, and `viewkeys`, `viewvalues` and `viewitems` return the view
objects which are the default in Python 3 (but these are only available
in Python 2.7 and above). In Python 2 you should *really* not use
`mykey in mydict.keys()` to check for key membership -- if you do, you
will be searching the entire list of keys sequentially, which is much
slower than a direct dictionary lookup.

### Example 4

1.  Create a dict `directory` which stores telephone numbers (as string
    values), and populate it with these key-value pairs:


    | Name       | Telephone number |
    |------------|------------------|
    | Jane Doe   | +27 555 5367     |
    | John Smith | +27 555 6254     |
    | Bob Stone  | +27 555 5689     |


2.  Change Jane's number to *+27 555 1024*

3.  Add a new entry for a person called *Anna Cooper* with the phone
    number *+27 555 3237*

4.  Print Bob's number.

5.  Print Bob's number in such a way that `None` would be printed if
    Bob's name were not in the dictionary.

6.  Print all the keys. The format is unimportant, as long as they're
    all visible.

7.  Print all the values.



#### Answers to Example 4

In [None]:
directory = {
    "Jane Doe": "+27 555 5367",
    "John Smith": "+27 555 6254",
    "Bob Stone": "+27 555 5689",
}

directory["Jane Doe"] = "+27 555 1024"
directory["Anna Cooper"] = "+27 555 3237"

print(directory["Bob Stone"])
print(directory.get("Bob Stone", None))

print(directory.keys())
print(directory.values())

+27 555 5689
+27 555 5689
dict_keys(['Jane Doe', 'John Smith', 'Bob Stone', 'Anna Cooper'])
dict_values(['+27 555 1024', '+27 555 6254', '+27 555 5689', '+27 555 3237'])


## 1.6. Converting between Collection Types

### 1.6.1. Implicit Conversions

If we try to iterate over a collection in a `for` loop, Python will try to convert it into
something that we can iterate over if it knows how to. For example, the
dictionary views we saw above are not actually iterators, but Python
knows how to make them into iterators -- so we can use them in a `for`
loop without having to convert them ourselves.

Sometimes the iterator we get by default may not be what we expected --
if we iterate over a dictionary in a `for` loop, we will iterate over
the *keys*. If what we actually want to do is iterate over the values,
or key and value pairs, we will have to specify that ourselves by using
the dictionary's `values` or `items` view instead.

### 1.6.2. Explicit Conversions

We can convert between the different sequence types quite easily by
using the type functions to *cast* sequences to the desired types --
just like we would use `float` and `int` to convert numbers:

In [None]:
animals = ['cat', 'dog', 'goldfish', 'canary', 'cat']

animals_set = set(animals)
animals_unique_list = list(animals_set)
animals_unique_tuple = tuple(animals_unique_list)

We have to be more careful when converting a dictionary to a sequence:
do we want to use the keys, the values or pairs of keys and values? :

In [None]:
marbles = {"red": 34, "green": 30, "brown": 31, "yellow": 29 }

colours = list(marbles) # the keys will be used by default
counts = tuple(marbles.values()) # but we can use a view to get the values
marbles_set = set(marbles.items()) # or the key-value pairs

If we convert the key-value pairs of a dictionary to a sequence, each
pair will be converted to a tuple containing the key followed by the
value.

We can also convert a sequence to a dictionary, but only if it is a
sequence of *pairs* -- each pair must itself be a sequence with two
values:

In [None]:
# Python does not know how to convert this into a dictionary
dict([1, 2, 3, 4])

# but this will work
dict([(1, 2), (3, 4)])

TypeError: cannot convert dictionary update sequence element #0 to a sequence

We will revisit conversions in the next chapter, when we learn about
*comprehensions* -- an efficient syntax for filtering sequences or
dictionaries. By using the right kind of comprehension, we can filter a
collection and convert it to a different type of collection at the same
time.

### Example 5

1.  Convert a list which contains the numbers `1`, `1`, `2`, `3` and
    `3` into a tuple `a`.
2.  Convert `a` to a list `b`. Print its length.
3.  Convert `b` to a set `c`. Print its length.
4.  Convert `c` to a list `d`. Print its length.
5.  Create a range which starts at `1` and ends at `10`. Convert it to a
    list `e`.
6.  Create the `directory` dict from the previous example. Create a list
    `t` which contains all the key-value pairs from the dictionary as
    tuples.
7.  Create a list `v` of all the values in the dictionary.
8.  Create a list `k` of all the keys in he dictionary.
9.  Create a string `s` which contains the word
    `"antidisestablishmentarianism"`. Use the `sorted` function on it.
    What is the output type? Concatenate the letters in the output to a
    string `s2`.
10. Split the string `"the quick brown fox jumped over the lazy dog"`
    into a list `w` of individual words.

#### Answers to Example 5

In [None]:
a = tuple([1, 1, 2, 3, 3])

b = list(a)
print(len(b))

c = set(b)
print(len(c))

d = list(c)
print(len(d))

e = list(range(1, 11))

directory = {
    "Jane Doe": "+27 555 5367",
    "John Smith": "+27 555 6254",
    "Bob Stone": "+27 555 5689",
}

t = list(directory.items())
v = list(directory.values())
k = list(directory)

s = "antidisestablishmentarianism"
s2 = "".join(sorted(s))

w = "the quick brown fox jumped over the lazy dog".split()

5
3
3


## 1.6. Two-Dimensional Sequences

Most of the sequences we have seen so far have been one-dimensional:
each sequence is a row of elements. What if we want to use a sequence to
represent a two-dimensional data structure, which has both rows and
columns? The easiest way to do this is to make a sequence in which each
element is also a sequence. For example, we can create a list of lists:

In [None]:
my_table = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
    [10, 11, 12],
]

The outer list has four elements, and each of these elements is a list
with three elements (which are numbers). To access one of these numbers,
we need to use two indices -- one for the outer list, and one for the
inner list:

In [None]:
print(my_table[0][0])

# lists are mutable, so we can do this
my_table[0][0] = 42

1


We have already seen an example of this in the previous chapter, when we
created a list of tuples to convert into a dict.

When we use a two-dimensional sequence to represent tabular data, each
inner sequence will have the same length, because a table is rectangular
-- but nothing is stopping us from constructing two-dimensional
sequences which do not have this property:

In [None]:
my_2d_list = [
    [0],
    [1, 2, 3, 4],
    [5, 6],
]

We can also make a three-dimensional sequence by making a list of lists
of lists:

In [None]:
my_3d_list = [
    [[1, 2], [3, 4]],
    [[5, 6], [7, 8]],
]

print(my_3d_list[0][0][0])

1


Of course we can also make a list of lists of lists of lists and so
forth -- we can nest lists as many times as we like.

If we wanted to make a two-dimensional list to represent a weekly
timetable, we could either have days as the outer list and time slots as
the inner list or the other way around -- we would have to remember
which range we picked to be the rows and which the columns.

### Example 6

1.  Create a list `a` which contains three tuples. The first tuple
    should contain a single element, the second two elements and the
    third three elements.
2.  Print the second element of the second element of `a`.
3.  Create a list `b` which contains four lists, each of which contains
    four elements.
4.  Print the last two elements of the first element of `b`.

#### Answer to Example 6

In [None]:
a = [
    (1,),
    (2, 2),
    (3, 3, 3),
]

print(a[1][1])

b = [
    list(range(10)),
    list(range(10, 20)),
    list(range(20, 30)),
    list(range(30, 40)),
]

print(b[0][1:-1])

2
[1, 2, 3, 4, 5, 6, 7, 8]


$$
\def\CC{\bf C}
\def\QQ{\bf Q}
\def\RR{\bf R}
\def\ZZ{\bf Z}
\def\NN{\bf N}
$$
# 2. Functions

A function is a sequence of statements which performs some kind of task.
We use functions to eliminate code duplication -- instead of writing all
the statements at every place in our code where we want to perform the
same task, we define them in one place and refer to them by the function
name. If we want to change how that task is performed, we will now
mostly only need to change code in one place.

Here is a definition of a simple function which takes no parameters and
does not return any values:

In [None]:
def print_a_message():
    print("Hello, world!")

We use the `def` statement to indicate the start of a function
definition. The next part of the definition is the function name, in
this case `print_a_message`, followed by round brackets (the definitions
of any parameters that the function takes will go in between them) and a
colon. Thereafter, everything that is indented by one level is the body
of the function.

Functions *do things*, so you should always choose a function name which
explains as simply as accurately as possible *what the function does*.
This will usually be a verb or some phrase containing a verb. If you
change a function so much that the name no longer accurately reflects
what it does, you should consider updating the name -- although this may
sometimes be inconvenient.

This particular function always does exactly the same thing: it prints
the message `"Hello, world!"`.

Defining a function does not make it run -- when the flow of control
reaches the function definition and executes it, Python just learns
about the function and what it will do when we run it. To run a
function, we have to *call* it. To call the function we use its name
followed by round brackets (with any parameters that the function takes
in between them):

In [None]:
print_a_message()

Hello, world!


Of course we have already used many of Python's built-in functions, such
as `print` and `len` :

In [None]:
print("Hello")
len([1, 2, 3])

Hello


3

Many objects in Python are *callable*, which means that you can call
them like functions -- a callable object has a special method defined
which is executed when the object is called. For example, types such as
`str`, `int` or `list` can be used as functions, to create new objects
of that type (sometimes by converting an existing object):

In [None]:
num_str = str(3)
num = int("3")

people = list() # make a new (empty) list
people = list((1, 2, 3)) # convert a tuple to a new list

In general, classes (of which types are a subset) are callable -- when
we call a class we call its *constructor* method, which is used to
create a new object of that class. We will learn more about classes in
the next chapter, but you may recall that we already called some classes
to make new objects when we raised exceptions:

In [None]:
raise ValueError("There's something wrong with your number!")

ValueError: There's something wrong with your number!

Because functions are objects in Python, we can treat them just like any
other object -- we can assign a function as the value of a variable. To
refer to a function without calling it, we just use the function name
without round brackets:

In [None]:
my_function = print_a_message

# later we can call the function using the variable name
my_function()

Because defining a function does not cause it to execute, we can use an
identifier inside a function even if it hasn't been defined yet -- as
long as it becomes defined by the time we run the function. For example,
if we define several functions which all call each other, the order in
which we define them doesn't matter as long as they are all defined
before we start using them:

In [None]:
def my_function():
    my_other_function()

def my_other_function():
    print("Hello!")

# this is fine, because my_other_function is now defined
my_function()

Hello!


If we were to move that function call up, we would get an error:

In [None]:
def my_function2():
    my_other_function2()

# this is not fine, because my_other_function is not defined yet!
my_function2()

def my_other_function2():
    print("Hello!")

NameError: name 'my_other_function2' is not defined

Because of this, it is a good idea to put all function definitions near
the top of your program, so that they are executed before any of your
other statements.

### Example 1

1.  Create a function called `func_a`, which prints a message.
2.  Call the function.
3.  Assign the function object as a value to the variable `b`, without
    calling the function.
4.  Now call the function using the variable `b`.



#### Answers to Example 1

In [None]:
def func_a():
    print("This is my awesome function.")

func_a()

b = func_a

b()

This is my awesome function.
This is my awesome function.


## 2.1. Input Parameters

It is very seldom the case that the task that we want to perform with a
function is always exactly the same. There are usually minor differences
to what we need to do under different circumstances. We do nOt want to
write a slightly different function for each of these slightly different
cases -- that would defeat the object of the exercise! Instead, we want
to pass information into the function and use it inside the function to
tailor the function's behaviour to our exact needs. We express this
information as a series of *input parameters*.

For example, we can make the function we defined above more useful if we
make the message customisable:

In [None]:
def print_a_message(message):
    print(message)

More usefully, we can pass in two numbers and add them together:

In [None]:
def print_sum(a, b):
    print(a + b)

`a` and `b` are parameters. When we call this function, we have to pass
two paramenters in, or we will get an error:

In [None]:
print_sum() # this won't work

print_sum(2, 3) # this is correct

TypeError: print_sum() missing 2 required positional arguments: 'a' and 'b'

In the example above, we are passing `2` and `3` as parameters to the
function when we call it. That means that when the function is executed,
the variable `a` will be given the value `2` and the variable `b` will
be given the value `3`. You will then be able to refer to these values
using the variable names `a` and `b` inside the function.

In languages which are statically typed, we have to declare the types of
parameters when we define the function, and we can only use variables of
those types when we call the function. If we want to perform a similar
task with variables of different types, we must define a separate
function which accepts those types.

In Python, parameters have no declared types. We can pass any kind of
variable to the `print_message` function above, not just a string. We
can use the `print_sum` function to add any two things which can be
added: two integers, two floats, an integer and a float, or even two
strings. We can also pass in an integer and a string, but although these
are permitted as parameters, they cannot be added together, so we will
get an error when we actually try to add them inside the function.

The advantage of this is that we do not have to write a lot of different
`print_sum` functions, one for each different pair of types, when they
would all be identical otherwise. The disadvantage is that since Python
does not check parameter types against the function definition when a
function is called, we may not immediately notice if the wrong type of
parameter is passed in -- if, for example, another person interacting
with code that we have written uses parameter types that we did not
anticipate, or if we accidentally get the parameters out of order.

This is why it is important for us to test our code thoroughly --
something we will look at in a later chapter. If we intend to write code
which is robust, especially if it is also going to be used by other
people, it is also often a good idea to check function parameters early
in the function and give the user feedback (by raising exceptions) if
the are incorrect.

### Example 2

1.  Create a function called `hypotenuse`, which takes two numbers as
    parameters and prints the square root of the sum of their squares.
2.  Call this function with two floats.
3.  Call this function with two integers.
4.  Call this function with one integer and one float.



#### Answers to Example 2

In [None]:
import math

def hypotenuse(x, y):
    print(math.sqrt(x**2 + y**2))

hypotenuse(12.3, 45.6)
hypotenuse(12, 34)
hypotenuse(12, 34.5)

47.22975756871932
36.05551275463989
36.52738698565776


## 2.2. Return Values

The function examples we have seen above do not return any values -- they
just result in a message being printed. We often want to use a function
to calculate some kind of value and then *return* it to us, so that we
can store it in a variable and use it later. Output which is returned
from a function is called a *return value*. We can rewrite the
`print_sum` function to return the result of its addition instead of
printing it:

In [None]:
def add(a, b):
    return a + b

We use the `return` keyword to define a return value. To access this
value when we call the function, we have to *assign* the result of the
function to a variable:

In [None]:
c = add(3, 5)
print(c)

8


Here the return value of the function will be assigned to `c` when the
function is executed.

A function can only have a single return value, but that value can be a
list or tuple, so in practice you can return as many different values
from a function as you like. It usually only makes sense to return
multiple values if they are tied to each other in some way. If you place
several values after the `return` statement, separated by commas, they
will automatically be converted to a tuple. Conversely, you can assign a
tuple to multiple variables separated by commas at the same time, so you
can *unpack* a tuple returned by a function into multiple variables:

In [None]:
def divide(dividend, divisor):
    quotient = dividend // divisor
    remainder = dividend % divisor
    return quotient, remainder

# you can do this
q, r = divide(35, 4)

# but you can also do this
result = divide(67, 9)
q1 = result[0]
q2 = result[1]

# by the way, you can also do this
a, b = (1, 2)
# or this
c, d = [5, 6]

What happens if you try to assign one of our first examples, which do not
have a return value, to a variable? :

In [None]:
mystery_output = print_message("Boo!")
print(mystery_output)

NameError: name 'print_message' is not defined

All functions do actually return *something*, even if we do not define a
return value -- the default return value is `None`, which is what our
mystery output is set to.

When a `return` statement is reached, the flow of control immediately
exits the function -- any further statements in the function body will
be skipped. We can sometimes use this to our advantage to reduce the
number of conditional statements we need to use inside a function:

In [None]:
def divide(dividend, divisor):
    if not divisor:
        return None, None # instead of dividing by zero

    quotient = dividend // divisor
    remainder = dividend % divisor
    return quotient, remainder

If the `if` clause is executed, the first `return` will cause the
function to exit -- so whatever comes after the `if` clause does not need
to be inside an `else`. The remaining statements can simply be in the
main body of the function, since they can only be reached if the `if`
clause is not executed.

This technique can be useful whenever we want to check parameters at the
beginning of a function -- it means that we do not have to indent the
main part of the function inside an `else` block. Sometimes it is more
appropriate to raise an exception instead of returning a value like
`None` if there is something wrong with one of the parameters:

In [None]:
def divide(dividend, divisor):
    if not divisor:
        raise ValueError("The divisor cannot be zero!")

    quotient = dividend // divisor
    remainder = dividend % divisor
    return quotient, remainder

Having multiple exit points scattered throughout your function can make
your code difficult to read -- most people expect a single `return`
right at the end of a function. You should use this technique sparingly.

### Example 3

1.  Rewrite the `hypotenuse` function from exercise 2 so that it returns
    a value instead of printing it. Add exception handling so that the
    function returns `None` if it is called with parameters of the wrong
    type.
2.  Call the function with two numbers, and print the result.
3.  Call the function with two strings, and print the result.
4.  Call the function with a number and a string, and print the result.



#### Answers to Example 3

In [None]:
import math

def hypotenuse(x, y):
    try:
        return math.sqrt(x**2 + y**2)
    except TypeError:
        return None

print(hypotenuse(12, 34))
print(hypotenuse("12", "34"))
print(hypotenuse(12, "34"))

36.05551275463989
None
None


## 2.3. Default Parameters

The combination of the function name and the number of parameters that
it takes is called the *function signature*. In statically typed
languages, there can be multiple functions with the same name in the
same scope as long as they have different numbers or types of parameters
(in these languages, parameter types and return types are also part of
the signature).

In Python, there can only be one function with a particular name defined
in the scope -- if you define another function with the same name, you
will overwrite the first function. You must call this function with the
correct number of parameters, otherwise you will get an error.

Sometimes there is a good reason to want to have two versions of the
same function with different sets of parameters. You can achieve
something similar to this by making some parameters *optional*. To make
a parameter optional, we need to supply a default value for it. Optional
parameters must come after all the required parameters in the function
definition:

In [None]:
def make_greeting(title, name, surname, formal=True):
    if formal:
        return "Hello, %s %s!" % (title, surname)

    return "Hello, %s!" % name

print(make_greeting("Mr", "John", "Smith"))
print(make_greeting("Mr", "John", "Smith", False))

Hello, Mr Smith!
Hello, John!


When we call the function, we can leave the optional parameter out -- if
we do, the default value will be used. If we include the parameter, our
value will override the default value.

We can define multiple optional parameters:

In [None]:
def make_greeting(title, name, surname, formal=True, time=None):
    if formal:
        fullname =  "%s %s" % (title, surname)
    else:
        fullname = name

    if time is None:
        greeting = "Hello"q
    else:
        greeting = "Good %s" % time

    return "%s, %s!" % (greeting, fullname)

print(make_greeting("Mr", "John", "Smith"))
print(make_greeting("Mr", "John", "Smith", False))
print(make_greeting("Mr", "John", "Smith", False, "evening"))

Hello, Mr Smith!
Hello, John!
Good evening, John!


What if we want to pass in the *second* optional parameter, but not the
*first*? So far we have been passing *positional* parameters to all
these functions -- a tuple of values which are matched up with
parameters in the function signature based on their *positions*. We can
also, however, pass these values in as *keyword* parameters -- we can
explicitly specify the parameter names along with the values:

In [None]:
print(make_greeting(title="Mr", name="John", surname="Smith"))
print(make_greeting(title="Mr", name="John", surname="Smith", formal=False, time="evening"))

Hello, Mr Smith!
Good evening, John!


We can mix positional and keyword parameters, but the keyword parameters
must come *after* any positional parameters:

In [None]:
# this is OK
print(make_greeting("Mr", "John", surname="Smith"))
# this will give you an error
print(make_greeting(title="Mr", "John", "Smith"))

SyntaxError: positional argument follows keyword argument (<ipython-input-78-d1c79e819da0>, line 4)

We can specify keyword parameters in any order -- they do nOt have to
match the order in the function definition:

In [None]:
print(make_greeting(surname="Smith", name="John", title="Mr"))

Hello, Mr Smith!


Now we can easily pass in the second optional parameter and not the
first:

In [None]:
print(make_greeting("Mr", "John", "Smith", time="evening"))

Good evening, Mr Smith!


### 2.3.1. Mutable Types & Default Parameters

We should be careful when using mutable types as default parameter
values in function definitions if we intend to modify them in-place:

In [None]:
def add_pet_to_list(pet, pets=[]):
    pets.append(pet)
    return pets

list_with_cat = add_pet_to_list("cat")
list_with_dog = add_pet_to_list("dog")

print(list_with_cat)
print(list_with_dog) # oops

['cat', 'dog']
['cat', 'dog']


Remember that although we can execute a function *body* many times, a
function *definition* is executed only once -- that means that the empty
list which is created in this function definition will be the same list
for all instances of the function. What we really want to do in this
case is to create an empty list inside the function body:

In [None]:
def add_pet_to_list(pet, pets=None):
    if pets is None:
        pets = []
    pets.append(pet)
    return pets

### Example 4

1.  Write a function called `calculator`. It should take the following
    parameters: two numbers, an arithmetic operation (which can be
    addition, subtraction, multiplication or division and is addition by
    default), and an output format (which can be integer or floating
    point, and is floating point by default). Division should be
    floating-point division.


    The function should perform the requested operation on the two input
    numbers, and return a result in the requested format (if the format
    is integer, the result should be rounded and not just truncated).
    Raise exceptions as appropriate if any of the parameters passed to
    the function are invalid.


2.  Call the function with the following sets of parameters, and check
    that the answer is what you expect:


    > 1.  `2`, `3.0`
    > 2.  `2`, `3.0`, output format is integer
    > 3.  `2`, `3.0`, operation is division
    > 4.  `2`, `3.0`, operation is division, output format is integer


#### Answers to Example 4

In [None]:
    import math

    ADD, SUB, MUL, DIV = range(4)

    def calculator(a, b, operation=ADD, output_format=float):
        if operation == ADD:
            result = a + b
        elif operation == SUB:
            result = a - b
        elif operation == MUL:
            result = a * b
        elif operation == DIV:
            result = a / b
        else:
            raise ValueError("Operation must be ADD, SUB, MUL or DIV.")

        if output_format == float:
            result = float(result)
        elif output_format == int:
            result = math.round(result)
        else:
            raise ValueError("Format must be float or int.")

        return result

2.  You should get the following results:


    > 1.  `5.0`
    > 2.  `5`
    > 3.  `0.6666666666666666`
    > 4.  `1`


## 2.6. Lambda Functions

We have already seen that when we want to use a number or a string in
our program we can either write it as a *literal* in the place where we
want to use it or use a *variable* that we have already defined in our
code. For example, `print("Hello!")` prints the literal string
`"Hello!"`, which we have not stored in a variable anywhere, but
`print(message)` prints whatever string is stored in the variable
`message`.

We have also seen that we can store a function in a variable, just like
any other object, by referring to it by its name (but not calling it).
Is there such a thing as a function literal? Can we define a function on
the fly when we want to pass it as a parameter or assign it to a
variable, just like we did with the string `"Hello!"`?

The answer is *yes*, but only for very simple functions. We can use the
`lambda` keyword to define anonymous, one-line functions *inline* in our
code:

In [None]:
a = lambda: 3

# is the same as

def a():
    return 3

Lambdas can take parameters -- they are written between the `lambda`
keyword and the colon, without brackets. A lambda function may only
contain a single expression, and the result of evaluating this
expression is implicitly returned from the function (we don't use the
`return` keyword):

In [None]:
b = lambda x, y: x + y

# is the same as

def b(x, y):
    return x + y

Lambdas should only be used for very simple functions. If your lambda
starts looking too complicated to be readable, you should rather write
it out in full as a normal, named function.

### Example 5

1.  Define the following functions as lambdas, and assign them to
    variables:


    > 1.  Take one parameter; return its square
    > 2.  Take two parameters; return the square root of the sums of
    >     their squares
    > 3.  Take any number of parameters; return their average
    > 4.  Take a string parameter; return a string which contains the
    >     unique letters in the input string (in any order)


2. Do the same with standard function syntax.

#### Answers to Example 5

1.  Here is an example program:

In [None]:
    import math

    a = lambda x: x**2
    b = lambda x, y: math.sqrt(x**2 + y**2)
    c = lambda *args: sum(args)/len(args)
    d = lambda s: "".join(set(s))

2.  Here is an example program:

In [None]:
    import math

    def a(x):
        return x**2

    def b(x, y):
        return math.sqrt(x**2 + y**2)

    def c(*args):
        return sum(args)/len(args)

    def d(s):
        return "".join(set(s))

    d("avril")

'lriav'

© Copyright 2013, 2014, Confluence (https://github.com/confluence) and individual contributors. This work is released under the CC BY-SA 4.0 licence.