# Tutorial: Functions

In this tutorial, you will learn how to create your own reusable Python functions.

Learning objectives:
1. Create custom functions
2. Create anonymous (lambda) functions
3. Manipulate sequence-like objects through built-in sequence functions and generators
4. Handle exceptions

## 1. Creating custom functions

Functions are declared with the *def* keyword and returned from with the *return* keyword:

In [1]:
def my_function(x, y, z=1.5):
    if z > 1:
        return z * (x + y)
    else:
        return z / (x + y)

In [2]:
my_function(5,5,5)

50

### Returning Multiple Values

In data analysis and other scientific applications, you may find yourself doing this often. What’s happening here is that the function is actually just returning one object, namely a tuple, which is then being unpacked into the result variables. 

In [3]:
def f():
    a=5
    b=6
    c=7

    return a, b, c

a,b,c=f()

print(f'{a} {b} {c}')

5 6 7


### Functions are objects too

You can use functions as arguments to other functions:

In [4]:
import re

def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

clean_ops = [str.strip, remove_punctuation, str.title]

def clean_strings(strings, ops):
    result = set()
    for value in strings:
        for function in ops:
            value = function(value)
        result.add(value)
    
    return result

In [5]:
states = [' Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda', 'south carolina##', 'West virginia?']

clean_strings(states, clean_ops)

{'Alabama', 'Florida', 'Georgia', 'South Carolina', 'West Virginia'}

## 2. Anonymous (lambda) functions

Python has support for anonymous or lambda functions, which are a way of writing functions consisting of a single statement, the result of which is the return value. They are defined with the lambda keyword, which has no meaning other than "we are declaring an anonymous function."

One reason lambda functions are called anonymous functions is that, unlike functions declared with the def keyword, the function object itself is never given an explicit \__name__ attribute.

In [6]:
def short_function(x):
    return x*2

equiv_anon = lambda x: x * 2

> Lambda functions are convenient in data analysis because, as you’ll see, there are many cases where data transformation functions will take functions as arguments. It’s often less typing (and clearer) to pass a lambda function as opposed to writing a full-out function declaration or even assigning the lambda function to a local variable.

## Built-in sequence functions

Python has a handful of useful functions to work with iterable objects that you should familiarize yourself with and use at any opportunity.

### len()

Return the length (the number of items) of an object. The argument may be a sequence or a collection.

In [7]:
a_list = [5, 6, 1, 2, 10]
len(a_list) # number of items in the list

5

In [8]:
dictionary = {'a': 1, 'b': 2}
len(dictionary) # number of items in the dictionary

2

### enumerate()

It’s common when iterating over a sequence to want to keep track of the index of the current item. 

In [9]:
tup = ('a', 'b', 'c', 'd')

index = 0
for value in tup:
    # do something with index and value
    print(index, "->", tup[index])
    
    # increase index to retrieve next item in collection
    index += 1

0 -> a
1 -> b
2 -> c
3 -> d


Since this is so common, Python has a built-in function, enumerate, which returns a sequence of (i, value) tuples: 

In [10]:
for index, value in enumerate(tup):
    print(index, "->", tup[index])

0 -> a
1 -> b
2 -> c
3 -> d


### sorted()

The sorted function returns a new sorted list from the elements of any sequence.

The sorted function accepts the same arguments as the sort method on lists.

In [11]:
a_list = [5, 6, 1, 2, 10]

sorted(a_list)

[1, 2, 5, 6, 10]

In [12]:
a_list

[5, 6, 1, 2, 10]

### range()

The range function returns an iterator that yields a sequence of integers:

In [13]:
r = range(10)
r

range(0, 10)

The range function does not directly return all the elements in the range. If we need to, we can use *list()* to generate at once all the elements specified in the iterator:

In [14]:
list(r) # We use the *list* function to materialize the objects specified in the range object

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

> Using functions like *range()* and generators (see below) consume less amount of memory when compared to a list or tuple. Irrespective of the range it represents, a range iterator *yields* only one element at a time. 

We can specify a start and end:

In [15]:
list(range(10, 20))

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

Besides a start and end, a step (which may be negative) can be given:

In [16]:
list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [17]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

A common use of range is for iterating through sequences by index:

In [18]:
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    val = seq[i]
    print (val)

1
2
3
4


While you can use functions like *list* to store all the integers generated by range in some other data structure, the default iterator form will often be what you want to use. This snippet sums all numbers from 0 to 99,999 that are multiples of 3 or 5:

In [19]:
total=0
for i in range(100000):
    # % is the modulo operator
    if i%3 == 0 or i%5 ==0:
        total += i
        
print(total)

2333316668


### reversed()

reversed iterates over the elements of a sequence in reverse order:

In [20]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

### zip()

zip “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples. zip can take an arbitrary number of sequences, and the number of elements it produces is determined by the shortest sequence:

In [21]:
seq_1 = ['foo', 'bar', 'baz']
seq_2 = ['one', 'two', 'three']

list(zip(seq_1, seq_2))

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

In [22]:
seq_3 = [False, True]

list(zip(seq_1, seq_2, seq_3))

[('foo', 'one', False), ('bar', 'two', True)]

### Generators functions

We will use a generator to list the factors of a given integer:

In [25]:
def factors(x):
   for i in range(1, x + 1):
       if x % i == 0:
           yield i

In [26]:
for n in factors(21):
    print(n)

1
3
7
21


### Generator expressions

Another even more concise way to make a generator is by using a generator expression. This is a generator analogue to list, dict, and set comprehensions; to create one, enclose what would otherwise be a list comprehension within parentheses instead of brackets:

In [27]:
gen = (x ** 2 for x in range(100))
gen

<generator object <genexpr> at 0x10f4d61d0>

Generator expressions can be used instead of list comprehensions as function arguments in many cases:

In [28]:
sum(x ** 2 for x in range(100))

328350

In [29]:
max(x ** 2 for x in range(100))

9801

## Exception handling

Handling Python errors or exceptions gracefully is an important part of building robust programs. In data analysis applications, many functions only work on certain kinds of input.

As an example, Python’s float function is capable of casting a string to a floating-point number, but fails with ValueError on improper inputs. Suppose we wanted a version of float that fails gracefully, returning the input argument. We can do this by writing a function that encloses the call to float in a try/ except block

In [30]:
def attempt_float(x):
    try:
        return float(x)
    except:
        return x

In [31]:
attempt_float('1.2345')

1.2345

In [32]:
 attempt_float('something')

'something'

You might notice that float can raise exceptions other than ValueError. You might want to only suppress ValueError, since a TypeError (the input was not a string or numeric value) might indicate a legitimate bug in your program.

To do that, write the exception type after except:

In [33]:
def attempt_float(x):
    try:
        return float(x)
    except ValueError:
        return x
    
attempt_float((1, 2))

TypeError: float() argument must be a string or a number, not 'tuple'

You can catch multiple exception types by writing a tuple of exception types instead (the parentheses are required):

In [35]:
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x

In [36]:
attempt_float((1, 2))

(1, 2)

## The *finally* keyword

In some cases, you may not want to suppress an exception, but you want some code to be executed regardless of whether the code in the try block succeeds or not. To do this, use finally:

In [37]:
def attempt_float(x):
    try:
        return float(x)    
    finally:
        print("FINALLY")
        
attempt_float("string")

FINALLY


ValueError: could not convert string to float: 'string'

Similarly, you can have code that executes only if the try: block succeeds using else:

In [38]:
def attempt_float(x):
    try:
        return float(x)    
    except ValueError:
        return x
    finally:
        print("FINALLY")
        
attempt_float("string")

FINALLY


'string'