<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="fig/cover-small.jpg">
*This notebook contains an excerpt from the [Whirlwind Tour of Python](http://www.oreilly.com/programming/free/a-whirlwind-tour-of-python.csp) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/WhirlwindTourOfPython).*

*The text and code are released under the [CC0](https://github.com/jakevdp/WhirlwindTourOfPython/blob/master/LICENSE) license; see also the companion project, the [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook).*


<!--NAVIGATION-->
< [Control Flow](07-Control-Flow-Statements.ipynb) | [Contents](Index.ipynb) | [Errors and Exceptions](09-Errors-and-Exceptions.ipynb) >

# Defining and Using Functions

So far, our scripts have been simple, single-use code blocks.
One way to organize our Python code and to make it more readable and reusable is to factor-out useful pieces into reusable *functions*.
Here we'll cover two ways of creating functions: the ``def`` statement, useful for any type of function, and the ``lambda`` statement, useful for creating short anonymous functions.

## Using Functions

Functions are groups of code that have a name, and can be called using parentheses.
We've seen functions before. For example, ``print`` in Python 3 is a function:

In [1]:
print('abc')

abc


Here ``print`` is the function name, and ``'abc'`` is the function's *argument*.

In addition to arguments, there are *keyword arguments* that are specified by name.
One available keyword argument for the ``print()`` function (in Python 3) is ``sep``, which tells what character or characters should be used to separate multiple items:

In [2]:
print(1, 2, 3)

1 2 3


In [3]:
print(1, 2, 3, sep='--')

1--2--3


When non-keyword arguments are used together with keyword arguments, the keyword arguments must come at the end.

## Defining Functions
Functions become even more useful when we begin to define our own, organizing functionality to be used in multiple places.
In Python, functions are defined with the ``def`` statement.
For example, we can encapsulate a version of our Fibonacci sequence code from the previous section as follows:

In [4]:
def fibonacci(N):
    L = []
    a, b = 0, 1
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

Now we have a function named ``fibonacci`` which takes a single argument ``N``, does something with this argument, and ``return``s a value; in this case, a list of the first ``N`` Fibonacci numbers:

In [5]:
fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

If you're familiar with strongly-typed languages like ``C``, you'll immediately notice that there is no type information associated with the function inputs or outputs.
Python functions can return any Python object, simple or compound, which means constructs that may be difficult in other languages are straightforward in Python.

For example, multiple return values are simply put in a tuple, which is indicated by commas:

In [6]:
def real_imag_conj(val):
    return val.real, val.imag, val.conjugate()

r, i, c = real_imag_conj(3 + 4j)
print(r, i, c)

3.0 4.0 (3-4j)


## Default Argument Values

Often when defining a function, there are certain values that we want the function to use *most* of the time, but we'd also like to give the user some flexibility.
In this case, we can use *default values* for arguments.
Consider the ``fibonacci`` function from before.
What if we would like the user to be able to play with the starting values?
We could do that as follows:

In [7]:
def fibonacci(N, a=0, b=1):
    L = []
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

With a single argument, the result of the function call is identical to before:

In [8]:
fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

But now we can use the function to explore new things, such as the effect of new starting values:

In [9]:
fibonacci(10, 0, 2)

[2, 2, 4, 6, 10, 16, 26, 42, 68, 110]

The values can also be specified by name if desired, in which case the order of the named values does not matter:

In [10]:
fibonacci(10, b=3, a=1)

[3, 4, 7, 11, 18, 29, 47, 76, 123, 199]

## ``*args`` and ``**kwargs``: Flexible Arguments
Sometimes you might wish to write a function in which you don't initially know how many arguments the user will pass.
In this case, you can use the special form ``*args`` and ``**kwargs`` to catch all arguments that are passed.
Here is an example:

In [11]:
def catch_all(*args, **kwargs):
    print("args =", args)
    print("kwargs = ", kwargs)

In [12]:
catch_all(1, 2, 3, a=4, b=5)

args = (1, 2, 3)
kwargs =  {'a': 4, 'b': 5}


In [13]:
catch_all('a', keyword=2)

args = ('a',)
kwargs =  {'keyword': 2}


Here it is not the names ``args`` and ``kwargs`` that are important, but the ``*`` characters preceding them.
``args`` and ``kwargs`` are just the variable names often used by convention, short for "arguments" and "keyword arguments".
The operative difference is the asterisk characters: a single ``*`` before a variable means "expand this as a sequence", while a double ``**`` before a variable means "expand this as a dictionary".
In fact, this syntax can be used not only with the function definition, but with the function call as well!

In [14]:
inputs = (1, 2, 3)
keywords = {'pi': 3.14}

catch_all(*inputs, **keywords)

args = (1, 2, 3)
kwargs =  {'pi': 3.14}


## Anonymous (``lambda``) Functions
Earlier we quickly covered the most common way of defining functions, the ``def`` statement.
You'll likely come across another way of defining short, one-off functions with the ``lambda`` statement.
It looks something like this:

In [15]:
add = lambda x, y: x + y
add(1, 2)

3

This lambda function is roughly equivalent to

In [16]:
def add(x, y):
    return x + y

Lambdas differ from normal Python methods because they can have only one expression, can't contain any statements and their return type is a function object. So the line of code above doesn't exactly return the value x + y but the function that calculates x + y.

So why would you ever want to use such a thing?
Primarily, it comes down to the fact that *everything is an object* in Python, even functions themselves!
That means that functions can be passed as arguments to functions.

As an example of this, suppose we have some data stored in a list of dictionaries:

In [17]:
data = [{'first':'Guido', 'last':'Van Rossum', 'YOB':1956},
        {'first':'Grace', 'last':'Hopper',     'YOB':1906},
        {'first':'Alan',  'last':'Turing',     'YOB':1912}]

Now suppose we want to sort this data.
Python has a ``sorted`` function that does this:

In [18]:
sorted([2,4,3,5,1,6])

[1, 2, 3, 4, 5, 6]

But dictionaries are not orderable: we need a way to tell the function *how* to sort our data.
We can do this by specifying the ``key`` function, a function which given an item returns the sorting key for that item:

In [19]:
# sort alphabetically by first name
sorted(data, key=lambda item: item['first'])

[{'YOB': 1912, 'first': 'Alan', 'last': 'Turing'},
 {'YOB': 1906, 'first': 'Grace', 'last': 'Hopper'},
 {'YOB': 1956, 'first': 'Guido', 'last': 'Van Rossum'}]

In [20]:
# sort by year of birth
sorted(data, key=lambda item: item['YOB'])

[{'YOB': 1906, 'first': 'Grace', 'last': 'Hopper'},
 {'YOB': 1912, 'first': 'Alan', 'last': 'Turing'},
 {'YOB': 1956, 'first': 'Guido', 'last': 'Van Rossum'}]

While these key functions could certainly be created by the normal, ``def`` syntax, the ``lambda`` syntax is convenient for such short one-off functions like these.

Lambda functions are frequently used with higher-order functions, which take one or more functions as arguments or return one or more functions.

A lambda function can be a higher-order function by taking a function (normal or lambda) as an argument like in the following contrived example:

In [3]:
high_ord_func = lambda x, func: x + func(x)



In [4]:
high_ord_func(2, lambda x: x * x)


6

In [5]:
high_ord_func(2, lambda x: x + 3)

7

mAP AND REDUCE

# The map() Function
The map() function iterates through all items in the given iterable and executes the function we passed as an argument on each of them.

The syntax is:

map(function, iterable(s))
We can pass as many iterable objects as we want after passing the function we want to use:



In [6]:
def starts_with_A(s):
    return s[0] == "A"

fruit = ["Apple", "Banana", "Pear", "Apricot", "Orange"]
map_object = map(starts_with_A, fruit)

print(list(map_object))

[True, False, False, True, False]


# The reduce() Function
reduce() works differently than map() and filter(). It does not return a new list based on the function and iterable we've passed. Instead, it returns a single value.

Also, in Python 3 reduce() isn't a built-in function anymore, and it can be found in the functools module.

The syntax is:

reduce(function, sequence[, initial])
reduce() works by calling the function we passed for the first two items in the sequence. The result returned by the function is used in another call to function alongside with the next (third in this case), element.


This process repeats until we've gone through all the elements in the sequence.

The optional argument initial is used, when present, at the beginning of this "loop" with the first element in the first call to function. In a way, the initial element is the 0th element, before the first one, when provided.

reduce() is a bit harder to understand than map() and filter(), so let's look at a step by step example:

We start with a list [2, 4, 7, 3] and pass the add(x, y) function to reduce() alongside this list, without an initial value

reduce() calls add(2, 4), and add() returns 6

reduce() calls add(6, 7) (result of the previous call to add() and the next element in the list as parameters), and add() returns 13

reduce() calls add(13, 3), and add() returns 16

Since no more elements are left in the sequence, reduce() returns 16

The only difference, if we had given an initial value would have been an additional step - 1.5. where reduce() would call add(initial, 2) and use that return value in step 2.

Let's go ahead and use the reduce() function:

In [7]:
from functools import reduce

def add(x, y):
    return x + y

list = [2, 4, 7, 3]
print(reduce(add, list))

16


# Revise

In [1]:
from functools import reduce

nums = [3,2,6,8,4,6,2,9]

evens = list(filter(lambda n : n%2==0,nums))

doubles = list(map(lambda n : n*2,evens))
print(doubles)

sum = reduce(lambda a,b : a+b,doubles)

print(sum)

[4, 12, 16, 8, 12, 4]
56


# Applying a function to a pandas Series or DataFrame

In [1]:
import pandas as pd

In [2]:
url = 'http://bit.ly/kaggletrain'
train = pd.read_csv(url)
train.head(3)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S


# map() function as a Series method
Mostly used for mapping categorical data to numerical data

In [3]:
# create new column
train['Sex_num'] = train.Sex.map({'female':0, 'male':1})
# let's compared Sex and Sex_num columns
# here we can see we map male to 1 and female to 0
train.loc[0:4, ['Sex', 'Sex_num']]

Unnamed: 0,Sex,Sex_num
0,male,1
1,female,0
2,female,0
3,female,0
4,male,1


apply() function as a Series method
Applies a function to each element in the Series

In [4]:
# say we want to calculate length of string in each string in "Name" column

# create new column
# we are applying Python's len function
train['Name_length'] = train.Name.apply(len)
# the apply() method applies the function to each element
train.loc[0:4, ['Name', 'Name_length']]

Unnamed: 0,Name,Name_length
0,"Braund, Mr. Owen Harris",23
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",51
2,"Heikkinen, Miss. Laina",22
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",44
4,"Allen, Mr. William Henry",24
