# Functions

Functions are packaged-up code. The are three places you can access functions from: built-in to Python, inside modules, and inside objects.

## Built-in Functions

### Function Objects

Python has a number of built-in functions for our convenience.

In [6]:
len

<function len(obj, /)>

In [13]:
sum

<function sum(iterable, /, start=0)>

In [14]:
print

<function print>

These functions are already defined in the background for us. There is some code hidden inside each of these functions. 

### Calling a Function

If we want to run a function, we "call" it. You call it by putting round brackets after it. These round brackets allow you to pass objects into the function if you want or need to.

Passing in nothing is allowed in some functions. For instance,  `print()` will print out an empty message; `list()` will create an empty list.

In [18]:
print()




In [17]:
list()

[]

Inside those round brackets, we can pass objects into our functions if we want. This depends on the function itself and what it will accept.

Passing a string into `print()` prints it out.

In [25]:
print("Hello World!")

Hello World!


Passing a list into `sum()` sums up all the elements.

In [24]:
sum([7,3])

10

Passing an object into `type()` returns the object type.

In [26]:
type("Hello")

str

Once we call a function, it changes entirely. It is no longer a function object, but it turns into whatever it returns.

So while `sum` is a function, `sum([7,3])` is an integer (in this case `10`). This is done by "returning" the sum of `7 + 3` (more on this later). We can check this by passing it into `type()`. 

In [28]:
type(sum)

builtin_function_or_method

In [27]:
type(sum([7,3]))

int

## Functions in Modules

Not all the functions we want to use are built-in to Python. Some are stored in seperate files. These files are called modules. There are just files that contain a load of functions, one after the other. 

### Importing Functions

We can import these modules (files of functions) by using the `import <module>` statement. Once a module is imported, you can access functions inside of it using the `.` notation. So: 

`<module>.<function>`

For instance, numpy is a module that contains loads of handy numerical functions. One of them is called `mean()` which will return the average of a list. So we can 

In [29]:
import numpy

In [31]:
numpy.mean([1,2,3,4,5])

3.0

### Module Alias

Since we will be going inside these modules quite often, it is common to give them a shorthand alias. This is done by adding `as` to the import statement. So:

`import <module> as <alias>`

NumPy, for instance, is almost always abbreviate to `np`.

In [32]:
import numpy as np

In [33]:
np.mean([1,2,3,4,5])

3.0

### Direct Import

You can even go so far as to import the function directly. This will take it out of that module and make it like a built-in function. In this case you would no longer need to use the `.` notation to access it. 

This can be done by using `from <module> import <function>`.

For instance, `mean()` is not one of Python's built-in functions (see below, `mean([1,2,3])` causes an error; it is not a recognised function. It can only be accessed from the numpy package — `np.mean()`. However, we can import that function directly, by stating `from numpy import mean`.

In [36]:
mean([1,2,3,4,5])

NameError: name 'mean' is not defined

In [37]:
from numpy import mean

In [39]:
mean([1,2,3,4,5])

3.0

## Functions in Objects

Given we are in this Object-oriented programming paradigm (OOP), we can package these function up *inside* objects. These are also accessed using the `.` notation. So, `<object>.<function>`. These functions operate a little bit differently. Instead of us pass the object into the round brackets (e.g. mean([1,2,3]), we go *inside* the object and call the function that within it, on itself! A bit confusing, but examples will help.

Let's start with a sentence:

In [141]:
sentence = "This is a sentence"

So far, we have been pass objects into functions

In [142]:
len(sentence)

18

But we could also go inside the string and call a function from within.

In [143]:
sentence.upper()

'THIS IS A SENTENCE'

By calling the function from within the string, it is effectively passing the string into the function. Notice we still include the round brackets. These are empty as the string is automatically being passed in.

The functions that are available for you to use, will depend on the object type.

Compare how we access functions from modules. We will make a numpy array and see the functions inside in it.

In [48]:
my_array = np.array([1,2,3,4,5,6])

Accessing the `mean()` function from within the numpy module, and passing the array into it.

In [49]:
np.mean(my_array)

3.5

Going inside the array object, access the `mean()` function, and call it on the object itself.

In [50]:
my_array.mean()

3.5

Both these pieces of code have the exact same outcome.

Because these functions operate differently to regular functions they get their own name, they are called *methods*.

## Defining our own Function

You define a function by using `def`. Remember: a function must have opening and close brackets after it, because functions have the ability to take in arguments. For instance, `len(word)` is a function called `len()` that takes in a word a returns its length.

In [106]:
def greeting():
    print('Hello and Welcome!')

### Calling a Function

Defining a function does just that: it defines it. It doesn't execute it. It is only executed when you "call" it. 

In [108]:
greeting()

Hello and Welcome!


### Taking an Argument

Our `greeting()` function doesn't take anything in. It just greets you regardless. But some functions can take inputs. These are called arguments. If we wanted to get the length of our name, and used `len(name)`, the argument here would be the string called `name`.

In [121]:
def says_sum(a, b):  # Here we take in two numbers, represented by "a" and "b"
    print(a + b)

In [123]:
says_sum(10, 20)

30


Looking good! We have made a function that takes in two numbers and prints out the result. 

### Returning

But what if we want to use this result? What if I wanted to add `10` to the result?

In [126]:
says_sum(10, 20) + 10

30


TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

That didn't work at all! It says it can't add `NoneType` and `int`. That's because our function just *printed out* the number `30`, it didn't actually *return* it. We can check this by using the `type()` function.

In [127]:
type(says_sum(10, 20))

30


NoneType

It prints out the result, `30`, and tells us the type is `NoneType`.

So, if we want it to actually turn into a value, we will need to return it as something. This is done using `return`.

In [139]:
def returns_sum(a, b):  # Here we take in two numbers, represented by "a" and "b"
    answer = a + b      # We assign the sum of these two numbers to "answer"
    return answer       # And now we return this function as the variable "answer"

In [129]:
returns_sum(10, 20)

30

Looks very similar to before, but with one crucial difference.

In [130]:
type(returns_sum(10, 20))

int

It actually *is* a number now. It is something we can use

In [132]:
returns_sum(10, 20) * 2

60

Just to note, you can return a formula if you want. In this case we could have saved ourselves some code:

In [133]:
def returns_sum(a, b):
    return a + b

You can also return more than one argument. How about a function that squares two numbers, and returns both those squared numbers?

In [134]:
def squares_two(a, b):
    return a**2, b**2

In [135]:
squares_two(2, 3)

(4, 9)

Here it has returned us a tuple with the two squared numbers. You can always unpack those if you want, using our unpacking trick from before

In [136]:
two_sqr, three_sqr = squares_two(2, 3)

In [137]:
two_sqr

4

In [138]:
three_sqr

9

## Anonymous Functions

If we are just going to use our function once, it's hardly worth going to the bother of giving it a name. For these once-off functions, we can define and run them all in one line. These will be hand later, bear with me!

These types of functions are called `lambda` functions (comes from a Lambda Calculus). It simplifies the function definition expression.

From:

In [None]:
def <function_name>(<arguments>):
    <body>
    return <returns>

To

In [None]:
lambda <arguments> : <expression>

So instead of writing a function like we had earlier:

In [56]:
def sums_two(a, b):
    total = a + b
    return total

We can simply write

In [55]:
lambda a, b: a + b

<function __main__.<lambda>(a, b)>

You will notice there is no function name here. Because of that, it just vanishes after we're finished with it. So how could we use this? Well, we need to call it immediately. One way we could do this is to pass the arguments in straight away.

In [58]:
(lambda a, b: a + b) (5, 10)

15

But this isn't the most popular use case. To show when they are most useful let's take a look a handy method associated with DataFrames.

## Applying Functions to DataFrames

In [65]:
import pandas as pd
import seaborn as sns

In [175]:
tips = sns.load_dataset('tips')
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


Before we jump into applying a function to our data, we should first think how we would like it to be applied: cell-wise, column-wise, or row-wise. 

### Cell-wise

We can use `apply()` to apply a function to one of the columns. Here we will use Numpy's `sqrt` to get the square root of the `tip` column.

In [198]:
tips['tip'].apply(np.sqrt)

0      1.004988
1      1.288410
2      1.870829
3      1.819341
4      1.900000
         ...   
239    2.433105
240    1.414214
241    1.414214
242    1.322876
243    1.732051
Name: tip, Length: 244, dtype: float64

### Column-wise

We will select the columns we want to operate over. In this case, we will select the `tip` and `total_bill` columns

In [78]:
tips[['tip', 'total_bill']].apply(np.sum)

tip            731.58
total_bill    4827.77
dtype: float64

### Row-wise

This is going to operate horizontally, based on the columns passed in. 

In this example, we will pass in the `tip` and `total_bill` columns and the `np.sum` function. This will return a new series that is the sum of these

In [80]:
tips[['tip', 'total_bill']].apply(np.sum, axis=1)

0      18.00
1      12.00
2      24.51
3      26.99
4      28.20
       ...  
239    34.95
240    29.18
241    24.67
242    19.57
243    21.78
Length: 244, dtype: float64

We could even save this as a new columns if we want.

In [195]:
tips['bill_with_tip'] = tips[['tip', 'total_bill']].apply(np.sum, axis=1)

In [197]:
tips['tip'] + tips['total_bill']

0      18.00
1      12.00
2      24.51
3      26.99
4      28.20
       ...  
239    34.95
240    29.18
241    24.67
242    19.57
243    21.78
Length: 244, dtype: float64

In [87]:
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,bill with tip
0,16.99,1.01,Female,No,Sun,Dinner,2,18.0
1,10.34,1.66,Male,No,Sun,Dinner,3,12.0
2,21.01,3.5,Male,No,Sun,Dinner,3,24.51
3,23.68,3.31,Male,No,Sun,Dinner,2,26.99
4,24.59,3.61,Female,No,Sun,Dinner,4,28.2


### Applying Custom Functions

Instead of using these predefined functions, we can also pass our own functions in to operate on our DataFrame.

In [176]:
def bill_with_tip(df):
    full_bill = df['total_bill'] + df['size']
    return round(full_bill,2)

In [177]:
tips.apply(bill_with_tip, axis=1)

0      18.99
1      13.34
2      24.01
3      25.68
4      28.59
       ...  
239    32.03
240    29.18
241    24.67
242    19.82
243    20.78
Length: 244, dtype: float64

Great! Now we can save this as a new column `bill_with_tip`

In [178]:
tips['bill_with_tip'] = tips.apply(bill_with_tip, axis=1)
tips

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,bill_with_tip
0,16.99,1.01,Female,No,Sun,Dinner,2,18.99
1,10.34,1.66,Male,No,Sun,Dinner,3,13.34
2,21.01,3.50,Male,No,Sun,Dinner,3,24.01
3,23.68,3.31,Male,No,Sun,Dinner,2,25.68
4,24.59,3.61,Female,No,Sun,Dinner,4,28.59
...,...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3,32.03
240,27.18,2.00,Female,Yes,Sat,Dinner,2,29.18
241,22.67,2.00,Male,Yes,Sat,Dinner,2,24.67
242,17.82,1.75,Male,No,Sat,Dinner,2,19.82


### Applying Anonymous Functions

This is a very useful ability we now have. However, it is a bit clunky. To have to define our function beforehand, and then call it in the function. Sometimes it would be just handier to define and run it all in one go! Fortunately, there are a particular type of functions that do not need to be defined beforehand. They are called anonymous functions, due to their lack of having a name.

I want to divide the bill by the number of people at the table. So it will be `tips['bill with tip'] / tips['size']`. Remember, when defining a function we define the variable while taking it in. Here we will just call it `x` for simplicity. I will also round it off to the nearest cent with `round()`. As a lambda function, this would look like:

In [179]:
lambda x: round(x['bill_with_tip'] / x['size'],2)

<function __main__.<lambda>(x)>

Passing this into our `apply()` method, looks like:

In [180]:
tips['bill_per_person'] = tips.apply(lambda x: round(x['total_bill'] / x['size'],2), axis=1)
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,bill_with_tip,bill_per_person
0,16.99,1.01,Female,No,Sun,Dinner,2,18.99,8.49
1,10.34,1.66,Male,No,Sun,Dinner,3,13.34,3.45
2,21.01,3.5,Male,No,Sun,Dinner,3,24.01,7.0
3,23.68,3.31,Male,No,Sun,Dinner,2,25.68,11.84
4,24.59,3.61,Female,No,Sun,Dinner,4,28.59,6.15


### Lamda Conditional

So now, how could we go about adding in a conditional statement into our lambda function. This would take the form:

`lambda <argument>: <true return> if <truth statement> else <false return>`

That's hard to make sense of at the moment, so let's use it in an example. Say we want to make a new column called `party`, that will say weather it is a large or small party. I'm going to say dinner with 4+ is a large party.

In [192]:
lambda x: 'large' if x['size'] > 3 else 'small'

<function __main__.<lambda>(x)>

Let's use our `apply(axis=1)` method to run this over each row and save it to a new column called `party`.

In [193]:
tips['party'] = tips.apply(lambda x: 'large' if x['size'] > 3 else 'small', axis=1)
tips

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,bill_with_tip,bill_per_person,party
0,16.99,1.01,Female,No,Sun,Dinner,2,18.99,8.49,small
1,10.34,1.66,Male,No,Sun,Dinner,3,13.34,3.45,small
2,21.01,3.50,Male,No,Sun,Dinner,3,24.01,7.00,small
3,23.68,3.31,Male,No,Sun,Dinner,2,25.68,11.84,small
4,24.59,3.61,Female,No,Sun,Dinner,4,28.59,6.15,large
...,...,...,...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3,32.03,9.68,small
240,27.18,2.00,Female,Yes,Sat,Dinner,2,29.18,13.59,small
241,22.67,2.00,Male,Yes,Sat,Dinner,2,24.67,11.34,small
242,17.82,1.75,Male,No,Sat,Dinner,2,19.82,8.91,small


We can also try this with strings! Here we will add in an extra logical step `or` to determine if it is a weekend or a weekday.

In [194]:
tips['Weekday'] = tips.apply(lambda x: 'Weekend' if (x['day']=='Sun' or x['day']=='Sat') else 'Weekday', axis=1)
tips

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,bill_with_tip,bill_per_person,party,Weekday
0,16.99,1.01,Female,No,Sun,Dinner,2,18.99,8.49,small,Weekend
1,10.34,1.66,Male,No,Sun,Dinner,3,13.34,3.45,small,Weekend
2,21.01,3.50,Male,No,Sun,Dinner,3,24.01,7.00,small,Weekend
3,23.68,3.31,Male,No,Sun,Dinner,2,25.68,11.84,small,Weekend
4,24.59,3.61,Female,No,Sun,Dinner,4,28.59,6.15,large,Weekend
...,...,...,...,...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3,32.03,9.68,small,Weekend
240,27.18,2.00,Female,Yes,Sat,Dinner,2,29.18,13.59,small,Weekend
241,22.67,2.00,Male,Yes,Sat,Dinner,2,24.67,11.34,small,Weekend
242,17.82,1.75,Male,No,Sat,Dinner,2,19.82,8.91,small,Weekend
