# Functions and Control Statements
We have used many functions to do quick calculations. This week, we are going to learn how to define, or create, our own functions.

In [None]:
from datascience import *
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

As you know, a function is a block of code that runs with one or more arguments. The function will read the argument(s), do various things to it, and `return` an object that can be used later.

In [None]:
np.add(4,6)

You've seen so far in this course some instances of custom functions, or user-defined functions, such as `plot_vowel_space()` and `edit_distance()`. Any user-defined function requires a few parts:

- `def` ("define" the function)
- a name (this will turn blue in your cell)
- argument(s), which go inside of the parentheses
- a colon (just like in a for loop!); everything inside the body of the function goes after the colon and is indented
- text quotation marks (usually `'''`) that describes your function
- `return` (what your function returns that can be used outside the body of the function)

Here's an example:

In [None]:
def multiply_by_two(x):
    '''Multiplies x by 2'''
    return x * 2

In [None]:
multiply_by_two(5)

The code that defines custom functions will run even if your function doesn't do what you want it to:

In [None]:
def multiply_by_oops(x):
    '''Multiplies x by 2, but won't work correctly'''
    x * 2

In [None]:
multiply_by_oops(5)

The function never returned anything, so there is no output!

But the function cannot be defined if there is an actual error in your code:

In [None]:
def multiply_by_nope(x):
    '''Multiplies x by 2, but also won't work correctly'''
    return 2x

Your functions can also create new variables, which you might want to do for various reasons, such as keeping track of many different values.

In [None]:
def longer_multiply(x):
    '''Multiples x by 2, returns y'''
    y = x * 2
    return y

In [None]:
longer_multiply(5)

Functions can, of course, have multiple arguments.

In [None]:
def add(x, y):
    '''Adds two numbers (exactly as np.add() does!)'''
    return x + y

In [None]:
add(10,20)

In [None]:
def formula(x, y, z):
    '''Multiplies the first two arguments and adds the third'''
    return (x * y) + z

In [None]:
print(formula(1, 2, 3))
print(formula(3, 2, 1))

In [None]:
def story(part1, part2, part3):
    '''Prints the arguments'''
    x = "{}, {}. {}."
    return x.format(part1, part2, part3)

In [None]:
story("Once upon a time", "they lived happily ever after","The end")

## Comparisons in functions
Comparisons evaluate as `bool` (Boolean) values: true or false. You can compare any two items using:

- `>`: greater than
- `<`: less than
- `==`: equal to (note the two equal signs!)
- `!=`: not equal to
- `>=`: greater than or equal to
- `<=`: less than or equal to

Comparisons often occur inside the body of a function.

In [None]:
2 > 5

In [None]:
2 < 5

In [None]:
type(2 > 5)

In [None]:
3 == 3

In [None]:
3 != 3

In [None]:
name = 'Klaus'
name == 'Diego'

Comparisons can be expanded and combined using an `or` statement, or an `and` statement.

In [None]:
name == 'Vanya' or name == 'Klaus'

In [None]:
name == 'Klaus' or name == 'Vanya'

In [None]:
(name == 'Klaus') or (name == 'Vanya')

In [None]:
y = 7
y > 5

In [None]:
y > 5 and y < 9

In [None]:
z = 4

In [None]:
z > 5 and z < 9

## Comparisons within arrays
You can use these comparison statements within arrays. They will return arrays of Booleans (True/False).

In [None]:
tua = make_array('Luther', 'Diego', 'Allison', 'Klaus', 'Number Five', 'Ben', 'Vanya')
tua

In [None]:
tua == 'Luther'

There are two ways of finding the number of items in an array that match a certain value. The first is the function `sum()`. The second is the function `np.count_nonzero()`, which you saw in last week's homework. Both of these functions take a Boolean as an argument (e.g., `x == y`) and return an integer: a count of the number of `True`s.

In [None]:
sum(tua == 'Luther')

In [None]:
np.count_nonzero(tua == 'Luther')

You can also assign the Boolean array to another variable. Remember the crucial difference between `=` and `==` here!

In [None]:
n = tua == 'Luther'
n

The `sum()` function can also be a method, `.sum()`.

In [None]:
n.sum()

In [None]:
(tua == 'Luther').sum()

But you cannot find the sum of a single Boolean object.

In [None]:
name

In [None]:
(name == 'Klaus').sum()

## Conditional statements
Conditional statements are used to control the flow of a computation. For example, suppose we want to run some kind of string splitting on the names in `tua`, but only if the names are of a certain length. You'll want to create a conditional statement: if the statement is met, perform the string splitting. If not, don't perform the string splitting.

A conditional statement will look like a `for` loop. It requires the `if` statement, the condition itself, a colon (`:`), and then code, indented on the next line, that runs if the condition is met.

In [None]:
if True:
    print('Pogo')

In [None]:
x = 5
if x == 5:
    print('Pogo')

In [None]:
if x != 5:
    print('Pogo')

Conditional statements often pair an `if` statement with an `else` statement. Think of it as saying, "Otherwise..." Note that the `if` and `else` statements are on the same "level" (of indentation).

In [None]:
if x > 5:
    print('Hazel')
else:
    print('Cha-cha')

Now, let's put this inside of a function.

In [None]:
def h_or_c(x):
    '''Tells me who to root for.'''
    if x > 5:
        print('Hazel')
    else: print('Cha-cha')

In [None]:
#h_or_c(5)
h_or_c(10)

You can have multiple conditions using `if`, `elif`, and `else`. `elif` stands for "else if". Again, think of "else" as "Otherwise..."

In [None]:
def hcp(x):
    '''Tells me who to root for.'''
    if x < 5:
        print('Hazel')
    elif (x >= 5) and (x <= 10):
        print('Cha-cha')
    else:
        print('Pogo')

In [None]:
hcp(4)
hcp(5)
hcp(11)

Remember that you can't exactly assign print statements to variables.

In [None]:
character = hcp(4)

In [None]:
character

Better to assign new variables within your function, which you can then call on outside of the function. Remember to have your function `return` something if you want to see its output.

In [None]:
def hcp(x):
    '''Tells me who to root for.'''
    if x < 5:
        char = 'Hazel'
    elif (x >= 5) and (x <= 10):
        char = 'Cha-cha'
    else:
        char = 'Pogo'
    return char

In [None]:
z = hcp(4)

In [None]:
z

Finally, back to our original intention of creating a function that splits the names in `tua`, but only if they are a certain length (which is a truly random thing to want to do, but... you never know!).

In [None]:
def split(s):
    '''Splits name s by the first letter.'''
    return s[:1], s[1:]

Let's test to see if the function works, first:

In [None]:
split('Hargreeves')

Now, only split **if** the name is longer than, say, 5 characters:

In [None]:
def split(s):
    '''Splits name s by the first letter if name is longer than 5 characters.
    If not, returns s.'''
    if len(s) > 5:
        return s[:1], s[1:]
    else: return s

In [None]:
print(split('Hargreeves'))
print(split('Pogo'))

Finally, run a `for` loop on our array `tua` and run `split()` on each item in the array. We append the output to our empty array `splitnames` at each iteration.

In [None]:
splitnames = make_array()
for name in tua:
    sn = split(name)
    splitnames = np.append(splitnames, sn)
splitnames

That's it! In the next lecture, we will go over some ways in which user-defined functions can be helpful in speeding up our data visualization and analysis. We will do so in the context of a simple study in **sociolinguistics** (my favorite!).