In [1]:
from __future__ import division, print_function

# Python functions 
________

Sometimes, a portion of code is reused over and over again in the entire script. To prevent repetitive coding, we are able to define our own custom defined functions using the `def` keyword. When invoked, functions will instruct the computer to perform a set list of instructions, possibly returning an output at the end. Functions can also be pre-defined and saved in another file (with extension `.py`) so that it can be used for another project. 

You will have encountered many pre-defined functions in Python up to now. Functions like `print`, `len`, etc... But there are many other pre-built functions in Python which are extremely useful and makes code more transparent and readable. Some like `map` are written and optimized for speed so that scripts run faster. 

We will also encounter what is known as anonymous functions which are defined using the `lambda` keyword. These are short functions and can only be defined in one line of code. While technically `def` does what `lambda` does, nevertheless `lambda` allows for less pendantic and more natural style of coding. You will encounter anonymous functions alot when working with `pandas` especially when cleaning up data frames. 

## Learning objectives 

The objectives of this unit are:

1. To use the `def` keyword to defined functions. 

1. Recall the terms  arguments, keyword arguments, the signature of a function. 

1. To use `lambda` to define anonymous functions. 

1. To use build in  functions: `map`, `zip` and `enumerate`.  

## Let's build our own functions

All functions are defined using the `def` keyword. In general, the format of function definition looks like this: 

    def my_function_name(arg_1, arg_2, ..., arg_n) : 
    
        code
        
        return <something, or None> 
        
First, we tell Python that we are going to define a function by typing out `def`. Then we proceed by naming our function. The normal rules for naming variables apply to naming functions as well - one cannot start a name with a number or use special characters or use words which have been reserved for Python. 

Then after the name, we describe the *signature* of the function by writing down every argument to the function seperated by commas and enclosed in round braces `( )`. And *argument* to a function is an input to the code which will be executed when the function is *called*. It is not mandatory that a function recieve inputs. Sometimes, a function just needs to run a set of instructions, without any input. We end the `def` statement with a `:`. The newline under this marks the beginning of the function code. 

Every line of code meant for the function *must* be indented. There are no enclosing `{ }` which marks the "body" of the function. In Python, the "body" of the function is denoted with indentation only. Thus, every line of code meant for the function must be on the same indentation level. Finally, at the end of the function we `return` an output or `None`. While this last syntax is not mandatory, it is not good practice to leave off a function definition *without* a `return` statement.  

In [2]:
# Our first function

def my_first_function():
    pass

For our first function, we see above that `my_first_function` does not take in any input and does nothing. The `pass` keyword is a kind of temporary placeholder and basically does nothing. We use `pass` because one cannot leave a function "body" without any code at all. 

Now let's code something into `my_first_function` so that it does something useful. 

In [3]:
def my_first_function():
    print("Hello world!")

`my_first_function` will `print` the string `"Hello world!"` whenever it is *called*. Calling a function basically means instructing Python to run the code contained in the function. Notice that after defining a function and running the cell, there is no output. But that doesn't mean nothing has happened. In fact, Python has populated the global namespace with a new name, `my_first_function` and is ready to do what ever has been coded into this function when it is called. 

In [4]:
my_first_function

<function __main__.my_first_function>

It is good to understand what happens when we type `my_first_function` and execute a cell. Notice that the output says `<function ...` This means that the variable `my_first_function` represents an object of type `function`. The rest of the output indicates that this function is represented by a name `my_first_function` in the module `__main__`. We will not describe what modules are in this course, but suffices for our purposes to think of `__main__` as file containing all the functions that we will define in this Jupyter Notebook session. 

To actually execute the instructions in `my_first_function`, we must type `my_first_function()`. 

In [5]:
my_first_function()

Hello world!


## Functions with arguments

Functions won't be useful if we are unable to pass input into it. Most of the time, the set of instructions will act on the input we have supplied to the function and produces some output which is then passed to a variable to be stored. Let's modify `my_first_function` to print out a name supplied as input to it. 

In [6]:
def my_first_function(name):
    print("Hello %s" % (name))
    return None

my_first_function("Tang U-Liang")

Hello Tang U-Liang


In [15]:
# Passing two arguments

def special_product(x,y):
    prod = x-y+x*y
    return prod 

When defining functions with arguments, the same variable name used in the signature must be used in the body of the function. Now there is nothing inherently special about using `name` to represent the argument for names to `my_first_function`. After all, the computer doesn't "understand" that we intend to print out a name when calling `my_first_function`. However, we should use recognizable variable names to improve readibility of our code and to make our intentions transparent. 

In the function `special_product`, I passed two arguments named `x` and `y`. Inside the function, it performs the operation and assigns the result to a variable named `prod`. Then the function uses the keyword `return` to send the answer out from the function environment to the global environment. 

In [16]:
answer = special_product(1,3)
print(answer)

1


What  happened is that the function `special_product` performs the said operation on inputs `1` and `3`. It then outputs the answer, in this case `7`. We assign the output `7` to a variable named `answer` and print it. 

Note that we do not need to explicit declare a variable to "capture" the answer. The following works too. 

In [17]:
special_product(1,3)

1

Passing arguments in correct sequence matters. Python will pass values to arguments according to the sequence as it was declared in the signature. 

In [20]:
# x = 1, and y =3
print(special_product(1,3))

# x =3 and y = 1
print(special_product(3,1))

1
5


What will happen if we try to display the variable `prod` directly? 

In [18]:
print(prod)

NameError: name 'prod' is not defined

### Function scope

But isn't the variable `prod` defined already when we defined the function `special_product`? This happens because the variable `prod` is only available in the *scope* of the function. The global environment is another scope. In general, variables from one scope are not accessible in another scope with exeptions given by scoping rules (which are programming language dependant). For this course, it suffices to know that variables defined in the function scope will *NOT* be accessible from the global scope. 

(This can be overriden using the `global` keyword. But this is not encouraged.) 

## Function defaults

When we define functions, all variables we define in the signature must be assigned values. We cannot leave any out. 

In [21]:
special_product(1,)

TypeError: special_product() takes exactly 2 arguments (1 given)

Therefore, it becomes quite a hassle if we have to call the function in various places in our code with the same input in one of the arguments. To do that we can assign default values to particular arguments in the following manner.

    def my_function(arg_1, arg_2 = default_value, ...):
    
        code
        
Note that arguments assigned default values must come after arguments without default values. Also, don't worry that you cannot input values other than defaults. You are still able to override default values when you need to. 

In [24]:
def special_product(x, y=1): # default value of y is 1
    return x-y+x*y

# We don't have to pass any value to arguments with default values
print(special_product(2))

# Default values can be overriden
print(special_product(2,9))

3
11


## Passing arguments to functions by keyword 

The arguments to a function have names, just as variables have names. The names of arguments to a function are called keywords. We can pass arguments to functions by assigning values explicitly to keywords like so:

    my_function(keyword_1 = value_1, keyword_2 = value_2,...)
    
This gives enormous flexibility in using Python interactively. Most functions given in the `matplotlib` and `seaborn` libraries have many arguments almost all of have them have default values. However, we often use a few of these keywords and it is quite a pain to remember the exact sequence of arguments in the function signature. Passing argments to keywords allows us to pass arguments in any order convenient to us. 

In [25]:
special_product(1,3) == special_product(3,1)

False

In [27]:
special_product(x=1, y=3) == special_product( y=3, x=1)

True

## An application: A function to implement the Euclidean algorithm

To end this section, below is code to find the gcd of two positive integers. The function will also print out each stage of the algorithm as an output of 3 numbers: (m, n, remainder). 

In [34]:
def gcd(m=1,n=1):
    """
    This function is an implementation of the Euclidean algorithm. 
    
    Returns:
        int, greatest common divisor of m and n 
    """
    
    if m < n:
        r = [n, m, n%m]
    else:
        r = [m, n, m%n]
        
    for _ in range(0, r[1]):
        
        print("{} {} {}".format(r[0], r[1], r[2]))
        
        if r[2] == 0:
            break
        else:
            r = [r[1], r[2], r[1]%r[2]]
        
    print("Greatest common divisor of {} and {} is {}".format(m, n, r[1]))
    return r[1]

In [36]:
gcd(151,1223)

1223 151 15
151 15 1
15 1 0
Greatest common divisor of 151 and 1223 is 1


1