# Python Functions Tutorial

Functions are important in all programming languages.  They allow us to write code to do something, and then do that something just by calling the function.  Here is an example that prints out a message:

In [1]:
def print_message():
    print('this is a function that prints out a long message')
    print('kind of silly, but maybe you are getting the point?')

To 'call' the function, we just type the function name with parenthesis:

In [2]:
print_message()

this is a function that prints out a long message
kind of silly, but maybe you are getting the point?


All functions in Python start with 'def', then a space, then the name of the function, then parenthises, then a colon.  The code for the function comes indented below, just like we saw with the simple example:

In [3]:
def print_message():
    print('this is a function that prints out a long message')
    print('kind of silly, but maybe you are getting the point?')

# Functions are arbitrary
Functions can take any input, and have any output.  Their inputs and outputs are completely arbitrary.  We could have an image, numbers, a list, a string, etc as an input.  Any of the same datatypes could be outputs as well; it's completely flexible.

The way we get an output from a function is with a 'return' statement.  The way we supply inputs is with 'arguments', which are words in the parentheses of a function.

Here is a function that takes two numbers and outputs another number (after adding 12).  Notice the 'return' statement, which is the word return, followed by what we want the function to give back to us.  The argument is one number, which we are calling `a_number`.  We can supply the argument `a_number` either by position or by keyword argument.  'Keyword' means we are giving the argument like `a_number=30`.  Position means we are matching the argument to its position in the list of arguments.  

In [4]:
def add_twelve(a_number, second_number):
    return a_number + second_number + 12

In [5]:
# specify the arguments positionally
add_twelve(20, 10)

42

In [6]:
# supply as a keyword argument
add_twelve(a_number=30, second_number=10)

52

In [7]:
# mix keyword and positional arguments -- positional arguments always have to come first
add_twelve(30, second_number=10)

52

# Default Values for arguments
In functions, we often want default values for arguments.  This means we will have arguments specified like `a_number=30` in our argument list.  Here is an example of using default values for arguments in a function.  The arguments with default values must come last in the list of arguments.

In [8]:
def add_twelve(a_number=30, second_number=10):
    return a_number + second_number + 12

In [9]:
# With the default values, we don't have to supply values for those arguments.
add_twelve()

52

In [10]:
# We can't do it this way.  The error message should be pretty self-explanatory,
# it's saying we must have our default arguments come last.
def add_twelve(a_number=30, second_number):
    return a_number + second_number + 12

SyntaxError: non-default argument follows default argument (<ipython-input-10-a79f4a18dd4c>, line 3)

In [11]:
def add_twelve(a_number, second_number=10):
    return a_number + second_number + 12

In [12]:
add_twelve(a_number=30)

52

In [13]:
# We can still provide an argument to supercede the default values
add_twelve(a_number=30, second_number=12)

54

In [14]:
# we can also supply the arguments positionally, of course
add_twelve(30, 12)

54

# List of strings as inputs
In the assignment, we want to take a list of strings as input.  This might look like this:

['heres the first doc', 'heres the second', 'and a third document']

The 'documents' would be separate documents, like separate news stories.

First let's print out the documents with a for loop:

In [15]:
def print_docs(docs):
    for d in docs:
        print(d)

In [16]:
print_docs(['heres the first doc', 'heres the second', 'and a third document'])

heres the first doc
heres the second
and a third document


# Boolean arguments
In the assignment, we want boolean arguments for our function.  A boolean is a variable type that can be `True` or `False`.  Here is an example of using a boolean argument for lowercasing.  I've given it the default value of 'True'.

In [17]:
def clean_text(docs, lower=True):
    for d in docs:
        if lower:
            print(d.lower())
        else:
            print(d)

In [18]:
document_list = ['Heres the First Doc', 'Heres the second', 'and a third document']
clean_text(docs=document_list)

heres the first doc
heres the second
and a third document


In [19]:
document_list = ['Heres the First Doc', 'Heres the second', 'and a third document']
clean_text(docs=document_list, lower=False)

Heres the First Doc
Heres the second
and a third document


Now you know how to lowercase documents!  Let's take the list of documents as an input, and return the converted list as an output -- this is the first part of the assignment.

In [20]:
def clean_text(docs, lower=True):
    clean_docs = []
    for d in docs:
        if lower:
            clean_docs.append(d.lower())
    
    if lower:
        # overwrite our 'docs' variable
        docs = clean_docs
        
    return docs

In [21]:
document_list = ['Heres the First Doc', 'Heres the second', 'and a third document']
clean_documents = clean_text(docs=document_list)

In [22]:
clean_documents

['heres the first doc', 'heres the second', 'and a third document']

# Error catching
Sometimes it's nice to check if a variable is the 'type' we are expecting so the function will work.  We can check at the beginning of the function if a variable is the type we expect.  Here is an example of checking if a variable is a list of strings:

In [23]:
type(document_list)

list

In [24]:
# check if it's a list like this
type(document_list) is list

True

In [25]:
type(document_list[0])

str

In [26]:
type(document_list[0]) is str

True

In [27]:
def clean_text(docs, lower=True):
    if type(docs) is not list:
        print('the docs argument should be a list')
        return None
    for d in docs:
        if type(d) is not str:
            print('each document must be a string')
            return None
    
    clean_docs = []
    for d in docs:
        if lower:
            clean_docs.append(d.lower())
    
    if lower:
        # overwrite our 'docs' variable
        docs = clean_docs
        
    return docs

# How we just checked for the wrong data type
If the 'docs' variable is not a list, we print out a message and return nothing (`None`).  None in Python is the same as NULL in R.  It is the built-in version of nothing.  So when we return it, we get `None` back.  This means if we assign the result to a variable, that variable will be `None`:

In [28]:
clean_docs = clean_text(docs=2)

the docs argument should be a list


In [29]:
clean_docs

In [30]:
clean_docs is None

True