# Python Basics - bonus on functions

## What we will cover?

As we will see later in the course, functions are a crucial element of programming. They allow you to automate repeated tasks. Especially in the Data Understanding and Data Preparation stages, we often will need to transform our data and create features for our models. In this tutorial, we will see how to apply your functions to different types of data and what happens when the function does not work.

Let's start with a simple function that categorizes grades as passes and fails.

In [30]:
gradelist = [6, 7, 4, 7.6, 8, "NA"]

In [31]:
def catgrades(grade):
    if grade < 5.5:
        return "fail"
    if grade >= 5.5 and grade <= 10:
        return "pass"
    if grade > 10:
        return "error"

### Testing a function

Before we apply this function to the entire list (or a dataset) it is good to make sure it works as expected. We can do it by simply providing it with different arguments.

In [32]:
catgrades(4)

'fail'

In [33]:
catgrades(8)

'pass'

In [34]:
catgrades(11)

'error'

### Applying the function to the data (list)

To apply the function to the list of grades, we need to use a for loop. 

In [35]:
for grade in gradelist:
    print(catgrades(grade))

pass
pass
fail
pass
pass


TypeError: '<' not supported between instances of 'str' and 'float'

### Troubleshooting a function

Let's try to see what is going on. The error tells us that the condition we have included in the function cannot be applied to a string. Indeed, it looks like `'NA'` is included in our list and it is not a integer nor a float. Let's doublecheck by testing the function applied to a string.

In [36]:
catgrades('string')

TypeError: '<' not supported between instances of 'str' and 'float'

There are many different ways to solve such an issue. One of them includees the use of `try except`. It allows you to  test a block of code for errors and to move on if the block raises error. We can build it in into functions to avoid for example errors due to data type.

In [37]:
def catgrades(grade):
    try:
        if grade < 5.5:
            return "fail"
        if grade >= 5.5 and grade <= 10:
            return "pass"
        if grade > 10:
            return "error"
    except:
        return "not categorized"

In [38]:
catgrades(4)

'fail'

In [39]:
catgrades('NA')

'not categorized'

In [40]:
for grade in gradelist:
    print(catgrades(grade))

pass
pass
fail
pass
pass
not categorized


This way, we can test our functions, discover potential errors and account for them for example using `try except` statements. We will see how useful this is once we start to apply functions to large datasets.