## Avoiding Common Errors with Functions
### _or_ Understanding Local and Global Variables _and_ Therefore Not Getting This Wrong on the Test

---

Programming functions in Python can be a bit of a double edged sword. On the one hand, it's pretty awesome when you're able to put together an effective and robust function and immediately put it to use, but at the same time, functions can also be a bit of a nightmare to debug since they're so stingy about throwing errors sometimes. Part of the problem is Python's own occasional leniency when it comes to local and global variables: often, these make us _think_ our function is doing just fine when in reality it needs substantially more tweaking before being appropriate for general use (and also appropriate as a test answer).

Let's distinguish local and global variables first. As mentioned in Lab 7, _global_ variables are variables that exist in your general Python workspace:

In [1]:
x = 5
print(x)

5


Here, `x` is a global variable, in that we can perform operations on it right in a regular code cell and whatnot. It's not defined in relation to anything else, like a loop or function. 

A _local_ variable, on the other hand, only exists inside a special Python structure, such as a loop or a function.

In [2]:
for number in [1,2,3]:
    print(number)

1
2
3


`number` is an example of a local variable, in that it shouldn't exist outside of the loop. In this example, we're using it as a placeholder variable while looping through that list. 

However, globally defined variables can obviously be used within loops:

In [3]:
x = 5 # global

for number in [1,2,3]:
    print(x * number)

5
10
15


Here, `x` is globally defined, and yet we're multiplying it by a local variable every time the loop resets. This is also a similar operation to when we define a list outside of a loop and then use `list.append` - since the list is globally defined, we can then get its newly appended contents outside of the loop.

Where things fall apart is Python's relative leniency with this sort of thing. For instance:

In [4]:
for number in [1,2,3]:
    print(number)
    
print("This shouldn't print below -", number) # shouldn't exist - it's not global

1
2
3
This shouldn't print below - 3


It shouldn't print anything from that last line, and yet it does. This is because the loop changes the value of the local variable `number` every time, and when the loop ends, the last thing assigned to `number` was the number 3. Instead of saying 'hey, that's not right' when our code calls on this local variable as if it was global, Python goes 'oh, I think I know what you mean' and meekly offers you a 3.

**This is bad.** We should never, ever try to work with local variables like this - it makes for error-prone, buggy code, especially when things get a little more complicated than looping through a list of three consecutive numbers. You want all your global variables to be very explicitly defined, and all your local variables hidden away in whatever structure (loop/function) they belong to.

That being said, if you think this is bad, Python's leniency is slightly different and yet somehow even worse when it comes to functions. First, have a look at this friendly and perfectly correct function:

In [5]:
def add_numbers(num1, num2):
    '''(int, int) -> int'''
    out = num1 + num2
    return out

add_numbers(2, 2)

4

In this function, we have a total of three local variables - two inputs (`num1, num2`) and an output variable (`out`). _None of these should exist outside of our function_. They are only defined _relative_ to said function. Python is slightly better about catching this than it is with loops:

In [6]:
add_numbers(2, 2)

print(num1)

NameError: name 'num1' is not defined

Since `num1` is a local variable, we can't print it like we would a global variable such as `x` above. However, we could do that in the body of the function, since the existence of `num1` makes sense to Python there:

In [7]:
def add_numbers_verbose(num1, num2):
    '''(int, int) -> int'''
    out = num1 + num2
    print(num1) # this is now happening locally
    return out

add_numbers_verbose(2,2)

2


4

Notice how `print(num1)` doesn't throw an error this time. This is because we're referring to it, a local variable, within the function in which it exists. 

So far, it seems like functions are comparatively robust. So what's the issue? Well, let's say we're having a bad day, and in a coffee-deprived haze, write a function like this:

In [8]:
def add_numbers(num1, num2):
    out = a + b
    return out

add_numbers(2,2)

NameError: name 'a' is not defined

What's going wrong here? Well, we're referring to two local variables, `a` and `b`, which don't exist. We haven't defined them relative to the function, and so Python is very confused. Still robust though - it caught the error, didn't it? 

The problem - and this is a big, big problem - arises when our mistyped local variables share names with actual global variables. Have a look:

In [9]:
a = 3
b = 5

def add_numbers(num1, num2):
    out = a + b
    return out

add_numbers(2,2)

8

So either we just broke all of math with 2 + 2 now equal to 8, or something's going wrong with our global/local variables here. This is the big issue with Python - if you call on a local variable that doesn't exist, it tries to be friendly (it just wants to be liked, after all) and looks for _global_ variables that match that name.

**This is really, really bad.** If you gave this code cell and its output to a kindergartener who was (bear with me) somehow a Python wiz but struggled a bit with arithmetic, they'd totally think 2 + 2 is indeed 8, and the kicker is _they'd be justified in doing so_ - Python isn't throwing any errors, and computers totally don't lie, remember?

Now consider this in light of bioinformatics. Imagine you had a function that looked something like this:

    def get_sequence(SeqRecord):
        seq = seqrecord.seq
        return seq
        
Notice how our function's input is saved to a local variable called `SeqRecord`, but the actual function works on something called `seqrecord`. This would throw an error... unless in a previous question, we actually globally defined something as `seqrecord`. Then, no matter what input we give to this function, it will _always_ return the same sequence out of whatever was saved in `seqrecord`! All because Python wants to be a little too friendly at the expense of good coding practices.

Be extremely careful with phantom bugs like this - they can completely sink a test answer (and perhaps later, your work on a research project) just because they're so silent and can sneak right by. A good way to make absolutely sure that this isn't going on is to look at the local variables you're defining in your input (i.e. `num1` and `num2` in the above code cell) and **make sure your function makes use of _all_ your input local variables**. Otherwise, they shouldn't be there - and given that for the purposes of this course, we give you the local variables for a function as part of the question prompt, they very much should be there as far as the test is concerned. If they aren't in the body of your function, you might be missing something that you should actually be accounting for as well.

Of course, it's possible to define new local variables within functions as well, like I did with `out` above. These are also fine, but just make absolutely sure they never clash with your global variables. Ideally, a function should stand on its own in your code, using only local variables as much as humanly possible. This is the best way to make sure your functions approximate the 'general blueprints' of whatever tasks they're supposed to complete, which is largely the idea behind writing functions in the first place.

Hope this helps and best of luck on the test!

Ahmed