# ** Functions **


In [None]:
# run this cell to show a video, use slider to resize it, type Esc-o to hide it
from IPython.display import Video, clear_output; from ipywidgets import interactive, IntSlider
def _play(resize): display(Video(filename="media/ECS780P_Functions_topicSummary.mp4",data="",width=resize))
interactive(_play, resize=IntSlider(min=150, max=900, step=50, value=600, continuous_update=False, readout=False))

The scripts we have seen so far were mainly very short. However when programs become longer (above say, 100 lines of code), it becomes important to split them into separate units. This improves readability, makes it easier to debug code and also allows reusing code within the same script or across different applications.

Functions are a standard mechanism provided by most programming languages to support modularisation of the code.

Functions normally take parameters passed as *arguments*, and may accomplish some action or return some value. We have already seen several examples:

In [None]:
# sum() is an inbuilt Python function that takes a list of numbers as a parameter and returns the sum of its elements.
# Type help(sum) in a cell, for more information about this function.
total = sum([1, 3, 5, 7])
print(total)

# print() takes a string and does something with it (it prints it). It does not
# return anything. Functions without return values are sometimes called procedures.
print("Hello")

## User-defined functions

We can define functions to perform tasks in our code. The way to do that is by using the **def** statement:

```
def functionName(argument1, argument2, ...):
    """ Optional description (DocString) """
    ...BLOCK of code...
    return DATA # optional return statement
```
    

Here is an example of a user-defined function:

In [None]:
# our own function
def mulTable(n):
    """ Print the multiplication table for number n """
    for i in range(1,11):
        print (n*i), "\t", 
    print() # add newline at the end

# program control starts from here
mulTable(3) # we "call" the function
print ("---")
mulTable(6) # ...and again

The above function does not return any value. We can modify it so that it returns the multiplication table instead of printing it:

In [None]:
# our own function
def mulTable(n):
    """ Compute the multiplication table for number 'n' and return it as a list. """
    resultsList = [n*i for i in range(1,11)]
    return resultsList

# program control starts from here
list1 = mulTable(3)
list2 = mulTable(6)
s = [(x+y) for (x,y) in zip(list1, list2)]
print("The sum of the two tables for 3 and 6 is:", s, "\n")
print("Reassuringly, this is the same as the multiplication table for number 9:", mulTable(9))

Note that when the interpreter reaches the ```def``` statement it defines the function, but does **not** run it. Control starts from the first line of code outside a ```def``` statement. Note also that we have to define a function *before* we can use it, so we could not put the definition below the rest of the code.

Also note that the **docstring** comment (i.e. the text enclosed in **"""** - triple double-quotes) has a special role in the code; try running the cell below:

In [None]:
help(mulTable)

Here is another example from the book "Python for Bioinformatics" by S. Bassi (section 6.1):

In [None]:
def protcharge(AAseq):
        """ Returns the net charge of a protein sequence. """
        protseq = AAseq.upper() # make sure it's uppercase
        charge = -0.002
        AACharge = {'C': -0.45, 'D': -.999, 'E': -.998, 'H': -0.91, 'K': 1,
                    'R': 1, 'Y': -0.001}
        for aa in protseq:
                charge += AACharge.get(aa, 0)
        return charge

# using the function:
seq = "qtallvvlvllavalqateagpyga"
print(protcharge(seq))
print(protcharge("EEARGPL"))

A function can take more than one parameter, and return any type of result. Here is an example illustrating that:

In [None]:
def isMultiple(x, y):
    """ Returns True if 'x' is a multiple of 'y'; returns False otherwise. """
    if (x%y) == 0:
        return True
    else:
        return False
    
# let's test it
print(isMultiple(9, 3))
print(isMultiple(10, 6))
if isMultiple(5, 1):
    print("This is underwhelming...")

In particular, we can use *tuples* or *lists* to have a function return more than one value. Here is an example of that, again from the book "Python for Bioinformatics" by S. Bassi (section 6.2):

In [None]:
def chargeandprop(AAseq):
        """ Returns the net charge of a protein sequence 
        and the proportion of charged amino acids. """
        protseq = AAseq.upper() # make sure it's uppercase
        charge = -0.002
        count = 0
        AACharge = {'C': -0.45, 'D': -.999, 'E': -.998, 'H': -0.91, 'K': 1,
                    'R': 1, 'Y': -0.001}
        for aa in protseq:
                charge += AACharge.get(aa, 0)
                if aa in AACharge:
                    count += 1
        proportion = 100.0*count/len(AAseq)
        return (charge, proportion) # return a tuple

# using the function:
seq = "qtallvvlvllavalqateagpyga"
(c, p) = chargeandprop(seq) # storing the tuple returned by the chargeandprop() function
print("Charge    : ", c)
print("Proportion: ", p)

In [None]:
# run this cell to show a video, use slider to resize it, type Esc-o to hide it
from IPython.display import Video, clear_output; from ipywidgets import interactive, IntSlider
def _play(resize): display(Video(filename="media/ECS780P_Functions_UserDefined.mp4",data="",width=resize))
interactive(_play, resize=IntSlider(min=150, max=900, step=50, value=600, continuous_update=False, readout=False))

## Function scope

Be aware that a function defines a *scope* for variables. In general, this means that variables that you use within a function are local to that function. This in turn means that you **cannot** access a variable that is local to a function, from outside the function.

In [None]:
# mult=1 # try uncommenting this

def test(x):
    mult = 2*x
    return mult
    
print(test(3))
print(mult)

Any variable of the same name outside the function will be overshadowed by new variables defined within the function and will not be affected by operations done within the function. Modifying a *global* variable within a function (such as variable **y** in the first line of the example above) is potentially messy, so Python generally prevents you from doing it. If you do want to access a global variable within a function *for writing*, you have to declare it explicitly using the ```global``` keyword:

In [None]:
PI = 3.14 # meant to be a constant
people = 0 # meant to be a global counter

def circperimeter(r):
    return 2*PI*r # this works (it's a read)

def greet():
    # global people
    print("Hello!")
    people += 1 # this does not work, because we're trying to modify its value;
                # need to declare variable as global before this statement
    
print(circperimeter(1))
greet()
greet()
print(people)

In [None]:
# run this cell to show a video, use slider to resize it, type Esc-o to hide it
from IPython.display import Video, clear_output; from ipywidgets import interactive, IntSlider
def _play(resize): display(Video(filename="media/ECS780P_Functions_VariableScope.mp4",data="",width=resize))
interactive(_play, resize=IntSlider(min=150, max=900, step=50, value=600, continuous_update=False, readout=False))

### Putting it all together ...

As a last example, let us try to 'package' the code below (which reads a protein from a FASTA file), into a function.

**Note**: You may remember this from the topic **Files**.

In [None]:
FASTA = open("P04637.fas", "r")
header = FASTA.readline()
protein = ""
for line in FASTA: # couldn't be easier!
    protein += line.rstrip()
FASTA.close()

# Done. This is just pretty-printing.
(code, name)= header.split('|')
print("Accession code:", code)
print("\nName:", name)
print("Protein:", protein)
print("\nNumber of residues:", len(protein))