# Write functions and Classes

This section will cover:
* A) Writing functions
* B) Writing Classes

# Functions

The scripts we have seen till now were mainly very short. However, when programs become longer (above, say, 100 lines of code) it becomes important to split them into separate units. This improves readability, makes it easier to debug code and also allows reusing code within the same script or across different applications. 

Functions allow for greater abstraction and reproducibility too.

# Definition of a python function

A function in Python is defined using the keyword `def`, followed by a function name, a signature within parentheses `()`, and a colon `:`. The following code, with one additional level of indentation, is the function body. If the function is returning any values then a `return` keyword is added with indentation.

In [26]:
# Example Python functions that prints the sign of a value
    
def sign(x): 
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'
    return 

# usage
for x in [-1, 0, 1]:
    print(sign(x))

negative
zero
positive


Optionally, but highly recommended, we can define a so called "docstring", which is a description of the functions purpose and behaivor. The docstring should follow directly after the function definition, before the code in the function body. You should include this using `"""Text"""`. Se below:

In [27]:
# Example Python function that prints the sign of value
    
def sign(x):
    """
    Function to check the sign of a value.
    
    Parameters
    ----------
    x: The value to check.
    
    Returns
    -------
    outcone: string indicating the value
    """
    
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'
    return 

# usage
for x in [-1, 0, 1]:
    print(sign(x))

# Notice that the docstring improves the understanding of the function.

negative
zero
positive


# Default and keyword arguments

In a definition of a function, we can give default values to the arguments the function takes:

We will often define functions to take optional keyword arguments, like this:

In [6]:
def hello(name, loud=False):
    if loud:
        print('HELLO {}'.format(name.upper()))
    else:
        print('Hello {}'.format(name))
    return

# Usage
hello('Bob') # Prints "Hello, Bob"
hello('Fred', loud=True)  # Prints "HELLO, FRED!"

Hello Bob
HELLO FRED


If we don't provide a value of the loud argument when calling the the function hello it defaults to the value provided in the function definition:

If we explicitly list the name of the arguments in the function calls, they do not need to come in the same order as in the function definition. This is called keyword arguments, and is often very useful in functions with a lot of optional arguments.

Consider the below function:

In [29]:
def myfunc(x, p=2, debug=False):
    """
    Function to calculate the exponent of a value
    
    Parameters
    ----------
    x: The value to raise.
    p: Exponent to use.
    debug: Flag to debug or not.
    
    Returns
    -------
    exponent: exponent value.
    """

    if debug:
        print("Evaluating myfunc for x = " + str(x) + " using exponent p = " + str(p))
    return x**p

In [30]:
# We can have the arguments in any order provided we list them all.
myfunc(p=3, debug=True, x=7)

Evaluating myfunc for x = 7 using exponent p = 3


343

# Return multiple values

We can also return multiple values. You just list them seperated by a comma.
See the example below:

In [15]:
def powers(x):
    """
    Return a few powers of x.
    """
    return x ** 2, x ** 3, x ** 4

We use multi return function like any other function. But we have a additional option for storing the outputs.
Consider the below examples:

In [16]:
# return on screen
powers(3)

(9, 27, 81)

In [18]:
powered = powers(3)
powered

(9, 27, 81)

In [20]:
two, three, four = powers(3)
print(two, three, four)

9 27 81


# Function Scope

Be aware that a function defines a scope for variables. In general, this means that variables that you use within a function are local to that function. You cannot access a variable that's local to a function from outside the function. Conisder the below code:

In [24]:
y=1

def test (x):
    y=2*x
    return y
    
print(test(3))
print(y)

6
1


Any variable of the same name outside the function will be overshadowed by new variables defined within the function and will not be affected by operations done within the function.

# More function examples

In [25]:
def protcharge(AAseq):
        """ Returns the net charge of a protein sequence """
        protseq=AAseq.upper() # make sure it's uppercase
        charge=-0.002 # again, an "accumulator" variable
        AACharge={'C': -0.45, 'D': -.999, 'E': -.998, 'H': -0.91, 'K': 1,
                  'R': 1, 'Y': -0.001}
        for aa in protseq:
                charge+=AACharge.get(aa,0)
        return charge

# using the function:
seq="qtallvvlvllavalqateagpyga"
print(protcharge(seq))
print(protcharge("EEARGPL"))

-1.001
-0.998


In [31]:
def fib2(n=20): 
    """Return Fibonacci series up to n."""
    
    # Create an empty list to hold the output
    result = []
    
    # Define the initialisation variables
    a, b = 0, 1
    
    # Go through the nth number
    while a < n:
        # Add the current number of the sequence
        result.append(a)
        
        # Update the variables to the new number
        a, b = b, a+b
        
        # Return the results
    return result

# Call the function for n=50
# Note the indentation
f100 = fib2(50) 

# show the output
f100    

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

In [22]:
def check_status(prompt, tries=3, reminder="Please try again!"):
    """ Notice the two arguments with defualts"""
    
    # While loop operating on the return statment
    while True:
        # Get input from the user
        val = input(prompt)
        
        # Check if input is in the on status
        if val in ("o", "on", "live"):
            return True
        
        # Check if input is off status
        if val in ("of", "off", "dead"):
            return False
        tries = tries - 1
        
        # Check the tries
        if tries < 0:
            raise ValueError("invalid user response")
        print(reminder)

# Using the function
check_status("What is the status of the bulb?")

What is the status of the bulb?on


True

In [23]:
def find_dna_motif(dna, motif = "CCT", step = 1):
    """
    Function to find DNA motifs in a sequence

    Parameters
    ----------
    dna: The input DNA to search
    motif: The motif to find in the sequence
    step: The window steps to use for the search

    Returns
    -------
    motif_found: 
    """
    # All the text above is called a doc string, give clear details about the function
    
    # Get the length of the motif for window creation
    mlen = len(motif)
    
    # Calculate the total number of windows given the motif length
    numofSub = int(((len(dna) - mlen) / step ) + 1)
    
    # Set the counter
    count = 0
    
    # Create windows (sub-strings)
    for i in range(0, numofSub * step, step):
        nsub = DNA[i:i + mlen]
        
        # Check for a match with the target motif
        if nsub == motif:
            # Update the count
            count += 1
            
    # Return the final count 
    return count

# Use the function
DNA = "CTGCCTCAGCCCTGCCTGTCTCCCAGATCACTGTCCTTCTGCCATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGCGCTGCTGGCCCTCTGGGGAC"
find_dna_motif(DNA)

9

# Quiz
Try to add a print  function to the code above to show you the actual motifs

# OOP

The programming paradigm we have been working with so far is called procedural programming. Essentially, this views a program as composed of two entities:

* Data structures (variables, lists, etc) that hold the data.
* Functions that operate on the data

This is adequate for many purposes, and indeed you will find yourself writing procedural code in Python time and again. However, sometimes we would like the data to be more closely tied to the functions that handle them. For instance, we might want DNA sequence data to know that it should save itself to the disk in FASTA format, while RNA expression should write itself in txt format. 

In procedural format, we can write it below:
dna= # sequence data
rna= # expression data
writeFasta(dna, "DNA") 
writeRNA(rna, "RNA")

However, it will work better to do it this way:
dna.write("DNA")
rna.write("RNA)

This method makes it easy and removes the need to remember two functions. Thinking about python programming in an object oriented programming way allows objects that have methods.

# Class

Classes are the key features of object-oriented programming. A class is a structure for representing an object and the operations that can be performed on the object. In Python a class can contain attributes (variables) and methods (functions).

A `class` is defined almost like a function, but using the `class` keyword, and the class definition usually contains a number of class method definitions (a function in a class).
Each class method should have an argument self as its first argument. This object is a self-reference.

Some class method names have special meaning, for example:
* `__init__`: The name of the method that is invoked when the object is first created.
* `__str__`: A method that is invoked when a simple string representation of the class is needed, as for example when printed.

There are many more, see http://docs.python.org/2/reference/datamodel.html#special-method-names

# Definition of a python class

A class in Python is defined using the keyword `class`, followed by a class name, a signature within parentheses `()`, and a colon `:`. The following code, with one additional level of indentation, is the function body.

In [32]:
# The syntax for defining classes in Python is straightforward:

class Point:
    """
    Simple class for representing a point in a Cartesian coordinate system.
    """
    
    # Constructor
    def __init__(self, x, y):
        """
        Create a new Point at x, y.
        """
        self.x = x
        self.y = y
    
     # Instance method
    def translate(self, dx, dy):
        """
        Translate the point by dx and dy in the x and y direction.
        """
        self.x += dx
        self.y += dy
    
    # String method on the class
    def __str__(self):
        return("Point at {} {} ".format(self.x, self.y))

In [34]:
# To create a new instance of a class:
p1 = Point(0, 0)     # this will invoke the __init__ method in the Point class

print(p1)            # this will invoke the __str__ method

Point at [0.000000, 0.000000]


In [42]:
# Yet another class example

class Greeter(object):

    # Constructor
    def __init__(self, name):
        self.name = name  # Create an instance variable

    # Instance method
    def greet(self, loud=False):
        if loud:
            print("HELLO, {} ".format(self.name.upper()))
        else:
            print("Hello, {} ".format(self.name))
                  
# Notice this is simialr to a function from before.

In [43]:
g = Greeter('Fred')  # Construct an instance of the Greeter class
g.greet()            # Call an instance method; prints "Hello, Fred"
g.greet(loud=True)   # Call an instance method; prints "HELLO, FRED!"

Hello, Fred 
HELLO, FRED 


# Cell class example with more features.

In [47]:
class Cell:
    """ A class representing a cell"""
    
    # Constructor 
    def __init__(self, name):
        """Input layer"""
        self.name = name
        self.organelles = []    # creates a new empty list for each cell
        self.genes = []         # same as above
    
    # The string method
    def __str__(self):
        return ("A model of a cell, current instance is {} ".format(self.name))
    
    # Methods
    def add_organelle(self, organelle):
        self.organelles.append(organelle)
    
    def add_gene(self, gene):
        self.genes.append(gene)
        
    def count_organelle(self):
        outG = len(self.organelles)
        return outG
    
    def count_gene(self):
        outg = len(self.genes)
        return outg
    
    def organelle_gene_ratio(self):
        rati = len(self.organelles) / len(self.genes)
        return rati
        
    def view_cell(self):
        return self.organelles, self.genes

In [48]:
# Initalise a cell class and do things
stem_cell = Cell("SC1")
print(stem_cell)

# Add some organelles
stem_cell.add_organelle("golgi")
stem_cell.add_organelle("ribosome")
stem_cell.add_organelle("cytoplasm")
stem_cell.add_organelle("ER")

# Add some genes
stem_cell.add_gene("TP53")
stem_cell.add_gene("THBS1")

# Do some operations
stem_cell.view_cell()

A model of a cell, current instance is SC1 


(['golgi', 'ribosome', 'cytoplasm', 'ER'], ['TP53', 'THBS1'])

In [49]:
# Count the number of genes
stem_cell.count_gene()

2

In [50]:
# Get the organelle to gene ratio
stem_cell.organelle_gene_ratio()

2.0

# Inheritance

A key feature of Objects Oriented Programming is inheritance. Inheritance facilitates code reuse; as such most libraries you will sooner or later use rely heavily on it. In essence you can think of objects as forming a taxonomy, the root of which is (you guessed it) the type object. This type is rather boring, in fact it is essentially a placeholder. 
Consider the below code that expands the cell class:

In [1]:
# We can extend the Cell class with new features 

class SP_Cell (Cell):
    """A class extending the cell class"""
    
    def __init__(self, name, age, sex):
        Cell.__init__(self, name) # Call __init__ for Cell
        self.age = age
        self.sex = sex
    
    def view_cell(self): # This will redefine the view_cell method entirely
        print("Age of cell is {} and sex is {}".format(self.age, self.sex))
        return

NameError: name 'Cell' is not defined

In [53]:
# Lets use the SP_Cell
IPS = SP_Cell("IPS1", 40, "male")
# Notice we can still add organelle and gene using the parent method
IPS.add_organelle("golgi")
IPS.add_gene("THBS1")

# However, the print works differently
IPS.view_cell()

Age of cell is 40 and sex is male


In [54]:
# Notice that the print still gives you Cell system but uses the current name
print(IPS)

A model of a cell, current instance is IPS1 


# Self practice

You have now learnt how to write python functions and classes.
Complete the below tasks to test your knowledge.

# 1

**FASTA processor**

The below code reads protein from a FASTA file, and prints several information about the protein.
Task : Convert the code into a python function.

Example FASTA file (P89u.fas is provided to help you test your code.

**[Hint: Review the function section]**

In [55]:
# 1

FASTA = open("P89u.fas", "r")
header = FASTA.readline()

protein = "" # build up the sequence here

for ll in FASTA:
    protein += ll.rstrip() # remove trailing '\n'

FASTA.close()

# Print the header
(code, name) = header.split('|')

print("Accession code:")
print(code)

print("\nName:")
print(name)

print("Protein:")
print(protein)

print("\nNumber of residues:")
print(len(protein))

Accession code:
>P04637

Name:
P53_HUMAN Cellular tumor antigen p53 - Homo sapiens (Human).

Protein:
MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD

Number of residues:
393


# 2

Write a subclass called `mCell` that models a mouse cell and extends the `Cell` class defined in blcok 47. It should have self.cytoplasm_size and self.nuclie_size attributes. 

Code the following method:
* .init(*****) constructor. Identify and include all the parameters to ensure that the cell .init() is included.
* .count_gene() calculate the ratio of cytoplasm_size and nuclie_size

**[Hint: Review the Inheritance section]

# 3

**Single cell counter class**

Write a class called `SCcounter` that represents a single cell counter. It should have a self.count attribute that starts from 0. Also it should have a self.max attribute that represents a limit for the count. And a self.name() that gives a string name to the current instance.

Code the following methods:
* **.__init__(self, m, name)** constructor. Set self.max to m and self.count to 0
* **.increment(self)** increase the count if it is smaller than the max, otherwise complain
* **.decrement(self)** decrease the count if it is larger than zero, otherwise complain
* **.bulk(self, by)** increase the count by a fixed number if the max allows it others complain
* **.reset(self)** reset counter to 0
* **.view(self)** view the current max and count
* **.__str__(self)** return a string that describes the class and the name of the current instance (this message will be displayed when the object is printed)


Write a test program for your counter class. Allocate two objects of type Counter: one called blood with a maximum of 6 and one called kidney with a maximum of 4. 

Program a while loop to keep asking the user for input. 
Process input as following:
* "monocyte" increment blood counter
* "podocyte" increment kideny counter
* "endothelial" decrement blood counter
* "macula" decrement kidney counter
* "error" reset both counters
* "quit" exit
* Print the .view() method of the two counter objects after each iteration (this will actually call the __str__ method you coded and print the string it returns).

**[Hint: Review the OOP section]**