## Why create functions?
****Resusability!!

Some useful functions:
Computing the GC percentage in a DNA sequence

Checking if a DNA sequence has an in frame stop codon

A function to reverse complement a DNA Sequence


## Function general syntax:
    
def function_name(input arguments):
    
    "string documenting the function"
    
    function_code_block
    
    return output
    

### Example: 

In [1]:
def gc(dna):
    "This function computes the GC percentage of a dna sequence"
    
    nbases=dna.count('n') + dna.count('N')
    
    gcpercent=float(dna.count('c')+dna.count('C')+dna.count('g')+dna.count('G'))*100.0/(len(dna)-nbases)
    
    return gcpercent                                             

In [2]:
gc('ATCAGCTACGACTAGCATCGACT')

47.82608695652174

#### You can also use the input command to collect a dna sequence and assign it to the variable dna

In [3]:
dna=input('Enter DNA sequence: ')

Enter DNA sequence: ACTAGCATCGACTACG


In [4]:
gc(dna)

50.0

### 2nd Example

Problem: Write a program that checks if a given DNA sequence contains an in-frame stop codon

In [5]:
dna=input('Enter DNA sequence: ')

Enter DNA sequence: ATCAGCTACGATC


In [6]:
def has_stop_codon(dna):
    "This function checks if given dna sequence has in frame stop codons."
    stop_codon_found=False
    stop_codons=['tga', 'tag', 'taa']
    for i in range(0,len(dna),3):
        codon=dna[i:i+3].lower()
        if codon in stop_codons:
            stop_codon_found=True
            break
    return stop_codon_found


In [7]:
has_stop_codon(dna)

False

### We can modify this function to adjust/slide the frame to any number we want

In [8]:
def has_stop_codon_frame(dna, frame=0): #sets default frame to 0 but you can modify it when calling function
    "This function checks if given dna sequence has in frame stop codons."
    stop_codon_found=False
    stop_codons=['tga', 'tag', 'taa']
    for i in range(frame,len(dna),3):
        codon=dna[i:i+3].lower()
        if codon in stop_codons:
            stop_codon_found=True
            break
    return stop_codon_found

In [9]:
has_stop_codon_frame(dna) #frame default is 0

False

In [10]:
has_stop_codon_frame(dna, 0)

False

In [11]:
has_stop_codon_frame(dna, 1)

False

### Example 3: Reverse complement

In [29]:
dna=input('Enter DNA sequence: ')

Enter DNA sequence: GATTACA


In [30]:
dna[::-1] #gives me the entire string backwards

'ACATTAG'

In [31]:
l = list(dna) #strings can be converted to a list
l

['G', 'A', 'T', 'T', 'A', 'C', 'A']

In [47]:
d = '-'.join(l) #You can join the list back together with this  code
d

'G-A-T-T-A-C-A'

In [48]:
x = ''.join(l) #You can join the list back together with this  code
x

'GATTACA'

In [49]:
def reverse_string(seq):
    return seq[::-1]
    

In [50]:
def complement(dna):
    "Return the complementary sequence string."
    basecomplement = {'A':'T', 'C':'G', 'G':'C', 'T':'A', 'N': 'N', 'a':'t', 'c':'g', 'g':'c', 't':'a', 'n':'n'}
    letters = list(dna) #converts the string to a list
    letters = [basecomplement[base] for base in letters] #converts each letter in the list to the complement
    return''.join(letters) #joins the list letters back together (removes comma's)

In [51]:
def reversecomplement(seq):
    "Return the reverse complement of the dna string"
    seq = reverse_string(seq)
    seq = complement(seq)
    return seq

In [52]:
reversecomplement(dna)

'TGTAATC'

### Variable number of functions argument

General syntax:

def myfunction(first,second,third):

    #do something with the 3 variables
    
    ...

In [54]:
def newfunction(first,second,third,*therest):
    print('First: %s', first)
    print('Second: %s', second)
    print('Third: %s', third)
    print('And all the rest...', therest)
    return