# Python Tutorial - Part 2

---
<a id='translate_seq'></a>
# Example Program

# Translating a Nucleotide Sequence

---

A common task in bioinformatics is using a scripting language, such as Python, to also parse the output from an analysis program, such as BLAST, and produce a file of edited data.

Scripting languages can be used to rapidly write code to read files, manipulate the text and produce an output.

An example might be to read a fasta file of nucleotide sequence and produce the possible amino acid sequence. In other words, translate the sequence.

This will demonstrate the use of file handling, control structures, string (list) manipulation and dictionaries.

---

## Input Files

---

For the example we will have 2 input files.

The fasta sequence:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;>BF246290<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ggcgtcgtagtctcctgcagcgtctggggtttccgttgcagtcctcggaaccaggacctc<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ggcgtggcctagcgagttatggcgacgaaggccgtgtgcgtgctgaagggcgacgg<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cccagtgcagggctatcatcaattcgagcagaaggaaagtaatggcaccagtgaag<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;…<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cacagatggtgtgccgatgtgtctatggaacgattctgtgatctcactctcaggagacca<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tggcatcatgtggccgcacaactgtggtccatgaaaaagcaagatgactgtgggcca<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ggg</b><br>

NOTE: This is not one complete sequence, there are line breaks

A file of codon translations:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ttt<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;F<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ttc<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;F<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tta<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;L<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ttg<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;L<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;etc</b><br>

---

## Storing the Sequence

---

The first step is to read the fasta sequence from file into a string:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with open('sequence.txt') as in_file:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Open the file<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq_list = in_file.readlines()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Read the file to a list<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq_name = seq_list.pop(0)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Remove the sequence name<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq = seq_list.pop(0)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Initialise the sequence<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq = seq.rstrip()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Remove the newline character<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq = seq.lower()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Lower case the sequence<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for line in seq_list:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#  Append the rest of the sequence to seq<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq += line.rstrip().lower()</b><br>

NOTE: Methods can be appended to each other - seq += line.rstrip().lower()

---

## Building the Dictionary

---

The next step is to build the dictionary using the codons file:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with open('codons.txt') as in_file:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Open the file<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codons_list = in_file.readlines()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Read the file to a list<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codons = {}&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Initialise the dictionary<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for count in range(0, len(codons_list), 2):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Work through the list<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key = codons_list[count].rstrip()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Get the codon<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key = key.lower()<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;value = codons_list[count+1].rstrip()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Get the aa, on next line<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;value = value.lower()<br>   
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codons[key] = value&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Add to dictionary</b><br><br>

---

## Translating the Sequence

---

Now we have the sequence and codon translations we create the amino acid sequence:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for count in range(0, len(seq), 3):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Work through the list<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codon = seq[count:count+3]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Get 3 nucleotides<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;aa = codons[codon]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Get the associated aa<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print (aa, end=" ")<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# In 2.7 this would be:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# print aa,</b><br><br>

---

## Complete Program

---

Putting it all together, with a few short cuts:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with open('sequence.txt') as in_file:<br>                                 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq_list = in_file.readlines()<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq_name = seq_list.pop(0)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq = seq_list.pop(0).rstrip().lower()<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for line in seq_list:<br> 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;seq += line.rstrip().lower()<br><br>     
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;with open('codons.txt') as in_file:<br>     
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codons_list = in_file.readlines()<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codons = {}  # Initialise the dictionary<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for count in range(0, len(codons_list), 2):<br>                      &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codons[codons_list[count].rstrip().lower()] =  codons_list[count+1].rstrip().lower()<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for count in range(0, len(seq), 3):<br> 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;codon = seq[count:count+3]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if codon in codons:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;aa = codons[codon]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;aa = '-'<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print (aa, end=" ")</b><br><br>

---

## Exercise 9

---

Write a program to test the sequence translation code. Modify the code so that it translates the sequence in 3 reading frames and prints them to a file. The sequences should be in fasta format with the frame number in the description line.

The fasta file (<b>sequence.txt</b>) and codons file (<b>codons.txt</b>) needed to test the script can be downloaded from the <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">Exercise Answers page</a>.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [16]:
with open('sequence.txt') as in_file:
    
    seq_list = in_file.readlines()  # Read the file to a list
    seq_name = seq_list.pop(0)  # Remove the sequence name
    seq = seq_list.pop(0)  # Initialise the sequence
    seq = seq.rstrip()  # Remove the newline character
    seq = seq.lower()  # Lower case the sequence

    for line in seq_list:  #  Append the rest of the sequence to seq
        seq += line.rstrip().lower()

with open('codons.txt') as codons_file:
    codons_list = codons_file.readlines()
    codons = {}  # Initialise the dictionary
    for count in range(0, len(codons_list), 2):  # Work through the list
        key = codons_list[count].rstrip()   # Get the codon
        key = key.lower()
        value = codons_list[count+1].rstrip()  # Get the aa, on next line
        value = value.lower()	
        codons[key] = value  # Add to dictionary

out_file = open("translated_seqs.fasta", "w")
for frame in range(0, 3):
    out_file.write("> Frame " + str(frame) + "\n")

    for count in range(frame, len(seq), 3):  # Work through the list
        codon = seq[count:count+3]   # Get 3 nucleotides
        if codon in codons:
            aa = codons[codon]   # Get the associated aa
        else:
            aa = '-'
        out_file.write(aa)
    out_file.write("\n") # Sequence is finished so print newline for next sequence

out_file.close()



---
<a id='modules'></a>

# Modules

---

Modules, or libraries, are common to most programming languages, including Perl, C++ and Java.

Modules provide a set of code to provide particular functions that can be included in your own code. They are essentially programs with functions that can be called from your own program.

As expected Python provides a large library of modules, including Biopython and PyCogent, and are imported with the "import" command.

The most commonly used is probably the sys module which contains system-specific functionality. This is required for functions that are often included by default in other languages.

---
<a id='command_line_arg'></a>
# Command Line Arguments

---

Often when writing a Python script you want to include the option to read values in to the script (command line arguments).

In many languages this is a built in function but with Python you need to import the <b>sys</b> module.

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;import sys</b><br>

Arguments can then be included when you run the script. For example, if the script is called test.py:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;python test.py arg1 arg2 arg3</b><br>

To read them in to the script the <b>sys.argv</b> function is used:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;import sys<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a1 = sys.argv[1]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a2 = sys.argv[2]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a3 = sys.argv[3]</b><br>

Now the variable a1 has the value of arg1 etc.

It is possible to avoid having to write sys each time the function is called by explicitly importing that function.

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;from sys import argv</b><br>

Or to import all functions:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;from sys import *</b><br>

These options work for all modules and using either of the above the rest of the code is now simply:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;import sysa1 = argv[1]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a2 = argv[2]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a3 = argv[3]</b><br>

---

## Exercise 10

---

Write a script that reads two numbers as command line arguments, adds them together and prints the result.
 
(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [17]:
from sys import *

def add(num1, num2): 
	num3 = num1 + num2
	return num3 

n1 = int(argv[1])
n2 = int(argv[2])

total = add(n1, n2)
print (n1, "+", n2, "=", total)

# Note that input needs to be converted to an integer.
# Without this the 2 numbers would be treated as strings
# and just appended together 


ValueError: invalid literal for int() with base 10: '-f'

---
<a id='string_methods'></a>

# String Methods

---

Strings contain numerous built in methods enabling you to manipulate and interrogate them. Examples include:
<a id='string_search'></a>
## String Searches

---

<b>startswith(string)</b> – This will return true if the string starts with the the argument string, for example:


In [None]:

protein = 'MEFTIKRDYFITQLNDT' 

# Does the protein start with methionine
if protein.startswith('M'): 
    print ('Yes, the protein starts with methionine')
           
# Will print “Yes, the protein starts with methionine”



A substring can also be searched for within a string: 
    

In [None]:

protein = 'MEFTIKRDYFITQLNDT'

# Does the protein contain DYF
if 'DYF' in protein: 
    print ('Yes, it contains the pattern DYF')



---
<a id='string_split'></a>
## Splitting Strings

---

Strings have an inbuilt split method that splits the string based on a delimiter and returns the split items in a list:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;str.split([delim[, maxsplit]])<br><br>

If no arguments are provided the string wil be split on whitespace:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;' 1  2   3  '.split()<br>        
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;returns ['1', '2', '3']</b><br><br>

With delimiter:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'1XX2XX3'.split('XX')<br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;returns ['1', '2', '3']</b><br><br>


In [None]:

test_string =  '1 2 3'

split_list = test_string.split()

print (split_list)



Alternatively a delimiter can be used:


In [None]:

test_string =  '1XX2XX3'   

split_list = test_string.split('XX')

print (split_list)



If the optional <b>maxsplit</b> argument is provided it set the maximum number of splits in the string:


In [None]:

test_string =  '1 2 3 4 5 6'

split_list = test_string.split(" ", 2)

print (split_list)



---
<a id='string_other'></a>
## Other Methods

---

<b>lower() / upper()</b> - Returns a copy of the string converted to lower/upper case.

<b>capitalize()</b> - Returns a copy of the string with only its first character capitalized.

<b>replace(old, new[, count])</b>  Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

<b>replace(old, new[, count])</b> - Returns a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

There are many more.


In [None]:
# Replace valine with tyrosine

protein = "vlspadktnv"
print(protein)

new_protein = protein.replace("v", "y")

print(new_protein)


In [None]:
# Replace more than one amino acid

protein = "vlspadktnv"
print(protein)

new_protein = protein.replace("pad", "tyc")

print(new_protein)

There are many more.

---

## Exercise 11

---

The file CAUH01000012.gff is a sequence annotation file in GFF format and is available on the <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">Exercise Answers</a> page.

a) Write a script that reads this file and prints out the lines that start with “CAUH01000012“, the ID for the sequence.

b) Modify the file to print out the start and end position of the CDS regions.

The columns in the GFF file are Sequence ID, Source, Type, Start, End, Score, Strand, Phase and Attributes. The CDS is identified in the third column and start and end in the fourth and fifth columns. The columns are all separated by tabs, which are whitespace.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [23]:
with open('CAUH01000012.gff') as in_file:
    for line in in_file: 
        if line.startswith('CAUH01000012'):
            print (line) 
# 


CAUH01000012	BluGen	gene	4550	5817	.	+	.	ID=BGHDH14_bgh01634;Alias=bgh01634;Name=hydroxyisocaproate dehydrogenase;Ontology_term=GO:0008152,GO:0005488;inference=ab initio prediction:Eugene:3.5g:similar to RNA sequence, EST (same species):dbEST:GT062874.1	

CAUH01000012	BluGen	mRNA	4550	4636	.	+	.	ID=BGHDH14_bgh01634m1;Parent=BGHDH14_bgh01634;Name=BGHDH14_bgh01634m1	

CAUH01000012	BluGen	mRNA	4698	4796	.	+	.	ID=BGHDH14_bgh01634m2;Parent=BGHDH14_bgh01634;Name=BGHDH14_bgh01634m2	

CAUH01000012	BluGen	mRNA	4852	5565	.	+	.	ID=BGHDH14_bgh01634m3;Parent=BGHDH14_bgh01634;Name=BGHDH14_bgh01634m3	

CAUH01000012	BluGen	mRNA	5623	5817	.	+	.	ID=BGHDH14_bgh01634m4;Parent=BGHDH14_bgh01634;Name=BGHDH14_bgh01634m4	

CAUH01000012	BluGen	CDS	4550	4636	.	+	0	Parent=BGHDH14_bgh01634m1	

CAUH01000012	BluGen	CDS	4698	4796	.	+	0	Parent=BGHDH14_bgh01634m2	

CAUH01000012	BluGen	CDS	4852	5565	.	+	0	Parent=BGHDH14_bgh01634m3	

CAUH01000012	BluGen	CDS	5623	5715	.	+	0	Parent=BGHDH14_bgh01634m4	

CAUH01000012	BluGen	

In [24]:
with open('CAUH01000012.gff') as in_file:
    for line in in_file: 
        if line.startswith('CAUH01000012'):
            values = line.split()
            if values[2] == 'CDS':	
                print ("Start =", values[3], "and end =", values[4])



Start = 4550 and end = 4636
Start = 4698 and end = 4796
Start = 4852 and end = 5565
Start = 5623 and end = 5715
Start = 7647 and end = 7759
Start = 7854 and end = 8216
Start = 8267 and end = 8660
Start = 17433 and end = 17709
Start = 17243 and end = 17370


---
<a id='functions'></a>

# Functions

---

Often when writing a program there are times when you want to use the same piece of code multiple times. An example is a sequence translation program. What if you wanted to translate multiple sequence?

This is where functions are used (or subroutines or methods, depending on the language).

A function is a block of code that can be used multiple times to perform a particular piece of work.

A function is defined by the keyword def followed by the function name. Note the brackets following the name, as functions can optionally take arguments.

The structure of a function is like Python control methods, the function name ending in a colon and the scope of the function identified by indented code:

<b>def hello():<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print ("Hello World!")</b><br><br>
    

A function has to be declared in the program before it can be used.

The function is called by simply using its name, and including arguments if required.

The code within the function is then called:


In [None]:

def hello():

    print ("Hello World!")
       
hello()
hello()



---
<a id='fun_pass_arg'></a>
## Passing Arguments

---

Any number of arguments can be passed to a function:


In [None]:

def hello(name1, name2):
    print ("Hello", name1, "and", name2)
       

hello("Lenka", "Ema")
                        


---
<a id='fun_ret_val'></a>
## Returning Values

---

Functions can also return data:


In [None]:

def add(num1, num2):
    num3 = num1 + num2
    return num3


num = add(3, 4)
print (num)



Multiple values can be returned and assigned to variables when the function is called:


In [None]:

def calc(num1, num2):
    a = num1 + num2
    b = num1 * num2
    c = num1 - num2

    return a,b,c

   
d, e, f = calc(10, 5)
print (d, e, f)



Although the previous version actually returns a tuple it is possible to explicitly return one, or a list:


In [None]:

def calcTuple(num1, num2):
    a = num1 + num2
    b = num1 * num2
    c = num1 - num2
    
    return (a,b,c)

def calcList(num1, num2):
    a = num1 + num2
    b = num1 * num2
    c = num1 - num2

    return [a,b,c] 

t = calcTuple(10, 5)

print ("A tuple of values is returned:")
print (t)

l = calcList(10, 5)

print ("A list of values is returned:")
print (l)



---

## Exercise 12

a) Write a script that includes a function that takes 2 numbers, multiplies them together and returns the result. The script should prompt the user for the numbers, pass them to the function and print out the result.

b) Create a script that tests the example functions above, which return multiple values, a list and a tuple.

Edit the script to demonstrate that it does actually return a list and a tuple.

HINT: You can do this by assigning the returned values to a single variable, which will be the tuple, and then printing the index positions.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [1]:
def add(num1, num2): 
	num3 = num1 + num2
	return num3 

n1 = int(input("Enter first number: "))
n2 = int(input("Enter second number: "))

total = add(n1, n2)
print (n1, "+", n2, "=", total)


# This code is obviously more verbose than it needs to be,
# for instance there is no need for the "total" variable
# and the print statement could be:
#
# print (n1, "+", n2, "=", add(n1, n2))




Enter first number: 1
Enter second number: 2
1 + 2 = 3


In [2]:

# Calc function

def calc(num1, num2):
    	a = num1 + num2
    	b = num1 * num2
    	c = num1 - num2

    	return a,b,c 
	
d, e, f = calc(10, 5)
print (d, e, f)


# Tuple function

def calcTuple(num1, num2):
    	a = num1 + num2
    	b = num1 * num2
    	c = num1 - num2

    	return (a,b,c) 
	
tup = calcTuple(10, 5)
print ("First value =", tup[0])
print ("Second value =", tup[1])
print ("Third value =", tup[2])


# List function

def calcList(num1, num2):
    	a = num1 + num2
    	b = num1 * num2
    	c = num1 - num2

    	return (a,b,c) 
	
lis = calcList(10, 5)
print ("First value =", lis[0])
print ("Second value =", lis[1])
print ("Third value =", lis[2])




15 50 5
First value = 15
Second value = 50
Third value = 5
First value = 15
Second value = 50
Third value = 5


---
<a id='fun_pass_other'></a>
## Passing Other Data as an Argument

---

Any data type, or object, can be passed to a Python function e.g. list, tuple, dictionary etc:


In [None]:

def calcSum(list1):
    total = 0
    for num in list1:
        total += num

    return total


sumlist = (1, 2, 3, 4, 5)
print (calcSum(sumlist))



---

## Exercise 13

---

Create a script that contains a function that takes a dictionary as an argument and tests it.

Edit the script further so that the function takes a list and dictionary as arguments and tests it. A suitable test would be for the list to contain the key values from the dictionary and then iterate the list in the function and print out the associated dictionary values.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [3]:
# Dictionary function
# Prints the keys and values of a dictionary

def showDictionary(dic):
	for k in dic.keys():
		print ("Key is ", k, "and value is", dic[k])

	
codons = {'ttt':'F', 'tta':'L', 'gga':'G'}
showDictionary(codons)

# Dictionary and list function
# Prints the keys and values of a dictionary using list of keys

def showDictionary(lis, dic):
	for k in lis:
		print ("Key is", k, "and value is", dic[k])


codon_list = ('ttt', 'tta', 'gga')	
codons = {'ttt':'F', 'tta':'L', 'gga':'G'}
showDictionary(codon_list, codons)

Key is  ttt and value is F
Key is  tta and value is L
Key is  gga and value is G
Key is ttt and value is F
Key is tta and value is L
Key is gga and value is G



---
<a id='fun_pass_keyword'></a>
## Passing Function Arguments by Keywords


In the examples above, when multiple arguments are passed to a function, they are associated to parameters by their position. For example, with the following function definition:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;def calcValue(num1, num2, num3):</b><br>

If it is called by:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;calcValue(5, 12, 16)</b><br>

Then 5 would be assigned to num1, 12 to num2 and 16 to num3, as expected.

However, arguments can also be passed by keywords and are then assigned to parameters by explicitly naming them. A simple calculation function could be:


In [5]:
 
def calcValue(num1, num2, num3):
    total = num1 + num2 - num3
   
    return total


ans = calcValue(5, 10, 6)

print (ans)


9



Alternatively the arguments to the function can be explicitly named, and therefore in any order:
  

In [None]:

def calcValue(num1, num2, num3):
    total = num1 + num2 - num3
   
    return total

ans = calcValue(num1=5, num2=10, num3=6)
print (ans)

ans = calcValue(num3=6, num2=10, num1=5)
print (ans)

ans = calcValue(num2=10, num1=5, num3=6)
print (ans)

# etc ...



All versions will produce the same answer.

An advantage is that you do not have to know in what order parameters are declared in the function.

---

## Exercise 14

---

Create a script that tests the various ways to pass arguments to a function.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [4]:
def add(num1, num2): 
	num3 = num1 + num2
	return num3 

n1 = int(input("Enter first number: "))
n2 = int(input("Enter second number: "))

total = add(n1, n2)
print (n1, "+", n2, "=", total)


# This code is obviously more verbose than it needs to be,
# for instance there is no need for the "total" variable
# and the print statement could be:
#
# print (n1, "+", n2, "=", add(n1, n2))

Enter first number: 1
Enter second number: 2
1 + 2 = 3


---
<a id='fun_default_values'></a>

## Default Values of Parameters

---

It is possible to set default values of parameters in the function definition:
 
<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;def calcValue(num1, num2=20, num3=8):</b><br>

The parameters num1 and num2 now have default values so the function can be called with just a single value for num3, or with 2 values and num3 will be the default, or with 3 values and both defaults will be overwritten. Finally, one of the default values can be changed if required by specifying the defined name:
 

In [None]:

def calcValue(num1, num2=20, num3=8):
    total = num1 + num2 - num3

    return total

# Function can be called with just a single value for num
ans = calcValue(6)
print ("Version 1: ", ans)

# Called with 2 values and num3 will be the default
ans = calcValue(6, 10)
print ("Version 2: ", ans)

# Called  with 3 values and both defaults will be overwritten
ans = calcValue(6, 10, 5)
print ("Version 3: ", ans)

# Default value changed by specifying the defined name
ans = calcValue(6, num3=4)
print ("Version 4: ", ans)



NOTE: An example of the use of default values is with the Python open file command when opening a file for reading:

<b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f = open(‘test.txt’, ‘r’)</b><br>

The ‘r’ is optional as it is a default value.

---
<a id='fun_var_params'></a>
## Variable Number of Parameters

---

A function can take additional optional arguments by prefixing the last parameter with *.

Optional arguments are then available in a tuple referenced by this parameter:


In [None]:

def calcValue(num1, num2, *morenums):
    total = num1 + num2
    for n in morenums:
        total += n

    return total

  
ans = calcValue(2, 4,  5, 12, 15)
print (ans)



In this case 2 would be assigned to num1, 4 to num2 and tuple created from the rest (5, 12, 15).
 
If ** is used the arguments are then converted in to a dictionary.


In [None]:

def testFunc (val1, val2, **morevals):
    print ("Val1 is", val1)
    print ("Val2 is", val2)

    print ("The dictionary:")

    for k in morevals.keys():
        print ("Key is", k, "Value is", morevals[k])

testFunc("First", "Second", Third= 3, Fourth= 4, Fifth= 5)



NOTE: The keys in the dictionary section of the function call do not have quotes.

---

## Exercise 15

---

Create a script that tests the examples given above for default parameters and variable numbers of parameters.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)


In [6]:
# Default parameters

def calcValue(num1, num2=20, num3=8):

	total = num1 + num2 - num3

	return total


ans = calcValue(6)

print (ans)


# Call it with 2 numbers so only num3 uses the default

ans = calcValue(6, 10)

print (ans)


# Can still call it with 3 different numbers

ans = calcValue(6, 10, 5)

print (ans)


# Can also change just one of the defaults

ans = calcValue(6, num3=4)

print (ans)


# Variable number of parameters in a list

def calcValue(num1, num2, *morenums):

	total = num1 + num2
	for n in morenums:
		total += n

	return total


ans = calcValue(2, 4,  5, 12, 15)
print (ans)
		

# Variable number of parameters in a dictionary

def testFunc (val1, val2, **morevals):

	print ("Val1 is", val1)
	print ("Val2 is", val2)

	print ("The dictionary")

	for k in morevals.keys():
		print ("Key is", k, "Value is", morevals[k])



testFunc("First", "Second", Third=3, Fourth=4, Fifth=5)


18
8
11
22
38
Val1 is First
Val2 is Second
The dictionary
Key is Third Value is 3
Key is Fourth Value is 4
Key is Fifth Value is 5


---

## Exercise 16

---

In the Example Program - Sequence Translation part of the tutorial you were provided with the code to translate a nucleotide sequence into amino acids, which you tested in exercise 8.

The fasta file (<b>sequence.txt</b>) and codons file (<b>codons.txt</b>) needed to test the script can be downloaded from the <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">Exercise Answers</a> page.

Modify the code so that it translates the sequence in all 6 reading frames. You should include a function that carries out the translation and call it 6 times.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [21]:
# Dictionary for the reverse complementing of the nucleic acids

trans = {'A':'T', 'T':'A', 'G':'C', 'C':'G'}


def translate(seq):
    aaseq = ""
    for count in range(0, len(seq), 3):  # Work through the list
        codon = seq[count:count+3]   # Get 3 nucleotides
        if codon in codons:
            aa = codons[codon]   # Get the associated aa
        else:
            aa = '-'

        aaseq += aa

    return aaseq


with open('codons.txt') as codons_file:
    codons_list = codons_file.readlines()

    codons = {}  # Initialise the dictionary
    for count in range(0, len(codons_list), 2):  # Work through the list
        key = codons_list[count].rstrip()   # Get the codon
        key = key.upper()
        value = codons_list[count+1].rstrip()  # Get the aa, on next line
        value = value.upper()	
        codons[key] = value  # Add to dictionary

with open('sequence.txt') as in_file:

    seq_list = in_file.readlines()  # Read the file to a list
    seq_name = seq_list.pop(0)  # Remove the sequence name
    seq = seq_list.pop(0)  # Initialise the sequence
    seq = seq.rstrip()  # Remove the newline character

    for line in seq_list:  #  Append the rest of the sequence to seq
        seq += line.rstrip()
    seq = seq.upper()  # Upper case the sequence

    for i in range(1, 3):
        if i == 1:
            print ("Forward Strand")
        else:
            print ("\nReverse Strand")
        for j in range(1, 4):
            aaseq = translate(seq[j-1:])

            print ("Frame", j, ":\n", aaseq)

        # Reverse the sequence
        revseq = ""
        seq = seq.upper()
        if i == 1:
            seq = seq[::-1]
            for base in seq:
                if base in trans:
                    revseq += trans[base]
                else:
                    revseq += "N"
            seq = revseq
            

            




Forward Strand
Frame 1 :
 MDLEKNYPTPRTSRTGHGGVNQLGGVFVNGRPLPDVVRQRIVELAHQGVRPCDISRQLRVSHGCVSKILGRYYETGSIKPGVIGGSKPKVATPKVVEKIAEYKRQNPTMFAWEIRGRLLAERVCDNDTVPSVSSINRIIRTKVQQPPNQPVPASSHSIVSTGSVTQVSSVSTDSAGSSYSISGILGITSPSADTNKRKRDEGIQESPVPNGYSLPGRDFLRKQMRGDLFTQQQLEVLDRVFERQHYSDIFTTTEPIKPEQTTEYSAMASLAGGLDDMKANLASPTPADIGSSVPGPQSYPIVTGRDLASTTLPGYPPHVPPAGQGSYSAPTLTGMVPGSEFSGSPYSHPQYSSYNDSWRFPNPGLLGSPYYYSAAARGAAPPAAATAYDRH-
Frame 2 :
 WI*RKIIRLLGPAGQDMEE*ISLGGFL*MDGHSRM*SARG*WNLLIKVSGPATSPGSFGSAMVVSAKFLAGIMRQEASSLG*LEDPNQRSPHPKWWKKSLSINAKIPPCLPGRSGAGCWQSGCVTMTPCLASVPSTGSSGQKYSSHPTNQSQLPVTA*CPLAP*RRCPR*ARIRPARRTPSAASWASRPPAPTPTSARETKVFRSLRCRTATRFRAETSSGSRCGETCSHSSSWRCWTACLRGSTTQTSSPPQSPSSPSRPQSIQPWPRWLVGWTT*RPIWPAPPLLTSGAVCQARSPTPL*QAVTWRARPSPGTLHTSPPLDRAATQHRR*QGWCLGVSFPGVPTATLSIPRTTTPGGSPTRGCLAPPTIIALPPEEPPHLQPPLPMTVT-
Frame 3 :
 GFREKLSDSSDQQDRTWRSESAWGGFCEWTATPGCSPPEDSGTCSSRCQALRHLQAASGQPWLCQQNSWQVL*DRKHQAWGNWRIQTKGRHTQSGGKNR*V*TPKSHHVCLGDQGPAAGRAGV*Q*HRA*RQFHQQDHPDKSTAATQPTSPSFQSQHSVHWLRDAG

---

## Exercise 17

---

Modify your code from the previous exercise so that the function translates only a section of the sequence. It should also only translate the sequence in a single frame, defined as a function parameter.


For example, the function could take 4 arguments:

    Sequence
    Start
    End
    Sequence strand (1 or -1)

Remember, for the minus strand you will need to complement the sequence.

(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)


In [22]:
# Dictionary for the reverse complementing of the nucleic acids

trans = {'A':'T', 'T':'A', 'G':'C', 'C':'G'}


def translate(seq, start, end, strand):

    seq = seq[start:end]

    if strand == -1:
        # Reverse the sequence
        revseq = ""

        seq = seq[::-1]
        for base in seq:
            if base in trans:
                revseq += trans[base]
            else:
                revseq += "N"
        seq = revseq

    aaseq = ""
    for count in range(0, len(seq), 3):  # Work through the list
        codon = seq[count:count+3]   # Get 3 nucleotides
        if codon in codons:
            aa = codons[codon]   # Get the associated aa
        else:
            aa = '-'

        aaseq += aa

    return aaseq


with open('codons.txt') as codons_file:
    codons_list = codons_file.readlines()

    codons = {}  # Initialise the dictionary
    for count in range(0, len(codons_list), 2):  # Work through the list
        key = codons_list[count].rstrip()   # Get the codon
        key = key.upper()
        value = codons_list[count+1].rstrip()  # Get the aa, on next line
        value = value.upper()	
        codons[key] = value  # Add to dictionary

with open('sequence.txt') as in_file:

    seq_list = in_file.readlines()  # Read the file to a list
    seq_name = seq_list.pop(0)  # Remove the sequence name
    seq = seq_list.pop(0)  # Initialise the sequence
    seq = seq.rstrip()  # Remove the newline character

    for line in seq_list:  #  Append the rest of the sequence to seq
        seq += line.rstrip()
    seq = seq.upper()  # Upper case the sequence

    aaseq = translate(seq, 10, 109, -1)

    print (aaseq)



LHPGVAVHSQKPPQADSLLHVLSCWSEESDNFS


---

## Exercise 18

---

Modify your code from the previous exercise so that the function extracts  multiple sections of the sequence and translates them. The previous example would only translate a single exon gene but this version should translate a multi exon gene.

The number of exons will vary so your function will have to take a single list of start and end positions.

To test this version of the function you can use a different and much larger sequence (<b>blumeria_seq.fasta</b>), available from the <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">Exercise Answers</a> page. This is a contig annotation from Blumeria graminis and the genes in the sequence are annotated so you can use those positions to test your function. There are 3 genes, 2 on the forward strand and 1 on the reverse:

Gene 1

Strand +
Exon 1     Start     4550     End     4636           
Exon 2     Start     4698     End    4796       
Exon 3     Start     4852     End    5565   
Exon 4     Start     5623     End     5715   

Gene 2

Strand +
Exon 1     Start      7647       End      7759   
Exon 2     Start      7854      End      8216   
Exon 3     Start      8267      End      8660   

Gene 3

Strand –
Exon 1     Start    17433    End      17709
Exon 2     Start      17243  End      17370       

You could also change the function to use arguments with default values. For example, a default could be “strand=1”, so the function assumes the positive strand.

NOTE: The correct translations are:

Gene 1 Translation: MSLPTVLLVYKIIHTVDEWNDLSRIATLKVFNGNSRQDFLKSCEDGEFNEVTIIYYSNESVRLIGQLDQELLFKLPPSIRYICHHGAGYDSIDIAACTERSIQVSHTPIAVNKSTADMTLFLILGALRRIHVPYLAVRAGQWRGSTQLGHDPENKLLGILGMGGIGQEVAKRAKAFGMKVQYHNRSRLVPELEQGADYVSFEHLIKTSDILSLNCSLNKATIGIIGKDELAQMKQGVIVVNTARGKLIDEFALVKALESGKVFSAGLDVFEKEPSIDDTLLSSPNVVLTPHIGTATVETQRAMELLVLQNIENALKTDALVTQVQEQIQA*
Gene 2 Translation: MSVVSLLGVNVLQNPARFGDPYEFEITFECLETLQKGTYIVGHKLTPVFNKLIFNSNDHDQELDSLLVGPIPVGVNKFIFVADPPDTNKIPDAEILGVTVILLTCAYDGREFVRVGYYVNNEYDSDELNTDPPAKPILEKVRRNILAEKPRVTRFAIKWDSDDSAPPLYPPEQPEADLVADGEEYGAEEAEDEEEEESADGPEVPADPDVMIEDSEVAGAMVETVKATEEESDAGSEDLEAESSGSEEDEIEEDEEREDEPEEAMDLDGAGKPNGATSSTHNTDTTMAH*
Gene 3 Translation: MWSLQYLALLVFVSRAAANYQHWDIDSAVNCQNNIWTSNYLKQVRLRYCNHPPSPQSLVDISRHPNSQLHSNRSTNIYFLSIPRPGPDGELGDDVSNYYMLVDADCNYYSVVGLNARYAQGFVLNSQTVPCRLA*


(Answers to all exercises are available <a href="http://teaching.bc.ic.ac.uk/msc/ipython-files/exercises.html">here</a>.)

In [25]:
# Dictionary for the reverse complementing of the nucleic acids

trans = {'A':'T', 'T':'A', 'G':'C', 'C':'G'}


def translate(seq, strand, *pos):

    code_seq = ""
    for p in range(0, len(pos), 2):
        start = pos[p]
        end = pos[p+1]

        ex_seq = seq[start-1:end]

        if strand == -1:
            # Reverse the sequence
            revseq = ""

            ex_seq = ex_seq[::-1]
            for base in ex_seq:
                if base in trans:
                    revseq += trans[base]
                else:
                    revseq += "N"
            ex_seq = revseq

        code_seq += ex_seq

    aaseq = ""
    for count in range(0, len(code_seq), 3):  # Work through the list
        codon = code_seq[count:count+3]   # Get 3 nucleotides
        if codon in codons:
            aa = codons[codon]   # Get the associated aa
        else:
            aa = '-'

        aaseq += aa

    return aaseq


with open('codons.txt') as codons_file:
    codons_list = codons_file.readlines()

    codons = {}  # Initialise the dictionary
    for count in range(0, len(codons_list), 2):  # Work through the list
        key = codons_list[count].rstrip()   # Get the codon
        key = key.upper()
        value = codons_list[count+1].rstrip()  # Get the aa, on next line
        value = value.upper()	
        codons[key] = value  # Add to dictionary

with open('blumeria_seq.fasta') as in_file:

    seq_list = in_file.readlines()  # Read the file to a list
    seq_name = seq_list.pop(0)  # Remove the sequence name
    seq = seq_list.pop(0)  # Initialise the sequence
    seq = seq.rstrip()  # Remove the newline character

    for line in seq_list:  #  Append the rest of the sequence to seq
        seq += line.rstrip()
    seq = seq.upper()  # Upper case the sequence

    aaseq = translate(seq, 1, 4550, 4636, 4698, 4796, 4852, 5565, 5623, 5715)

    print ("Gene 1 Translation:", aaseq)

    aaseq = translate(seq, 1, 7647, 7759, 7854, 8216, 8267, 8660)

    print ("Gene 2 Translation:", aaseq)

    aaseq = translate(seq, -1, 17433, 17709, 17243, 17370)

    print ("Gene 3 Translation:", aaseq)






Gene 1 Translation: MSLPTVLLVYKIIHTVDEWNDLSRIATLKVFNGNSRQDFLKSCEDGEFNEVTIIYYSNESVRLIGQLDQELLFKLPPSIRYICHHGAGYDSIDIAACTERSIQVSHTPIAVNKSTADMTLFLILGALRRIHVPYLAVRAGQWRGSTQLGHDPENKLLGILGMGGIGQEVAKRAKAFGMKVQYHNRSRLVPELEQGADYVSFEHLIKTSDILSLNCSLNKATIGIIGKDELAQMKQGVIVVNTARGKLIDEFALVKALESGKVFSAGLDVFEKEPSIDDTLLSSPNVVLTPHIGTATVETQRAMELLVLQNIENALKTDALVTQVQEQIQA*
Gene 2 Translation: MSVVSLLGVNVLQNPARFGDPYEFEITFECLETLQKGTYIVGHKLTPVFNKLIFNSNDHDQELDSLLVGPIPVGVNKFIFVADPPDTNKIPDAEILGVTVILLTCAYDGREFVRVGYYVNNEYDSDELNTDPPAKPILEKVRRNILAEKPRVTRFAIKWDSDDSAPPLYPPEQPEADLVADGEEYGAEEAEDEEEEESADGPEVPADPDVMIEDSEVAGAMVETVKATEEESDAGSEDLEAESSGSEEDEIEEDEEREDEPEEAMDLDGAGKPNGATSSTHNTDTTMAH*
Gene 3 Translation: MWSLQYLALLVFVSRAAANYQHWDIDSAVNCQNNIWTSNYLKQVRLRYCNHPPSPQSLVDISRHPNSQLHSNRSTNIYFLSIPRPGPDGELGDDVSNYYMLVDADCNYYSVVGLNARYAQGFVLNSQTVPCRLA*


---

<b>The third part of the Python tutorial is available <a href="PythonTutorial_Pt3.ipynb">here</a>.</b><br><br>
