# 1. Source

Click on the link to go to the source web page of **Rosalind**: [Protein Translation](https://rosalind.info/problems/ptra/)

 **Problem**
 
 ![Protein Translation](ptra_problem.png "Protein Translation")

**Sample Dataset**

ATGGCCATGGCGCCCAGAACTGAGATCAATAGTACCCGTATTAACGGGTGA<br>
MAMAPRTEINSTRING

**Sample Output**

1

# 2. Workspace

In [1]:
# write and extract info

with open('ptra_test.txt', 'r') as file:
    dnaSeq = file.readline().rstrip().upper()
    proSeq = file.readline().rstrip().upper()
    
# print

print(dnaSeq)
print(proSeq)

ATGGCCATGGCGCCCAGAACTGAGATCAATAGTACCCGTATTAACGGGTGA
MAMAPRTEINSTRING


In [2]:
# import biopython for easy translation

from Bio.Seq import Seq, CodonTable

dnaSeq = Seq(dnaSeq)

dnaSeq

Seq('ATGGCCATGGCGCCCAGAACTGAGATCAATAGTACCCGTATTAACGGGTGA')

In [3]:
# translation based on table 1

dnaSeq.translate(table = 1, to_stop = True)

Seq('MAMAPRTEINSTRING')

In [4]:
# table 6

dnaSeq.translate(table = 6, to_stop = True)

Seq('MAMAPRTEINSTRING')

In [5]:
# we can get codon tables directly from biopython

tables = CodonTable.generic_by_id

tables

{1: NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...),
 2: NCBICodonTable(id=2, names=['Vertebrate Mitochondrial', 'SGC1'], ...),
 3: NCBICodonTable(id=3, names=['Yeast Mitochondrial', 'SGC2'], ...),
 4: NCBICodonTable(id=4, names=['Mold Mitochondrial', 'Protozoan Mitochondrial', 'Coelenterate Mitochondrial', 'Mycoplasma', 'Spiroplasma', 'SGC3'], ...),
 5: NCBICodonTable(id=5, names=['Invertebrate Mitochondrial', 'SGC4'], ...),
 6: NCBICodonTable(id=6, names=['Ciliate Nuclear', 'Dasycladacean Nuclear', 'Hexamita Nuclear', 'SGC5'], ...),
 9: NCBICodonTable(id=9, names=['Echinoderm Mitochondrial', 'Flatworm Mitochondrial', 'SGC8'], ...),
 10: NCBICodonTable(id=10, names=['Euplotid Nuclear', 'SGC9'], ...),
 11: NCBICodonTable(id=11, names=['Bacterial', 'Archaeal', 'Plant Plastid', None], ...),
 12: NCBICodonTable(id=12, names=['Alternative Yeast Nuclear', None], ...),
 13: NCBICodonTable(id=13, names=['Ascidian Mitochondrial', None], ...),
 14: NCBICodonTable(id=14, names=['Alternativ

In [6]:
# just the keys

tables = list(CodonTable.ambiguous_generic_by_id.keys())

# however these table options can not be used with to_stop argument 

tables.remove(27)
tables.remove(28)
tables.remove(31)

In [7]:
# we can try each of them

correct_tables = list()

# translations for each table

for table in tables:
    if dnaSeq.translate(table = table, to_stop = True) == proSeq:
        correct_tables.append(table)
        
# display

print(*correct_tables, sep = ' ')

1 6 11 12 15 16 22 23 26 29 30 32


# 3. Implementation

In [8]:
def ptra(filename, return_all = False):
    
    '''
    input
        a file containing
            a dna sequence: dnaSeq
            a protein sequence: proSeq
        a boolean flag whether all table variations should return or just one of them
    process
        performs a translation based on each table variants
        compares those translation with proSeq
    output
        prints correct translation table/s to console
        writes same answer in a file
    '''
    
    from Bio.Seq import Seq, CodonTable
    import random; random.seed(59)
    
    # read input file and extract dnaSeq & proSeq
    with open(filename, 'r') as file:
        dnaSeq = file.readline().rstrip().upper()
        proSeq = file.readline().rstrip().upper()
        
    # create codon tables list
    tables = list(CodonTable.ambiguous_generic_by_id.keys())
    for i in [27, 28, 31]:
        tables.remove(i)
        
    # convert dnaSeq into a Seq() object
    dnaSeq = Seq(dnaSeq)
    
    # try each translations for each table and compare with proSeq
    correct_tables = list()
    for table in tables:
        if dnaSeq.translate(to_stop = True, table = table) == proSeq:
            correct_tables.append(str(table))
            
    # final representation
    if return_all:
        result = ' '.join(correct_tables)
    else:
        result = random.choice(correct_tables)
        
    # print answer to console
    print('\n\x1B[1mANSWER\x1B[0m\n______\n')
    print(f'{result}')
    
    # open file and write answer
    file = open(f'{filename.split(".")[0]}_answer.txt', 'w')
    file.write(f'{result}')
    file.close()
    print('\n\n#! The answer has been written into the file:',
          f'\x1B[1m./{filename.split(".")[0]}_answer.txt\x1B[0m\n')

In [9]:
ptra('ptra_test.txt', True)


[1mANSWER[0m
______

1 6 11 12 15 16 22 23 26 29 30 32


#! The answer has been written into the file: [1m./ptra_test_answer.txt[0m



In [10]:
ptra('ptra_test.txt')


[1mANSWER[0m
______

12


#! The answer has been written into the file: [1m./ptra_test_answer.txt[0m



In [11]:
ptra('rosalind_ptra_2_dataset.txt', True)


[1mANSWER[0m
______

6 15


#! The answer has been written into the file: [1m./rosalind_ptra_2_dataset_answer.txt[0m



In [12]:
ptra('rosalind_ptra_2_dataset.txt', False)


[1mANSWER[0m
______

6


#! The answer has been written into the file: [1m./rosalind_ptra_2_dataset_answer.txt[0m



In [13]:
ptra('rosalind_ptra.txt', True)


[1mANSWER[0m
______

2 5 13


#! The answer has been written into the file: [1m./rosalind_ptra_answer.txt[0m



In [14]:
ptra('rosalind_ptra.txt')


[1mANSWER[0m
______

2


#! The answer has been written into the file: [1m./rosalind_ptra_answer.txt[0m



<p style='text-align: right;'>
    <!--<b><font size = '5'>Contact</font></b><br>-->
    <b>Orcun Tasar</b><br>
    <i>Bioinformatician / Data Scientist</i><br>
    orcuntasar |at@| ogr.iu.edu.tr<br>
    tasar.orcun |at@| gmail.com<br>
    <a href = 'https://www.linkedin.com/in/orçun-taşar-7b5992a1/'>Linkedin</a> | <a href = 'https://www.instagram.com/shatranuchor/'>Instagram</a>
</p>