The following problem asks you to find the translation of an RNA string into an amino acid string.

Protein Translation Problem: Translate an RNA string into an amino acid string.

Input: An RNA string Pattern and the array GeneticCode.<br>
Output: The translation of Pattern into an amino acid string Peptide.<br>
Code Challenge: Solve the Protein Translation Problem.

Notes:

The "Stop" codon should not be translated, as shown in the sample below.<br>
For your convenience, we provide a downloadable RNA codon table indicating which codons encode which amino acids.

Sample Input:

    AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA
Sample Output:

    MAMAPRTEINSTRING

In [5]:
def translation(rna):
    result = ""
    dictionary = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
    "UCU":"S", "UCC":"S", "UCA":"S", "UCG":"S",
    "UAU":"Y", "UAC":"Y", "UAA":"", "UAG":"",
    "UGU":"C", "UGC":"C", "UGA":"", "UGG":"W",
    "CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
    "CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
    "CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
    "CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
    "AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
    "ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
    "AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
    "AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
    "GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
    "GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
    "GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
    "GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G"}
    for i in range(0, len(rna), 3):
        result += dictionary[rna[i:i + 3]]
    return result

'MIVRIQDRALWLWASRIVQTAVPRAIKPKALLNQTLLRNKASHDPCSILGPRRSALILFLLVPSLIICVLLPPKDMISGNPPLTPLPRAADSVALPNTRSTAEDLTEHSDLMGHASSEEVACSDGAASTLLGAVRSSSRIKGGGYHKNKLALSFRIAVQTLSSTDVKRAISPSRPRGGTLIRILRPQLSFHYHRNNIAYKYFVGNSIFAAPGIVTLYPGTSVQVLLRLSRLHFMLLSMRGQRSETHYSNLGRMSSPTLFSGGMSWYSDGLMRSVPGLVTERPWGYCLGDLITRLCRVFTIPLEPSCLLQQLRTRIQLLDVQLPWKSLAARDAKGRAHGSDDAWGPTFRRSKFPELRPRPCSFDGPLTSSLVVARLVTPPNTPTVIYNSRGSHFIIAPWVYRLGVLCHRACVRTHMLPEDTIRVKWLYLEIGGGCYTPQSRAFKPCVAKPTFVHQDDTEESTRKRPTSGKAVRPTKHLSPLAAVLGQPAVLTCYRVRFCKCQVLGLSREGTTRVTCKYTRQTRWNLGIPTEGPSRHESQALPGNPQPSIHSGISSMARVTTYSPTNGTGDMAHIDLIGIGLMARVYHHKHASKFVMDKDNIKQLSSRFSDFHHNGNDVLPGVFDGEATPKSRWCLTLRIETGASFIDNVSAIAGISSIVPGTINEYRFGPISQRYREVPYWSGYRPAQCRHGDVKPSKGGTLPLLRELHIGTFAARLLLCFILIILKKLTDGRGKPYVASSTVSNLHVNLAGEFTLSVRPSIIEWSYKVRVVKDKDLATRRYSQGSGLAGKRSFWGKTTYRYLNGRGHLDRDRVPAGILLGGEPIRHWRPPTCVSTCSRSTSHPTPGHMTEGGGCPIREGVGSLGLLRWDRFLTAHNCPKRPGHKLKLVHIYSLQRIRDQFGSSGPKLYLPSLHLGDAYRIPCRTPVKASHSVTARNGVAFRFRTFTYNSTGSIISSEGLETPAGLQPGRKPPDEEREAWSGVPLSYFMSLQDPHARSAS

We say that a DNA string Pattern encodes an amino acid string Peptide if the RNA string transcribed from either Pattern or its reverse complement Pattern translates into Peptide. For example, the DNA string GAAACT is transcribed into GAAACU and translated into ET. The reverse complement of this DNA string, AGTTTC, is transcribed into AGUUUC and translated into SF. Thus, GAAACT encodes both ET and SF.

Peptide Encoding Problem: Find substrings of a genome encoding a given amino acid sequence.

Input: A DNA string Text, an amino acid string Peptide, and the array GeneticCode.<br>
Output: All substrings of Text encoding Peptide (if any such substrings exist).<br>
Code Challenge: Solve the Peptide Encoding Problem. Click here for the RNA codon table corresponding to the array GeneticCode.

Note: The solution may contain repeated strings if the same string occurs more than once as a substring of Text and encodes Peptide.

Sample Input:

    ATGGCCATGGCCCCCAGAACTGAGATCAATAGTACCCGTATTAACGGGTGA
    MA
Sample Output:

    ATGGCC
    GGCCAT
    ATGGCC

In [24]:
from itertools import product

def reverse_translate(peptide):
    aminoacids = {
    "M": ["ATG"],
    "I": ["ATA", "ATC", "ATT"],
    "A": ["GCT", "GCA", "GCC", "GCG"],
    "S": ["TCA", "TCC", "TCG", "TCT"],
    "F": ["TTC", "TTT"],
    "P": ["CCA", "CCC", "CCG", "CCT"],
    "C": ["TGC", "TGT"],
    "K": ["AAG", "AAA"],
    "H": ["CAT", "CAC"],
    "D": ["GAT", "GAC"],
    "V": ["GTA", "GTC", "GTG", "GTT"],
    "L": ["TTG", "TTA", "CTA", "CTC", "CTG", "CTT"],
    "W": ["TGG"],
    "T": ["ACA", "ACC", "ACG", "ACT"],
    "R": ["AGA", "AGG", "CGA", "CGG","CGT", "CGC"],
    "Y": ["TAT", "TAC"],
    "N": ["AAC", "AAT"],
    "Q": ["CAA", "CAG"],
    "E": ["GAA", "GAG"],
    "G": ["GGA", "GGC", "GGT", "GGG"],
    "*": ["TAA", "TAG", "TGA"]
    }
    result = []
    for i in range(len(peptide)):
        result.append(aminoacids[peptide[i]])
    result2 = list(map(lambda x: "".join(x), product(*result)))
    return result2


def reverse_complement(dna):
    result = ""
    for i in range(len(dna)):
        if dna[i] == "A":
            result += "T"
        elif dna[i] == "T":
            result += "A"
        elif dna[i] == "C":
            result += "G"
        elif dna[i] == "G":
            result += "C"
    return result[::-1]


def main(text, peptide):
    result = reverse_translate(peptide)
    result.extend(list(map(reverse_complement, result)))
    final_result = []
    for item in result:
        n = text.count(item)
        for i in range(n):
            final_result.append(item)
    # Print answer
    for item in final_result:
        print(item)
    return final_result


ACATACAATACTCAAATGATCTGGACA
ACATACAATACTCAGATGATTTGGACC
ACCTATAACACGCAGATGATCTGGACG
ACCTACAACACCCAAATGATCTGGACA
ACGTACAACACGCAGATGATCTGGACC
ACTTATAACACACAGATGATTTGGACG
ACTTATAATACCCAGATGATTTGGACG
ACTTACAACACACAAATGATTTGGACT
AGTCCATATCATCTGAGTATTATATGT
CGTCCAAATCATTTGCGTGTTGTATGT
TGTCCAGATCATCTGGGTATTGTATGT
GGTCCAGATCATCTGCGTATTGTATGT
CGTCCAGATCATCTGAGTGTTATAGGT
TGTCCATATCATTTGAGTATTGTAGGT
TGTCCATATCATCTGGGTATTATACGT
CGTCCATATCATCTGGGTATTATACGT
GGTCCAAATCATCTGCGTATTATAAGT
GGTCCATATCATTTGTGTGTTGTAAGT
CGTCCAGATCATCTGCGTGTTGTAAGT
TGTCCATATCATTTGCGTATTGTAAGT


['ACATACAATACTCAAATGATCTGGACA',
 'ACATACAATACTCAGATGATTTGGACC',
 'ACCTATAACACGCAGATGATCTGGACG',
 'ACCTACAACACCCAAATGATCTGGACA',
 'ACGTACAACACGCAGATGATCTGGACC',
 'ACTTATAACACACAGATGATTTGGACG',
 'ACTTATAATACCCAGATGATTTGGACG',
 'ACTTACAACACACAAATGATTTGGACT',
 'AGTCCATATCATCTGAGTATTATATGT',
 'CGTCCAAATCATTTGCGTGTTGTATGT',
 'TGTCCAGATCATCTGGGTATTGTATGT',
 'GGTCCAGATCATCTGCGTATTGTATGT',
 'CGTCCAGATCATCTGAGTGTTATAGGT',
 'TGTCCATATCATTTGAGTATTGTAGGT',
 'TGTCCATATCATCTGGGTATTATACGT',
 'CGTCCATATCATCTGGGTATTATACGT',
 'GGTCCAAATCATCTGCGTATTATAAGT',
 'GGTCCATATCATTTGTGTGTTGTAAGT',
 'CGTCCAGATCATCTGCGTGTTGTAAGT',
 'TGTCCATATCATTTGCGTATTGTAAGT']