<a href="https://colab.research.google.com/github/wvirany/rosalind/blob/main/solutions/rosalind08.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Translating RNA into Protein


---


### New Terms:
* proteins - the functional unit of the cell
* amino acids - the monomer unit for proteins; the same 20 amino acids commonly occur in most species
* primary structure - the order of amino acids on a protein
* polypeptides - a long chain of amino acids
* proteomics - the study of proteins and their properties
* genetic code - the exact specifications for translating nucleic acid codons into amino acids
* translation - the proces by which mRNA is converted into a peptide chain for the creation of a protein
* messenger RNA (mRNA) - an RNA molecule that serves as the blueprint for translation into protein
* codons - a triplet of contiguous nucleotides
* start codon (AUG) - the RNA codon AUG, which codes for the amino acid methionine and indicates the beginning of translation into protein
* stop codons (UAA, UAG, UGA) - one of three possible RNA codons that indicate the termination of protein translation
* central dogma of molecular biology - a postulate dictating that protein is always translated from RNA, which in turn is always transcribed from DNA
* organelle - a structure in the cell that serves as a central hub for some particular group of cellular functions
* ribosome - an organelle that carries out the assembly of peptides from mRNA during translation
* transfer RNA (tRNA) - the helper molecule used by ribosomes for physically translating codons into amino acids
* gene - an interval of DNA whose nucleotides are translated into a polypeptide for protein creation
* heredity - the scientific study of the inheritance of traits
* protein strings - a string composed of symbols from the English alphabet sans B, J, O, U, X, and Z; representing a peptide chain formed from amino acids
* genetic string - a DNA, RNA, or amino acid string
* the RNA codon table - a table indicating the translation of individual RNA codons into amino acids for the purpose of protein creation

### Problem:

`Given`: An RNA string $s$ corresponding to a strand of mRNA (of length at most 10 kbp)

`Returns`: The protein string encoded by $s$

In [7]:
from textwrap import wrap

In [105]:
'''
Hash Function:
  Takes codon as argument, i.e., 3 letter string of nucleotides
  Returns amino acid corresponding to codon
'''

def hash(codon):

  a, b, c = codon

  if a == 'U':
    if b == 'U':
      return 'F' if (c == 'U' or c == 'C') else 'L'
    elif b == 'C': return 'S'
    elif b == 'A':
      return 'Y' if (c == 'U' or c == 'C') else None
    elif b == 'G':
      return 'C' if (c == 'U' or c == 'C') else None if c == 'A' else 'W'
  elif a == 'C':
    if b == 'U': return 'L'
    elif b == 'C': return 'P'
    elif b == 'A':
      return 'H' if (c == 'U' or c == 'C') else 'Q'
    elif b == 'G': return 'R'
  elif a == 'A':
    if b == 'U':
      return 'M' if c == 'G' else 'I'
    elif b == 'C': return 'T'
    elif b == 'A':
      return 'N' if (c == 'U' or c == 'C') else 'K'
    elif b == 'G':
      return 'S' if (c == 'U' or c == 'C') else 'R'
  elif a == 'G':
    if b == 'U': return 'V'
    elif b == 'C': return 'A'
    elif b == 'A':
      return 'D' if (c == 'U' or c == 'C') else 'E'
    else: return 'G'

In [110]:
def transcribe_RNA(s):

  codons = wrap(s, 3)

  p_string = ''

  for i in codons:

    j = hash(i)

    if j: p_string += j
    else: break

  return p_string

In [113]:
# Sample data:

s = 'AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA'

p_string1 = transcribe_RNA(s)

print(p_string1)

MAMAPRTEINSTRING


In [112]:
# Problem data:

t = 'AUGAGGAAACAGACGUUGGUUUGUACACUGGGAACGCUUUGGGAGGAAGUUGUUCGUGAUAGCGAAGACAACCACCCGUUUUGGAGUGUAAGAAUCAGUUCCCGUCGACCCCUCAUCGGCCGGGAGCAAAAACACGGACUGCCACGAUGUCUUUGCUCGCAUGGAGCCUUACCCGUUGCGAAAGGUUAUACGCUAGGUCCAUUUAUAAAAGAAACAGAACCUUGCUACCGAUUCAAGUGGUCGCUCAGUAGUAAGAGUAACUGGUACGGUAUAUUUAAGACGAUCUCUCGGAGUCGGCCGAAGGUAUCGGGUAAUCGGCAGCUCCGUGUUUCCGACCCUCAUGCUAACGAUACCAAAGUCCACACUCCUAUUCGAUGGCGGGUUCAUGGAGCCCUAGCGCUAUGGCUAGCCAUCGGAAUCUGGAAACGUGGUGUGCCCUCAGGGGAACAAGUCUGGCUGACCAAAAUAUAUGCUAUUACGGGUCGUUCAGGUGACGCUAGCUUCCUGACUAUGACAGCGAACAUUUACGUUGUUAGUAUACCAAGAUCGGCGUCACAGAACCAACACUGCUUCCGAUUAUCCGUAUCCAUAACUGUCUGGGGCGUCAUCGUGUCUGAGCAGACGCCAGCAGAGAAUUAUGCGUUCCGGAGGACGUUCGCGGAUCUCAACGAUUCAACCGAAGACAUACAUCCAUGUGGAUAUCAGGCGAUGAAGAGCCAACCUCUCCCAAUGAUCUCACCCCAGAGUAUAAGCGUUUUAAUUGCCCCACAAAACUUGAGCUUUACUCUUCUAAUAUGUCUAUACCAUGUCCACGUGACGAAACCCUUGUCAAAUGGCUUGGUCGGCAUAUUUAAGCGAUGGAUCGAACUGACACAUGUCAUCUCACGGUUACGGCUCGACAAACGCGCGUUCGUUCUCCUCGGUUAUCCGGCCGACCGGCCAGCCACCGAUAGGAGGUUCGUGGGGACGGCGUCAACGGGGAGUGUACGGAUAUGGUUACAAAGCCCACUAUACGGGGCAAGCCAGCGUGCCUCAAUUCCACUGGAAGAUCUCGUAAUGCGACGCUCGUUCGAGUGGAUUAUUACAUCACAUCUUUCGUACAGGGAACAAUUAUCAUAUUACACUCGUCCUUACAGCUUGCACCUUUUCGCAUAUAAGCGAGGUCGGUCCUUUGGAGACAUUGGACUAGUCCAUGAGCCGUCGCCUAUUAGUCCAGAGAUCGCAUCGCCUUUGUGUCGCGCACCGGUUAGCCUCCAUCAUUGCCUUAGGCUGAAGCUCCCCUACGGUAUAGAGAGUCCUAAAACGCUCUUAUUCGAGCCAGCCCAUUGUUUUGUACGCCAGGUGGUGCCAGCGACGCACGUAUCACCCCCGCUCGAGUUUUAUGAUGGGCUGCAGGCAACGAUCGAAAGCUUUUUGACAUGGGUUGCAGCCGCUACAGCUCGAACGAAUCAGUCGUUUGUUCCAGGAUCCGUUCUGCUUACACCCAGGGGCAUGUCGGUAGUCAAGUACUUAUCGUGGACUGGACUGUCGGCAUUCUAUCCUCGUCCAGCUCACAGGGAACCAGCUUCGUCGCCGUGUUCAGAGAUACUUCAUAGGAACCCCAGAUCACUUGUAUACUCUUCGAUCUCACGGGGGCUACGGACGGACCUGGGGUGUCCGAGAGCAAAAAGAGGAACGACUCAUGUGCUGCUGUGGCUAGGGCAGAGGCCUUCUCCUCGAGGCACAUCCCCAUACUCUAAGUUUACGAAUCAGCCCUACAGGCCCUUACUUCAGGCGUUCCUGACCUCGGAAAAGGUGUGGCCAGAACAUCUUCAAUGGUACAUGCUGGCGAUAGGGAGCAUCAUAAACGGCGGCAAUCUUCUACUUUCGAAUAAUCUGGCAAGAUCAAAUGGGUUCUCGGCCUUGCAGGGAGUUCCCGAUCACCGGGAUGGGUGUCGUACUGGCUCCUGCGCCAAGUCACGACGACAACGACUCGUGAGGACGGAACGUAAGUUAACGUUAAUAAGGCCGGUAAGGACUAUAAGUCCACUCGGCCUCACCGGCUCUAUACGCGGGACGGUACAGGGUGAAUAUCUGUCCUGUACCCUAAAGUCCGCGUGUCGCCGGGUUCCGUUUAUCACCUCCAUACAGAAGGGAUUCUCUAGCUCUCAGGUGAAUGAGAAUGAGACUGAACCCCCACCGACAAAUAAAUUACAAGAUGACAACUCGAGAAUUGAACGGUGGAGGUUGGCUAGCAUCCGUACCGCUCUAGUACUAUUUAUCGACUCUUCCAUAAUGAGCAGGGACCGUCUUAACUCCUUCAAACGGGGUACGCAGAGAGUUUACGUCAGGCCACUCGCGAAUUUAUGGCAGCAGACUAAGUGCAUUAGAUUUAGCACCCCAAAUGAUGAGGGCGCUAUCGCUCAAGCUGGCCGAGUCAAAAUCCUAGAGGCUGCCGGCUCCGGAACAUCUGUCCACCUGGGCCUUGCAAACUACUCACUCAUACACAAGAGGCCACCCCUGACCCGCAAACGUAGAACUGCACAAGACAGUAAAAUUACUCGGGGGGACUCUAGCGCUGAGCUCAACUGCCCAGUAGAUGUUAUAGAUUAUGCAUCCAAAAACUGUUGGGGCGCUCAGACACGUACCACCGUUGGGUUCUCAAGCCAGUAUCUCGGUUGCGGCCGCAACGUACUACAUAAGUGGCAUGGCUUUAAGAGAGUUACCCCGUCUUUUGCCCGUUACUUAGCACGAGAGAAUCAUAUUCAGGUUGGCAUACUUGUAGAGCAAGCGGCAGCCGCGAAUCUCGAGACUCACCCGAGAGAGCUUCAUUUGGUGUUGGCAUACUCCCCGUAUAGCCUGCGUGACGUUCCGACUGAACGCUCACCUGAUUACCAUAUGAGCGGACAAGCGGGACGAAAACGAGCUGAGCCUACCCAUCACUUUAGGGGCUUGAUCGAGUACGCUAGAAUGUGUAACAAGUUGCAUUCUGGUGGAUGGAGUGGCUAUUCCGGGCCCGUGCCGUUCGGCCGUGAAGCCGCUGGGUUGAGCGGGACAAUUACUAGGGAAGUGGAGCGUUGCUCGGGGCAUCGGGCUUGUUUAAAAUCUCGUCGGAAUGACUCAUCUUCGUUGGACCUCUUGUGCCGAAUGCCAAUUUAUCGCUCUCGCUCGACGGAUUCCAAGAGCAAAGUCCGUACCGCCUUUGUUCGGCUGGUUGGCGAUGCGGCGCUAAGACCGGACCUGAUCCUUACACCCGGGAACAGUUGCCGUAAGGCCCGGCGAACGAGACCACGCGCCGGUGCGUAUCAAACUGUGUCUAAAGCCAGUGUAUGGUGCCAGCAAGAGUCAAAAACAAGAAGGAACCAAUGGCGAAAUUGUCGUCUCCCCAUCGUGCCGGGGCAGAAGCGAAGCUUUCACAGCAUCGUCGUUAUUCGAACCCUUAACAUUACCACACGUCCACUUUCGUCUGCUGAUCUAUCCACGCCAAAGUCUUGGCUUACAAAAAGACCUGGAGGACCUCGCCCAUUUCGGGCCAAAAGUCCGGCCGCUCUCCCUAAAAUUAUGGCAGCUCCCCUUCGAACAAGUCCACCGAGGGGGUUAUCAGCACCAGAAGGUUUUCUGAGUUUCCACAGCUCACCUGUAGAUAAGCUCUUCUCGGUAGCGAUACAAUCGGAACUCCGCUUCACGUAUUGCGUCGGCGCCUGUGGCCCGAAGUUCGGAGCAGACGAACCGGAUCAAAUUCCCUCCGUAAGCAUGGUCGACUUUAAGUCGUCCGGCCGCACUGAAGAGCAGAGCAACAGCUUUGCAGACUUAUUCGCUCUGAGAGCGAAUGGCAGACAGUGCCCUCUUUGUCUUCUCCUUCUUUGCCCUCGGGCCGCAUCCCAUAAGAAUCAAACAAAAAGAAUUCCCGACCCACCGGCUUUCACCAGUCUUUCCUAUGCACCAGGAAACCGCAUGGCCUCGGGCUCGAUCGAGGAGGAUCCGACAUCUAUGGUCAUCACCGGCUGUAUUGCAGACCUGGGAGCUGCUGCUGGCUGGAAAGGUUUAAUCAGGGUUCUGCAUCUGCCAAAGGUCCAGCAACCAGGCCUCAUCGGAUCCUACGCAGACACGAAUAGAAUACAUAACCAACCCCCAUCGCCUGACAAACAACCGUCAGGUGCGAUAUGGGUGCCCGUUCCUACCCCAUAUAUAGGACAUGGGAUGGUCCGUAAUCAUCGUGUGCCAUUAUUUCAGCCAACACGCUAUUUGGCGCCGCAAGCCGUGACAAAUGUUACGGGCCACUAUGUACUCGCAAUCAGUUCGUGCGCCAGAGAGCGCCUAAACGCUGCCUCGAGCAGGGCCUCAGCUGUCAUUACGAACAUCAAGAAUGCCCCUCUUCGUACGACGAGAUGCCACUCAUUGACCACCCGGCACCCUUACUCGGCGUUUCUACACAAACAGAGGACGAGGACUGCUCCUUGGACUCGCUGUCUCAAGACCUAUUGUGACGGUUACAGGACUCGCAACAGAAGAAGUGCAAACACUUCUGACGCAGUCCGGCUAAAAAUACUAGAAUUCGGUAGUGCGUGGGCGUGCGUGCGCCCCAUACCCUUUAGAUUCGUAGAGCCCCACAAAACUGAGUGCACUUCCGCUGAGAAGAAUGUUACUUCAGAGGUAGUUCGUGUACCCAUCUGCGUUGUGGUCGGAUUUCAUGCAUGGGCUGGCGCCGCGAAGAGCCCCAUUGCACCAGAGAGGAGAAGCCCUGGGUACGCUCACCUGCGGUACAUUCCGCUUACCAAACGGGUACGCUCGUCGGCAGCUCCCGGAUCUUACUACCCGUUCAAUUUCGUUCCACGAAUCGGAGGUGGGAGUCGCGAUUCACUAGAUCCACAUUUAACUCCAUCAGUAGUCACUGUGGGCCGUCUUGGCCGAGGACUUUAUGGCGUCAGCCACGCCCCGUACCGAAGAUACCUCGACCCUUCGUUUUAUGUCCCCGCAUUGCAGCAAGGUUCGGAGGGUGUCAUCUGUUUCCGGGUAUGUUAUGUCUCGGACUGGACAGAGGGACGGGCCUAUCACAAAAAAUGGGUAGGGCAUAUCGGCAAUUGCUUGACAGCAUACGAGGCAGUUGCUAUGACGAGUACUGAGCUCACUUCCGACCGUACGGCAACCAUCGAAUGCUCAAUGUACUGGGCGUUAGCCCCUAUACGUAUCACCGCACGGUAUAAUAUCAUUGAUCGAUGGCCAGCACACCUAGCGAUGUCAGCGCCUCGGGUAAUAUUAUUUGCGGACCUUACGUUUAAGUGGAUGUCGAAAGAAGGCUUCCGCAGGACGCUCUCCGGCUAUCGCCUCCACGCGCAUAAUUGGACCGAUGGUACUCAUCCCGUAUGCCACGAAUACACACUGUUAGCUCUACAAGAACUACUUUCGUUCGGGUAUCAUAUCACACAACUACUAGUAUUCGCUCGUCGGUGCUCAUUCUUGAGAAUCUCACGCUCGUCGGGGUGUCUUAGUCACAGGCCUUCCAGGAUUAAAACUCCCAAUCUCGCCGAAGACGCACUUGUGCCUAGGAUUGAUGGGUCGGUUCCACCUGUGGUCCGUAAUGGGGCUUACUUAACGCCGUUCCGGGUCCAGCCCAGACCUCUGCGAUUUGAUAUAAAAUCCUACACAUCCGACCGCCGCCCGGCCGCAAAAACCUCACUCGGUUUUAGCCGUAUAGUAUCGCAUCCUCUGUGUGGGCUCCAGUUGGAAAGGAACGCAUCAAAACAGCUAUGUGCUUCGCCUUGCGCCAGCCAAACGUGCCUUAUGGGUAUAAUCGAAUUUCUUUUGCAAGGGAGCCUCACUUGGCUGCAUUCGAUAAAGAUUAGCACGAACGGUGGUGCCUGCAAGGCUCCCGUUUGCAUAAUGCACUUGCAACUAGCACGUGUGAUUUCAUGGUCUGUACAAGUUCCCCAAAAUACGCUGUUAGGGUUCAUUUCUGAUAGAAUAUUCAGUGUCAGAUUUUCUAAGGACAAAUUCCUCAGUUUGAGUCGGAGUACCACGUCGAGGCGGGCCUGGAAGGGAGGUUCAUCCAGCCGCAAGACGUCUCACAACCUGUCCCGUUUCCCGUUGACGAAUAAGCAUAAGACAAUCAUACCAAUUGCGACCGAACAGGGCAGAUUGCUUUUUCAGUCCGGACCUCUUACCAUUCUCAGAGGUCUGCGCCGGGUAAGAAUACAGGUACGCACUGCAUAUCUCCGAAACGUGAGUCAUGAUCUAAAGCAGGUCUCGCGCGCCAUUGAAAUUUCGGUGUUACCGUACCCGAGGGGCUUAUGUGAAAAAUUUAAUAGGCAAUUCAUGGUUAACGUUCCAGAACAAAUACUUUCACGAGUCCUGACCCACGUUAGAAAUCUCAUAGAAAAGGUCGUGGCGGAGCUUACCUCGAAAGACUCACACCCCGGGCUACGGAACUUGGCGCCAUACAACUGUUCACUACACACUUUAAGCAUGUGCUCGCAGCAAGUGCAUGGCCCAUUACCCAAUACAGUGUGGGCAGGCACGCUUCCGGACAUCCUUAGCAGCACUUGUUCGGGAAAGAGUCAUGAACAAAAACGCUGGCUGGCAGCGCAAACUCGGCAGAUUACAGUAUGCGUAGAGGUCCUCUCUCACCCCCAUGCGAUGUGCUCUCAAGCUAGCGAGACAAGGCCUCUGUUAUUGUUUGAGCGGGCCCGGGAACUACUUCUACCGUCACCCAGGGGUGCGCGCCUCGCCUCAAGGCUAUUCGUGCCAUCGCUUCUGGCGCCCGCACAUUCCUUCACUUUUCAAGCCUUUAGGGCAAUGCUGUAUAUCGGAGGUCAAGGCUUAGGUGGGGAGCAUGUCUCCGCAAGACUUUUCAUAGCCCAACUACCACUCCGAGUACGUAUGCUGCAGAUAACGCUGAGCGGCACAGAGUAUAGCUAUGACUACAGUAGGGGCCUGAAGAUAAUUGGCAGCACCUUGUUGGCAGAUCCGAAUCUUUCUGUCGGGAGCACCCCUACGCGUGAGCGUUGCCGCGCUGGUCGGCUCAUAUUGUCCUUAUCCAUGAUGAGACAGAUAUGUUCUGAUUUCUCCAUUAGAUACACAUGUACGAACAGGUAUUCAGUCGAUUUACUAGGUUUACUUAACAAAGGAAGGCUGUGUACUAGACGACCCCUCACGGCUACCAUCCGCUCAUACCCUGCCGUUACGUUGAAACUCAAGACUGUCGACUCUUAUGGGGGCGGCACAGAGAACCCGCCGUUCCGCAAAGCGACAUUGUCCUCGCUUGCAUUCGGUCACUGGGAAAUAGCAACUAACUUGGUGCUUAACACGGAACCACUUGAACUAUUCGUGAAAAGGAGCUUACCAGGGUUGCCAACACCUUGGGCGUCACCACUGCUUCUUAUCAGAUCUACCUUUACGUACUGUUUGUUUGGUCAUGAAUAUGGUGACCAGGGGCAGGCGUCAUGUCCAAUGGAGAGCCCAUGCCCUGCGCACGAUCAAUUCUAUGGCCAUGUUCAACUGACACAUAAUCGUUUACGGGCCUCCAUUGGCUCAUAUGACGGUUUGCUAUGUAUCCUGGUGGCAUCUGCGAUUGGGCUAAGGCUCUCCUGUGGCGACAGAACGGCCGGGUCCCUCGCUACUGGCUCGCGUGCACGACAGAUGAGUACUGUUCGGAAAAACGUCCCAGACCCAAUCCAUAUCUCCAUAACUCGCAUAAGUCGACGUCGAGCGGCAGGGCUACUGAUUUACAGCUACCCUUACGAGCCACUAACGUUUCGACUGUAUUCACCAGCAGAAGAUUUAACCAAUACAGACAGAGCCAAUUUAAAACUCGUGCCCGCCCAGCCCUACCUCCAGGCGCGUAAGUGGCUAUUGGGGCCUAAGUGCGCAAAGGAGCCAGUCCUACGCAUCCGUCGGAUUGGUAGUUUGCCGGUCGCUUCCAUGGAACCUAAUGACUCACAUCAUGUAUUAUCUCGGGUCGGUAACGAUACGAGUAAAAUAGACUUACAAAAUGUCCCGACAUACUUGAUCUCCAAUGGCCACCAUAUAGACGACCGUACGCGCCUCCAGCCAACUGGCGGUCUUGCAAGAGUAUACUUUAUUUUGAGAUGCGAAUCGGGGUUAGUCCCGCCAAUAUUCAGUUCUUUCACCGACCGUCCACUAUACAGGAGAAGAGCUUCACACCAGCAGUUCAGAGCAUUCACGCGAUCGAAUUCUGAGUUCGUUUCGGAGCACCCCAUGGGACAGGCAUUGCGAGAGCUGAGGCGUGACUGCACAACUUUAACGGGCAGUAUAAUCGAAAGGCAUCCCGAGCUACGUCCCCUCAACACCUGCUACCUUAAGGAAAGAAGAGCUUUUGUUCUGCACCGGAAGAAUUGGACAGGAAAUCCGUUUACACGCCAGGAAGCACGAAGAUUGACAAAGCGUAAAGAAUCGCGCGGGUUAGGCUCACUUACACGAAUUCAGCCUAACGUCCCUGUCAAACAAUAUGUAAGUGGGCAUAUGGUGAAACAUGGCGUUUACCAUUAUACACUGACUAGGCUAAACGAUUGGUCAUGGACUGGCCAUGUACUCAGCGGGGUUGUUUUUUUGAACUGCUCCGUACUACGGGCUGUUAGUGCAUUGCACCAAUUUUCCCUUUCCGACACAGCCUACAGCCUUGUGAAGUACGCGCUACCAAAGUUUGUAACCCUCUUUAACUCCAAUGGUCUUUUGGAGGCUGGCGCCCUGCAUAUGCGACCUUGUGUUAUAAUAGUCGUGAGAGACUAG'

p_string2 = transcribe_RNA(t)

print(p_string2)

MRKQTLVCTLGTLWEEVVRDSEDNHPFWSVRISSRRPLIGREQKHGLPRCLCSHGALPVAKGYTLGPFIKETEPCYRFKWSLSSKSNWYGIFKTISRSRPKVSGNRQLRVSDPHANDTKVHTPIRWRVHGALALWLAIGIWKRGVPSGEQVWLTKIYAITGRSGDASFLTMTANIYVVSIPRSASQNQHCFRLSVSITVWGVIVSEQTPAENYAFRRTFADLNDSTEDIHPCGYQAMKSQPLPMISPQSISVLIAPQNLSFTLLICLYHVHVTKPLSNGLVGIFKRWIELTHVISRLRLDKRAFVLLGYPADRPATDRRFVGTASTGSVRIWLQSPLYGASQRASIPLEDLVMRRSFEWIITSHLSYREQLSYYTRPYSLHLFAYKRGRSFGDIGLVHEPSPISPEIASPLCRAPVSLHHCLRLKLPYGIESPKTLLFEPAHCFVRQVVPATHVSPPLEFYDGLQATIESFLTWVAAATARTNQSFVPGSVLLTPRGMSVVKYLSWTGLSAFYPRPAHREPASSPCSEILHRNPRSLVYSSISRGLRTDLGCPRAKRGTTHVLLWLGQRPSPRGTSPYSKFTNQPYRPLLQAFLTSEKVWPEHLQWYMLAIGSIINGGNLLLSNNLARSNGFSALQGVPDHRDGCRTGSCAKSRRQRLVRTERKLTLIRPVRTISPLGLTGSIRGTVQGEYLSCTLKSACRRVPFITSIQKGFSSSQVNENETEPPPTNKLQDDNSRIERWRLASIRTALVLFIDSSIMSRDRLNSFKRGTQRVYVRPLANLWQQTKCIRFSTPNDEGAIAQAGRVKILEAAGSGTSVHLGLANYSLIHKRPPLTRKRRTAQDSKITRGDSSAELNCPVDVIDYASKNCWGAQTRTTVGFSSQYLGCGRNVLHKWHGFKRVTPSFARYLARENHIQVGILVEQAAAANLETHPRELHLVLAYSPYSLRDVPTERSPDYHMSGQAGRKRAEPTHHFRGLIEYARMCNKLHSGGWSGYSGPV