# Translating Mutated RNA to Protein

## Description
This notebook shows:
- Translating RNA sequences from files into proteins
- Handling mutated sequences
- Applying Python string manipulation and bioinformatics translation methods

Challenge: Translate mutated RNA sequence to protein

Task:

Read a DNA sequence from a file (like before)

Clean it (only A, T, G, C)

Apply the same mutation (A→G, T→C)

Transcribe the mutated DNA to RNA (replace T with U)

Translate the RNA sequence into a protein sequence using the standard codon table

Stop translation at the first stop codon

Output the mutated DNA, RNA, and translated protein sequence

In [10]:
with open("dna.txt", "r") as file:
    sequence= file.read().upper()

cleaned_sequence= "".join([base for base in sequence if base in 'ATGC'])
mutated_dna= ""
for base in cleaned_sequence:
    if base == 'A':
        mutated_dna +='G'
    elif base == 'T':
        mutated_dna +='C'
    else:
        mutated_dna += base

codon_table = {
    'UUU':'F', 'UUC':'F', 'UUA':'L', 'UUG':'L',
    'CUU':'L', 'CUC':'L', 'CUA':'L', 'CUG':'L',
    'AUU':'I', 'AUC':'I', 'AUA':'I', 'AUG':'M',
    'GUU':'V', 'GUC':'V', 'GUA':'V', 'GUG':'V',
    'UCU':'S', 'UCC':'S', 'UCA':'S', 'UCG':'S',
    'CCU':'P', 'CCC':'P', 'CCA':'P', 'CCG':'P',
    'ACU':'T', 'ACC':'T', 'ACA':'T', 'ACG':'T',
    'GCU':'A', 'GCC':'A', 'GCA':'A', 'GCG':'A',
    'UAU':'Y', 'UAC':'Y', 'UAA':'_', 'UAG':'_',
    'CAU':'H', 'CAC':'H', 'CAA':'Q', 'CAG':'Q',
    'AAU':'N', 'AAC':'N', 'AAA':'K', 'AAG':'K',
    'GAU':'D', 'GAC':'D', 'GAA':'E', 'GAG':'E',
    'UGU':'C', 'UGC':'C', 'UGA':'_', 'UGG':'W',
    'CGU':'R', 'CGC':'R', 'CGA':'R', 'CGG':'R',
    'AGU':'S', 'AGC':'S', 'AGA':'R', 'AGG':'R',
    'GGU':'G', 'GGG':'G', 'GGC':'G', 'GGA':'G'
}
rna= ""
for base in mutated_dna:
    if base == 'T':
        rna +='U'
    else:
        rna += base

codons= [rna[i:i+3] for i in range (0, len(rna)-2, 3)]
aminoacids= list(map(lambda codon:codon_table.get(codon, '?'), codons))

translated_protein= []
for aa in aminoacids:
    if aa == '_':
        break
    else:
        translated_protein.append(aa)

print(f"Mutated DNA:{mutated_dna}\nRNA: {rna}\nTranslated Protein: {"".join(translated_protein)}")


Mutated DNA:GGCCGGGCGCGCCGGGCGC
RNA: GGCCGGGCGCGCCGGGCGC
Translated Protein: GRARRA


In [9]:
# Another method
with open("dna.txt", "r") as file:
    sequence= file.read().upper()
    
cleaned_sequence= "".join([base for base in sequence if base in 'ATGC'])

mutated_complement= {'A':'G', 'T':'C'}
mutated_dna= ""
for base in cleaned_sequence:
    if base in mutated_complement:
        mutated_dna += mutated_complement[base]
    else:
        mutated_dna += base

rna= mutated_dna.replace('T', 'U')

codon_table = {
    'UUU':'F', 'UUC':'F', 'UUA':'L', 'UUG':'L',
    'CUU':'L', 'CUC':'L', 'CUA':'L', 'CUG':'L',
    'AUU':'I', 'AUC':'I', 'AUA':'I', 'AUG':'M',
    'GUU':'V', 'GUC':'V', 'GUA':'V', 'GUG':'V',
    'UCU':'S', 'UCC':'S', 'UCA':'S', 'UCG':'S',
    'CCU':'P', 'CCC':'P', 'CCA':'P', 'CCG':'P',
    'ACU':'T', 'ACC':'T', 'ACA':'T', 'ACG':'T',
    'GCU':'A', 'GCC':'A', 'GCA':'A', 'GCG':'A',
    'UAU':'Y', 'UAC':'Y', 'UAA':'_', 'UAG':'_',
    'CAU':'H', 'CAC':'H', 'CAA':'Q', 'CAG':'Q',
    'AAU':'N', 'AAC':'N', 'AAA':'K', 'AAG':'K',
    'GAU':'D', 'GAC':'D', 'GAA':'E', 'GAG':'E',
    'UGU':'C', 'UGC':'C', 'UGA':'_', 'UGG':'W',
    'CGU':'R', 'CGC':'R', 'CGA':'R', 'CGG':'R',
    'AGU':'S', 'AGC':'S', 'AGA':'R', 'AGG':'R',
    'GGU':'G', 'GGG':'G', 'GGC':'G', 'GGA':'G'
}

codons= [rna[i:i+3] for i in range(0, len(rna)-2, 3)]
aminoacids= []
for codon in codons:
    if codon in codon_table:
        aa= codon_table[codon]
        if aa == '_':
            break
        aminoacids.append(aa)

print(f"Mutated DNA: {mutated_dna}\nRNA: {rna}\nTranslatedProteins: {"".join(aminoacids)}")


Mutated DNA: GGCCGGGCGCGCCGGGCGC
RNA: GGCCGGGCGCGCCGGGCGC
TranslatedProteins: GRARRA
