# DNA & RNA Sequence Manipulation Exercises

This notebook contains exercises on:
- Reversing DNA sequences
- Finding reverse complements
- Counting codon occurrences (e.g., ATG)
- Sliding window analysis
- Replacing nucleotides in RNA/DNA
- Removing special characters
- Splitting multifasta strings
- Motif position identification

### Python & Bioinformatics Exercises Summary â€” 1-7th July 2025


In [1]:
# Exercise 1: Reverse a DNA sequence

seq= input("Enter a dna sequence").upper()
reverse = seq[::-1]
print(reverse)

Enter a dna sequence ACCCTTTGGG


GGGTTTCCCA


In [3]:
# Exercise 2: Get the reverse complement of a DNA sequence
seq = input("Enter a DNA sequence").upper()
complement = {'A':'T', 'T':'A', 'G':'C', 'C':'G'}
result = ''
for base in seq:
    if base in complement:
        result += complement[base]
    else:
        result += base
print(result)


Enter a DNA sequence agggttttccc


TCCCAAAAGGG


In [5]:
# Exercise 3: Count how many times "ATG" appears
seq= input("Enter a dna sequence").upper()
count= seq.count('ATG')
print(count)

Enter a dna sequence aaaatgggggagtttagttttcccc


1


In [8]:
# Another method (IN-FRAME COUNTING)
seq= input("Enter an dna sequence:").upper()
count_atg = 0
codons= [seq[i:i+3] for i in range(0, len(seq)-2,3) if len(seq[i:i+3])==3]
for codon in codons:
    if codon == 'ATG':
        count_atg += 1
print(f"count of atg :{count_atg}")
    


Enter an dna sequence: aaatgatgatgggggtttcccatg


count of atg :1


In [1]:
# Another method (SLIDING WINDOW/ OVERLAPPING/ ALL MATCHING)
seq= input("Enter a dna sequence:").upper()
count_atg = 0
motif = "ATG"
for i in range(len(seq)- len(motif) + 1):
    if seq[i:i+ len(motif)]== motif:
        count_atg += 1
print(count_atg)


Enter a dna sequence: aaatgatgatgggggtttcccatg


4


In [22]:
# Exercise 4: Replace all "U" with "T" (RNA-DNA)
rna= input("Enter a rna sequence:").upper()
dna = ''
for base in rna:
    if base == 'U':
        dna += 'T'
    else:
        dna += base
print(dna)


Enter a rna sequence: auuuggggcccaaaugc


ATTTGGGGCCCAAATGC


In [25]:
# Exercise 5: Remove special characters from sequence header
# Task: Given a FASTA header string (which should only have letters, numbers,_and the starting >),
# remove any unwanted special characters like #, *, @, etc

header = ">seq#1*alpha"
clean_header = header.replace('#', '').replace('*', '')
print(clean_header)
                                                     

>seq1alpha


In [26]:
# Exercise 6: Split a multi-FASTA string into lines
# Task : Given a multi-FASTA formatted string with sequences and headers separated by newlines, split
# it into a list of lines.

fasta= ">seq1\nATGC\n>seq2\nGGA"
splitted = fasta.splitlines()
print(splitted)

['>seq1', 'ATGC', '>seq2', 'GGA']


In [13]:
# Another method for Exercise 6

fasta= ">seq1\nATGC\n>seq2\nGGA"
splitted = fasta.split("\n")
print(splitted)


['>seq1', 'ATGC', '>seq2', 'GGA']


In [14]:
# Exercise 7: Find all positions of motif "CG" 
# Task: Given a DNA sequence, find all positions where the motif "CG" occurs.
# Positions are based on the first character of each match
# Return them as a list of numbers.

seq= input("Enter a dna sequence:").upper()
motif= "CG"
positions = []
for i in range(len(seq) - len(motif) + 1):
    if seq[i:i+len(motif)]== motif:
        positions.append(i)

print(positions)


Enter a dna sequence: aaacgcgcgcgccgcgcg


[3, 5, 7, 9, 12, 14, 16]


In [11]:
# Exercise 8: Check if sequence starts with "ATG" and ends with a stop codon
# Task:
# write code to check if:
# The sequence starts with "ATG" (start codon)
# The sequence ends with any one of these stop codons: "TAA", "TAG", or "TGA"
# if both conditions are true, print True; else print False

seq= input("Enter a sequence:").upper()
           
start_codon = "ATG"
stop_codon = ["TAA", "TAG", "TGA"]

start = seq.startswith(start_codon)
stop= seq.endswith(tuple(stop_codon))

if start and end:
    print(True)
else:
    print(False)


Enter a sequence: aaatttgggggggtga


False
