# Solving BioInformatics challenges

This notebook contains my solutions to a variety of bioinformatics challenges found at Rosalind.com (http://rosalind.info/problems/locations/)

### 1) Counting DNA Nucleotides (http://rosalind.info/problems/dna/)

<b>Problem: </b>
A string is simply an ordered collection of symbols selected from some alphabet and formed into a word; the length of a string is the number of symbols that it contains.

An example of a length 21 DNA string (whose alphabet contains the symbols 'A', 'C', 'G', and 'T') is "ATGCTTCAGAAAGGTCTTACG."

Given: A DNA string s of length at most 1000 nt.

Return: Four integers (separated by spaces) counting the respective number of times that the symbols 'A', 'C', 'G', and 'T' occur in s.

In [14]:
def countDna(seq):
    if(not is_dna(seq)):
        return None
    dna_dict = {'A': 0, 'C':0, 'G':0,'T':0}
    seq = seq.upper()
    for i in range(len(seq)):
        curr = seq[i]
        dna_dict[seq[i]] = dna_dict[seq[i]] + 1
    return dna_dict

#validates a proper DNA sequence
def is_dna(seq):
    symb = {'A','C','G','T'}
    for i in range(len(seq)):
        if(seq[i] not in symb):
            return False
    return True

In [15]:
#Testing: 
seq = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC'
count = countDna(seq)
correct_ans = '20 12 17 21'
print(count)
print(correct_ans)


{'A': 20, 'C': 12, 'G': 17, 'T': 21}
20 12 17 21


In [16]:
#Error testing:
seq = 'AGCTTTTCATTCTGAzTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC'
count = countDna(seq)
print(count)

None


Time complexity: O(2n),
Space complexity:O(1)

### 2) Transcribing DNA into RNA (http://rosalind.info/problems/rna/)
<b>Problem:</b>
An RNA string is a string formed from the alphabet containing 'A', 'C', 'G', and 'U'.

Given a DNA string t corresponding to a coding strand, its transcribed RNA string u is formed by replacing all occurrences of 'T' in t with 'U' in u.

Given: A DNA string t having length at most 1000 nt.

Return: The transcribed RNA string of t.

Example input: GATGGAACTTGACTACGTAAATT

Example output: GAUGGAACUUGACUACGUAAAUU

In [17]:
def dna_rna_transcribe(seq):
    seq = seq.upper()
    if(not is_dna(seq)):
        return None
    
    answ = ''
    for i in range(len(seq)):
        if(seq[i] == 'T'):
            answ = answ + 'U'
        else:
            answ = answ + seq[i]
    return answ

In [19]:
#Testing
seq = 'GATGGAACTTGACTACGTAAATT'
correct = 'GAUGGAACUUGACUACGUAAAUU'
answ = dna_rna_transcribe(seq)
print(answ)
print(answ == correct)

GAUGGAACUUGACUACGUAAAUU
True


In [20]:
#Error Testing 
seq = 'GATGGAACTTGACTACGTAAAdeT'
print(dna_rna_transcribe(seq))

None


Time complexity: O(2N), 
Space Complexity: O(N)