# Counting Point Mutations
**Given:** Two DNA strings *s* and *t* of equal length (not exceeding 1 kbp).

**Return:** The Hamming distance *d*<sub>H</sub>(*s*,*t*)

# Sample Dataset

In [1]:
%%file Sample_Dataset.txt
GAGCCTACTAACGGGAT
CATCGTAATGACGGCCT



Overwriting Sample_Dataset.txt


# Sample Output

In [2]:
%%file Sample_Output.txt
7



Overwriting Sample_Output.txt


# Solution

In [3]:
def countPointMutations(s, t):
    "Given two DNA strings s and t of equal length (not exceeding 1 kbp), return the Hamming distance dH(s,t)"
    
    hamming_distance = 0
    for s_n, t_n in zip(s, t):
        if s_n != t_n:
            hamming_distance += 1        
    
    return hamming_distance

def countPointMutationsFromFileToFile(input_file_path, output_file_path):
    "Wraps countPointMutations to read from input_file_path and write to output_file_path"
    
    input_file = open(input_file_path,'r')
    output_file = open(output_file_path,'w')
    
    input_strings = input_file.readlines()
    
    s = input_strings[0].strip()
    t = input_strings[1].strip()
    
    output_string = "%d" % countPointMutations(s, t)
    
    output_file.write("%s\n" % output_string)
    
    input_file.close()
    output_file.close()
    
    return


# Test Solution

In [4]:
countPointMutationsFromFileToFile("Sample_Dataset.txt", "Test_Output.txt")

In [5]:
%%bash
echo Sample_Output.txt
md5sum Sample_Output.txt
cat Sample_Output.txt

Sample_Output.txt
84bc3da1b3e33a18e8d5e1bdd7a18d7a  Sample_Output.txt
7


In [6]:
%%bash
echo Test_Output.txt
md5sum Test_Output.txt
cat Test_Output.txt

Test_Output.txt
84bc3da1b3e33a18e8d5e1bdd7a18d7a  Test_Output.txt
7


In [7]:
%%bash
if [ $(md5sum Sample_Output.txt|cut -f1 -d' ') == $(md5sum Test_Output.txt|cut -f1 -d' ') ]
then
    echo Sample output matches test output.
else
    echo Sample output does not Match test output.
fi

Sample output matches test output.


# Downloaded Dataset

In [8]:
%%bash
cp ~/Downloads/rosalind_hamm.txt ./
cat rosalind_hamm.txt

CAATTCCCCAGGCATGGAGTCTGTCGCGGCCTGATCCGTCTCCCATCGCTATCGCTACAAAGAACGTTAGGTGGGAGCTAGATAGTTGCGGGGACATAGATAAACGCCCATCATCGATCTAAACTCGTACCTTAGAAGGCAGCACTGGCGTATGATTTCTCGTCAGGCCAGGCTGTACAATCGTTAATGCCGTCAGTTGTGACCGTTCTCTACGTGTCACACTCTTTGCCCCGAAAGGCGTCCCACGGTGAGCCTTGTGGTAGAATGTCCTGCCTCGCACTAACCCGCTTTGCACGCTAAATCCATCGCGTAGGGGGCACTTATCCTACCAGGCAGTCGTGGTATAGGACAGGATGGTCGTCAGCTTCGGCGGGCAAACGGACCTCGCAGGGGATTGGGTATTGTCAGGGACCACCTTACTTACTTTGAGCGATAAATATCTAAATGGAAAATGTGATACTCGCGTCTGCTTAATAGTAGCGTCAATATGCACGACCACATTAATGCTAAAGTTGAGGACTAGTGATACAGCCGTCTTTGTCTCCATCTGCACCGGGCCGTCGACTGTCATCTTCTCTAAACGACGGGGGGAGAGGTCCTAATATAGAGGCATACACCCAGTCCGGTGTAGGTTAACGACGAGCTGGACTATTCTTACAGCCTGCTGAGTTGTCGATGGCAAAAGCGCAAGGGTATCCCCGCTCCACCGGCGTCCCCCGTTAGTTGGGCCATTTAGCGTTCCCCGCTAATCATTGCCCATTCCTATTATTCAAAAAAACACGATCGGACTGCTAGAGATATTACGCTCCCACTAACCTATATTCCGAATCCTGCGTTTGAATTGCTAAGCGTAAAGTAGTCACTGTAGGTGCTACTCCCATCAGTGATCATTTGCATAGACCACGGAGCGATACGTACCACAAAGCCTCCTAGGCCCTCTTTT
CCCTTCGCATTGCTTCTACTCGGTCACAGCCCGACCTTGCTCTCAGCGATGAGCCC

# Solution to Downloaded Dataset

In [9]:
countPointMutationsFromFileToFile("rosalind_hamm.txt", "Solution_Output.txt")

In [10]:
%%bash
cat Solution_Output.txt

467
