# Counting Point Mutations

## Evolution as a Sequence of Mistakes

A __mutation__ is simply a mistake that occurs during the creation or copying of a nucleic acid, in particular DNA. 
Because nucleic acids are vital to cellular functions, mutations tend to cause a ripple effect throughout the 
cell. Although mutations are technically mistakes, a very rare mutation may equip the cell with a beneficial 
attribute. In fact, the macro effects of evolution are attributable by the accumulated result of beneficial 
microscopic mutations over many generations.

The simplest and most common type of nucleic acid mutation is a __point mutation__, which replaces one base with 
another at a single nucleotide. In the case of DNA, a point mutation must change the complementary base 
accordingly.

Two DNA strands taken from different organism or species genomes are __homologous__ if they share a recent ancestor; 
thus, counting the number of bases at which homologous strands differ provides us with the minimum number of 
point mutations that could have occurred on the evolutionary path between the two strands.

We are interested in minimizing the number of (point) mutations separating two species because of the biological 
principle of __parsimony__, which demands that evolutionary histories should be as simply explained as possible.

## Problem

Given two strings $s$ and $t$ of equal length, the Hamming distance between $s$ and $t$, denoted $d_H(s,t)$, is 
the number of corresponding symbols that differ in $s$ and $t$.

**Given**: Two DNA strings $s$ and $t$ of equal length (not exceeding 1 kbp).

**Return**: The Hamming distance $d_H(s,t)$.

## Sample Dataset
```
GAGCCTACTAACGGGAT
CATCGTAATGACGGCCT
```

## Sample Output
```
7
```

In [ ]:
def HammingDistance(s, t):
    # Handle case of different sizes
    hamm = 0
    for i in range(len(s)):
        if (s[i] != t[i]):
            hamm = hamm + 1

    return hamm

HammingDistance("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT")

7

In [ ]:
s = "TTTGCCCGCAAGATAGATTCATGGTTCGACACTAGTATAGATCAAAACCAGTAACCTGCCAATGGCAGGCAATTCACCATTGTTGAACGGGGGCATGTTCGAACAGAACTTTTTCCTATAGGAAGACCAGATACTCACACACGTTCAAGCCGTAGTCCCGCGAGCGGGTCCATGTCTCAACACGCTACATTTTCCGTTAATTTGGGCGCACCAGGCGCGGTCGTCGAACGGCTATCTAGAAAAGATAAGGGTCGGCTAATGTTATAGTCGGGACTCCAAATGCTGTCCCACCTGACCATACCCTCCGCCGGGATGCCCTGCTGATCAGCGGAACTAATGAAACAGCGAGAATTCTCACGAGGGTGCGGATCGGGTTAATTATGCAAACGACGTAGAATGCTAGTCCAGGCCCGTAAACCCGGGTTCCTCTTGGAGCGTTGCCTGCTGCTTTGTCTTGTGATGTGTTTGTTTTGCCTCCAAACTCATCGTTCTATTATGAGCAGCTGAAGTACTAACCAGAGTCGTATCAATGCGTACGGTATTTTTCTACACGTTGTTGTCAATTCCAGACGTCCATACTGGCATCCCCTGTAAAGAAGCCAACACGATAACAGTGCTTGTCAGGCCTAGACCATAGCCTAGGCCACGGGAATACAGTATTAGACCGGCGAAGTAACATCAGGGGTTATTAATCTCGTCGATATGACGAAACTATTCTGTTTTGGCTGCTCGCTTATAAGAAGAGAATTAACCTAGGAGCGGGTTCTGGAGCCAGCACGTCGTATCCCTATGGTAGTGGCGAGTTAGAAGGGGGATTCCTAATCATCCTCTGCAGCTCCTATATCTACGCGCAGATCGCGCCTCGCTGTGGGCACGCTGTGGATGTGGCACATTGCCTGACCTGTTGGTGGGTAGATGGCCCGTCAACTTAGTGATTA"
t = "AGCAGAGACATAAGATAGTCAGAGATGGTCGGTAGGGAGCGTGAAAAATTTCAGCCTGGTTTTTGTAGCCTCGGTAACATCGATGTAATGCAGACCTGTATTACAGACTGATAGCCATGAGGTTGACTATAAGGGCGATTACGCTGAGCAGGTTGTGCGTTGATTCCGTTGACCTCTGACCAGGATAAATGCCGATCCACACTGGGTTTTCCAGACGCCTCGACAAGACTTTAATCGATGAGCGGTCAGCATCTGCAAATATTTTACTCCGGCCTTCCTAACACGGCCTACCCCTGAATAGGATGCTTGACGTTGTTCTCGTCACAACCCTAAATACGGATGTCACGATCAGTCTTTGGCGGGTCCGGATCGGGTTAATTGAGAAAACCACGCAGATGAGAGGTCCAACTCCCTAAGCGCAGCTTACCCATCAGACTTTGGCCTTAGGTTTAAGGAAGGCTATCCTTCATAAGCATCCGCTAACAGCTTTCTATTGTAAGGGACTGCTGTACCTACTAGATCCCTCTCAGGACGTAGACTCTGTTCATGACTATAGTGGACGAGTCCAGGCATCCTGTTTGGGAGCCAATGTAAAGTAGCGGACCTCATCTTAGTGCGAAGCACGGCTTGGCGATAGACTGAGTTGTAACAATTTAGCGAAAAGGCGCTGAGGTAAGATCACTGGTAACTACTAGCGTTTCTACGCCTTTACAAATTACTGGTATAGACTTCCTCTGAATATGCTACTCACGCGAGTAGCGAGGTGCAGCGCGCGCAACTAGTGTAACAACGGTCTCATGGGGGACTAAGGGGGACTCCACAGCACCTGTTCAAGCCCAGATCCCCACGATCAGTTCTCACCCCACTTTATGGAAGTTTGCGAGTTTTCGAGCGACCAGTTCGGTTTCTAGACTTATGGCCGACAACATTCGCTAAGA"
HammingDistance(s, t)


475