# Counting Point Mutations

## Problem

Given two strings $s$ and $t$ of equal length, the **Hamming** distance between $s$ and $t$, denoted $dH(s,t)$, is the number of corresponding symbols that differ in $s$ and $t$.

_Given_: Two DNA strings $s$ and $t$ of equal length (not exceeding 1 kbp).

_Return_: The Hamming distance $dH(s,t)$.

**Sample Dataset**

    GAGCCTACTAACGGGAT
    CATCGTAATGACGGCCT

**Sample Output**

    7

________________
## Solution

The solution to this problem being straightfoward, we only present one approach. It is easy to see that an algorithm for calculating the Hamming distance will run in linear time since we only have to traverse one of the input strings once and perform one comparison per element. So, instead on focusing on focusing on running time we are shooting for an elegant, _pythonic_ way of writing this function.

First we need to make sure the input strings are of the same length. Then we can combine both list into an iterable _zip_ object, which can be thought of as a list made up of tuples of elements from both lists. Then we simply iterate through the zip and count the instances where the two characters fromm the different strings differ. Although this approach takes slightly longer than a simple loop, it can be elegantly written in one line:

In [8]:
def hamming(s, t):
    if len(s) != len(t):
        raise ValueError("Strings must be the same length")
    else:
        return sum(c1 != c2 for c1, c2 in zip(s, t))

seq_a = 'GAGCCTACTAACGGGAT'
seq_b = 'CATCGTAATGACGGCCT'

print(hamming(seq_a,seq_b))

7
