## Hamming Distance


By counting the number of differences between two homologous DNA strands
taken from different genomes with a common ancestor, we get a measure of
the minimum number of point mutations that could have occurred on the
evolutionary path between the two strands.

This is called the 'Hamming distance'.

It is found by comparing two DNA strands and counting how many of the
nucleotides are different from their equivalent in the other string.

In information theory, the Hamming distance between two strings of equal length is the 
number of positions at which the corresponding symbols are different.

The symbols may be letters, bits, or decimal digits, among other possibilities. 
For example, the Hamming distance between:

    "karolin" and "kathrin" is 3.
    "karolin" and "kerstin" is 3.
    "kathrin" and "kerstin" is 4.
    1011101 and 1001001 is 2.
    2173896 and 2233796 is 3.


### Algorithm

In [17]:
def hamming(str_a,str_b):
    if len(str_a) == len(str_b): # hamming distance is calculated for strings with equal length
        hamming = 0
        for i in range(len(str_a)): # iterate over string a and count the positions with different values in a and b
            if str_a[i] != str_b[i]:
                hamming += 1
        print(f'The hamming distance for "{str_a}" and "{str_b}" is {hamming}')
    else:
        print('These strings are not equal length!')
        
        
hamming('GGACGGATTCTG', 'AGGACGGATTCT')
hamming("karolin","kathrin")
hamming('GGACGGATTC', 'AGGACGGATTCT') # unequal length
hamming('GGACG', 'GGTCG')
hamming('1011101','1001001')

The hamming distance for "GGACGGATTCTG" and "AGGACGGATTCT" is 9
The hamming distance for "karolin" and "kathrin" is 3
These strings are not equal length!
The hamming distance for "GGACG" and "GGTCG" is 1
The hamming distance for "1011101" and "1001001" is 2
