# CMPE 484: Bioinformatics and Computational Genomics, Spring 2022
## Assignment I -  Pairwise Sequence Alignment
### Halil Burak Pala - 2019400282

In this project, I implemented Global and Local Alignments with Linear and Affine gap penalties.

To get BLOSUM62 scoring matrix, I used a very small python module, _blosum_ . More information can be found [here](https://pypi.org/project/blosum/).

In [1]:
import blosum as bl
blosum62 = bl.BLOSUM(62) # BLOSUM62 scoring matrix

### Needleman-Wunch Algorithm (Global Sequence Alignment)

#### NW with Linear Gap Penalty
Firstly, I created a function for Needleman-Wunch algorithm with _linear gap penalty_:

In [2]:
def nw_linear(seq1, seq2, gap_score):
    # Our tracing matrix:
    table = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    
    # Our bactrack matrix:
    # Backtrack matrix will keep track of from where we reached to a specific
    # entry in the tracing matrix. I will use some integer codes for this purpose.
    backtrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]

    # Fill the first row with gap penalties:
    for i in range(1,len(seq1)+1):
        table[i][0] = i * gap_score

    # Fill the first column also with gap penalties:
    for j in range(1,len(seq2)+1):
        table[0][j] = j * gap_score

    # Fill the rest of the table:
    for i in range(1, len(seq1)+1): 
        for j in range(1, len(seq2)+1):
            # For every table entry, choose the maximum of delete, insert and match scores.
            scores = [table[i-1][j] + gap_score, table[i][j-1] + gap_score, table[i-1][j-1] + blosum62[seq1[i-1]+seq2[j-1]]]
            table[i][j] = max(scores)
            
            # If backtrack[i][j] is:
            # 0 -> reached from top
            # 1 -> reached from left
            # 2 -> reached from diagonal
            backtrack[i][j] = scores.index(table[i][j])

    # Our alignment score is at the right bottom corner of the table:
    alignmentScore = table[len(seq1)][len(seq2)]
    
    # This is a lambda function for inserting indels:
    insertIndel = lambda seq, i: seq[:i] + '-' + seq[i:]

    # Now, bactrack and get the alignments:      
    alignment1 = seq1
    alignment2 = seq2
    alignStr = "" # Indicates matches, mismatches and indels

    # We start from the right bottom corner and identify from which table entry it
    # came from.
    while i*j != 0:
        # If the entry is equal to match score + diagonal entry:
        if backtrack[i][j] == 0:
            i -= 1
            alignment2 = insertIndel(alignment2, j)
            alignStr += "-"
        elif backtrack[i][j] == 1:
            j -= 1
            alignment1 = insertIndel(alignment1, i)
            alignStr += "-"
        else:
            i -= 1
            j -= 1
            alignStr += "|" if alignment1[i] == alignment2[j] else "."
            
    # Fill the rest of the alignments with indels:
    for _ in range(i):
        alignment2 = insertIndel(alignment2, 0)
        alignStr += "-"
    for _ in range(j):
        alignment1 = insertIndel(alignment1, 0)
        alignStr += "-"
    
    alignStr = alignStr[::-1]
    
    return "Score: " + str(alignmentScore), alignment1, alignStr, alignment2

#### NW with Affine Gap Penalty
Secondly, I created the Needleman-Wunch Algorithm with _affine gap penalty_:

In [3]:
def nw_affine(seq1, seq2, open_gap_score, extend_gap_score):
    # In affine gap penalty case, we have three tables: lower, main and upper tables.
    # upper table is for deletions, lower table is for insertions, main table is for
    # matches.
    
    # Initialization of tables:
    
    # Here, I initialized upper table. Except the poisition [0,0], first row of the 
    # upper table is filled with -infinity, first row is initialized with gap penalties.
    upperTable = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    for i in range(0,len(seq2)+1):
        upperTable[0][i] = -float("inf")
    for i in range(1,len(seq1)+1):
        upperTable[i][0] = open_gap_score + (i-1) * extend_gap_score

    # Here, I initialized main table. Except the poisition [0,0], first row and first
    # column of the table is initialized with gap penalties.
    mainTable = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    for i in range(1,len(seq2)+1):
        mainTable[0][i] = open_gap_score + (i-1) * extend_gap_score
    for i in range(1,len(seq1)+1):
        mainTable[i][0] = open_gap_score + (i-1) * extend_gap_score

    # Here, I initialized lower table. Except the poisition [0,0], first column of the 
    # lower table is filled with -infinity, first column is initialized with gap penalties.
    lowerTable = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    for i in range(0,len(seq1)+1):
        lowerTable[i][0] = -float("inf")
    for i in range(1,len(seq2)+1):
        lowerTable[0][i] = open_gap_score + (i-1) * extend_gap_score

    # Here, I created bactracking matrices. Since, we have three tables, we have also
    # three backtracking tables for every one of them. 
    lowerBacktrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    mainBacktrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    upperBacktrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]

    # Filling the tables:
    for i in range(1, len(seq1)+1):
        for j in range(1, len(seq2)+1):
            # Here is for checking extending a deletion or creating an deletion:
            upperScores = [upperTable[i-1][j] + extend_gap_score, mainTable[i-1][j] + open_gap_score]
            upperTable[i][j] = max(upperScores)
            upperBacktrack[i][j] = upperScores.index(upperTable[i][j])  # Here, I keep from where I reached that entry. To store it
                                                                        # I give some numbers to tables. The same procedure applies
                                                                        # also for other cases:

            # Here is for checking extending an insertion or creating an insertion:
            lowerScores = [lowerTable[i][j-1] + extend_gap_score, mainTable[i][j-1] + open_gap_score]
            lowerTable[i][j] = max(lowerScores)
            lowerBacktrack[i][j] = lowerScores.index(lowerTable[i][j])

            # Here is for checking a match or any deletion or insertion:
            middleScores = [upperTable[i][j], mainTable[i-1][j-1] + blosum62[seq1[i-1]+seq2[j-1]], lowerTable[i][j]]
            mainTable[i][j] = max(middleScores)
            mainBacktrack[i][j] = middleScores.index(mainTable[i][j])

    # Bactrack and get the alignments:
    
    # Backtracking in affine gap penalty case is somehow tricky. We need to check every backtrack table and go back and forth
    # between these tables to get the exact alignments. 
    
    # Initialization of variables:
    i = len(seq1)
    j = len(seq2)
    alignment1 = seq1
    alignment2 = seq2

    # We will firstly choose the maximum of the bottom right corner entries of backtrack tables.
    # Then start checking these tables. We use a backtracking code: backtrackMatrixNo to indicate 
    # changing backtrack tables. According to this backtrack code, we will determine which table
    # we will look next for determining the correct path.
    matrixScores = [upperTable[i][j], mainTable[i][j], lowerTable[i][j]]
    alignmentScore = max(matrixScores)
    backtrackMatrixNo = matrixScores.index(alignmentScore)

    # This is a lambda function for inserting indels:
    insertIndel = lambda seq, i: seq[:i] + '-' + seq[i:]
    
    alignStr = ""

    while i*j != 0: 
        if backtrackMatrixNo == 0:  # Look at upper backtrack table:
            if upperBacktrack[i][j] == 1:# If this entry indicates that next entry is a deletion ...
                backtrackMatrixNo = 1 # ...then change the backtrack code accordingly.
            i -= 1 # Move up...
            alignment2 = insertIndel(alignment2, j) # ..and insert an indel to seq2.
            alignStr += "-"

        elif backtrackMatrixNo == 1:  # Look at main backtrack table:
            if mainBacktrack[i][j] == 0: # If this entry indicates that next entry is an insertion ...
                backtrackMatrixNo = 0 # ...then change the backtrack code accordingly.
            elif mainBacktrack[i][j] == 2: # If this entry indicates that next entry is a deletion...
                backtrackMatrixNo = 2 # ...then change the backtrack code accordingly.
            else: # If next entry indicates a match, move diagonally:
                i -= 1
                j -= 1
                alignStr += "|" if alignment1[i] == alignment2[j] else "."

        else: # Look at lower backtrack table:
            if lowerBacktrack[i][j] == 1:# If this entry indicates that next entry is a match...
                backtrackMatrixNo = 1 # ...then change the backtrack code accordingly.
            j -= 1 # Move left...
            alignment1 = insertIndel(alignment1, i) # ..and insert an indel to seq1.
            alignStr += "-"

    # Insert indels for the rest:
    for _ in range(i):
        alignment2 = insertIndel(alignment2, 0)
        alignStr += "-"
    for _ in range(j):
        alignment1 = insertIndel(alignment1, 0)
        alignStr += "-"
        
    alignStr = alignStr[::-1]
        
    return "Score: " + str(alignmentScore), alignment1, alignStr, alignment2

Here is our __Needleman-Wunsch Algorithm__:

In [4]:
def needleman_wunsch_algorithm(seq1, seq2, penalty_params):
    n = len(seq1) # nofrows  
    m = len(seq2) # nofcolumns
    
    if penalty_params['penalty_type'] == 'linear':
        # Assumed indel score given as gap_opening_penalty
        return nw_linear(seq1, seq2, penalty_params['gap_opening_penalty'])
    
    else: # AFFINE GAP PENALTY 
        return nw_affine(seq1, seq2, penalty_params['gap_opening_penalty'], penalty_params['gap_extension_penalty'])
    

### Smith-Waterman Algorithm (Local Sequence Alignment)
#### SW with Linear Gap Penalty
Here, I created a function for Smith-Waterman Algorithm with _linear gap penalty_. It is almost the same with Needleman Wunsch, but thist time we set every negative entry in tracing matrix to 0.

In [5]:
 def sw_linear(seq1, seq2, gap_score):
    
    # Our tracing matrix:
    table = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]

    # Our bactrack matrix:
    # Backtrack matrix will keep track of from where we reached to a specific
    # entry in the tracing matrix. I will use some integer codes for this purpose.
    backtrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]

    # We should keep track of maximum score and its position
    maxScore = -float("inf")
    maxScorePosition = (0,0)

    # Filing the score and backtrack matrices:
    for i in range(1, len(seq1)+1):
        for j in range(1, len(seq2)+1):
            scores = [table[i-1][j] + gap_score, table[i][j-1] + gap_score, table[i-1][j-1] + blosum62[seq1[i-1]+seq2[j-1]], 0]
            table[i][j] = max(scores)

            # If backtrack[i][j] is:
            # 0 -> reached from top
            # 1 -> reached from left
            # 2 -> reached from diagonal
            # 3 -> end of the sequence
            backtrack[i][j] = scores.index(table[i][j])
            # Update maximum if necessary:
            if table[i][j] > maxScore:
                maxScore = table[i][j]
                maxScorePosition = (i,j)

    (i,j) = maxScorePosition

    alignment1, alignment2 = seq1, seq2

    # This is a lambda function for inserting indels:
    insertIndel = lambda seq, i: seq[:i] + '-' + seq[i:]

    alignStr = "" # Indicates matches, mismatches and indels

    # We backtrack until we reach to a 0 or to the edges of the table
    while backtrack[i][j] != 3 and i*j != 0:
        if backtrack[i][j] == 0: # Reached from top:
            i -= 1
            alignment2 = insertIndel(alignment2, j)
            alignStr += "-"
        elif backtrack[i][j] == 1: # Reached from left:
            j -= 1
            alignment1 = insertIndel(alignment1, i)
            alignStr += "-"
        elif backtrack[i][j] == 2: # Reached from diagonal:
            i -= 1
            j -= 1
            alignStr += "|" if alignment1[i] == alignment2[j] else "."

    # Fill the rest with spaces:
    if min(i,j) == i:
        alignment1 = (j-i) * " " + alignment1
    else:    
        alignment2 = (i-j) * " " + alignment2

    alignStr += max(i,j) * " "
    alignStr = alignStr[::-1]
    
    return "Score: " + str(maxScore), alignment1, alignStr, alignment2

#### SW with Affine Gap Penalty
Here, I created the Smith-Waterman Algorithm with _affine gap penalty_. Actually, it is almost the same algorithm with Needleman-Wunsch, except this time we consider every negative entry in the main tracking table as 0.

In [6]:
def sw_affine(seq1, seq2, open_gap_score, extend_gap_score):
    # In affine gap penalty case, we have three tables: lower, main and upper tables.
    # upper table is for deletions, lower table is for insertions, main table is for
    # matches.
    
    # Initialization of tables:
    
    # Here, I initialized upper table. Except the poisition [0,0], first row of the 
    # upper table is filled with -infinity, first row is initialized with gap penalties.
    upperTable = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    for i in range(0,len(seq2)+1):
        upperTable[0][i] = -float("inf")
    for i in range(1,len(seq1)+1):
        upperTable[i][0] = open_gap_score + (i-1) * extend_gap_score

    # Here, I initialized main table. Except the poisition [0,0], first row and first
    # column of the table is initialized with 0 (since this is local alignment).
    mainTable = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    for i in range(1,len(seq2)+1):
        mainTable[0][i] = 0
    for i in range(1,len(seq1)+1):
        mainTable[i][0] = 0

    # Here, I initialized lower table. Except the poisition [0,0], first column of the 
    # lower table is filled with -infinity, first column is initialized with gap penalties.
    lowerTable = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    for i in range(0,len(seq1)+1):
        lowerTable[i][0] = -float("inf")
    for i in range(1,len(seq2)+1):
        lowerTable[0][i] = open_gap_score + (i-1) * extend_gap_score

    # Here, I created bactracking matrices. Since, we have three tables, we have also
    # three backtracking tables for every one of them. 
    lowerBacktrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    mainBacktrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]
    upperBacktrack = [[0 for j in range(len(seq2)+1)] for i in range(len(seq1)+1)]

    # We should keep track of max score and its coordinates:
    maxScore = -float("inf")
    maxScorePosition = (0,0)
    
    alignStr = ""
    # Filling the tables:
    for i in range(1, len(seq1)+1):
        for j in range(1, len(seq2)+1):
            # Here is for checking extending a deletion or creating an deletion:
            upperScores = [upperTable[i-1][j] + extend_gap_score, mainTable[i-1][j] + open_gap_score]
            upperTable[i][j] = max(upperScores)
            upperBacktrack[i][j] = upperScores.index(upperTable[i][j])  # Here, I keep from where I reached that entry. To store it
                                                                        # I give some numbers to tables. The same procedure applies
                                                                        # also for other cases:

            # Here is for checking extending an insertion or creating an insertion:
            lowerScores = [lowerTable[i][j-1] + extend_gap_score, mainTable[i][j-1] + open_gap_score]
            lowerTable[i][j] = max(lowerScores)
            lowerBacktrack[i][j] = lowerScores.index(lowerTable[i][j])

            # Here, this time we check also for 0:
            middleScores = [upperTable[i][j], mainTable[i-1][j-1] + blosum62[seq1[i-1]+seq2[j-1]], lowerTable[i][j], 0] 
            mainTable[i][j] = max(middleScores)
            mainBacktrack[i][j] = middleScores.index(mainTable[i][j])

            if mainTable[i][j] > maxScore:
                maxScore = mainTable[i][j]
                maxScorePosition = (i,j)

    # Bactrack and get the alignments:
    
    # Backtracking in affine gap penalty case is somehow tricky. We need to check every backtrack table and go back and forth
    # between these tables to get the exact alignments. 
    
    # Initialization of variables:
    i, j = maxScorePosition
    alignment1 = seq1
    alignment2 = seq2

    # This time we start inspecting tables by the main one:
    backtrackMatrixNo = 1

    # This is a lambda function for inserting indels:
    insertIndel = lambda seq, i: seq[:i] + '-' + seq[i:]
    insertSpace = lambda seq, i: seq[:i] + ' ' + seq[i:]
    
    alignStr = ""
    while mainBacktrack[i][j] != 3 and i*j != 0: 
        if backtrackMatrixNo == 0:  # Look at upper backtrack table:
            if upperBacktrack[i][j] == 1:# If this entry indicates that next entry is a deletion ...
                backtrackMatrixNo = 1 # ...then change the backtrack code accordingly.
            i -= 1 # Move up...
            alignment2 = insertIndel(alignment2, j) # ..and insert an indel to seq2.
            alignStr += "-"

        elif backtrackMatrixNo == 1:  # Look at main backtrack table:
            if mainBacktrack[i][j] == 0: # If this entry indicates that next entry is an insertion ...
                backtrackMatrixNo = 0 # ...then change the backtrack code accordingly.
            elif mainBacktrack[i][j] == 2: # If this entry indicates that next entry is a deletion...
                backtrackMatrixNo = 2 # ...then change the backtrack code accordingly.
            else: # If next entry indicates a match, move diagonally:
                i -= 1
                j -= 1
                alignStr += "|" if alignment1[i] == alignment2[j] else "."

        else: # Look at lower backtrack table:
            if lowerBacktrack[i][j] == 1:# If this entry indicates that next entry is a match...
                backtrackMatrixNo = 1 # ...then change the backtrack code accordingly.
            j -= 1 # Move left...
            alignment1 = insertIndel(alignment1, i) # ..and insert an indel to seq1.
            alignStr += "-"
    
    # Insert spaces for the rest:
    if min(i,j) == i:
        alignment1 = (j-i) * " " + alignment1
    else:    
        alignment2 = (i-j) * " " + alignment2
    
    alignStr += max(i,j) * " "
    
    alignStr = alignStr[::-1]
    
    return "Score: " + str(maxScore), alignment1, alignStr, alignment2
    

And here is the __Smith-Waterman algorithm__:

In [7]:
def smith_waterman_algorithm(seq1, seq2, penalty_params):
    
    if penalty_params['penalty_type'] == 'linear':
        # Assumed indel score given as gap_opening_penalty
        return sw_linear(seq1, seq2, penalty_params['gap_opening_penalty'])

    else: # AFFINE GAP PENALTY 
        return sw_affine(seq1,seq2,penalty_params['gap_opening_penalty'],penalty_params['gap_extension_penalty'])

### Generic Sequence Alignment Function

Here is our generic function.

In [8]:
def align_sequences(seq1, seq2, algorithm, penalty_params):
    if algorithm == 'global':
        print('\n'.join(needleman_wunsch_algorithm(seq1, seq2, penalty_params)))
    elif algorithm == 'local': 
        print('\n'.join(smith_waterman_algorithm(seq1, seq2, penalty_params)))
    else:
        raise NotImplementedError

### Testing

In [9]:
# To generate random sequences:
import random
random.seed(2022)

aminoacids = ['A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V']

def generate_sequence(length=50):    
    return ''.join([random.choice(aminoacids) for i in range(length)])

def mutate_sequence(seq, n_mutations=35):
    seq = list(seq)
    pos = {random.randint(1, len(seq)): random.choice(['substitute', 'delete']) for i in range(n_mutations)}   
    mutated_sequence = ''
    for ix, aminoacid in enumerate(seq):
        if ix in pos:
            if pos[ix] == 'substitute':
                mutated_sequence += random.choice(aminoacids)
        else:
            mutated_sequence += aminoacid             
    return mutated_sequence

def create_random_sequences():
    sample_sequence = generate_sequence()
    sequence1 = mutate_sequence(sample_sequence)
    sequence2 = mutate_sequence(sample_sequence)
    return sequence1, sequence2

In [10]:
# BioPython's alignment function for checking our results:
from Bio.Align import substitution_matrices
from Bio import Align 
def align_sequences_biopython(seq1, seq2, algorithm, penalty_params):
    aligner = Align.PairwiseAligner()
    aligner.substitution_matrix = substitution_matrices.load("BLOSUM62")
    if penalty_params['penalty_type'] == 'linear':
        aligner.open_gap_score = penalty_params['gap_opening_penalty'] 
        aligner.extend_gap_score = penalty_params['gap_opening_penalty'] 
    else:
        aligner.open_gap_score = penalty_params['gap_opening_penalty'] # -11
        aligner.extend_gap_score = penalty_params['gap_extension_penalty'] # -1
    aligner.mode = algorithm
    for alignment in aligner.align(seq1, seq2):
        print('Score:', alignment.score)
        print(alignment)

Now, I tested my functions with 3 given samples and 4 new random samples that I created.

#### Sample 1

In [11]:
sample1_sequence1 = "PLEASANTLY"
sample1_sequence2 = "MEANLY"

##### Global - Linear Gap Penalty

Here is what my function gives:

In [12]:
align_sequences(sample1_sequence1, sample1_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -16.0
PLEASANTLY
-.||--|-||
-MEA--N-LY


Here is what BioPython gives:

In [13]:
align_sequences_biopython(sample1_sequence1, sample1_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -16.0
PLEASANTLY
-.||--|-||
-MEA--N-LY

Score: -16.0
PLEASANTLY
-.|--||-||
-ME--AN-LY



As it can be seen, the first result is same as mine.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [14]:
align_sequences(sample1_sequence1, sample1_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -1.0
PLEASANTLY
-.||.---||
-MEAN---LY


Here is what BioPython gives:

In [15]:
align_sequences_biopython(sample1_sequence1, sample1_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -1.0
PLEASANTLY
-.||.---||
-MEAN---LY



As it can be seen, they are the same.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [16]:
align_sequences(sample1_sequence1, sample1_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 12.0
PLEASANTLY
 .||.
 MEANLY


Here is what BioPython gives:

In [17]:
align_sequences_biopython(sample1_sequence1, sample1_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 12.0
PLEASANTLY
 .||.
 MEANLY



As it can be seen, they are the same.

##### Local - Affine Gap Penalty

Here is what my function gives

In [18]:
align_sequences(sample1_sequence1, sample1_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 12.0
PLEASANTLY
 .||.
 MEANLY


Here is what BioPython gives:

In [19]:
align_sequences_biopython(sample1_sequence1, sample1_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 12.0
PLEASANTLY
 .||.
 MEANLY



As it can be seen, they are the same.

#### Sample 2

In [20]:
sample2_sequence1 = 'PRTEINS'
sample2_sequence2 = 'PRTWPSEIN'

##### Global - Linear Gap Penalty

Here is what my function gives:

In [21]:
align_sequences(sample2_sequence1, sample2_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -7.0
PRT-EIN-S
|||-...-.
PRTWPSEIN


Here is what BioPython gives:

In [22]:
align_sequences_biopython(sample2_sequence1, sample2_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -7.0
PRT-EIN-S
|||-...-.
PRTWPSEIN



As it can be seen, they are the same.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [23]:
align_sequences(sample2_sequence1, sample2_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 8.0
PRT---EINS
|||---|||-
PRTWPSEIN-


Here is what BioPython gives:

In [24]:
align_sequences_biopython(sample2_sequence1, sample2_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 8.0
PRT---EINS
|||---|||-
PRTWPSEIN-



As it can be seen, they are the same.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [25]:
align_sequences(sample2_sequence1, sample2_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 17.0
PRTEINS
|||
PRTWPSEIN


Here is what BioPython gives:

In [26]:
align_sequences_biopython(sample2_sequence1, sample2_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 17.0
PRTEINS
|||
PRTWPSEIN



As it can be seen, they are the same.

##### Local - Affine Gap Penalty

Here is what my function gives

In [27]:
align_sequences(sample2_sequence1, sample2_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 19.0
PRT---EINS
|||---|||
PRTWPSEIN


Here is what BioPython gives:

In [28]:
align_sequences_biopython(sample2_sequence1, sample2_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 19.0
PRT---EINS
|||---|||
PRTWPSEIN



As it can be seen, they are the same.

#### Sample 3

In [29]:
sample3_sequence1 = 'YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW'
sample3_sequence2 = 'YHEDVAHEDAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPIISATCARMRVRTVWE'

##### Global - Linear Gap Penalty

Here is what my function gives:

In [30]:
align_sequences(sample3_sequence1, sample3_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 38.0
YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCR--W---LPIRGKKC-SSCCTRMRVRTVWEW
||.||....|....|-|.....-.-|||.|.|||||||.........|...|--|---...|....-|..|.|||||||||-
YHEDVAHEDAIAQMV-NTFGFV-W-QICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPIISATCARMRVRTVWE-


Here is what BioPython gives:

In [31]:
align_sequences_biopython(sample3_sequence1, sample3_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 38.0
YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCR--W---LPIRGKKC-SSCCTRMRVRTVWEW
||.||....|....|-|.....-.-|||.|.|||||||.........|...|--|---...|....-|..|.|||||||||-
YHEDVAHEDAIAQMV-NTFGFV-W-QICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPIISATCARMRVRTVWE-

Score: 38.0
YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCR--W---LPIRGKKC-SSCCTRMRVRTVWEW
||.||....|....|-|--......|||.|.|||||||.........|...|--|---...|....-|..|.|||||||||-
YHEDVAHEDAIAQMV-N--TFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPIISATCARMRVRTVWE-

Score: 38.0
YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCR--W---LPIRGKK-CSSCCTRMRVRTVWEW
||.||....|....|-|.....-.-|||.|.|||||||.........|...|--|---...|...-.|..|.|||||||||-
YHEDVAHEDAIAQMV-NTFGFV-W-QICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPIISATCARMRVRTVWE-

Score: 38.0
YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCR--W---LPIRGKK-CSSCCTRMRVRTVWEW
||.||....|....|-|--......|||.|.|||||||.........|...|--|---...|...-.|..|.|||||||||-
YHEDVAHEDAIAQMV-N--TFGFVWQICLNQFPSMM

As it can be seen, the first result is same as mine.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [32]:
align_sequences(sample3_sequence1, sample3_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 133.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVF---KVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||.---....|||||||||----.|..|.|||||||||-
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE-


Here is what BioPython gives:

In [33]:
align_sequences_biopython(sample3_sequence1, sample3_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 133.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVF---KVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||.---....|||||||||----.|..|.|||||||||-
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE-

Score: 133.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVFK---VDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||..---...|||||||||----.|..|.|||||||||-
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE-

Score: 133.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVFKV---DHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||...---..|||||||||----.|..|.|||||||||-
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE-



As it can be seen, the first one is the same with mine.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [34]:
align_sequences(sample3_sequence1, sample3_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 74.0
       YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
                                                      |||||||||----.|..|.|||||||||
YHEDVAHEDAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE


Here is what BioPython gives:

In [35]:
align_sequences_biopython(sample3_sequence1, sample3_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 74.0
       YHFDVPDCWAHRYWVENPQAIAQMEQICFNWFPSMMMKQPHVFKVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
                                                      |||||||||----.|..|.|||||||||
YHEDVAHEDAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE



As it can be seen, they are the same

##### Local - Affine Gap Penalty

Here is what my function gives

In [36]:
align_sequences(sample3_sequence1, sample3_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 144.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVF---KVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||.---....|||||||||----.|..|.|||||||||
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE


Here is what BioPython gives:

In [37]:
align_sequences_biopython(sample3_sequence1, sample3_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 144.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVF---KVDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||.---....|||||||||----.|..|.|||||||||
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE

Score: 144.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVFK---VDHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||..---...|||||||||----.|..|.|||||||||
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE

Score: 144.0
YHFDVPDCWAHRYWVENPQAIAQME-------QICFNWFPSMMMK-------QPHVFKV---DHHMSCRWLPIRGKKCSSCCTRMRVRTVWEW
||.||----||.------.|||||.-------|||.|.|||||||-------..||...---..|||||||||----.|..|.|||||||||
YHEDV----AHE------DAIAQMVNTFGFVWQICLNQFPSMMMKIYWIAVLSAHVADRKTWSKHMSCRWLPI----ISATCARMRVRTVWE



As it can be seen, the first one is the same with mine.

#### Sample 4
Now, I am going to generate new samples:

In [38]:
sample4_sequence1, sample4_sequence2 = create_random_sequences()

##### Global - Linear Gap Penalty

Here is what my function gives:

In [39]:
align_sequences(sample4_sequence1, sample4_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 17.0
WIPWIYNIFQ-CPTYHEKIMNFPIRSENDAPHCLHACA
|..|.|..|.-.|..|........|...|.-|.....|
WDMWDYRTFAFIPNRHRMNFHEQQRHFRDW-HPMRCGA


Here is what BioPython gives:

In [40]:
align_sequences_biopython(sample4_sequence1, sample4_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 17.0
WIPWIYNIFQ-CPTYHEKIMNFPIRSENDAPHCLHACA
|..|.|..|.-.|..|........|...|.-|.....|
WDMWDYRTFAFIPNRHRMNFHEQQRHFRDW-HPMRCGA



As it can be seen, the first result is the same as mine.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [41]:
align_sequences(sample4_sequence1, sample4_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 20.0
WIPWIYNIFQ-CPTYHEKIMNFPIRSENDAP-HCLHACA
|..|.|..|.-.|..|.--|||.........-|.....|
WDMWDYRTFAFIPNRHR--MNFHEQQRHFRDWHPMRCGA


Here is what BioPython gives:

In [42]:
align_sequences_biopython(sample4_sequence1, sample4_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 20.0
WIPWIYNIFQ-CPTYHEKIMNFPIRSENDAP-HCLHACA
|..|.|..|.-.|..|.--|||.........-|.....|
WDMWDYRTFAFIPNRHR--MNFHEQQRHFRDWHPMRCGA



As it can be seen, they are the same.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [43]:
align_sequences(sample4_sequence1, sample4_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 26.0
WIPWIYNIFQCPTYHEKIMNFPIRSENDAPHCLHACA
|..|.|..|
WDMWDYRTFAFIPNRHRMNFHEQQRHFRDWHPMRCGA


Here is what BioPython gives:

In [44]:
align_sequences_biopython(sample4_sequence1, sample4_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 26.0
WIPWIYNIFQCPTYHEKIMNFPIRSENDAPHCLHACA
|..|.|..|
WDMWDYRTFAFIPNRHRMNFHEQQRHFRDWHPMRCGA



As it can be seen, they are the same

##### Local - Affine Gap Penalty

Here is what my function gives

In [45]:
align_sequences(sample4_sequence1, sample4_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 31.0
WIPWIYNIFQ-CPTYHEKIMNFPIRSENDAPHCLHACA
|..|.|..|.-.|..|.--|||
WDMWDYRTFAFIPNRHR--MNFHEQQRHFRDWHPMRCGA


Here is what BioPython gives:

In [46]:
align_sequences_biopython(sample4_sequence1, sample4_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 31.0
WIPWIYNIFQ-CPTYHEKIMNFPIRSENDAPHCLHACA
|..|.|..|.-.|..|.--|||
WDMWDYRTFAFIPNRHR--MNFHEQQRHFRDWHPMRCGA



As it can be seen, they are the same.

#### Sample 5

In [47]:
sample5_sequence1, sample5_sequence2 = create_random_sequences()

##### Global - Linear Gap Penalty

Here is what my function gives:

In [48]:
align_sequences(sample5_sequence1, sample5_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 19.0
CNYEILNFRDCRIKF-LVDAYNP-KMRPWLPTICHWKFDWE
|..|||.|...||..-..||.||-.......|.......|.
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK


Here is what BioPython gives:

In [49]:
align_sequences_biopython(sample5_sequence1, sample5_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 19.0
CNYEILNFRDCRIKF-LVDAYNP-KMRPWLPTICHWKFDWE
|..|||.|...||..-..||.||-.......|.......|.
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK

Score: 19.0
CNYEILNFRDCRIK-FLVDAYNP-KMRPWLPTICHWKFDWE
|..|||.|...||.-...||.||-.......|.......|.
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK



As it can be seen, the first result is the same as mine.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [50]:
align_sequences(sample5_sequence1, sample5_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 27.0
CNYEILNFRDCRIK-FLVDAYNPKMRPWLPTICHWKFD---WE
|..|||.|...||.-...||.||.--|.....|..|..---|.
CTIEILFFKMHRIPWWFSDALNPC--PCKRVNCTSKIARSVWK


Here is what BioPython gives:

In [51]:
align_sequences_biopython(sample5_sequence1, sample5_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 27.0
CNYEILNFRDCRIK-FLVDAYNPKMRPWLPTICHWKFD---WE
|..|||.|...||.-...||.||--.|.....|..|..---|.
CTIEILFFKMHRIPWWFSDALNP--CPCKRVNCTSKIARSVWK

Score: 27.0
CNYEILNFRDCRIKF-LVDAYNPKMRPWLPTICHWKFD---WE
|..|||.|...||..-..||.||--.|.....|..|..---|.
CTIEILFFKMHRIPWWFSDALNP--CPCKRVNCTSKIARSVWK

Score: 27.0
CNYEILNFRDCRIK-FLVDAYNPKMRPWLPTICHWKFD---WE
|..|||.|...||.-...||.||.--|.....|..|..---|.
CTIEILFFKMHRIPWWFSDALNPC--PCKRVNCTSKIARSVWK

Score: 27.0
CNYEILNFRDCRIKF-LVDAYNPKMRPWLPTICHWKFD---WE
|..|||.|...||..-..||.||.--|.....|..|..---|.
CTIEILFFKMHRIPWWFSDALNPC--PCKRVNCTSKIARSVWK



As it can be seen, the first result is the same as mine.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [52]:
align_sequences(sample5_sequence1, sample5_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 38.0
CNYEILNFRDCRIKF-LVDAYNPKMRPWLPTICHWKFDWE
|..|||.|...||..-..||.||
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK


Here is what BioPython gives:

In [53]:
align_sequences_biopython(sample5_sequence1, sample5_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 38.0
CNYEILNFRDCRIKF-LVDAYNPKMRPWLPTICHWKFDWE
|..|||.|...||..-..||.||
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK

Score: 38.0
CNYEILNFRDCRIK-FLVDAYNPKMRPWLPTICHWKFDWE
|..|||.|...||.-...||.||
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK



As it can be seen, the first one is the same as mine.

##### Local - Affine Gap Penalty

Here is what my function gives

In [54]:
align_sequences(sample5_sequence1, sample5_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 38.0
CNYEILNFRDCRIK-FLVDAYNPKMRPWLPTICHWKFDWE
|..|||.|...||.-...||.||
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK


Here is what BioPython gives:

In [55]:
align_sequences_biopython(sample5_sequence1, sample5_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 38.0
CNYEILNFRDCRIK-FLVDAYNPKMRPWLPTICHWKFDWE
|..|||.|...||.-...||.||
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK

Score: 38.0
CNYEILNFRDCRIKF-LVDAYNPKMRPWLPTICHWKFDWE
|..|||.|...||..-..||.||
CTIEILFFKMHRIPWWFSDALNPCPCKRVNCTSKIARSVWK



As it can be seen, the first one is the same as mine.

#### Sample 6

In [56]:
sample6_sequence1, sample6_sequence2 = create_random_sequences()

##### Global - Linear Gap Penalty

Here is what my function gives:

In [57]:
align_sequences(sample6_sequence1, sample6_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|........--||...|...|...|.-|||-||-||.
KSSEWINPP--IQLHIQEPYAHLVPN-YEV-YD-YHC


Here is what BioPython gives:

In [58]:
align_sequences_biopython(sample6_sequence1, sample6_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|........--||...|...|...|.-|||-||-||.
KSSEWINPP--IQLHIQEPYAHLVPN-YEV-YD-YHC

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|.......--.||...|...|...|.-|||-||-||.
KSSEWINP--PIQLHIQEPYAHLVPN-YEV-YD-YHC

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|......-.-.||...|...|...|.-|||-||-||.
KSSEWIN-P-PIQLHIQEPYAHLVPN-YEV-YD-YHC

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|........--||...|...|...|-.|||-||-||.
KSSEWINPP--IQLHIQEPYAHLVP-NYEV-YD-YHC

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|.......--.||...|...|...|-.|||-||-||.
KSSEWINP--PIQLHIQEPYAHLVP-NYEV-YD-YHC

Score: -5.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|......-.-.||...|...|...|-.|||-||-||.
KSSEWIN-P-PIQLHIQEPYAHLVP-NYEV-YD-YHC



As it can be seen, the first result is the same as mine.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [59]:
align_sequences(sample6_sequence1, sample6_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 10.0
KNH--YCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|..--.......||||---......|--.|||-||-||.
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC


Here is what BioPython gives:

In [60]:
align_sequences_biopython(sample6_sequence1, sample6_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 10.0
KNH--YCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|..--.......||||---......|--.|||-||-||.
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC

Score: 10.0
KNHYC--SKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|....--.....||||---......|--.|||-||-||.
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC

Score: 10.0
KNHYCSK--MVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
|......--...||||---......|--.|||-||-||.
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC

Score: 10.0
K-NHYCSKMV-FHIQEGDQDHVAAIPPREYEVLYDWYHQ
|-........-.||||---......|--.|||-||-||.
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC

Score: 10.0
KN-HYCSKMV-FHIQEGDQDHVAAIPPREYEVLYDWYHQ
|.-.......-.||||---......|--.|||-||-||.
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC



As it can be seen, the first one is the same as mine.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [61]:
align_sequences(sample6_sequence1, sample6_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 25.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
           ||...|...|...|.-|||-||..
  KSSEWINPPIQLHIQEPYAHLVPN-YEV-YDYHC


Here is what BioPython gives:

In [62]:
align_sequences_biopython(sample6_sequence1, sample6_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 25.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
           ||...|...|...|.-|||-||..
  KSSEWINPPIQLHIQEPYAHLVPN-YEV-YDYHC

Score: 25.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
           ||...|...|...|-.|||-||..
  KSSEWINPPIQLHIQEPYAHLVP-NYEV-YDYHC

Score: 25.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
           ||...|...|...|.-|||-||-||
  KSSEWINPPIQLHIQEPYAHLVPN-YEV-YD-YHC

Score: 25.0
KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
           ||...|...|...|-.|||-||-||
  KSSEWINPPIQLHIQEPYAHLVP-NYEV-YD-YHC



As it can be seen, the first one is the same as mine.

##### Local - Affine Gap Penalty

Here is what my function gives

In [63]:
align_sequences(sample6_sequence1, sample6_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 27.0
  KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
           .||||---......|--.|||-||..
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YDYHC


Here is what BioPython gives:

In [64]:
align_sequences_biopython(sample6_sequence1, sample6_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 27.0
  KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
            ||||---......|--.|||-||..
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YDYHC

Score: 27.0
  KNHYCSKMVFHIQEGDQDHVAAIPPREYEVLYDWYHQ
            ||||---......|--.|||-||-||
KSSEWINPPIQLHIQE---PYAHLVP--NYEV-YD-YHC



Here, although the scores are the same, I get a small difference in the alignment string.

#### Sample 7

In [65]:
sample7_sequence1, sample7_sequence2 = create_random_sequences()

##### Global - Linear Gap Penalty

Here is what my function gives:

In [66]:
align_sequences(sample7_sequence1, sample7_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -2.0
YCPQSSNGYVFHMVKWPMLRGSTRPSHNQAIDMICDK--NL
|||..-.||..................|....|||.|--..
YCPHA-DGYAHVREHCRGQTMLFWVNQNIPCTMICFKCPGM


Here is what BioPython gives:

In [67]:
align_sequences_biopython(sample7_sequence1, sample7_sequence2, 'global', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: -2.0
YCPQSSNGYVFHMVKWPMLRGSTRPSHNQAIDMICDK--NL
|||..-.||..................|....|||.|--..
YCPHA-DGYAHVREHCRGQTMLFWVNQNIPCTMICFKCPGM

Score: -2.0
YCPQSSNGYVFHMVKWPMLRGSTRPSHNQAIDMICDK--NL
|||.-..||..................|....|||.|--..
YCPH-ADGYAHVREHCRGQTMLFWVNQNIPCTMICFKCPGM



As it can be seen, the first result is the same as mine.

##### Global - Affine Gap Penalty

Here is what my function gives:

In [68]:
align_sequences(sample7_sequence1, sample7_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 19.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDK--NL
|||..-.||.-|....--.||.|.---...|....|||.|--..
YCPHA-DGYA-HVREH--CRGQTMLFWVNQNIPCTMICFKCPGM


Here is what BioPython gives:

In [69]:
align_sequences_biopython(sample7_sequence1, sample7_sequence2, 'global', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 19.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDK--NL
|||.-..||.-|...--..||.|.---...|....|||.|--..
YCPH-ADGYA-HVRE--HCRGQTMLFWVNQNIPCTMICFKCPGM

Score: 19.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDK--NL
|||..-.||.-|...--..||.|.---...|....|||.|--..
YCPHA-DGYA-HVRE--HCRGQTMLFWVNQNIPCTMICFKCPGM

Score: 19.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDK--NL
|||.-..||.-|....--.||.|.---...|....|||.|--..
YCPH-ADGYA-HVREH--CRGQTMLFWVNQNIPCTMICFKCPGM

Score: 19.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDK--NL
|||..-.||.-|....--.||.|.---...|....|||.|--..
YCPHA-DGYA-HVREH--CRGQTMLFWVNQNIPCTMICFKCPGM



As it can be seen, they are the same.

##### Local - Linear Gap Penalty

Here is what my function gives:

In [70]:
align_sequences(sample7_sequence1, sample7_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 27.0
YCPQSSNGYVFHMVKWPMLRGSTRPSHNQAIDMICDKNL
|||..-.||
YCPHA-DGYAHVREHCRGQTMLFWVNQNIPCTMICFKCPGM


Here is what BioPython gives:

In [71]:
align_sequences_biopython(sample7_sequence1, sample7_sequence2, 'local', {'penalty_type': 'linear', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 27.0
YCPQSSNGYVFHMVKWPMLRGSTRPSHNQAIDMICDKNL
|||..-.||
YCPHA-DGYAHVREHCRGQTMLFWVNQNIPCTMICFKCPGM

Score: 27.0
YCPQSSNGYVFHMVKWPMLRGSTRPSHNQAIDMICDKNL
|||.-..||
YCPH-ADGYAHVREHCRGQTMLFWVNQNIPCTMICFKCPGM



As it can be seen, the first one is the same as mine.

##### Local - Affine Gap Penalty

Here is what my function gives

In [72]:
align_sequences(sample7_sequence1, sample7_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 29.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDKNL
|||..-.||.-|....--.||.|.---...|....|||.|
YCPHA-DGYA-HVREH--CRGQTMLFWVNQNIPCTMICFKCPGM


Here is what BioPython gives:

In [73]:
align_sequences_biopython(sample7_sequence1, sample7_sequence2, 'local', {'penalty_type': 'affine', 'gap_opening_penalty': -11, 'gap_extension_penalty': -1})

Score: 29.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDKNL
|||.-..||.-|...--..||.|.---...|....|||.|
YCPH-ADGYA-HVRE--HCRGQTMLFWVNQNIPCTMICFKCPGM

Score: 29.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDKNL
|||..-.||.-|...--..||.|.---...|....|||.|
YCPHA-DGYA-HVRE--HCRGQTMLFWVNQNIPCTMICFKCPGM

Score: 29.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDKNL
|||.-..||.-|....--.||.|.---...|....|||.|
YCPH-ADGYA-HVREH--CRGQTMLFWVNQNIPCTMICFKCPGM

Score: 29.0
YCPQSSNGYVFHMVKWPMLRGSTR---PSHNQAIDMICDKNL
|||..-.||.-|....--.||.|.---...|....|||.|
YCPHA-DGYA-HVREH--CRGQTMLFWVNQNIPCTMICFKCPGM



As it can be seen, the first one is the same as mine.

### Conclusion
I successfully implemented Global alignment algorithm for both linear and affine gap penalties. I also implemented Local Alignment Algorithm with a high success rate, but although the scores I get are always totally correct, I sometimes got a little bit different results for alignment matrices. But as a whole, my algorithms work almost perfectly as I observed.

### References
1) Lecture Notes
2) [Wikipedia: Needleman-Wunsch Algorithm](https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm)
3) [Wikipedia: Smith-Waterman Algorithm](https://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm)
4) [CMU School of Computer Science Bioinfo-Lectures](https://www.cs.cmu.edu/~ckingsf/bioinfo-lectures/align.py)
5) [@xuyk's _Rosalind_ GitHub Repo](https://github.com/xuyk/Rosalind)