# INTRODUCTION

Project Title: Pairwise Sequence Alignment: Global and Local Alignment Using Python

Introduction:
Pairwise sequence alignment is a fundamental technique in bioinformatics used to identify regions of similarity between two sequences, such as DNA, RNA, or proteins. These similarities can provide insights into functional, structural, and evolutionary relationships. There are two primary types of pairwise alignment: global alignment and local alignment.

Global alignment aims to align sequences along their entire length, which is useful when comparing sequences of similar length and overall similarity.
Local alignment focuses on finding the most similar regions within two sequences, which is particularly useful when comparing sequences with different lengths or when only certain regions are conserved.
Both methods are crucial for understanding biological functions, evolutionary relationships, and for identifying conserved motifs or domains in sequences.

Aim:
The aim of this project is to implement both global and local pairwise sequence alignment algorithms in Python, providing a tool to compare biological sequences and identify regions of similarity.

Objective:
To implement the Needleman-Wunsch algorithm for global alignment of two sequences.
To implement the Smith-Waterman algorithm for local alignment of two sequences.

# **PAIWISE ALIGNMENT**

In [9]:
import Bio
from Bio import SeqIO,Entrez
from Bio.Seq import Seq
from Bio.Blast import NCBIWWW, NCBIXML
from Bio import SearchIO
from Bio import pairwise2

# GLOBAL ALIGNMENT

In [3]:
seq1= Seq("ATCGTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG")
seq2= Seq("GATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGA")

In [4]:

alignments = pairwise2.align.globalxx(seq1, seq2)

for alignment in alignments:
    print(pairwise2.format_alignment(*alignment))

-ATCGTA-CGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG-
 |||| | |||||||||||||||||||||||||||||||||||||||||||||||||||||| 
GATCG-ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGA
  Score=59

-ATCG-TACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG-
 |||| | |||||||||||||||||||||||||||||||||||||||||||||||||||||| 
GATCGAT-CGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGA
  Score=59



In [5]:
alignments = pairwise2.align.globalms(seq1, seq2, 2, -1, -0.5, -0.1)

for alignment in alignments:
    print(pairwise2.format_alignment(*alignment))

-ATCGTA-CGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG-
 |||| | |||||||||||||||||||||||||||||||||||||||||||||||||||||| 
GATCG-ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGA
  Score=116

-ATCG-TACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG-
 |||| | |||||||||||||||||||||||||||||||||||||||||||||||||||||| 
GATCGAT-CGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGA
  Score=116



# LOCAL ALIGNMENT

In [6]:
seq1 = Seq("TAAGGATGCGTATAGCTTGGCTT")
seq2 = Seq("GGATCGAAGCTTGGCTTAGCTT")

In [7]:
alignments = pairwise2.align.localxx(seq1, seq2)

for alignment in alignments:
    print(pairwise2.format_alignment(*alignment))

4 GGATGCGTATAGCTTGGCT-----T
  |||| || | |||||||||     |
1 GGAT-CG-A-AGCTTGGCTTAGCTT
  Score=17

4 GGATGCGTATAGCTTGGC-T----T
  |||| || | |||||||| |    |
1 GGAT-CG-A-AGCTTGGCTTAGCTT
  Score=17

4 GGATGCGTATAGCTTGGC-----TT
  |||| || | ||||||||     ||
1 GGAT-CG-A-AGCTTGGCTTAGCTT
  Score=17

4 GGATGCGTATAGCTTGG-----CTT
  |||| || | |||||||     |||
1 GGAT-CG-A-AGCTTGGCTTAGCTT
  Score=17

4 GGATGCGTATAGCTTG-----GCTT
  |||| || | ||||||     ||||
1 GGAT-CG-A-AGCTTGGCTTAGCTT
  Score=17

4 GGATGCGTATAGCTT-G----GCTT
  |||| || | ||||| |    ||||
1 GGAT-CG-A-AGCTTGGCTTAGCTT
  Score=17

4 GGATGCGTATAGCTTGGCT----T
  |||| || | |||||||||    |
1 GGAT-CG-A-AGCTTGGCTTAGCT
  Score=17

4 GGATGCGTATAGCTTGGC-T---T
  |||| || | |||||||| |   |
1 GGAT-CG-A-AGCTTGGCTTAGCT
  Score=17

4 GGATGCGTATAGCTTGGCTT
  |||| || | ||||||||||
1 GGAT-CG-A-AGCTTGGCTT
  Score=17



In [8]:
alignments = pairwise2.align.localms(seq1, seq2, 2, -1, -0.5, -0.1)

for alignment in alignments:
    print(pairwise2.format_alignment(*alignment))

4 GGATGCGTATAGCTTGGCTT
  |||| || | ||||||||||
1 GGAT-CG-A-AGCTTGGCTT
  Score=32.5

