Sequence alignment and visualization tool designed to enhance understanding of sequence alignment algorithms, such as the Needleman-Wunsch algorithm. Through detailed matrix plot visualizations with arrows illustrating the path of different alignments, users can gain insights into how these algorithms score and identify optimal alignments between two sequences.
To use the SeqAlignR package, simply run:
install.packages("SeqAlignR")
library(SeqAlignR)Define the two sequences to align.
seq1 <- "GCATGCG"
seq2 <- "GATTACA"Then run the alignment algorithm. Here we also specify d (gap penalty), mismatch, and match, see the Needleman-Wunsch Wikipedia article for details on the algorithm.
# Run the Needleman-Wunsch algorithm
alignment1 <- align_sequences(seq1, seq2, d = -1, mismatch = -1, match = 1, method="needleman")A plot displaying the alignment matrix of seq1 and seq2 can then be generated.
# Plot the matrix
plot(alignment1)The first sequence seq1 is represented by the columns and the second sequence seq2 is represented by the rows. The first column and first row are left bank, meaning a gap. Each cell in the matrix displays the score. The subtitle states the match, mismatch, and gap penalty d used in the algorithm. A mismatch is shown by the red arrows, a match by the blue arrows, and a gap by the green arrows. The alignment(s) with the highest score are highlighted with thick gray borders.
The alignments can also be printed.
# Print alignment
print(alignment1)Alignments with a max score of 0
GCA-TGCG
| | | |
G-ATTACA
GCAT-GCG
| || |
G-ATTACA
GCATG-CG
| || |
G-ATTACA