DNA-Pattern-Matching

Analysing different pattern-searching algorithms when finding for substring occurrences in a given DNA main string

Problem Statement

Give a long DNA string, find all occurrences of a chosen substring

Implemented three different algorithms: Finite Automata (FA) algorithm, Boyer Moore Horspool algorithm and the Naive/Brute Force algorithm
Calculated and compared time complexities for the three different algorithms on different sizes of main DNA string and substrings

FA algorithm is more suitable to be used for general pattern searching and has a stable time complexity regardless of the size of the main DNA string and the substrings
Boyer Moore Horspool algorithm is more suitable for substrings with a few types of characters. Otherwise, this has a similar time complexity to the Naive/Brute Force algorithm

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.DS_Store		.DS_Store
.gitattributes		.gitattributes
CZ2001 Project 1 Report.pdf		CZ2001 Project 1 Report.pdf
CZ2001 Project 1.ipynb		CZ2001 Project 1.ipynb
GCF_000836805.1_ViralProj14012_genomic.fna		GCF_000836805.1_ViralProj14012_genomic.fna
Project 1 Presentation.pptx		Project 1 Presentation.pptx
README.md		README.md