Skip to content

Analysing different pattern-searching algorithms when finding for substring occurrences in a given DNA main string

Notifications You must be signed in to change notification settings

nikita-bachhas/DNA-Pattern-Matching

Repository files navigation

DNA-Pattern-Matching

Analysing different pattern-searching algorithms when finding for substring occurrences in a given DNA main string

Problem Statement

Give a long DNA string, find all occurrences of a chosen substring

Functionality

  1. Implemented three different algorithms: Finite Automata (FA) algorithm, Boyer Moore Horspool algorithm and the Naive/Brute Force algorithm
  2. Calculated and compared time complexities for the three different algorithms on different sizes of main DNA string and substrings

Conclusion

Screenshot 2022-08-07 at 2 54 06 PM

  1. FA algorithm is more suitable to be used for general pattern searching and has a stable time complexity regardless of the size of the main DNA string and the substrings
  2. Boyer Moore Horspool algorithm is more suitable for substrings with a few types of characters. Otherwise, this has a similar time complexity to the Naive/Brute Force algorithm

Documentation

  1. Full detailed report of the project: CZ2001 Project 1 Report.pdf
  2. Summary and presentation of the project: Project 1 Presentation.pptx

Developed by:

  1. Ang Shu Hui
  2. Bachhas Nikita
  3. Kundu Koushani
  4. Leonardo Irvin Pratama

About

Analysing different pattern-searching algorithms when finding for substring occurrences in a given DNA main string

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published