Name		Name	Last commit message	Last commit date
parent directory ..
Makefile		Makefile
README.md		README.md
all_distinct_palindromic_substrings.cc		all_distinct_palindromic_substrings.cc
base64.h		base64.h
base64_test.cc		base64_test.cc
longest_unique_substring.cc		longest_unique_substring.cc
string_helper.h		string_helper.h
substr_match.cc		substr_match.cc
substring.h		substring.h
word_freq_cnt.cc		word_freq_cnt.cc

README.md

String Related Algorithms

Substring Search

Basic implementation of different substring search algorithms:

Algorithm	Explanation	#Cycles per iteration	Time taken for 1M iteartions (seconds)
STL Find		166K cycles	6.25 s
Brute Force	For every position in the text consider it as a starting point and check for match (2 loops)	164K cycles	6.17s
Brute Force Optimized	Use a single loop and backtrack text and pattern when there is a mismatch	151K cycles	5.68s
KMP Algo	- Uses a Discrete Finite State Machine (DFA) to create a prefix table for searching (longest prefix which is also a suffix). - Best for long text and pattern with short radix (e.g. DNA sequence)	156K cycles	5.86s
Booyer Moore	- Scan pattern characters from right to left and skip as many as M characters whenever there is a mismatch. - Best for longer patterns with a bigger radix.	212K cycles	7.96s
Rabin Karp (Monte-Carlo Approach)	- Use a rolling hash to compare the pattern hash for each position in the text. - Monte-Carlo approach tolerates hash false-positives (do not manually check substring with text)	378K cycles	14.19 s

Note:
- Micro-benchmarks were measured on an Ubuntu 16.04 VM running on Intel(R) Xeon(R) CPU X5650 @ 2.67GHz with GCC 5.4.0 with C++11 and OPT flags(-O3 -Os, -march=native)
- The above data was obtained with 1M randomized test strings with a text size of 100K characters and a pattern size of 100.