# Knuth-Morris-Pratt (KMP) (Algorithm)

* the problem of substring search is that we want to search for occurences of a word (substring) w within a text S.
  * 'My name is Adam and I am a huge fan of **computer** science'
* brute-force approach and Boyer-Moore algorithm have O(N*M) worst-case running time.
* Knuth-Morris-Patt algorithm has O(N+M) linear running time complexity even in worst-case scenario!!!
* it was first published in 1977.
* the algorithm must preprocess the pattern with O(M) running time complexity and with O(M) additional memory complexity.
  * ~ this is when we construct the **partial match table** (or failure function)

### How can we boost the brute-force algorithm?

* We have to analyze the prefix and suffix of the pattern
  * <u>Prefix:</u> prefix is an affix which is placed before the stem of a word.
      * If the pattern is the word **apple** then the prefixes of the pattern are **[a, ap, app, appl, apple]**
  * <u>Suffix:</u> suffix is an affix which is placed after the stem of a word.
      * If the pattern is the word **apple** then the suffixes of the pattern are **[e, le, ple, pple, apple]**

  * **Knuth-Morris-Pratt** algorithm's preprocess stage analyzes the patterns and checks whether some prefixes are matching any suffixes in the pattern
    * ~ we look for the longest prefix which is the same as some suffixes.
   
    * This is how the algorithm can reduce the number of comparisons. 
    
The PI(p) encapsulates knowledge about how the pattern matches against the shifts of itself. This information can be used to avoid useless shift of the P pattern.
     

In [16]:
from typing import List

def get_prefixes(s: str) -> List[str]:
    p = [] 
    i = 0 

    while (i < len(s)):
        p.append(s[:i])
        i = i + 1 
    
    return p

def get_suffixes(s: str) -> List[str]:
    p = []
    i = len(s) - 1 

    while (i >= 0):
        p.append(s[i:])
        i = i - 1 
        
    return p
    
s = 'apple' 

prefixes = get_prefixes(s)
suffixes = get_suffixes(s)

print('prefixes: %s' % prefixes)
print('suffixes: %s' % suffixes)

prefixes: ['', 'a', 'ap', 'app', 'appl']
suffixes: ['e', 'le', 'ple', 'pple', 'apple']


### Partial Match Table (or the &#960; table)

#### Example 1
![Alt Text](./imgs/knuth1.png)


#### Example 2 
![Alt Text](./imgs/knuth2.png)

### The Algorithm Visualization 

![Alt Text](./imgs/knuth3.png)


![Alt Text](./imgs/knuth4.png)

### Contructing the &#960; Table 