# Strings
*Author: Jacob Park*

A string is a zero-indexed array of $n$ characters.

- **Substring**: A consecutive sequence of characters in a string.
    - e.g., 'TOM' is a substring of 'PSYCHOTOMIMETIC'.
- **Subsequence**: A monotonic sequence of chracters in a string.
    - e.g., 'TMM' is a subsequence of 'PSYCHOTOMIMETIC'.
- **Prefix**: A substring containing the first character of a string.
    - e.g., 'PSYCHO' is a prefix of 'PSYCHOTOMIMETIC'.
- **Suffix**: A substring containing the last character of a string.
    - e.g., 'MIMETIC' is a suffix of 'PSYCHOTOMIMETIC'.

## Polynomial Hashing
[See Link](https://en.wikipedia.org/wiki/Rolling_hash).
$$
\begin{align}
H(\text{s}) &= \sum_{i=0}^{n-1} \text{s}\left[i\right] \cdot p^{i} \pmod m \\
\text{h}\left[0\right] &= \text{s}\left[0\right] \\
\text{h}\left[k\right] &= \left(\text{h}\left[k-1\right] \cdot p + \text{s}\left[k\right]\right) \pmod m\\
\text{p}\left[0\right] &= 1 \\
\text{p}\left[k\right] &= \left(\text{p}\left[k-1\right] \cdot p\right) \pmod m \\
H(\text{s}\left[a...b\right]) &= \left(\text{h}\left[b\right] - \text{h}\left[a-1\right] \cdot \text{p}\left[b-a+1\right]\right) \pmod m
\end{align}
$$

In [1]:
package strings.polynomial_hashing;

import java.util.*;

public class PolynomialHasher {

    private static final long PA = 31;
    private static final long PB = 37;
    private static final long MA = (long) (1E9 + 7);
    private static final long MB = (long)(1E9 + 9);

    private final String string;
    private final long[] powers;
    private final long[] hashes;
    private final long p;
    private final long m;

    public PolynomialHasher(String string, boolean alphaFlag) {
        this.string = string;
        this.powers = new long[string.length()];
        this.hashes = new long[string.length()];
        this.p = (alphaFlag) ? PA : PB;
        this.m = (alphaFlag) ? MA : MB;
        initialize();
    }

    private void initialize() {
        if (string.isEmpty()) {
            return;
        }
        powers[0] = 1;
        hashes[0] = string.charAt(0);
        for (int index = 1; index < string.length(); index++) {
            powers[index] = (powers[index - 1] * p) % m;
            hashes[index] = (hashes[index - 1] * p + string.charAt(index)) % m;
        }
    }

    public long hash(int beginIndex, int endIndex) {
        if (beginIndex < 0 || beginIndex > endIndex || endIndex > string.length()) {
            throw new IllegalArgumentException();
        }
        if (string.isEmpty()) {
            return 0;
        }
        if (beginIndex == 0) {
            return hashes[endIndex];
        } else {
            // Note: Java's % operator returns the remainder which can be negative.
            // Accordingly, A mod B == ((A % B) + B) % B
            return ((hashes[endIndex] - hashes[beginIndex - 1] * powers[endIndex - beginIndex + 1]) % m + m) % m;
        }
    }
}

strings.polynomial_hashing.PolynomialHasher

### Application: Pattern Matching ($O(n)$)
1. Generate polynomial hashes of the pattern and the text with $O(n)$ space and time complexities.
2. Slide the text's indices with a window length of the pattern, and if the hash of the text substring matches the hash of the pattern, then the pattern matched.

In [2]:
package strings.polynomial_hashing;

import java.util.*;

final String pattern = "abracadabra";
final String text = "abacadabrabracabracadabrabrabracad";

final PolynomialHasher patternHasher = new PolynomialHasher(pattern, true);
final PolynomialHasher textHasher = new PolynomialHasher(text, true);

final long patternHash = patternHasher.hash(0, pattern.length() - 1);
for (int beginIndex = 0; beginIndex <= text.length() - pattern.length(); beginIndex++) {
    final int endIndex = beginIndex + pattern.length() - 1;
    final long textSubstringHash = textHasher.hash(beginIndex, endIndex);
    if (patternHash == textSubstringHash) {
        System.out.println(String.format("Pattern Matched [beginIndex=%d, endIndex=%d]", beginIndex, endIndex));
        break;
    }
}

Pattern Matched [beginIndex=14, endIndex=24]


null

## Z Algorithm
TODO.

## Suffix Array
TODO.