1. Define node for the trie
   - ```children``` dictionary mapping each character to its child node
   - ```count``` tracks how many words pass through the particular node

In [2]:
class SimpleTrieNode:
    def __init__(self):
        self.children = {}
        self.count = 0

- Initializes the prefix trie with a root node
- Adds a word to the trie letter by letter (from left to right)
- If a letter is missing, creates a new node
- Increments the count for each node traversed
  <br>
**find_stem_suffix**
- Finds the best split point in the word for stem and suffix.
- Traverses the trie, looking for the node with the most branches (indicating a common stem).
- Returns the split at the best branching point.

In [3]:
class PrefixTrie:
    def __init__(self):
        self.root = SimpleTrieNode()
    def add_word(self, word):
        node = self.root
        for letter in word:
            if letter not in node.children:
                node.children[letter] = SimpleTrieNode()
            node = node.children[letter]
            node.count += 1

    def find_stem_suffix(self, word):
        node = self.root
        best_split = 0
        max_branches = 0
        
        for i, letter in enumerate(word):
            if letter not in node.children:
                return word, ""
            
            node = node.children[letter]
            branches = len(node.children)
            
            if branches > max_branches and i >= 1:
                max_branches = branches
                best_split = i + 1
        
        if best_split == 0:
            return word, ""
        
        stem = word[:best_split]
        suffix = word[best_split:]
        return stem, suffix

- Initializes the suffix trie with a root node.
- Adds a word to the trie letter by letter, but in reverse (right to left).
- Finds the best split point for stem and suffix, but from the end of the word.
- Returns the split at the best branching point found in the reversed trie.

In [4]:
class SuffixTrie:
    def __init__(self):
        self.root = SimpleTrieNode()
    
    def add_word(self, word):
        node = self.root
        for letter in reversed(word):
            if letter not in node.children:
                node.children[letter] = SimpleTrieNode()
            node = node.children[letter]
            node.count += 1
    
    def find_stem_suffix(self, word):
        node = self.root
        best_suffix_length = 0
        max_branches = 0
        for i, letter in enumerate(reversed(word)):
            if letter not in node.children:
                return word, ""
            node = node.children[letter]
            branches = len(node.children)
            suffix_length = i + 1
            if branches > max_branches and 1 <= suffix_length <= len(word) - 1:
                max_branches = branches
                best_suffix_length = suffix_length
        if best_suffix_length == 0:
            return word, ""
        stem = word[:-best_suffix_length]
        suffix = word[-best_suffix_length:]
        return stem, suffix

- Tries to load words from ```brown_nouns.txt```
- If the file is missing, uses a small sample list of words.

In [5]:
def load_words():
    try:
        with open("brown_nouns.txt", "r") as f:
            words = [word.strip().lower() for word in f if word.strip()]
        print(f"Loaded {len(words)} words from brown_nouns.txt")
        return words
    except:
        print("File not found. Using sample words...")
        return ['go', 'goes', 'going', 'gone', 'cat', 'cats', 'dog', 'dogs', 
                'kite', 'kites', 'book', 'books', 'look', 'looks', 'looking']

- Loads words, builds both tries, and compares their ability to find suffixes.
- Prints results for the first 10 words.
- Counts and tells which trie found more suffixes.

In [6]:
def test_both_tries():
    words = load_words()

    print("\nPrefix Trie...")
    prefix_trie = PrefixTrie()
    for word in words:
        prefix_trie.add_word(word)
    
    print("Suffix Trie...")
    suffix_trie = SuffixTrie()
    for word in words:
        suffix_trie.add_word(word)

    test_words = words[:10]
    
    print("\n" + "="*50)
    print("COMPARISON RESULTS")
    print("="*50)
    
    prefix_good = 0
    suffix_good = 0
    
    for word in test_words:
        p_stem, p_suffix = prefix_trie.find_stem_suffix(word)
        s_stem, s_suffix = suffix_trie.find_stem_suffix(word)
        
        print(f"\nWord: {word}")
        print(f"Prefix Trie:  {p_stem}+{p_suffix}")
        print(f"Suffix Trie:  {s_stem}+{s_suffix}")
        
        if p_suffix:
            prefix_good += 1
        if s_suffix:
            suffix_good += 1

    print("\n" + "="*50)
    print("FINAL RESULTS")
    print("="*50)
    print(f"Prefix Trie found suffixes for: {prefix_good} words")
    print(f"Suffix Trie found suffixes for: {suffix_good} words")
    
    if suffix_good > prefix_good:
        print("\nWINNER: Suffix Trie!")
    elif prefix_good > suffix_good:
        print("\nWINNER: Prefix Trie!")
    else:
        print("\nIt's a tie!")


- Runs the test if the script is executed directly.
- Prints a header and calls the test function.

In [7]:
if __name__ == "__main__":
    print("SIMPLE TRIE STEMMING ANALYSIS")
    print("="*40)
    test_both_tries()

SIMPLE TRIE STEMMING ANALYSIS
Loaded 202793 words from brown_nouns.txt

Prefix Trie...
Suffix Trie...

COMPARISON RESULTS

Word: investigation
Prefix Trie:  in+vestigation
Suffix Trie:  investigati+on

Word: primary
Prefix Trie:  pri+mary
Suffix Trie:  primar+y

Word: election
Prefix Trie:  el+ection
Suffix Trie:  electi+on

Word: evidence
Prefix Trie:  ev+idence
Suffix Trie:  evidenc+e

Word: irregularities
Prefix Trie:  ir+regularities
Suffix Trie:  irregularitie+s

Word: place
Prefix Trie:  pla+ce
Suffix Trie:  plac+e

Word: jury
Prefix Trie:  ju+ry
Suffix Trie:  jur+y

Word: presentments
Prefix Trie:  pre+sentments
Suffix Trie:  presentment+s

Word: charge
Prefix Trie:  cha+rge
Suffix Trie:  charg+e

Word: election
Prefix Trie:  el+ection
Suffix Trie:  electi+on

FINAL RESULTS
Prefix Trie found suffixes for: 10 words
Suffix Trie found suffixes for: 10 words

It's a tie!
