### T9 predictive text

No, T9 doesn't guess character by character — that's the key insight that made it clever. It works at the **word level**, not the character level.

When you press a sequence of digits, T9 waits until it has enough context and then looks up all possible words that match the entire digit sequence at once. So pressing `4-3` doesn't first guess `g/h/i` then `d/e/f` — it just finds all words whose letters map to `4` then `3`, which gives you `he`, `if`, `id`, `ge` etc., and surfaces the most frequent one.

**How frequency ranking works in practice:**

The phone ships with a static dictionary where each word has a pre-assigned frequency score based on general English usage. The highest frequency match for the digit sequence is shown first. So for `4663` it knows `good` is more common than `gone` or `home` and shows that by default.

Over time many T9 implementations also did **personal learning** — if you repeatedly selected `gone` over `good` for `4663`, it would bump `gone`'s personal frequency score and start showing it first for you.

**The user experience flow:**

1. You press `4663`
2. T9 looks up all words matching that sequence and sorts by frequency
3. It displays the top match, say `good`
4. If that's wrong you press `0` or `*` to cycle through alternatives: `gone`, `home`
5. You confirm with a space or punctuation

**Where it gets interesting** is ambiguous short sequences like `43` which could be `he`, `if`, `id`, `ge` — frequency scoring is what makes it not feel like a guessing game. And for words not in the dictionary, there was usually a fallback "multi-tap" mode where pressing `4` twice gives you `h`, three times gives `i` — the old pre-T9 method.

So the trie implementation above is actually a reasonable model of the core lookup, but a real T9 would attach a frequency score to each `is_end` node and sort results by that rather than returning an unordered list.

In [None]:
# Blog
class TrieNode:
    def __init__(self, value: str, parent: "TrieNode | None" = None):
        self._children: dict[str, TrieNode] = {}
        self._insertion_count = 0
        self._value = value
        self._parent = parent

    def __repr__(self):
        return f"TrieNode<{self._value}>"  # pragma: no cover

    @cached_property
    def word(self) -> str:
        return "".join(n._value for n in self._bottom_up_traversal())[::-1]

    def _bottom_up_traversal(self) -> Iterable["TrieNode"]:
        current: "TrieNode | None" = self
        while current is not None:
            yield current
            current = current._parent

    @property
    def word_nodes(self) -> Iterable["TrieNode"]:
        result = []
        dfs = [self]
        while dfs:
            node = dfs.pop()
            if node._insertion_count:
                result.append(node)

            for child in node._children.values():
                dfs.append(child)

        return result


In [None]:
# Claud

T9_MAP = {
    '2': 'abc',
    '3': 'def',
    '4': 'ghi',
    '5': 'jkl',
    '6': 'mno',
    '7': 'pqrs',
    '8': 'tuv',
    '9': 'wxyz'
}

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

class T9:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end = True

    def predict(self, digits: str) -> list[str]:
        results = []
        self._dfs(self.root, digits, 0, "", results)
        return results

    def _dfs(self, node: TrieNode, digits: str, depth: int, current: str, results: list) -> None:
        # Collected a full match for the digit sequence
        if depth == len(digits):
            if node.is_end:
                results.append(current)
            return

        # Try every letter mapped to the current digit
        for char in T9_MAP[digits[depth]]:
            if char in node.children:
                self._dfs(node.children[char], digits, depth + 1, current + char, results)

In [None]:
t9 = T9()

words = ["the", "of", "and", "he", "she", "his", "her", "home", "good", "gone"]
for word in words:
    t9.insert(word)

print(t9.predict("4663"))   # ['gone', 'good', 'home']
print(t9.predict("843"))    # ['the']
print(t9.predict("43"))     # ['he'] — 'he' = 4→h, 3→e
print(t9.predict("743"))    # ['she'] — 'she' = 7→s, 4→h, 3→e
print(t9.predict("447"))    # ['his'] — 'his' = 4→h, 4→i, 7→s