# 433. Minimum Genetic Mutation


## Topic Alignment
- **Role Relevance**: Explores BFS on string state space, similar to feature toggle transformations.
- **Scenario**: Useful for enumerating minimal edits in configuration strings.


## Metadata Summary
- Source: [LeetCode - Minimum Genetic Mutation](https://leetcode.com/problems/minimum-genetic-mutation/)
- Tags: `BFS`, `Graph`, `String`
- Difficulty: Medium
- Recommended Priority: Medium


## Problem Statement
A gene string can be transformed by changing one character to one of {A,C,G,T}.Given start, end, and a bank of valid genes, return the minimum number of mutations needed to reach end; return -1 if impossible.


## Progressive Hints
- Hint 1: Treat each gene as a node in a graph where edges represent one-character mutations.
- Hint 2: BFS ensures minimal mutation count.
- Hint 3: Precompute valid characters to attempt at each position.


## Solution Overview
Run BFS from start. At each step mutate every position to any of the four letters; if the new gene exists in bank and not visited, enqueue with step+1.


## Detailed Explanation
1. Convert bank to a set for O(1) lookup; early exit if end not in bank.
2. Initialize queue with start gene and mutation count 0.
3. For each dequeued gene, if equals end, return count.
4. For each position, try all four nucleotides; when new gene in bank and unseen, add to queue and mark visited.
5. If BFS completes without reaching end, return -1.


## Complexity Trade-off Table
| Approach | Time Complexity | Space Complexity | Notes |
| --- | --- | --- | --- |
| BFS over states | O(4^L * L) in worst case | O(bank) | Efficient with pruning on bank. |
| Bidirectional BFS | O(2 * 4^(L/2)) | O(bank) | Faster for large search spaces. |
| Dijkstra | O(E log V) | O(V) | Unnecessary since edges weight 1.


## Reference Implementation


In [None]:
from collections import deque


def min_mutation(start: str, end: str, bank: list[str]) -> int:
    """Return minimal mutations between start and end using BFS."""
    bank_set = set(bank)
    if end not in bank_set:
        return -1
    letters = ['A', 'C', 'G', 'T']
    queue = deque([(start, 0)])
    visited = {start}
    while queue:
        gene, steps = queue.popleft()
        if gene == end:
            return steps
        gene_list = list(gene)
        for i, original in enumerate(gene_list):
            for letter in letters:
                if letter == original:
                    continue
                candidate = gene[:i] + letter + gene[i + 1:]
                if candidate in bank_set and candidate not in visited:
                    visited.add(candidate)
                    queue.append((candidate, steps + 1))
    return -1


## Validation


In [None]:
assert min_mutation('AACCGGTT', 'AACCGGTA', ['AACCGGTA']) == 1
assert min_mutation('AACCGGTT', 'AAACGGTA', ['AACCGGTA','AACCGCTA','AAACGGTA']) == 2
assert min_mutation('AAAAACCC', 'AACCCCCC', ['AAAACCCC','AAACCCCC','AACCCCCC']) == 3
print('All tests passed for LC 433.')


## Complexity Analysis
- Time Complexity: O(L * 4^L) worst-case but constrained by bank size (L=8).
- Space Complexity: O(bank) for visited set and queue.
- Bottleneck: Number of candidate mutations per step.


## Edge Cases & Pitfalls
- Start equals end returns 0 if end in bank (or treat as 0).
- Bank without end yields -1 immediately.
- Duplicate entries in bank handled by set conversion.


## Follow-up Variants
- Use bidirectional BFS when bank is large.
- Assign weights to mutations and run Dijkstra.
- Track actual mutation sequence.


## Takeaways
- BFS on discrete state spaces mirrors shortest path in unweighted graphs.
- Precomputing candidate letters avoids nested conditionals.
- Visited set prevents cycling through previously seen genes.


## Similar Problems
| Problem ID | Problem Title | Technique |
| --- | --- | --- |
| 752 | Open the Lock | BFS on strings with wheel mutations |
| 127 | Word Ladder | BFS on dictionary transformations |
| 126 | Word Ladder II | BFS + backtracking for all paths |
