# Shortest Transformation Sequence
Given two words, start and end, and a dictionary containing an array of words, return the length of the shortest transformation sequence to transform start to end. A transformation sequence is a series of words in which:
- Each word differs from the preceding word by exactly one letter.
- Each word in the sequence exists in the dictionary.
If no such transformation sequence exists, return 0.


**Example:**
```python
Input: start = 'red', end = 'hit',
       dictionary = [
            'red', 'bed', 'hat', 'rod', 'rad', 'rat', 'hit', 'bad', 'bat'
       ]
Output: 5
```

**Constraints:**
- All words are the same length.
- All words contain only lowercase English letters.
- The dictionary contains no duplicate words.

## Intuition
In a **transformation sequence**, each step involves changing **one letter** of a word to obtain another valid word. Given a **starting word** and a **target word**, we can model this as a **graph problem**, where:
- Each **word** is a **node**.
- An **edge** exists between two words if they **differ by exactly one letter**.

To find the **shortest transformation sequence**, we need to determine the **shortest path** from the **start word** to the **end word**.

---

## Finding the Shortest Path
Whenever a graph problem requires finding the **shortest path**, **Breadth-First Search (BFS)** should be considered. BFS explores all neighbors at a **distance of 1**, then **distance of 2**, and so on. The first time we reach the **end word**, we are guaranteed to have done so using the **fewest** transformations.

To implement BFS:
1. Use **level-order traversal**, where each level represents words that are a **specific distance** from the start word.
2. Track visited words using a **hash set** to avoid revisiting them.
3. Return the length of the shortest transformation sequence (the BFS level count).

---

## Building the Graph
The challenge is identifying **neighbors** for each word. Given a word `"red"`, its neighbors can be found by:
- Changing **each letter** to **every other letter** in the alphabet.
- Checking if the new word exists in the **dictionary**.

To make dictionary lookups **efficient**, we store the words in a **hash set**, allowing for **O(1) lookups**.

### Edge Case:
- If the **start or end word** is missing from the dictionary, return `0`, since a valid transformation **must** use words from the dictionary.

---

## Space Optimization
Instead of **precomputing an adjacency list**, we can **generate neighbors on demand** during BFS traversal. This saves memory while still allowing efficient traversal.

In [1]:
from typing import List
from collections import deque

def shortest_transformation_sequence(start: str, end: str, dictionary: List[str]) -> int:
    dictionary_set = set(dictionary)
    if start not in dictionary_set or end not in dictionary_set:
        return 0
    
    if start == end:
        return 1

    lower_case_alphabet = 'abcdefghijklmnopqrstuvwxyz'
    queue = deque([start])
    visited = set([start])
    dist = 0

    while queue:
        for _ in range(len(queue)):
            curr_word = queue.popleft()

            if curr_word == end:
                return dist + 1
            
            for i in range(len(curr_word)):
                for c in lower_case_alphabet:
                    next_word = curr_word[:i] + c + curr_word[i+1:]

                    if (next_word in dictionary_set
                    and next_word not in visited):
                        visited.add(next_word)
                        queue.append(next_word)
        
        dist += 1

    return 0

## Complexity Analysis

### Time Complexity
The time complexity is **O(n * L²)**, where:
- **n** is the number of words in the dictionary.
- **L** is the length of each word.

#### Breakdown:
1. **Building the Hash Set**: 
   - Storing all words in a hash set takes **O(n * L)**, as hashing each word requires **O(L)** time.
  
2. **BFS Traversal**:
   - In the worst case, BFS processes **n** words.
   - For each word, we generate **26 * L** possible transformations by modifying each letter.
   - Checking if a transformation exists in the **dictionary_set** and **visited** set takes **O(L)**.
   - Enqueuing a valid transformation takes **O(1)**.
   - Thus, BFS traversal takes **O(n * 26 * L * L) = O(n * L²)**.

Since **O(n * L²)** dominates **O(n * L)**, the overall **time complexity is O(n * L²)**.

---

### Space Complexity
The space complexity is **O(n * L)**, due to:
- **Dictionary Set (`dictionary_set`)**: Stores up to **n** words, each of length **L** → **O(n * L)**.
- **Visited Set (`visited`)**: Stores at most **n** words → **O(n * L)**.
- **Queue (for BFS)**: Stores at most **n** words → **O(n * L)**.

Thus, the total **space complexity is O(n * L)**.

### Optimization - Bidirectional Traversal

An important observation is that we don't necessarily need to begin level-order traversal at the start word; we can also start a search at the end word. In fact, we can combine these by performing them simultaneously to find the shortest path. This is known as **bidirectional BFS**, or in this case, **bidirectional level-order traversal**.

When we perform two searches—one from the start and one from the end—the idea is that they will meet in the middle if a path between these two words exists. If a path doesn't exist, the searches will never meet, indicating that a transformation sequence does not exist. This optimization allows us to identify the shortest distance more quickly.

To simulate this process, we alternate between the two level-order traversals, progressing through each search one level at a time:

- **Start traversal**: Process one level in the traversal that started from the start node.
- **End traversal**: Process one level in the traversal that started from the end node.

We continue alternating between these two steps until a node visited in one search has already been visited in the other search, indicating that the traversals have met. To check if a node has already been visited by the other traversal, we can query the **visited hash set** used by the opposite search. If a word exists in the visited hash set of the other traversal, we know the searches have met.

In [2]:
from typing import List
from collections import deque

def shortest_transformation_sequence(start: str, end: str, dictionary: List[str]) -> int:
    dictionary_set = set(dictionary)
    if start not in dictionary_set or end not in dictionary_set:
        return 0
    
    if start == end:
        return 1

    start_queue = deque([start])
    start_visited = {start}
    end_queue = deque([end])
    end_visited = {end}
    level_start = level_end = 0

    while start_queue and end_queue:
        
        level_start += 1        
        if explore_level(start_queue, start_visited, end_visited, dictionary_set):
            return level_start + level_end + 1
        
        level_end += 1
        if explore_level(end_queue, end_visited, start_visited, dictionary_set):
            return level_start + level_end + 1
    
    return 0

def explore_level(queue, visited, other_visited, dictionary_set) -> bool:

    lower_case_alphabet = 'abcdefghijklmnopqrstuvwxyz'
    
    for _ in range(len(queue)):
        curr_word = queue.popleft()

        for i in range(len(curr_word)):
            for c in lower_case_alphabet:
                next_word = curr_word[:i] + c + curr_word[i+1:]

                if next_word in other_visited:
                    return True

                if (next_word in dictionary_set
                and next_word not in visited):
                    visited.add(next_word)
                    queue.append(next_word)
    return False


## Complexity Analysis

### Time Complexity
The time complexity is **O(n * L²)**, since we're performing two level-order traversals. But this is more efficient in practice since there are potentially fewer nodes to traverse when using bidirectional traversal.

---

### Space Complexity
The space complexity is **O(n * L)**, taken up by the hash set, both visited hash sets, and both queues.