In [1]:
from collections import Counter

def findSubstring(s: str, words: list[str]) -> list[int]:
    if not s or not words:
        return []
        
    word_len = len(words[0])
    num_words = len(words)
    sub_len = word_len * num_words
    word_count = Counter(words)
    
    result = []
    
    for i in range(len(s) - sub_len + 1):
        seen_count = Counter()
        is_match = True
        
        for j in range(num_words):
            word_start = i + j * word_len
            word = s[word_start:word_start + word_len]
            
            if word not in word_count:
                is_match = False
                break
                
            seen_count[word] += 1
            
            if seen_count[word] > word_count[word]:
                is_match = False
                break
        
        if is_match:
            result.append(i)
            
    return result

The LeetCode problem 30, "Substring with Concatenation of All Words," is a highly challenging string search problem that involves finding all starting indices in a main string $S$ (the `haystack`) where a substring can be formed by concatenating all words from a given list of `words` (the `needle` set) **exactly once and without any intervening characters**. All words in the `words` list have the same length.

---

### **Problem Constraints and Complexity**

The difficulty of this problem arises from three constraints:
1.  **Fixed Length:** The target substring must have a fixed, predetermined length, equal to $L \times M$, where $L$ is the length of each word and $M$ is the number of words.
2.  **Order Irrelevant:** The specific order of the words in the original `words` list does not matter; any permutation that forms a contiguous substring in $S$ is a valid match.
3.  **Duplicates Must Be Handled:** If the `words` list contains duplicate words (e.g., `["foo", "bar", "foo"]`), the matching substring must contain the word "foo" exactly twice and "bar" exactly once.

These constraints make simple sliding window or KMP algorithms insufficient, necessitating a more sophisticated approach involving hashing and frequency counting.

---

### **Preprocessing: The Master Frequency Map**

The first crucial step is to preprocess the input `words` list to create a **master frequency map** (a hash map). This map stores the required count for every unique word in the input list. For example, if `words = ["a", "b", "a"]`, the master map would be `{"a": 2, "b": 1}`. This map provides the target fingerprint that every potential matching substring must satisfy. 

---

### **The Sliding Window Strategy**

The core of the solution is a fixed-size **sliding window** that moves through the main string $S$. The size of the window is fixed at $L_{\text{total}} = \text{length of word} \times \text{number of words}$. The brute-force approach would be to slide this window one character at a time, check the $L \times M$ length substring, and verify its word frequencies against the master map. However, this check itself is slow.

The optimal strategy involves a clever modification: the search is performed in $L$ (word length) distinct passes.

---

### **The $L$ Separate Passes (The $L$ Start Points)**

Instead of checking all $N$ starting positions, we only need to check starting positions $i$ such that $0 \le i < L$.
* Pass 1: Start at index $0, L, 2L, 3L, \dots$
* Pass 2: Start at index $1, 1+L, 1+2L, 1+3L, \dots$
* ...
* Pass $L$: Start at index $L-1, 2L-1, 3L-1, \dots$

This ensures that every possible valid starting position is covered exactly once. For each of these $L$ starting offsets, we run an internal sliding window specifically for that pass, where the window size is $L \times M$ and the window only shifts by $L$ characters at a time. This effectively decouples the search into $L$ simpler, parallel problems.

---

### **Internal Window Management and Verification**

Within each of the $L$ passes, we maintain a **current frequency map** for the words currently inside the $L \times M$ window. The window moves word by word (i.e., it slides by $L$ characters).

1.  **Adding a Word:** A new word enters the window. We update the `current frequency map`.
2.  **Checking for Validity:**
    * If the new word is not in the `master frequency map`, the current window is invalid. We reset the `current frequency map`, move the start of the window past the current mismatch, and continue.
    * If the frequency of the word in the `current map` exceeds its required frequency in the `master map`, the window has too many of that word. We shrink the window from the left (removing words) until the count is correct.
3.  **Found Match:** If the number of words currently in the window equals $M$ (the total number of words), we have found a valid concatenation. The starting index of the window is added to the result list. We then immediately slide the window forward by one word (remove the leftmost word, add the next word) to continue the search for an overlapping match.

---

### **Complexity Analysis**

* **Time Complexity:** The total number of words we examine is $N$ (length of $S$). Since we have $L$ passes, and in each pass, we perform $O(N/L)$ word operations, the total time spent across all passes is $L \times O(N/L) = O(N)$. The time taken to hash and compare each word is $O(L)$, so the overall time complexity is $O(N \cdot L)$, which is very efficient.
* **Space Complexity:** We use space for the `master frequency map` ($O(M)$ or $O(\text{unique words})$), the `current frequency map` ($O(M)$), and the result list. Thus, the auxiliary space complexity is $O(M \cdot L)$, primarily dominated by storing the word maps and the result list.