# 392. Is Subsequence

[Link to Problem](https://leetcode.com/problems/is-subsequence/)

### Description

Given two strings `s` and `t`, return `true` if `s` is a **subsequence** of `t`, or `false` otherwise.

A **subsequence** of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., `"ace"` is a subsequence of `"abcde"` while `"aec"` is not).

**Example 1:**
```
Input: s = "abc", t = "ahbgdc"
Output: true
```

**Example 2:**
```
Input: s = "axc", t = "ahbgdc"
Output: false
```

**Constraints:**
- `0 <= s.length <= 100`
- `0 <= t.length <= 10^4`
- `s` and `t` consist only of lowercase English letters.

**Follow up:** Suppose there are lots of incoming `s`, say $s_1, s_2, ..., s_k$ where $k \ge 10^9$, and you want to check one by one to see if $t$ has its subsequence. In this scenario, how would you change your code?

## My Intuition

_Write down your thoughts, ideas, and insights here._

- **Observations:**
  1. Use Two pointers to check s array and t array.
- **Edge cases:**
  - `t.length < s.length`
- **Expected approach and complexity:**
  1. Time: `O(m+n); m = s.length, n = t.length`
  2. Space: `O(1)`

In [9]:
def isSubsequence(s: str, t: str) -> bool:
    if len(s) > len(t):
        return False

    i, j = 0, 0
    while i != len(s) and j != len(t):
        if s[i] == t[j]:
            i += 1
            j += 1
        else:
            j += 1

    if i == len(s):
        return True
        
    return False

if __name__ == '__main__':
    # Test Case 1
    s1, t1 = "abc", "ahbgdc"
    result1 = isSubsequence(s1, t1)
    print(f"Test 1 Input: s = \"{s1}\", t = \"{t1}\"")
    print(f"Test 1 Output: {result1}") # Expected: True

    # Test Case 2
    s2, t2 = "axc", "ahbgdc"
    result2 = isSubsequence(s2, t2)
    print(f"Test 2 Input: s = \"{s2}\", t = \"{t2}\"")
    print(f"Test 2 Output: {result2}") # Expected: False

Test 1 Input: s = "abc", t = "ahbgdc"
Test 1 Output: True
Test 2 Input: s = "axc", t = "ahbgdc"
Test 2 Output: False


I will generate a Python file containing the detailed code review and refactored solutions. This includes a "Pythonic" one-liner for the standard case and a robust, class-based solution using binary search to address your specific concern regarding the high-volume follow-up scenario.

### **Revised User Query**

**User's Original Query:** "Act as a Principal Software Engineer... reviewing code... I am worried about this time complexity for follow up question"

**Rewritten Query:** "Conduct a code review of a Python solution for the 'Is Subsequence' problem, focusing on production readiness and scalability. specifically, analyze the time complexity implications for the follow-up scenario involving a massive number of incoming `s` strings () and provide a refactored solution that optimizes for this high-throughput case."

### **Code Review**

**1. Executive Summary**
The submitted code is functionally correct and optimal () for a single execution. However, it fails to address the scalability requirement of the follow-up question; running an  scan for  incoming strings is computationally infeasible.

**2. Critical Issues (Must Fix)**

* **Scalability (Follow-up):** The current implementation re-scans `t` (length up to ) for every single `s`. If you have  incoming `s` strings, the total operations approach , which will time out. You need to preprocess `t`.
* **Loop Safety:** While `i != len(s)` works here, using `<` is generally safer coding practice to prevent infinite loops if logic inside the loop accidentally overshoots the index.

**3. Suggestions (Nice to Have)**

* **Pythonic Idioms:** For the simple case (without the follow-up), Python iterators allow for a very elegant  one-liner: `t_iter = iter(t); return all(c in t_iter for c in s)`.
* **Preprocessing for Follow-up:** To solve the follow-up, create a hash map where keys are characters and values are sorted lists of indices where that character appears in `t`. Then, for every character in `s`, use Binary Search to find the first valid index greater than the previous character's index.
* **Variable Naming:** `i` and `j` are acceptable, but `s_ptr` and `t_ptr` add clarity in complex logic.

**4. Refactored Code**
I have provided two solutions in the file below:

1. **`is_subsequence_iterative`**: The clean, Pythonic version of your current logic.
2. **`SubsequenceMatcher`**: The scalable Class-based solution required for the follow-up question.

### **Conclusion**

The refactored code provides two paths:

1. **`is_subsequence_simple`**: Use this for standard daily coding. It leverages Python's internal C-optimized iterators.
2. **`SubsequenceMatcher`**: Use this for the follow-up question. By preprocessing `t` into an index map, we reduce the per-query complexity from  to . This allows you to handle millions of queries efficiently without re-scanning the large target string every time.

In [11]:
# Solutions from Gemini3 Pro

import bisect
from collections import defaultdict
from typing import List

# ==========================================
# APPROACH 1: The "Pythonic" Solution
# Best for: Single checks or low volume of 's'
# Time Complexity: O(T) where T is length of t
# Space Complexity: O(1)
# ==========================================

def is_subsequence_simple(s: str, t: str) -> bool:
    """
    Checks if s is a subsequence of t using Python iterators.
    
    This is the production-standard way to write this logic in Python
    for simple use cases. It relies on the fact that the iterator
    advances and consumes items from t, maintaining state internally.
    """
    # Create an iterator for t. This allows us to consume characters 
    # one by one without managing integer indices manually.
    t_iter = iter(t)
    
    # Check if every character in 's' can be found in the remaining 
    # part of 't_iter'.
    return all(char in t_iter for char in s)


# ==========================================
# APPROACH 2: The Scalable Solution (Follow-up)
# Best for: High volume of 's' (k >= 10^9)
# Time Complexity: 
#   - Preprocessing: O(T)
#   - Per Query: O(S * log T) where S is length of s
# Space Complexity: O(T) to store the index map
# ==========================================

class SubsequenceMatcher:
    def __init__(self, t: str):
        """
        Preprocesses string t to allow for fast subsequence queries.
        
        We build an inverted index: a map where keys are characters
        and values are sorted lists of indices where that char appears.
        
        Example: t = "bahbgdabca"
        index_map = {
            'a': [1, 5, 9],
            'b': [0, 2, 6],
            ...
        }
        """
        self.t_len = len(t)
        self.index_map = defaultdict(list)
        for index, char in enumerate(t):
            self.index_map[char].append(index)

    def is_subsequence(self, s: str) -> bool:
        """
        Determines if s is a subsequence using Binary Search (bisect).
        """
        # If s is longer than t, it cannot be a subsequence
        if len(s) > self.t_len:
            return False

        current_t_index = -1
        
        for char in s:
            if char not in self.index_map:
                return False

            indices = self.index_map[char]
            
            # We need to find an index in 'indices' that is strictly GREATER
            # than our current_t_index. 
            # bisect_right returns the insertion point to maintain order.
            # Effectively, it finds the first index in the list > current_t_index.
            insertion_point = bisect.bisect_right(indices, current_t_index)
            
            # If the insertion point is at the end of the list, it means
            # no valid index exists after our current position.
            if insertion_point == len(indices):
                return False
            current_t_index = indices[insertion_point]
            
        return True

# ==========================================
# DRIVER / TEST CODE
# ==========================================

if __name__ == '__main__':
    # 1. Test the Simple Approach
    print("--- Testing Simple Approach ---")
    s1, t1 = "abc", "ahbgdc"
    print(f"Input: s='{s1}', t='{t1}' -> {is_subsequence_simple(s1, t1)}") # True
    
    s2, t2 = "axc", "ahbgdc"
    print(f"Input: s='{s2}', t='{t2}' -> {is_subsequence_simple(s2, t2)}") # False

    # 2. Test the Follow-up Approach
    print("\n--- Testing Scalable Approach (Follow-up) ---")
    
    # Large t simulation
    t_complex = "ahbgdc" * 1000 
    matcher = SubsequenceMatcher(t_complex)
    
    # Test cases
    test_cases = ["abc", "axc", "ahbgdc", "aaaa"]
    
    for s in test_cases:
        result = matcher.is_subsequence(s)
        print(f"Query: '{s}' -> {result}")

--- Testing Simple Approach ---
Input: s='abc', t='ahbgdc' -> True
Input: s='axc', t='ahbgdc' -> False

--- Testing Scalable Approach (Follow-up) ---
Query: 'abc' -> True
Query: 'axc' -> False
Query: 'ahbgdc' -> True
Query: 'aaaa' -> True
