# Large Group Positions

## Problem

Given a string `s` of lowercase English letters, consecutive equal letters form "groups".

- Example: `s = "abbxxxxzzy"`
  The groups are: `"a"`, `"bb"`, `"xxxx"`, `"z"`, `"yy"`.

We define a **group** by an interval `[start, end]`:
- `start` is the index of the first character in the group
- `end` is the index of the last character in the group
- indices are inclusive

A group is considered **large** if it has length ≥ 3.

### Task
Return a list of `[start, end]` intervals for every large group in the string, sorted by starting index.

### Examples

**Example 1**
Input: `s = "abbxxxxzzy"`
Output: `[[3, 6]]`
Explanation: `"xxxx"` is a large group from index 3 to index 6.

**Example 2**
Input: `s = "abc"`
Output: `[]`
Explanation: `"a"`, `"b"`, and `"c"` are all length 1.

**Example 3**
Input: `s = "abcdddeeeeaabbbcd"`
Output: `[[3, 5], [6, 9], [12, 14]]`
Explanation: `"ddd"`, `"eeee"`, `"bbb"` are large groups.


## Approach

We walk through the string and detect each "run" (group) of the same character.

We use two pointers (indices):
- `start`: where the current group begins
- `i`: a moving index that scans forward

Algorithm:
1. Initialize `start = 0` and `i = 0`.
2. While `i` is inside the string:
   - Move `i` forward while `s[i]` is the same as `s[start]`.
     - When this stops, we know the current group ended.
   - The current group covers indices `[start, i-1]`.
   - If the length of that group is at least 3, save `[start, i-1]` in the result.
   - Set `start = i` to begin tracking the next group.
3. When we reach the end of the string, return the result.

Why this works:
- Each group is processed exactly once.
- We never "rewind". We only move forward.
- We don't need extra data structures beyond a result list.


In [1]:
class Solution(object):
    def largeGroupPositions(self, s):
        r = []
        n = len(s)
        start = 0
        i = 0

        while i < n:
            while i < n and s[i] == s[start]:
                i += 1

            length = i - start
            if length >= 3:
                r.append([start, i - 1])

            start = i

        return r


In [2]:
sol = Solution()

user_s = input("Enter a lowercase string: ")

print("Input string:", user_s)
print("Large group positions:", sol.largeGroupPositions(user_s))


Input string: abcdddeeeeaabbbcd
Large group positions: [[3, 5], [6, 9], [12, 14]]


In [3]:
def show_groups(s):
    i = 0
    start = 0
    print("String:", s)
    while i < len(s):
        while i < len(s) and s[i] == s[start]:
            i += 1
        print(f"group '{s[start:i]}' at [{start}, {i-1}], length={i-start}")
        start = i

test_string = input("Enter a string to visualize all groups: ")
show_groups(test_string)


String: abcdddeeeeaabbbcd
group 'a' at [0, 0], length=1
group 'b' at [1, 1], length=1
group 'c' at [2, 2], length=1
group 'ddd' at [3, 5], length=3
group 'eeee' at [6, 9], length=4
group 'aa' at [10, 11], length=2
group 'bbb' at [12, 14], length=3
group 'c' at [15, 15], length=1
group 'd' at [16, 16], length=1


## Rubber Duck Explanation 🦆

Imagine you're reading the string from left to right with a marker.

You put your marker down at `start`, which is the beginning of a group.

Then you walk forward with `i` and ask:
- "Is this still the same letter as where I put the marker?"
- If yes, keep walking.
- If no, stop. You just walked past the end of that group.

At that moment:
- The group started at `start`
- The group ended at `i - 1`
- So its interval is `[start, i-1]`

Now you check: was this group long (3 or more)?
- If yes, you record `[start, i-1]` in the answer.
- If not, you ignore it.

Then you move the marker to the new position:
- `start = i`
- and repeat the process for the next group.

This repeats until you're off the edge of the string.

Example with `abbxxxxzzy`:

- Start at index 0:
  - You see `"a"`. Length is 1 → ignore.
- Then from index 1:
  - You see `"bb"`. Length is 2 → ignore.
- Then from index 3:
  - You see `"xxxx"`. Length is 4 → record `[3,6]`.
- Then from index 7:
  - `"zz"` → ignore.
- Then from index 9:
  - `"y"` → ignore.

Return `[[3,6]]`.

In other words: we're just circling long "stretches" of the same letter and writing down where each long stretch started and ended.


## Time Complexity and Real-World Use

### Time Complexity

Let `n = len(s)`.

- We scan through the string once with the pointer `i`.
- For each group, `i` moves forward and never goes backward.
- Each character in `s` is visited at most once.

Therefore:
- Time complexity is **O(n)**.
- Space complexity is **O(k)** where `k` is the number of large groups we find (worst case O(n) if you had something like "aaaaa...").

How we measure time complexity here:
1. We count how many times each loop actually *touches* a character.
2. The outer loop and the inner loop are not nested in the "expensive" way — they cooperate. The `while i < n:` loop doesn't reset `i` back to the beginning, so the total work across both loops is still proportional to `n`, not `n^2`.

This is a classic linear scan.


### Real-World Applicability

This "find long runs of the same thing and return their ranges" pattern shows up a lot:

1. **Data compression / run-length encoding (RLE)**
   When you compress something like `aaaaabbbbcc`, you want to know runs of identical symbols and their lengths. This is the same idea.

2. **Network / sensor monitoring**
   Imagine a stream of status readings like `"OKOKOKFAILFAILFAILFAILOK"`.
   Detecting "large groups" of `"FAIL"` can tell you when there's a serious outage (e.g. 3+ fails in a row).

3. **Genomics / DNA sequences**
   In DNA strings, repeated bases like `"AAAAAA"` (homopolymer runs) can be biologically relevant. You might want to report where long repeats occur.

4. **Text cleanup / validation**
   You might want to detect spammy input, like `"heyyyyyyyyyy"` or `"!!!!!!!!!!!!"`, and flag those long repeated sections.

So even though the problem looks like just string indexing, it's basically:
**"Find stable regions in a 1D signal and report where they're long enough to matter."**
