# 992. Subarrays with K Different Integers

Hard

Given an integer array nums and an integer k, return the number of good subarrays of nums.

A good array is an array where the number of different integers in that array is exactly k.

For example, [1,2,3,1,2] has 3 different integers: 1, 2, and 3.
A subarray is a contiguous part of an array.

# Example 1:

```
Input: nums = [1,2,1,2,3], k = 2
Output: 7
Explanation: Subarrays formed with exactly 2 different integers: [1,2], [2,1], [1,2], [2,3], [1,2,1], [2,1,2], [1,2,1,2]
```

# Example 2:

```
Input: nums = [1,2,1,3,4], k = 3
Output: 3
Explanation: Subarrays formed with exactly 3 different integers: [1,2,1,3], [2,1,3], [1,3,4].

```

# Constraints:

- 1 <= nums.length <= 2 \* 104
- 1 <= nums[i], k <= nums.length


**Problem Statement**

Given an integer array `nums` and an integer `k`, we need to find the number of contiguous subarrays within `nums` that contain exactly `k` distinct integers.

**Understanding the Problem**

A subarray is a contiguous part of the original array. We need to examine every possible subarray of `nums` and check if the number of unique integers within that subarray is equal to `k`. Finally, we return the total count of such "good" subarrays.

**Example Breakdown**

- **Example 1:** `nums = [1, 2, 1, 2, 3]`, `k = 2`
  The subarrays with exactly 2 distinct integers are:
  `[1, 2]`, `[2, 1]`, `[1, 2]`, `[2, 3]`, `[1, 2, 1]`, `[2, 1, 2]`, `[1, 2, 1, 2]`
  So the output is 7.

- **Example 2:** `nums = [1, 2, 1, 3, 4]`, `k = 3`
  The subarrays with exactly 3 distinct integers are:
  `[1, 2, 1, 3]`, `[2, 1, 3]`, `[1, 3, 4]`
  So the output is 3.

**Brute-Force Approach**

The most straightforward approach is to generate all possible contiguous subarrays and, for each subarray, count the number of distinct integers.

**Algorithm:**

1.  Initialize a counter `count` to 0.
2.  Iterate through all possible starting indices `i` from 0 to `len(nums) - 1`.
3.  For each starting index `i`, iterate through all possible ending indices `j` from `i` to `len(nums) - 1`.
4.  For each subarray `nums[i:j+1]`:
    a. Create a set of the elements in the subarray to find the distinct integers.
    b. If the size of the set is equal to `k`, increment `count`.
5.  Return `count`.

**Python Code (Brute-Force):**

```python
class SolutionBruteForce:
    def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
        n = len(nums)
        count = 0
        for i in range(n):
            for j in range(i, n):
                subarray = nums[i : j + 1]
                distinct_count = len(set(subarray))
                if distinct_count == k:
                    count += 1
        return count

# Test Cases (Brute-Force)
solver_bf = SolutionBruteForce()
print(f"Brute-force [1,2,1,2,3], k=2: {solver_bf.subarraysWithKDistinct([1, 2, 1, 2, 3], 2)}")
print(f"Brute-force [1,2,1,3,4], k=3: {solver_bf.subarraysWithKDistinct([1, 2, 1, 3, 4], 3)}")
print(f"Brute-force [1,1,1,1,1], k=1: {solver_bf.subarraysWithKDistinct([1, 1, 1, 1, 1], 1)}")
print(f"Brute-force [1,2,3,4,5], k=3: {solver_bf.subarraysWithKDistinct([1, 2, 3, 4, 5], 3)}")
print(f"Brute-force [], k=0: {solver_bf.subarraysWithKDistinct([], 0)}")
print(f"Brute-force [1], k=1: {solver_bf.subarraysWithKDistinct([1], 1)}")
print(f"Brute-force [1,2], k=3: {solver_bf.subarraysWithKDistinct([1, 2], 3)}")
```

**Time and Space Complexity (Brute-Force):**

- **Time Complexity:** O(n^3) in the worst case. The outer two loops iterate O(n^2) times, and creating a set for each subarray takes O(n) in the worst case.
- **Space Complexity:** O(n) in the worst case to store the subarray and the set.

This brute-force approach will likely time out for the given constraints. We need a more efficient solution.

**Optimized Approach: Using Inclusion-Exclusion (Counting Subarrays with At Most K Distinct Integers)**

The problem of counting subarrays with _exactly_ `k` distinct integers is harder to solve directly with a sliding window. However, we can reframe the problem using the principle of inclusion-exclusion.

The number of subarrays with exactly `k` distinct integers is equal to:

(Number of subarrays with at most `k` distinct integers) - (Number of subarrays with at most `k-1` distinct integers)

Now, the problem reduces to efficiently counting the number of subarrays with at most `m` distinct integers for a given `m`. This can be done using a sliding window approach.

**Algorithm (Optimized):**

1.  Create a helper function `count_at_most_k_distinct(nums, k)` that returns the number of subarrays with at most `k` distinct integers.
2.  In the main function `subarraysWithKDistinct(nums, k)`:
    a. Return `count_at_most_k_distinct(nums, k) - count_at_most_k_distinct(nums, k - 1)`.

**Helper Function `count_at_most_k_distinct(nums, k)`:**

1.  Initialize `count = 0`, `left = 0`, and a dictionary `distinct_counts` to store the frequency of elements in the current window.
2.  Initialize `distinct_elements = 0`.
3.  Iterate through the array with the `right` pointer from 0 to `len(nums) - 1`.
4.  For each `nums[right]`:
    a. If `nums[right]` is not in `distinct_counts` or its count is 0, increment `distinct_elements`.
    b. Increment the count of `nums[right]` in `distinct_counts`.
5.  While `distinct_elements > k`:
    a. Decrement the count of `nums[left]` in `distinct_counts`.
    b. If the count of `nums[left]` becomes 0, decrement `distinct_elements`.
    c. Move the `left` pointer one step to the right.
6.  The number of valid subarrays ending at `right` with at most `k` distinct elements is `right - left + 1`. Add this to `count`.
7.  Return `count`.

**Python Code (Optimized):**

```python
from collections import defaultdict

class Solution:
    def subarraysWithAtMostKDistinct(self, nums: list[int], k: int) -> int:
        n = len(nums)
        count = 0
        left = 0
        distinct_counts = defaultdict(int)
        distinct_elements = 0

        for right in range(n):
            if distinct_counts[nums[right]] == 0:
                distinct_elements += 1
            distinct_counts[nums[right]] += 1

            while distinct_elements > k:
                distinct_counts[nums[left]] -= 1
                if distinct_counts[nums[left]] == 0:
                    distinct_elements -= 1
                left += 1
            count += right - left + 1
        return count

    def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
        if k == 0:
            return 0
        return self.subarraysWithAtMostKDistinct(nums, k) - self.subarraysWithAtMostKDistinct(nums, k - 1)

# Test Cases (Optimized)
solver = Solution()
print(f"Optimized [1,2,1,2,3], k=2: {solver.subarraysWithKDistinct([1, 2, 1, 2, 3], 2)}")
print(f"Optimized [1,2,1,3,4], k=3: {solver.subarraysWithKDistinct([1, 2, 1, 3, 4], 3)}")
print(f"Optimized [1,1,1,1,1], k=1: {solver.subarraysWithKDistinct([1, 1, 1, 1, 1], 1)}")
print(f"Optimized [1,2,3,4,5], k=3: {solver.subarraysWithKDistinct([1, 2, 3, 4, 5], 3)}")
print(f"Optimized [], k=0: {solver.subarraysWithKDistinct([], 0)}")
print(f"Optimized [1], k=1: {solver.subarraysWithKDistinct([1], 1)}")
print(f"Optimized [1,2], k=3: {solver.subarraysWithKDistinct([1, 2], 3)}")
print(f"Optimized [3,2,3,1,3], k=2: {solver.subarraysWithKDistinct([3, 2, 3, 1, 3], 2)}")
print(f"Optimized [1,2,3], k=1: {solver.subarraysWithKDistinct([1, 2, 3], 1)}")
print(f"Optimized [1,2,3], k=2: {solver.subarraysWithKDistinct([1, 2, 3], 2)}")
print(f"Optimized [1,2,3], k=3: {solver.subarraysWithKDistinct([1, 2, 3], 3)}")
print(f"Optimized [1,2,3], k=4: {solver.subarraysWithKDistinct([1, 2, 3], 4)}")
print(f"Optimized [1,0,0,0,1,0,0,1], k=1: {solver.subarraysWithKDistinct([1,0,0,0,1,0,0,1], 1)}")
```

**Time and Space Complexity (Optimized):**

- **Time Complexity:** O(n). The `right` pointer iterates through the array once in `subarraysWithAtMostKDistinct`. The `left` pointer also moves at most `n` times in total. Therefore, the overall time complexity is linear.
- **Space Complexity:** O(n) in the worst case to store the `distinct_counts` dictionary (if all elements in `nums` are distinct).

This optimized approach using the inclusion-exclusion principle with a sliding window is efficient enough to solve the problem within the given constraints.


```python
from collections import defaultdict

class Solution:
    def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
        def count_at_most_k_distinct(arr, k_limit):
            n = len(arr)
            count = 0
            left = 0
            distinct_counts = defaultdict(int)
            distinct_elements = 0

            for right in range(n):
                if distinct_counts[arr[right]] == 0:
                    distinct_elements += 1
                distinct_counts[arr[right]] += 1

                while distinct_elements > k_limit:
                    distinct_counts[arr[left]] -= 1
                    if distinct_counts[arr[left]] == 0:
                        distinct_elements -= 1
                    left += 1
                count += right - left + 1
            return count

        if k == 0:
            return 0
        return count_at_most_k_distinct(nums, k) - count_at_most_k_distinct(nums, k - 1)

# Edge and Test Cases (Sliding Window Approach)
solver = Solution()

# Edge Case 1: Empty array
print(f"Sliding Window [], k=0: {solver.subarraysWithKDistinct([], 0)}")
print(f"Sliding Window [], k=1: {solver.subarraysWithKDistinct([], 1)}")

# Edge Case 2: k is greater than the number of elements
print(f"Sliding Window [1, 2], k=3: {solver.subarraysWithKDistinct([1, 2], 3)}")

# Edge Case 3: k is 0
print(f"Sliding Window [1, 2], k=0: {solver.subarraysWithKDistinct([1, 2], 0)}")

# Edge Case 4: Single element array
print(f"Sliding Window [1], k=1: {solver.subarraysWithKDistinct([1], 1)}")
print(f"Sliding Window [1], k=0: {solver.subarraysWithKDistinct([1], 0)}")

# Edge Case 5: Array with all same elements
print(f"Sliding Window [1, 1, 1], k=1: {solver.subarraysWithKDistinct([1, 1, 1], 1)}")
print(f"Sliding Window [1, 1, 1], k=2: {solver.subarraysWithKDistinct([1, 1, 1], 2)}")

# Example 1 (from problem description)
nums1 = [1, 2, 1, 2, 3]
k1 = 2
print(f"Sliding Window {nums1}, k={k1}: {solver.subarraysWithKDistinct(nums1, k1)}")

# Example 2 (from problem description)
nums2 = [1, 2, 1, 3, 4]
k2 = 3
print(f"Sliding Window {nums2}, k={k2}: {solver.subarraysWithKDistinct(nums2, k2)}")

# Test Case 3: k equals the number of distinct elements in the whole array
nums3 = [1, 2, 3]
k3 = 3
print(f"Sliding Window {nums3}, k={k3}: {solver.subarraysWithKDistinct(nums3, k3)}")

# Test Case 4: k is less than the number of distinct elements in the whole array
nums4 = [1, 2, 3, 1, 2]
k4 = 2
print(f"Sliding Window {nums4}, k={k4}: {solver.subarraysWithKDistinct(nums4, k4)}")

# Test Case 5: Larger array with repetitions
nums5 = [3, 2, 3, 1, 3, 2, 1]
k5 = 2
print(f"Sliding Window {nums5}, k={k5}: {solver.subarraysWithKDistinct(nums5, k5)}")

# Test Case 6: k is larger than the possible distinct elements in a subarray
nums6 = [1, 2, 1, 2]
k6 = 3
print(f"Sliding Window {nums6}, k={k6}: {solver.subarraysWithKDistinct(nums6, k6)}")

# Test Case 7: Array with zeros (though the problem constraints say nums[i] >= 1)
# If the constraint were different, this would be relevant.
# nums7 = [1, 0, 2, 0, 3]
# k7 = 2
# print(f"Sliding Window {nums7}, k={k7}: {solver.subarraysWithKDistinct(nums7, k7)}")
```

**Explanation of the Sliding Window Approach:**

The solution uses a clever technique based on the inclusion-exclusion principle combined with a sliding window. Instead of directly counting subarrays with exactly `k` distinct integers, it calculates:

`count(at most k distinct) - count(at most k-1 distinct)`

The difference between these two counts gives us the number of subarrays with exactly `k` distinct integers.

**`count_at_most_k_distinct(arr, k_limit)` Function:**

This helper function counts the number of contiguous subarrays in `arr` that have at most `k_limit` distinct integers. It uses a sliding window approach:

1.  **Initialization:**

    - `n`: Length of the input array `arr`.
    - `count`: Counter for the number of valid subarrays.
    - `left`: Left pointer of the sliding window.
    - `distinct_counts`: A `defaultdict(int)` to store the frequency of each element within the current window.
    - `distinct_elements`: The number of distinct elements in the current window.

2.  **Sliding the Right Pointer:**

    - The `right` pointer iterates through the array, expanding the window.
    - For each new element `arr[right]`:
      - If it's a new distinct element in the window (`distinct_counts[arr[right]] == 0`), increment `distinct_elements`.
      - Increment the count of `arr[right]` in `distinct_counts`.

3.  **Shrinking the Left Pointer (Maintaining the Constraint):**

    - `while distinct_elements > k_limit:`: If the number of distinct elements in the current window exceeds `k_limit`, we need to shrink the window from the left.
    - Decrement the count of `arr[left]` in `distinct_counts`.
    - If the count of `arr[left]` becomes 0 after decrementing, it means this element is no longer present in the window, so we decrement `distinct_elements`.
    - Move the `left` pointer one step to the right.

4.  **Counting Valid Subarrays:**

    - `count += right - left + 1`: At each `right` position, the window `[left, right]` contains at most `k_limit` distinct elements. Therefore, all subarrays ending at `right` and starting at any index from `left` to `right` (inclusive) also satisfy this condition. The number of such subarrays is `right - left + 1`.

5.  **Return Count:** The function returns the total count of subarrays with at most `k_limit` distinct elements.

**`subarraysWithKDistinct(nums, k)` Function:**

1.  **Handle `k=0`:** If `k` is 0, there can be no subarrays with exactly 0 distinct integers (unless the subarray is empty, but we are counting non-empty contiguous subarrays). So, we return 0.

2.  **Apply Inclusion-Exclusion:** We call `count_at_most_k_distinct` twice:

    - Once with `k` to get the count of subarrays with at most `k` distinct integers.
    - Once with `k - 1` to get the count of subarrays with at most `k - 1` distinct integers.

3.  **Return the Difference:** The difference between these two counts gives the number of subarrays with exactly `k` distinct integers.

**Time and Space Complexity:**

- **Time Complexity:** O(n), where n is the length of the `nums` array. Both `count_at_most_k_distinct` are linear time due to the single pass of the `right` pointer and the `left` pointer also moving at most `n` times in total.
- **Space Complexity:** O(n) in the worst case for the `distinct_counts` dictionary if all elements in `nums` are distinct.

This sliding window approach combined with the inclusion-exclusion principle provides an efficient solution to the problem. The test cases cover various edge scenarios and the examples from the problem description.
