#### [Python <img src="../../assets/pythonLogo.png" alt="py logo" style="height: 1em; vertical-align: sub;">](../README.md) | Easy 🟢 | [Sliding Window](README.md)
# [219. Contains Duplicate II](https://leetcode.com/problems/contains-duplicate-ii/description/)

Given an integer array `nums` and an integer `k`, return `true` if there are two **distinct indices** `i` and `j` in the array such that `nums[i] == nums[j]` and `abs(i - j) <= k`.

#### Example 1:
> **Input:** `nums = [1,2,3,1], k = 3`  
> **Output:** `true`

#### Example 2:
> **Input:** `nums = [1,0,1,1], k = 1`  
> **Output:** `true`

#### Example 3:
> **Input:** `nums = [1,2,3,1,2,3], k = 2`  
> **Output:** `false`

#### Constraints:
- $1 \leq$ `nums.length` $\leq 10^5$
- $-10^9 $ `nums[i]` $\leq 10^9$
- $0 \leq $ `k` $ \leq 10^5$

## Problem Explanation
- This problem is asking us to determine whether if an array contains at least two identical elements within a maximum distance `k` from each other.
- This problem is a varaint of the classic **contains duplicate** problem but now adding the constraint on proximity of the duplicates in regards to their indices.
***

# Approach 1: Hash table
- We can utilize hash tables in our approach since we can leverage using a dynamic set to keep track of a sliding window of elements that can be up to size $k$. 
- This window then moves through the array, and we can maintain a record of the elements within the distance constraint at a given time.

## Intuition
- The intuition behind using a hash table (or a set) is its ability to quickly check for the existence of an element.
- By maintaining a set of elements that are within $k$ distance from the current element, we can efficiently query whether a duplicate exists within the specified range.

## Algorithm
1. **Initialize an empty set**, let's call it `window`, to represent the sliding wundow of elements within distance `k`.
2. **Iterate** through `nums` using a **pointer** `R`:
    - If the size of the window exceeds `k`, remove the leftmost element from the window (the one at index `L`) and move `L` to the right.
    - Check if the current element `nums[R]` is in the `window`:
        - If it is in the window, return `True` as a duplicate within the required distance is found.
        - Otherwise if its not in the window, add `nums[R]` to the window.
3. Return `False` if no such duplicate is found by the of iterating through the array.

## Code Implementation

In [1]:
from typing import List

class Solution:
    def containsNearbyDuplicate(self, nums: List[int], k: int) -> bool:
        window = set()
        L = 0   # Initialize the left boundary of the window

        for R in range (len(nums)):     # Iterate through the array with the right boundary
            if R - L > k:                   # Check the window size is no larger than k 
                window.remove(nums[L])          # Remove the leftmost element from the window
                L += 1                          # Move the left boundary to the right
            if nums[R] in window:           # If the current element is already in the window
                return True                     # We found a duplicate
            window.add(nums[R])             # Add the current element to the window

        return False            # No duplicate found

## Testing

In [2]:
def test_containsNearbyDuplicate(solution, nums, k, expected):
    result = solution.containsNearbyDuplicate(nums, k)
    print(f"nums: {nums}, k: {k}, Expected: {expected}, Got: {result}")
    assert result == expected, "Test failed."
    print("✅ Test passed!")

# Instance of the solution
sol = Solution()

# Test cases
test_containsNearbyDuplicate(sol, [1,2,3,1], 3, True)
test_containsNearbyDuplicate(sol, [1,0,1,1], 1, True)
test_containsNearbyDuplicate(sol, [1,2,3,1,2,3], 2, False)

nums: [1, 2, 3, 1], k: 3, Expected: True, Got: True
✅ Test passed!
nums: [1, 0, 1, 1], k: 1, Expected: True, Got: True
✅ Test passed!
nums: [1, 2, 3, 1, 2, 3], k: 2, Expected: False, Got: False
✅ Test passed!


## Complexity Analysis
- **Variables**:
    - $n$ is the total number of elements in the input array `nums`.
    - $k$ is the specified distance between the indices of the array.

- ### Time Complexity: $O(n)$
    - Each element in the array only visited once.
    - The operations within the sliding window (_adding and removing elements_) are $O(1)$, thus making the time complexity linear.

- ### Space Complexity: $O(\min(n,k))$
    - The size of the set `window` is capped at `k` elements, but it could be smaller if the array has fewer than `k` elements.
    - Thus, the space complexity dpeends on the smaller of the two values, $n$ or $k$.
***

# Approach 1.1: Optimized Dictionary
- This variation of the hash table approach optimizes for both time and space by tracking the last seen index of each element in the array.
- By mapping each value to its most recent index, we can efficiently check the condition for nearby duplicates

## Intuition
- The core idea here is that we only need to remember the last index at which each number appeared.
- If the same number appears again, you can check if the current index and the last index of this number are within the distance `k`.
- This modification significantly speeds up the search for duplicates by eliminating the need to compare each element with its subsequent elements directly.

## Algorithm
1. Initialize an empty dictionary `dict` to store the number as the key and its last seen value as the index.
2. Iterate through `nums`, using `i` to track the current index.
    - If the current number `nums[i]` is already in the dictionary `n`, calculate the difference between `i` and the last seen index of `nums[i]`.
        - If the difference is less than or equal to `k`, return `True` because a nearby duplicate is found.
    - Update the dictionary with the current index `i` for `nums[i]`.
3. If the iteration completes without finding any nearby duplicates, return `False`.

## Code Implementation

In [3]:
class Solution1:
    def containsNearbyDuplicate(self, nums: List[int], k: int) -> bool:
        dict = {}   # Create a dictionary to store the index of each element
        for i in range(len(nums)):  # Iterate through the array
            if nums[i] in dict:    # Check if the current number has been seen before
                # if the diff btwn the curr index and the prev index of the number is less than or equal to k
                if abs(dict[nums[i]] - i) <= k:
                    return True         # We found a duplicate
            dict[nums[i]] = i       # Update the index of the current number
        return False            # No duplicate found

### Testing

In [4]:
# Testing optimized hash approach
sol1 = Solution1()

# Test cases
test_containsNearbyDuplicate(sol, [1,2,3,1], 3, True)
test_containsNearbyDuplicate(sol, [1,0,1,1], 1, True)
test_containsNearbyDuplicate(sol, [1,2,3,1,2,3], 2, False)

nums: [1, 2, 3, 1], k: 3, Expected: True, Got: True
✅ Test passed!
nums: [1, 0, 1, 1], k: 1, Expected: True, Got: True
✅ Test passed!
nums: [1, 2, 3, 1, 2, 3], k: 2, Expected: False, Got: False
✅ Test passed!


## Complexity Analysis
- ### Time Complexity: $O(n)$
    - Each element in the array is visited exactly once. 
    - Checking for the existence of a key in the hash table and updating the hash table are both $O(1)$ operations, leading to a linear time complexity relative to the size of the input array.
- ### Space Complexity: $O(\min{(n,k)})$
    - The space complexity depends on the number of unique elements stored in the hash table, as mentioned previously.
    - In the worst case, it is bounded by the smaller number between the total number of elements $n$ and the window size $k$, since the algorithm only needs to keep track of the last seen index for each unique number within the latest window of size $k$.
***

# Approach 2: Brute force/Naive Linear Search
- Another way of approaching this problem is doing a Naive Linear Search where we check each element against all other elements within the distance `k` to find any duplicates.
- This approach doesn't use any advanced data structures/algorithms and relies on simple iteration and comparison.

## Intuition
The main idea for this approach is relatively trivial, we basically compare every element in the array with its subsequent elements up to `k` positions away.

### Disclaimer
- While the Naive Linear Search approach is intuitive and ensures that no potential duplicates are missed, it is not the most efficient, especially for large arrays or large values of k. 
- This approach can lead to significant performance degradation due to its potentially high computational cost.


## Algorithm
1. Iterate through the array with an outer loop, using an index `i` to track the current position.
2. For each element at index `i`, use another loop to compare this element with the next `k` elements, or until the end of the array is reached. We'll use `j` to keep track of the inner loop.
3. If a duplicate is found within `k` positions (i.e. `nums[i] == nums[j]` where `abs(i-j) <= k`), return `True`.
4. If the end of the array is reached without finding any duplicates within the allowed distance, return `False.

## Code Implementation

In [5]:
class Solution2:
    def containsNearbyDuplicate(self, nums: List[int], k: int) -> bool:
        n = len(nums)   # length of the array
        for i in range(n):      # Iterate through the array
            for j in range(i+1, min(i + k + 1, n)): # Check up to k positions ahead
                if nums[i] == nums[j]:      
                    return True         # We found a duplicate
        return False        # else we didn't find a duplicate

### Testing

In [6]:
# Testing naive linear search approach
sol2 = Solution2()

# Test cases
test_containsNearbyDuplicate(sol2, [1,2,3,1], 3, True)
test_containsNearbyDuplicate(sol2, [1,0,1,1], 1, True)
test_containsNearbyDuplicate(sol2, [1,2,3,1,2,3], 2, False)

nums: [1, 2, 3, 1], k: 3, Expected: True, Got: True
✅ Test passed!
nums: [1, 0, 1, 1], k: 1, Expected: True, Got: True
✅ Test passed!
nums: [1, 2, 3, 1, 2, 3], k: 2, Expected: False, Got: False
✅ Test passed!


## Complexity Analysis
- ### Time Complexity: $O(n\min{(k,n)})$
    - In the worst casem for each element of the array (`n` elements), the algorithm checks the next `k` elements for a duplicate, thus the time complexity becomes linearly dependent on the product of the array's size and the distance.
    - It takes about $O(\min{(k,n)})$ for every linear search and we do at most $n$ comparisons, this we have $O(n\min{(k,n)})$.
- ### Space Complexity: $O(\min{(n,k)})$
    - The algorithm uses a constant amount of space regardless of the input size. 
    - The only additional memory used is for the loop counters and temporary variables, which do not scale with the size of the input array.
***

# Conclusion
For this problem, we can go about approaching it in several ways:
1. **Brute Force/Naive Linear Search:** Check each element against the otheres within the distance `k`, which leads to a potentially high computational cost, especially for large arrays and values of `k`.
2. **Hash Table:** We can use a hash table/dictionary to keep track of the last seen indices of elements, which allows for efficient checks/lookups for nearby duplicates.
3. **Binary Search Tree (BST):** While a BST could be used to maintain a sorted structure of the last `k` elements, allowing for $O(\log{k})$ insertions and deletions, its complexity in handling duplicates and the overhead of maintaining tree balance make it less practical than a hash table.
### Why using a hash table/dictionary is the way
- **Effiency:** Using a hash table significantly reduces the time complexity $O(n)$ from the $O(n\min{(k,n)})$ we get with the naive linear search approach or even using a BST. This effiency comes from the fact that a hash table is able to perform constant-time lookups and updates, which enables faster checks for duplicates within the require range of `k`.
- **Scalability:** Hash tables are suited for scaling large data sets. Unlike naive linear search which slows down dramatically when the data size increases or BSTs which require more overhead management, hash tables are able to maintain consistent performance.