# LeetCode 217: Contains Duplicate

**Difficulty:** Easy  
**Pattern:** Arrays & Hashing  
**Topics:** Array, Hash Table, Sorting

---

## Problem Statement

Given an integer array `nums`, return `true` if any value appears **at least twice** in the array, and return `false` if every element is distinct.

### Constraints

- `1 <= nums.length <= 10^5`
- `-10^9 <= nums[i] <= 10^9`

### Examples

**Example 1:**
```
Input: nums = [1,2,3,1]
Output: true
```

**Example 2:**
```
Input: nums = [1,2,3,4]
Output: false
```

**Example 3:**
```
Input: nums = [1,1,1,3,3,4,3,2,4,2]
Output: true
```

---

## Approach 1: Sorting

### Intuition

If we sort the array, duplicate values will be adjacent to each other. We can then check neighboring elements.

### Algorithm

1. Sort the array
2. Iterate through sorted array
3. Check if any adjacent elements are equal
4. Return `true` if duplicates found, `false` otherwise

### Complexity Analysis

- **Time Complexity:** O(n log n) - dominated by sorting
- **Space Complexity:** O(1) or O(n) - depends on sorting algorithm used

### Visualization

```
Original: [1, 2, 3, 1]
Sorted:   [1, 1, 2, 3]
          [1, 1] ← Adjacent duplicates found!
```

In [None]:
from typing import List

def contains_duplicate_sort(nums: List[int]) -> bool:
    """
    Sorting approach.
    
    Time: O(n log n), Space: O(1) to O(n)
    """
    nums.sort()
    
    for i in range(len(nums) - 1):
        if nums[i] == nums[i + 1]:
            return True
    
    return False

# Test cases
test_cases = [
    [1, 2, 3, 1],
    [1, 2, 3, 4],
    [1, 1, 1, 3, 3, 4, 3, 2, 4, 2]
]

print("Sorting Approach:")
for nums in test_cases:
    result = contains_duplicate_sort(nums.copy())  # Use copy to avoid modifying original
    print(f"nums={nums} → {result}")

---

## Approach 2: Hash Set (Optimal)

### Intuition

We can use a hash set to track numbers we've seen. If we encounter a number already in the set, we've found a duplicate.

### Algorithm

1. Create an empty set to track seen numbers
2. Iterate through array:
   - If current number is in set → return `true` (duplicate found)
   - Otherwise, add number to set
3. If we finish the loop → return `false` (no duplicates)

### Complexity Analysis

- **Time Complexity:** O(n) - single pass through array
- **Space Complexity:** O(n) - set can store up to n elements

### Visualization

For `nums = [1, 2, 3, 1]`:

```
Step 1: num=1, seen={}     → Add 1
        seen={1}

Step 2: num=2, seen={1}    → Add 2
        seen={1, 2}

Step 3: num=3, seen={1,2}  → Add 3
        seen={1, 2, 3}

Step 4: num=1, seen={1,2,3} → 1 already in set! ✓
        Return True
```

In [None]:
def contains_duplicate(nums: List[int]) -> bool:
    """
    Hash set approach (Optimal).
    
    Time: O(n), Space: O(n)
    """
    seen = set()
    
    for num in nums:
        if num in seen:
            return True
        seen.add(num)
    
    return False

print("\nHash Set Approach (Optimal):")
for nums in test_cases:
    result = contains_duplicate(nums)
    print(f"nums={nums} → {result}")

---

## Approach 3: Pythonic One-Liner

### Intuition

In Python, we can leverage the fact that sets automatically remove duplicates. If converting a list to a set changes its length, there were duplicates.

### Algorithm

Compare `len(nums)` with `len(set(nums))`

### Complexity Analysis

- **Time Complexity:** O(n) - creating set iterates through array
- **Space Complexity:** O(n) - set storage

**Note:** While concise, this approach is less efficient than the iterative hash set because it always processes the entire array, whereas the iterative version can return early.

In [None]:
def contains_duplicate_pythonic(nums: List[int]) -> bool:
    """
    Pythonic one-liner.
    
    Time: O(n), Space: O(n)
    """
    return len(nums) != len(set(nums))

print("\nPythonic Approach:")
for nums in test_cases:
    result = contains_duplicate_pythonic(nums)
    print(f"nums={nums} → {result}")

---

## Detailed Walkthrough

Let's trace through the optimal hash set solution with verbose output:

In [None]:
def contains_duplicate_verbose(nums: List[int]) -> bool:
    """
    Verbose version showing each step.
    """
    seen = set()
    
    print(f"Checking array: {nums}\n")
    
    for i, num in enumerate(nums):
        print(f"Step {i+1}: Checking {num}")
        print(f"  Current seen: {seen}")
        
        if num in seen:
            print(f"  ✓ Duplicate found! {num} was already seen.")
            return True
        
        seen.add(num)
        print(f"  ✗ Not a duplicate. Added {num} to seen.")
        print()
    
    print("No duplicates found!")
    return False

# Example with duplicates
print("Example 1: Array with duplicates")
print("="*50)
result1 = contains_duplicate_verbose([1, 2, 3, 1])
print(f"\nResult: {result1}\n")

# Example without duplicates
print("\nExample 2: Array without duplicates")
print("="*50)
result2 = contains_duplicate_verbose([1, 2, 3, 4])
print(f"\nResult: {result2}")

---

## Performance Comparison

Let's compare the performance of different approaches:

In [None]:
import time
import random

# Generate a large test array
large_array = list(range(10000))
large_array.append(5000)  # Add one duplicate
random.shuffle(large_array)

print(f"Testing with array of size {len(large_array)}\n")

# Test sorting approach
start = time.time()
result1 = contains_duplicate_sort(large_array.copy())
time1 = time.time() - start

# Test hash set approach
start = time.time()
result2 = contains_duplicate(large_array)
time2 = time.time() - start

# Test pythonic approach
start = time.time()
result3 = contains_duplicate_pythonic(large_array)
time3 = time.time() - start

print(f"Sorting approach:   {time1:.6f}s")
print(f"Hash set approach:  {time2:.6f}s  ← Fastest!")
print(f"Pythonic approach:  {time3:.6f}s")
print(f"\nAll methods returned: {result1}")

---

## Edge Cases

In [None]:
edge_cases = [
    ([1], "Single element"),
    ([1, 1], "Two identical elements"),
    ([0, 0], "Two zeros"),
    ([-1, -1], "Negative duplicates"),
    ([1, 2, 3, 4, 5], "All unique"),
    ([5, 5, 5, 5], "All same"),
    ([int(1e9), int(1e9)], "Large numbers"),
]

print("Edge Cases:")
print("="*60)
for nums, description in edge_cases:
    result = contains_duplicate(nums)
    print(f"{description:25} {str(nums):20} → {result}")

---

## Comparison of Approaches

| Approach | Time | Space | Early Exit? | Notes |
|----------|------|-------|-------------|-------|
| Sorting | O(n log n) | O(1)-O(n) | Yes | Good if space is constrained |
| Hash Set (Iterative) | O(n) | O(n) | Yes | **Optimal** - Use in interviews |
| Pythonic | O(n) | O(n) | No | Concise but always processes full array |

## When to Use Each Approach

- **Hash Set:** Default choice - fastest in practice
- **Sorting:** When space is extremely limited
- **Pythonic:** Code golf or when readability matters more than performance

## Interview Tips

1. **Clarify constraints:** Ask about array size and space limitations
2. **Mention multiple approaches:** Shows breadth of knowledge
3. **Discuss trade-offs:** Time vs space complexity
4. **Consider early exit:** Hash set can return immediately upon finding duplicate
5. **Talk through edge cases:** Empty arrays, single elements, all duplicates

## Key Takeaways

- Hash sets provide O(1) lookup for membership checking
- Sets are perfect for duplicate detection
- Early exit optimization matters when duplicates are likely
- Python's built-in set operations are highly optimized
- Sometimes the simplest solution (hash set) is the best solution

---

## Follow-up Questions

**Q: What if the array is sorted?**  
A: You can check adjacent elements in O(n) time and O(1) space.

**Q: What if you can't use extra space?**  
A: Sort the array in-place (if allowed) and check adjacent elements.

**Q: What if we need to find which element is duplicated?**  
A: Modify hash set approach to return the duplicate element instead of boolean.

**Q: What if we need to find all duplicates?**  
A: Use a frequency map (Counter) or track duplicates in a separate set.

## Related Problems

- LeetCode 219: Contains Duplicate II (Easy) - duplicates within k distance
- LeetCode 220: Contains Duplicate III (Hard) - value difference and index difference
- LeetCode 442: Find All Duplicates in an Array (Medium)
- LeetCode 136: Single Number (Easy) - find the element that appears once