# 26. Remove Duplicates from Sorted Array

## Topic Alignment
- Value-to-index hashing helps surface dedup invariants that translate into stable compaction steps.
- Records support pipelines where unique keys must be preserved while streaming data.

## Metadata 摘要
- **Source**: [LeetCode](https://leetcode.com/problems/remove-duplicates-from-sorted-array/)
- **Tags**: Array, Hash Table, Two Pointers
- **Difficulty**: Easy
- **Priority**: High

## Problem Statement 原题描述
Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place so that each unique element appears only once. The relative order of the elements should be kept the same. Because it is impossible to change the length of the array in place, you must place the result in the first part of the array nums. More formally, if there are k elements after removing the duplicates, then the first k elements of nums should hold the final result. Return k.

## Progressive Hints
- **Hint 1**: 哈希表可以帮助跟踪每个值是否已经写入，避免重复覆盖。
- **Hint 2**: 利用数组有序性，首见元素即可直接写入前缀区域。
- **Hint 3**: 维护写指针返回长度，同时保留字典以便调试统计。

## Solution Overview
Use a dictionary to record whether a value has appeared; each new value is written to the next slot, yielding an in-place unique prefix while the hash map captures value-to-position mapping.

## Detailed Explanation

1. 初始化写指针 write=0，并准备字典 seen 记录每个值被写入的位置。
2. 遍历数组：若当前数未出现在 seen 中，记录 `seen[val] = write`，并将其写到 `nums[write]`，随后 `write += 1`。
3. 数组有序确保第一次见到的值就是该值的最终位置，后续重复值将被跳过。
4. 遍历结束后，返回 write 作为唯一元素个数，前 write 个位置即为结果，其余位置值保持原样无影响。

## Complexity Trade-off Table
| Approach | Time | Space | Notes |
| --- | --- | --- | --- |
| 使用集合/字典去重再写回 | O(n) | O(n) | 逻辑简单但额外空间较多。 |
| 哈希标记 + 就地写入 | O(n) | O(n) | 保留哈希诊断信息，同时保持输出前缀正确。 |
| 纯双指针 | O(n) | O(1) | 最优空间，但少了值到位置的映射调试信息。 |

In [None]:
from typing import List


def remove_duplicates(nums: List[int]) -> int:
    if not nums:
        return 0
    seen: dict[int, int] = {}
    write = 0
    for val in nums:
        if val in seen:
            continue
        seen[val] = write
        nums[write] = val
        write += 1
    return write


def run_tests() -> None:
    tests = [
        ([1, 1, 2], ([1, 2], 2)),
        ([0, 0, 1, 1, 1, 2, 2, 3, 3, 4], ([0, 1, 2, 3, 4], 5)),
        ([1], ([1], 1)),
    ]
    for nums, (expected_prefix, expected_k) in tests:
        k = remove_duplicates(nums)
        assert k == expected_k
        assert nums[:k] == expected_prefix


if __name__ == "__main__":
    run_tests()

## Complexity Analysis
- 字典与写指针各访问一次数组 => O(n) 时间。
- seen 存储每个唯一值的位置 => O(u) 额外空间，u 为不同元素个数。

## Edge Cases & Pitfalls
- 全是重复值时应返回 1 并保留首元素。
- 单元素数组要直接返回 1。
- 包含负数、零时逻辑不变。

## Follow-up Variants
- 如何扩展到允许每个值最多出现 m 次？
- 如果输入流按块刷新，如何在保留哈希的前提下进行增量更新？

## Takeaways
- 哈希映射可以帮助调试和验证写入位置，尤其在复杂数据管道中。
- 即使最终方案用双指针，理解值到位置的映射依然有助于分类总结。

## Similar Problems
| Problem ID | Problem Title | Technique |
| --- | --- | --- |
| 80 | Remove Duplicates from Sorted Array II | Value-count tracking |
| 27 | Remove Element | Selective overwrite |
| 283 | Move Zeroes | Stable compaction |