# 645. Set Mismatch

[Link to Problem](https://leetcode.com/problems/set-mismatch/)

### Description
You have a set of integers `s`, which originally contains all the numbers from `1` to `n`. Unfortunately, due to some error, one of the numbers in `s` got duplicated to another number in the set, which results in **repetition of one number** and **loss of another number**.

You are given an integer array `nums` representing the data status of this set after the error.

Find the number that occurs twice and the number that is missing and return **them in the form of an array**.

---
**Example 1:**
```
Input: nums = [1,2,2,4]
Output: [2,3]
```

**Example 2:**
```
Input: nums = [1,1]
Output: [1,2]
```

---
**Constraints:**
- `2 <= nums.length <= 10^4`
- `1 <= nums[i] <= 10^4`

My intuition: XOR, Sum

In [1]:
# XOR appraoch
from typing import List

class Solution:
    def findErrorNums(self, nums: List[int]) -> List[int]:
        missing = 0
        for num in list(set(nums))+list(range(1, len(nums)+1)):
            missing ^= num
        
        nums.append(missing)
        dup = 0
        for num in list(nums)+list(range(1, len(nums))):
            dup ^= num
        return [dup,missing]
# Time: O(n)
# Space: O(n)

In [2]:
# SUM appraoch
from typing import List

class Solution:
    def findErrorNums(self, nums: List[int]) -> List[int]:
        missing = (set(range(1, len(nums)+1)) - set(nums)).pop()
        dup = sum(nums)+missing-sum(range(1, len(nums)+1))
        return [dup,missing]
# Time: O(n)
# Space: O(n)

Here’s a **code review** of your Jupyter notebook solution to **Leetcode 645: Set Mismatch**, covering both the **XOR** and **SUM** approaches:

---

## ✅ General Feedback

* ✅ Clear structure and markdown documentation.
* ✅ Two distinct algorithmic approaches implemented (XOR and SUM).
* ✅ Correct use of Python built-ins like `set()`, `sum()`.
* ✅ Good testing section with assertions.
* ✅ Time and space complexity annotated.

---

## 🔍 Detailed Review

### 1. 🧠 XOR Approach

```python
for num in list(set(nums)) + list(range(1, len(nums)+1)):
    missing ^= num
```

* 🔧 **Clarity**: It’s not immediately obvious why XOR is used here unless the reader understands XOR properties.
* ⚠️ **Edge Case Concern**: Appending `missing` to `nums` before the second loop is risky. If `nums` is reused later, this could cause side effects.
* 📦 **Space Complexity**: Actually `O(n)` because `list(set(nums))`, `range`, and `nums + [...]` create new lists.
* ✅ Works correctly due to XOR canceling out equal elements.

### ✅ Suggested Refactor (Safer):


In [42]:
# However this is slower
class Solution:
    def findErrorNums(self, nums: List[int]) -> List[int]:
        n = len(nums)
        xored = 0
        for num in nums:
            xored ^= num
        for i in range(1, n + 1):
            xored ^= i
        
        # Now xored = dup ^ missing
        # Find which one is which by checking set membership
        for num in nums:
            if nums.count(num) > 1:
                dup = num
                break
        missing = dup ^ xored
        return [dup, missing]

* ✅ Avoids mutation of original input.
* ✅ Still `O(n)` time.

---

### 2. ➕ SUM Approach

```python
missing = (set(range(1, len(nums)+1)) - set(nums)).pop()
dup = sum(nums)+missing-sum(range(1, len(nums)+1))
```

* ✅ Simple and elegant.
* ⚠️ Slight inefficiency with double `sum()` calls and multiple `set()` constructions.
* ✅ `missing` computation via set difference is clear and intuitive.

### ✅ Suggested Refactor:


In [44]:
class Solution:
    def findErrorNums(self, nums: List[int]) -> List[int]:
        n = len(nums)
        set_nums = set(nums)
        expected_sum = n * (n + 1) // 2
        missing = expected_sum - sum(set_nums)
        actual_sum = sum(nums)
        dup = actual_sum + missing - expected_sum
        return [dup, missing]
# Time: O(n)
# Space: O(n)

## 📊 Comparison Table

| Approach | Time Complexity | Space Complexity | Pros                     | Cons                           |
| -------- | --------------- | ---------------- | ------------------------ | ------------------------------ |
| XOR      | O(n)            | O(n)             | Clever, bit manipulation | Less readable, mutates input   |
| SUM      | O(n)            | O(n)             | Clean and intuitive      | Multiple `sum` and `set` calls |

---

## ✅ Final Verdict

Both solutions are **valid and performant**, but:

* Prefer the **SUM approach** for readability and maintainability.
* If memory usage is a concern and you're okay with a bit more complexity, **XOR** is fine (with input mutation avoided).

Would you like to explore an **O(1) space solution using negative marking** or **counting arrays** as well?

Great! Let's explore two **O(1) space** alternatives (excluding output space) that also run in **O(n) time** for the **Leetcode 645: Set Mismatch** problem:

---

## ✅ Approach 1: **Negative Marking (In-Place Detection)**

### 🔧 Idea:

* Iterate over each number `x` in `nums`.
* For each `x`, mark the element at index `abs(x) - 1` as negative.
* If it’s already negative, it means `x` is the **duplicate**.
* After the loop, the index of the positive number corresponds to the **missing number**.

### 💡 Code:

In [3]:
from typing import List

class Solution:
    def findErrorNums(self, nums: List[int]) -> List[int]:
        dup = -1
        for x in nums:
            if nums[abs(x) - 1] < 0:
                dup = abs(x)
            else:
                nums[abs(x) - 1] *= -1

        # Missing number: the index + 1 of the only positive number
        for i, val in enumerate(nums):
            if val > 0:
                return [dup, i + 1]

### ✅ Pros:

* No extra space (O(1)), just modifies the input.
* Efficient: O(n) time.

### ⚠️ Cons:

* Mutates `nums`, so only safe if you **don’t need to reuse it** later.

---

## ✅ Approach 2: **Counting Array**

### 🔧 Idea:

* Use a fixed-size array (since `nums[i] <= 10^4`) to count frequencies.
* The number with count 2 is duplicate.
* The number with count 0 is missing.

### 💡 Code:

In [4]:
from typing import List

class Solution:
    def findErrorNums(self, nums: List[int]) -> List[int]:
        counts = [0] * (len(nums) + 1)
        for num in nums:
            counts[num] += 1

        for i in range(1, len(counts)):
            if counts[i] == 0:
                missing = i
            elif counts[i] == 2:
                dup = i
        return [dup, missing]

### ✅ Pros:

* Easy to understand.
* Still linear in time.

### ⚠️ Cons:

* Uses O(n) space for the counting array.

---

## 📊 Final Comparison Table

| Approach             | Time | Space | Mutates Input        | Notes                                |
| -------------------- | ---- | ----- | -------------------- | ------------------------------------ |
| **Negative Marking** | O(n) | O(1)  | ✅ Yes                | Optimal space, but destructive       |
| **Counting Array**   | O(n) | O(n)  | ❌ No                 | Simple, but more space               |
| **XOR**              | O(n) | O(n)  | ✅ Yes (your version) | Bitwise clever, needs careful design |
| **SUM**              | O(n) | O(n)  | ❌ No                 | Clean and intuitive                  |

---

Would you like me to **add these approaches to your Jupyter notebook** or benchmark them on large inputs?


In [45]:
# Test
assert Solution().findErrorNums([1,2,2,4]) == [2,3]
assert Solution().findErrorNums([1,1]) == [1,2]
assert Solution().findErrorNums([3,2,4,2]) == [2,1]