# 36. Valid Sudoku

## Topic Alignment
- **Role Relevance**: Validating Sudoku grids mirrors feature-quality checks where categorical encodings must obey exclusivity constraints.
- **Scenario**: Ensuring no duplicated features per bucket aligns with monitoring deduplication in data ingestion for recommendation systems.

## Metadata Summary
- Source: [LeetCode - Valid Sudoku](https://leetcode.com/problems/valid-sudoku/)
- Tags: `Array`, `Hash Table`, `Matrix`
- Difficulty: Medium
- Recommended Priority: High

## Problem Statement
Determine if a 9 x 9 Sudoku board is valid. Only the filled cells need to be validated according to the following rules:

1. Each row must contain the digits 1-9 without repetition.
2. Each column must contain the digits 1-9 without repetition.
3. Each of the nine 3 x 3 sub-boxes must contain the digits 1-9 without repetition.

The board may be partially filled, where empty cells are represented by the character `.`.

## Progressive Hints
- Hint 1: Start by enforcing the three validity rules separately to avoid mixing logic.
- Hint 2: Use hash sets to record seen digits per row, column, and sub-box.
- Hint 3: Encode sub-box identity using integer division to map cells into 3 x 3 buckets.

## Solution Overview
Scan every cell once while tracking the digits seen per row, per column, and per 3 x 3 sub-box using hash sets. If a digit repeats in any tracking structure, the board is invalid; otherwise it is valid.

## Detailed Explanation
1. Prepare three arrays of sets: one per row, one per column, and one for sub-boxes indexed by `(r // 3) * 3 + (c // 3)`.
2. Iterate over each cell. Skip empty cells marked `.` to focus on filled values.
3. When encountering a digit, check whether it already exists in the row, column, or sub-box set.
4. If any check fails, return `False` early.
5. Otherwise insert the digit into the corresponding row, column, and sub-box sets.
6. Finish scanning the grid. If no duplicates were encountered, return `True`.

## Complexity Trade-off Table
| Approach | Time Complexity | Space Complexity | Notes |
| --- | --- | --- | --- |
| Sets per row/column/box | O(81) | O(81) | Straightforward and readable, leveraging Python sets. |
| Bitmask encoding | O(81) | O(1) | Memory efficient but requires careful bit manipulation. |

In [None]:
from typing import List


def isValidSudoku(board: List[List[str]]) -> bool:
    """Return True if the partially filled Sudoku board is valid."""
    rows = [set() for _ in range(9)]
    cols = [set() for _ in range(9)]
    boxes = [set() for _ in range(9)]

    for r in range(9):
        for c in range(9):
            value = board[r][c]
            if value == '.':
                continue  # Ignore empty cells.
            box_index = (r // 3) * 3 + (c // 3)  # Map to 3x3 sub-box.
            if value in rows[r] or value in cols[c] or value in boxes[box_index]:
                return False  # Duplicate detected in row, column, or box.
            rows[r].add(value)
            cols[c].add(value)
            boxes[box_index].add(value)
    return True


## Complexity Analysis
- Time Complexity: `O(81)` because the algorithm inspects each cell in the 9 x 9 grid exactly once.
- Space Complexity: `O(81)` for storing seen digits across rows, columns, and sub-boxes.
- Bottleneck: Hash set insertions and lookups dominate, but they are constant-time on average.

## Edge Cases & Pitfalls
- Ensure sub-box indices are computed correctly; off-by-one errors are common.
- Skip placeholder cells to avoid treating `.` as a digit.
- Validate only what is present; the input may be unsolvable yet still valid.
- Remember that the same digit may appear across rows so long as it is unique inside each row, column, and box.

## Follow-up Variants
- Extend to validate boards of size `n^2 x n^2` with dynamic box sizes.
- Use bitmasks to decrease memory footprint when integrating into performance-critical services.
- Combine with a backtracking solver to produce valid completions for data augmentation.

## Takeaways
- Hash sets provide a clean way to enforce uniqueness constraints on structured data.
- Decomposing grid validation into row, column, and sub-box checks keeps logic modular.
- Early exits improve latency when invalid configurations are common in data streams.

## Similar Problems
| Problem ID | Problem Title | Technique |
| --- | --- | --- |
| 37 | Sudoku Solver | Backtracking with constraint tracking |
| 73 | Set Matrix Zeroes | Grid scanning with bookkeeping |
| 266 | Palindrome Permutation | Frequency counting via hash map |