Context
NTHOCCURRENCE currently uses SUMPRODUCT((INDEX(col, SEQUENCE(i)) = INDEX(col, i)) * 1) inside a MAP over SEQUENCE(n). This works on both ranges and in-memory arrays, but it doesn't hit Excel's native C++-level COUNTIF fast path.
Proposed change
Swap the inner aggregation to:
COUNTIF(INDEX(col, 1):INDEX(col, i), INDEX(col, i))
When col is a real worksheet range, INDEX(a):INDEX(b) yields a range reference, and COUNTIF over a range reference is one of Excel's most optimised aggregations. Expected material speedup on large ranges.
Caveats / testing required
- Must verify behaviour is identical on in-memory array inputs (e.g. when
NTHOCCURRENCE is chained after another LAMBDA that returns an array, not a range). The INDEX:INDEX trick degrades to array semantics there and may lose the fast path — but the result must still be correct.
- Benchmark against current implementation on:
- Small arrays (n ≤ 50) — expect no meaningful difference.
- Medium arrays (n ≈ 500) — expect noticeable speedup on range inputs.
- Large arrays (n ≈ 5000) — expect significant speedup on range inputs.
- Confirm existing test cases in the
.lambda file still pass.
- Also drop the redundant
* 1 that remains even under current logic if we keep a SUMPRODUCT fallback.
Do not merge without
Measured benchmark numbers (before/after) on both range and array inputs.
Context
NTHOCCURRENCEcurrently usesSUMPRODUCT((INDEX(col, SEQUENCE(i)) = INDEX(col, i)) * 1)inside aMAPoverSEQUENCE(n). This works on both ranges and in-memory arrays, but it doesn't hit Excel's native C++-levelCOUNTIFfast path.Proposed change
Swap the inner aggregation to:
When
colis a real worksheet range,INDEX(a):INDEX(b)yields a range reference, andCOUNTIFover a range reference is one of Excel's most optimised aggregations. Expected material speedup on large ranges.Caveats / testing required
NTHOCCURRENCEis chained after another LAMBDA that returns an array, not a range). TheINDEX:INDEXtrick degrades to array semantics there and may lose the fast path — but the result must still be correct..lambdafile still pass.* 1that remains even under current logic if we keep a SUMPRODUCT fallback.Do not merge without
Measured benchmark numbers (before/after) on both range and array inputs.