Skip to content

Unbounded: core/src/exchanges/kalshi/fetcher.ts — O(n²) array.concat() with MAX_PAGES=1000 #343

@realfishsam

Description

@realfishsam

Location

core/src/exchanges/kalshi/fetcher.ts:427 (also :465)

Code

const MAX_PAGES = 1000;
const BATCH_SIZE = 200;

// fetchActiveEvents (line 408)
do {
    const events = data.events || [];
    allEvents = allEvents.concat(events);  // ← creates new array every iteration
    // ...
} while (cursor && page < MAX_PAGES);

// fetchAllWithStatus (line 465) — identical pattern
allEvents = allEvents.concat(events);

Growth Pattern

Array.prototype.concat() allocates a brand-new array containing all previously accumulated elements plus the new page. It does not mutate in place. At page k, the new array contains k × BATCH_SIZE element references, all newly allocated slots. The previous array becomes garbage, but until GC collects it both arrays coexist in memory.

Total pointer slots allocated across all iterations: BATCH_SIZE × (1 + 2 + 3 + … + N) = BATCH_SIZE × N(N+1)/2

With BATCH_SIZE=200 and MAX_PAGES=1000:

  • Total pointer copies: 200 × (1000 × 1001 / 2) = 100,100,000 pointer-sized allocations
  • At 8 bytes/pointer: ~801 MB in intermediate array allocations

Even with generational GC, the largest live array grows to 200,000 pointers (1.6 MB) while the second-largest is still alive, creating a sustained dual-allocation peak.

OOM Estimate

  • Worst case (1000 pages, all full): ~800 MB of intermediate arrays in flight during a single fetchMarkets() call
  • Realistic Kalshi (50 pages × 200 events): ~200*(50*51/2) = 255,000 pointers ≈ 2 MB peak overhead — manageable, but GC pressure spikes on every call
  • If fetchAllWithStatus('settled') runs concurrently with fetchActiveEvents(), both do the same pattern simultaneously → peak memory doubles

Suggested Fix

Replace concat with push spread: allEvents.push(...events). Pre-allocating with const allEvents: KalshiRawEvent[] = [] and using in-place push eliminates all intermediate array allocations. For very large pages, use Array.prototype.push.apply(allEvents, events) to avoid spreading a large array onto the call stack.


Found by automated unbounded operations audit

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions