Skip to content

bug: benchmark scripts crash entirely when one engine/model fails #408

@carlos-alm

Description

@carlos-alm

Found during dogfooding v3.1.2

Severity: Medium
Command: node scripts/benchmark.js, node scripts/embedding-benchmark.js

Reproduction

node scripts/benchmark.js
# Segfaults (exit 139) during 3rd WASM 1-file rebuild iteration
# Native benchmark results are never collected

node scripts/embedding-benchmark.js
# Crashes (exit 132, illegal instruction) on nomic-v1.5 model loading
# Results from completed models (minilm, jina-small, jina-base, nomic) are lost

Expected behavior

  • Each engine (WASM/native) should be wrapped in try/catch in benchmark.js so a crash in one doesn't prevent the other from running
  • Each model should be wrapped in try/catch in embedding-benchmark.js so one model's failure doesn't lose all results
  • Partial results should be reported with the failed engine/model marked as "crashed"

Actual behavior

  • benchmark.js line 182: await benchmarkEngine('wasm') has no error handling — WASM segfault kills process before native runs
  • embedding-benchmark.js: model loop has no try/catch — illegal instruction on nomic-v1.5 loses all 4 previously completed model results

Root cause

Missing try/catch isolation around engine and model benchmark loops.

Suggested fix

  1. In benchmark.js, wrap each benchmarkEngine() call in try/catch, store null for the crashed engine, and continue
  2. In embedding-benchmark.js, wrap each benchmarkModel() call in try/catch, record the error in the results JSON, and continue to the next model
  3. Consider running each engine/model in a child process (fork()) to isolate segfaults (which can't be caught by try/catch)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdogfoodFound during dogfooding

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions