Skip to content

Bug: readRecentCandidates() loads entire file into memory causing OOM crash #210

@zhongliangyu

Description

@zhongliangyu

Problem

The readRecentCandidates() function in src/gep/assetStore.js reads the entire candidates.jsonl file into memory, even though it only needs the last N lines:

function readRecentCandidates(limit = 20) {
  const raw = fs.readFileSync(p, 'utf8');  // Reads ENTIRE file!
  const lines = raw.split('\n').map(l => l.trim()).filter(Boolean);
  return lines.slice(Math.max(0, lines.length - limit))...
}

Impact

When candidates.jsonl grows large (my file reached 83MB), this causes:

  • Memory spike during each evolution cycle
  • Potential OOM crash when combined with other memory usage
  • Process restart/killed by the suicide check (EVOLVER_MAX_RSS_MB)

Reproduction

  1. Run evolver for an extended period (weeks/months)
  2. candidates.jsonl accumulates to tens of MB
  3. Observe memory usage spike or process crash

Suggested Fix

Use streaming or tail approach to only read the last N lines:

function readRecentCandidates(limit = 20) {
  const { execSync } = require('child_process');
  try {
    const output = execSync(`tail -n ${limit} "${p}"`, { encoding: 'utf8', timeout: 5000 });
    return output.split('\n').filter(Boolean).map(l => {
      try { return JSON.parse(l); } catch { return null; }
    }).filter(Boolean);
  } catch { return []; }
}

Or implement a proper backwards file reader for Windows compatibility.

Environment

  • evolver version: v1.27.5
  • Node.js: v20.x
  • OS: Linux
  • candidates.jsonl size: 83MB (before cleanup)

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions