Skip to content

Add manifest cache to prevent memory explosion on large builds#319

Open
jornmineur wants to merge 5 commits into11ty:mainfrom
jornmineur:no-buffer-in-memcache
Open

Add manifest cache to prevent memory explosion on large builds#319
jornmineur wants to merge 5 commits into11ty:mainfrom
jornmineur:no-buffer-in-memcache

Conversation

@jornmineur
Copy link
Copy Markdown

@jornmineur jornmineur commented Jan 22, 2026

PR Summary

This PR addresses issue #302 .

An earlier PR which attempted to solve the problem in a different way turned out to be a dead end, hence this new PR.

Problem summary

When processing large numbers of images, memory usage explodes because:

  1. Image buffers are stored in memCache and never released
  2. The entire source image is read into memory just to compute the cache hash, even when output files already exist

Solution

Two-part fix:

1. Persistent manifest cache (src/manifest-cache.js)

  • Stores image metadata in .cache/eleventy-img-manifest.json
  • Automatically strips buffers before persisting
  • Verifies output files still exist before returning cached stats

2. No buffer loading for production local files (in queue())

  • For local files in production mode, passes file path directly to Sharp instead of reading into a buffer
  • Source image buffer is never created, so there's nothing to leak

When the optimization applies

The new code path is used when ALL of these are true:

  • Source is a local file path (not a Buffer or remote URL)
  • Not in dryRun mode
  • Not in statsOnly mode
  • Not using transformOnRequest
  • Not an SVG file

Otherwise, falls back to existing behavior.

Files changed

  • New: src/manifest-cache.js — Persistent cache implementation
  • Modified: src/image.js:
    • Added ManifestCache import and singleton instance
    • Modified queue() to check manifest first, skip buffer loading
    • Modified getHash() to not store file contents in #contents
    • Added #canSkipBuffer(), #outputFilesExist(), hasLoadedBuffer helper methods
  • Modified: test/test.js — Added test to verify buffers aren't loaded

Manifest structure

{
  "./src/images/hero.jpg::abc123hash": {
    "mtime": 1705123456789,
    "size": 245678,
    "stats": {
      "avif": [{"format": "avif", "width": 400, "url": "...", ...}],
      "jpeg": [{"format": "jpeg", "width": 400, "url": "...", ...}]
    }
  }
}

Key is sourcePath::optionsHash. Buffers are stripped before storing.

Graceful degradation

  • If manifest is missing/corrupted: rebuilds automatically
  • If output files deleted but manifest exists: detects missing files and reprocesses
  • If image content changes: mtime/size mismatch triggers reprocessing

Performance impact

  • First build: Similar to before (slightly faster due to no buffer caching overhead)
  • Subsequent builds: Near-instant for unchanged images (just statSync + manifest lookup)
  • Memory: Constant regardless of image count (no buffers retained)

- Replace mtime+size check with content hash for reliable CI cache restoration
- Add urlFormat to #canSkipBuffer() exclusions
- Style cleanup: use let instead of const per project conventions
@jornmineur
Copy link
Copy Markdown
Author

Conversation: #315

Test sequence nr was missing
@Aankhen
Copy link
Copy Markdown

Aankhen commented Jan 25, 2026

Is there anything I need to do to enable the creation of the manifest with this branch? I replaced my v5 dependency, but the only behaviour I see is the increased memory usage I mentioned elsewhere, even on repeated builds, and I don’t see an eleventy-img-manifest.json file anywhere.

…iles – should be removed as a condition in #canSkipBuffer.
@jornmineur
Copy link
Copy Markdown
Author

jornmineur commented Jan 25, 2026

Is there anything I need to do to enable the creation of the manifest with this branch?

Not at all, you encountered a bug that I had missed! 😄

Based on what you describe, the manifest process was completely bypassed. The most likely reason is that your build config uses the default output directory, which of course is perfectly valid. The problem is that the code was checking for the presence of an explicitly defined outputDir config. That actually made no sense at all, so I just removed the condition. (*)

If you're indeed using the default output directory (in other words, haven't defined a specific/custom output directory), then the latest commit should solve the problem.

Thanks Aankhen, your input is super helpful!


(*) The idea for the outputDir check was a backstop to make sure we only apply the optimization when writing files to disk.
The existing conditions (!dryRun, !statsOnly, !transformOnRequest, !urlFormat) are highly specific, which feels slightly brittle – against the spirit of the principle of least knowledge.

Turns out it didn't help at all, and even backfired when the default outputDir wasn't explicitly set in user options.

@Aankhen
Copy link
Copy Markdown

Aankhen commented Jan 25, 2026

I’m glad to hear my blog’s idiosyncratic build process is helping. I owe you my thanks: all this experimentation helped me finally find the source of the runaway memory usage. I (re-)discovered that, for historical reasons, I wasn’t passing file paths but file contents to Image(…). Once I changed that, memory usage plummeted.

Now I could compare your branch to v6.0.4 in a slightly more useful manner. I ran each of them (in dev mode so that the server would remain) in separate Docker containers, recording Eleventy’s reported times and using a handy Python script to plot memory usage via Gnuplot. First, here’s a cold build (no existing generated images nor manifest):

v6’s memory usage climbed inexorably towards 12 GB (presumably because of sharp’s allocations since Node isn’t exhausting its heap), and the build itself also took 1,360 seconds. In comparison, your branch took 1,197 seconds and memory usage leveled out at 2 GB.

I next let it build a couple of times with the manifest then measured the third run (83 seconds with your branch, 90 with v6):

Neither version used nearly as much memory as before, but v6 still used around 300+ MB more than your version. Finally, I started a Bash terminal for each and let it build, serve, settle, and exit (Ctrl+C) thrice:

You can see that, in each instance, your branch finished the build more quickly (the difference is around the same as above) and used 30–60% less memory.

All of which is to say: this is great work and I think these benchmarks clearly demonstrate the improvement! Thanks for putting the PR together.

@jornmineur
Copy link
Copy Markdown
Author

These benchmarks are so valuable and insightful! Great to see the improvements.

On my end I'm seeing even more dramatic gains, but that's likely because the site has many category pages sharing the same images — lots of manifest cache hits.

Really appreciate your help with this! 🙏

@Aankhen
Copy link
Copy Markdown

Aankhen commented Jan 25, 2026

Thank you for your work. And yes, that makes sense about reusing the images in your case.

By the way, I apologize for the misspellings in the legends! I’ve updated the images so they have correctly-labeled lines.

@zeroby0
Copy link
Copy Markdown
Contributor

zeroby0 commented Jan 28, 2026

The results are amazing! And @Aankhen 's test methodology is also really nice!

I'm just asking out of curiosity, when an image is processed and saved to disk, it's stored as an entry to manifest-cache, right? And when the image is being processed the next time, a manifest-cache lookup is made, and the already processed file is returned.

But since we have isCached in exists-cache & disk-cache to see if a file exists, can we rely on that instead to see if a processed image already exists? So that we rely on a single source of truth - the filesystem, than maintain a separate file to track.

@jornmineur jornmineur marked this pull request as draft January 29, 2026 12:33
@jornmineur
Copy link
Copy Markdown
Author

jornmineur commented Jan 29, 2026

The latest commit now uses isCached.

This give us one source of truth, with proper cache counter tracking as a bonus.

I tested from a cold start and from a full build, and both worked fine, with no noticeable difference in build time, memory and CPU load.

@stefan-burke
Copy link
Copy Markdown

Many thanks for fixing this @jornmineur , I've got a big Eleventy site with 1,200 images and 13,000 pages and your fix halves its build time (with no memory explosion causing swap writes)

@jornmineur jornmineur marked this pull request as ready for review January 29, 2026 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants