Skip to content

🐛 Fixed broken oembeds on sites with many past embeds#27926

Merged
allouis merged 1 commit into
mainfrom
oembed-favicon-content-hash
May 18, 2026
Merged

🐛 Fixed broken oembeds on sites with many past embeds#27926
allouis merged 1 commit into
mainfrom
oembed-favicon-content-hash

Conversation

@allouis
Copy link
Copy Markdown
Collaborator

@allouis allouis commented May 16, 2026

ref https://linear.app/tryghost/issue/ONC-1727

Root cause

processImageFromUrl saves bookmark icons under images/icon/<basename>. On sites with many entries sharing a basename, getUniqueFileName walks the accumulated <basename>-N tail one S3 HEAD per iteration to find a free slot — linear in the accumulation, easily 60-90s per bookmark on busy publishers.

The walk was masked before 7c3ca72 (6.37.0) by a malformed-key bug in S3Storage.exists() that made every uniqueness check return false; once the key was fixed, the walk started actually running.

Fix

Content-addressing the filename removes the need for the uniqueness check entirely: identical bytes land at the same key, different bytes land at different keys, no walk in either case. It's also the right model for what these files are — a cache of an external resource, where the basename carries no user intent.

Scope

Only processImageFromUrl in the oembed service. Other image-saving paths (post uploads, media uploads, site logo, importer, etc.) pass relative targetDir already and aren't affected by the walk.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 16, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 94b16c3f-d21c-441c-bd4e-4399987a2a1d

📥 Commits

Reviewing files that changed from the base of the PR and between e290571 and 523d884.

📒 Files selected for processing (2)
  • ghost/core/core/server/services/oembed/oembed-service.js
  • ghost/core/test/unit/server/services/oembed/oembed-service.test.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • ghost/core/core/server/services/oembed/oembed-service.js
  • ghost/core/test/unit/server/services/oembed/oembed-service.test.js

Walkthrough

This PR replaces the oEmbed service's per-write unique filename generation with content-addressed storage. The processImageFromUrl function now computes a SHA-256 hash of the downloaded image buffer, appends the truncated hash to the sanitized filename, and uses that deterministic path to save the image via store.saveRaw. The implementation removes the dependency on the storage adapter's generateUnique method. Unit tests are updated to verify hash-based filename format, confirm deduplication behavior across repeated bookmarks, validate that different image bytes produce different filenames, and ensure the old unique-name generation path is no longer invoked.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title mentions fixing broken oembeds on sites with many past embeds, which directly relates to the main change of implementing content-hashed filenames to eliminate expensive S3 uniqueness checking.
Description check ✅ Passed The description clearly explains the root cause (S3 HEAD walk performance issue), the fix (content-addressed filenames), and the scope of changes, all related to the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch oembed-favicon-content-hash

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@allouis allouis force-pushed the oembed-favicon-content-hash branch 2 times, most recently from 5859bf2 to e290571 Compare May 16, 2026 11:12
@allouis allouis changed the title Stored bookmark icons under content-hashed filenames 🐛 Fixed broken oembeds on sites with many past embeds May 16, 2026
ref https://linear.app/tryghost/issue/ONC-1727

processImageFromUrl saves bookmark icons under `images/icon/<basename>`.
On sites with many entries sharing a basename, getUniqueFileName walks
the accumulated `<basename>-N` tail one S3 HEAD per iteration to find a
free slot — linear in the accumulation, easily 60-90s per bookmark on
busy publishers.

The walk was masked before 7c3ca72 (6.37.0) by a malformed-key bug in
S3Storage.exists() that made every uniqueness check return false; once
the key was fixed, the walk started actually running.

Content-addressing the filename removes the need for the uniqueness
check entirely: identical bytes land at the same key, different bytes
land at different keys, no walk in either case. It's also the right
model for what these files are — a cache of an external resource,
where the basename carries no user intent.
@allouis allouis force-pushed the oembed-favicon-content-hash branch from e290571 to 523d884 Compare May 18, 2026 09:49
@allouis allouis merged commit 2241565 into main May 18, 2026
76 of 85 checks passed
@allouis allouis deleted the oembed-favicon-content-hash branch May 18, 2026 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants