https://hl.oyasumi.dev/@ntek/019dbe77-60f6-7171-86d9-8fb369ec501d
Opening a ticket regarding this matter.
That said, since we haven’t been able to thoroughly investigate the issue—such as how to reproduce it—I have included below a summary of the problem generated by Opus 4.7 after analyzing the logs.
Summary
When persistSharingPost runs (either directly from a valid Announce
or from a retry caused by the (actor_id, sharing_id) unique-constraint
violation), it calls persistPost on the original object, and
persistPost unconditionally walks the remote replies collection and
recursively re-invokes itself on each reply — which in turn walks that
reply's replies collection. Under normal load this already produces a
lot of traffic; when multiple such activities are processed concurrently
it produces bursty, parallel fetches of the same remote URL at
millisecond intervals, severe enough to be rejected with HTTP 429 by the
origin instance.
Evidence (single offending URL hit three times in ~80 ms)
07:46:33.505 DBG Fetched document: 200 'https://fedibird.com/users/***/statuses/<status id>/replies?only_other_accounts=true&page=true'
(x-ratelimit-remaining: 0)
07:46:33.537 DBG Fetching document: 'GET' <same URL>
07:46:33.557 DBG fedify·sig·http: Failed to verify with draft-cavage-http-signatures-12 (429); retrying with rfc9421...
07:46:33.583 DBG Fetching document: 'GET' <same URL> (double-knock retry)
07:46:33.628 ERR Failed to fetch document: 429 <same URL>
07:46:33.629 ERR fedify·vocab: Failed to fetch '<same URL>': FetchError: HTTP 429
Stack trace at the 429 failure
at getRemoteDocument (.../@fedify/fedify/.../docloader-*.js)
at load (.../@fedify/fedify/.../authdocloader-*.js)
at CollectionPage.#fetchNext (.../@fedify/fedify/.../actor-*.js)
at CollectionPage.getNext (.../@fedify/fedify/.../actor-*.js)
at traverseCollection (.../@fedify/fedify/dist/vocab/mod.js)
at iterateCollection (/app/src/federation/collection.ts:19)
at persistPost (/app/src/federation/post.ts:453)
at persistSharingPost (/app/src/federation/post.ts:516)
Code path
persistPost (src/federation/post.ts) fetches the replies
collection once up-front:
const replies = await object.getReplies(options);
and later walks the entire collection, recursively persisting each
reply — which itself calls getReplies() on that reply:
if (replies != null) {
for await (const item of iterateCollection(replies, { ...options, suppressError: true })) {
if (!isPost(item)) continue;
await persistPost(db, item, baseUrl, { ...options, skipUpdate: true, replyTarget: post });
// ^^^^^^^^^^^ every recursion refetches that reply's `replies` too
}
}
persistSharingPost (same file) calls persistPost(originalObject, …)
on every invocation. It currently deduplicates only by the Announce
activity IRI, so when the same (actor, sharing) pair is announced
again (e.g. re-reblog or duplicated delivery), the insert fails against
the posts_actor_id_sharing_id_unique constraint, Fedify retries the
activity, and each retry re-enters this code path.
Impact
- Concurrent inbox handling causes the same origin URL to be refetched
within tens of milliseconds.
- A single post with many replies produces a cascade: fetch root
replies → N recursive persistPost calls → N further getReplies()
fetches, etc.
- Observed outcome: 429 from the origin, further replies silently
dropped.
Suggested fixes
-
Make persistSharingPost idempotent on (actor_id, sharing_id)
before insert, so duplicate Announces do not trigger retries and
re-fetches at all:
const existingShare = await db.query.posts.findFirst({
with: {
account: { with: { owner: true } },
sharing: { with: { account: { with: { owner: true } } } },
},
where: and(
eq(posts.accountId, account.id),
eq(posts.sharingId, originalPost.id),
),
});
if (existingShare != null) return existingShare;
(or use onConflictDoNothing().returning() + lookup.)
-
Reconsider whether persistPost must eagerly traverse the remote
replies collection on every call — in particular, whether it
should do so at all from persistSharingPost, and whether the
recursive descent per-reply is necessary. Skipping it, bounding it
(max depth / max items), or making it lazy would both reduce
baseline load and eliminate the worst-case burst observed above.
Environment
- Hollo 0.7.11
- PostgreSQL 18
- Affected code:
src/federation/post.ts — persistSharingPost,
persistPost; src/federation/collection.ts — iterateCollection
https://hl.oyasumi.dev/@ntek/019dbe77-60f6-7171-86d9-8fb369ec501d
Opening a ticket regarding this matter.
That said, since we haven’t been able to thoroughly investigate the issue—such as how to reproduce it—I have included below a summary of the problem generated by Opus 4.7 after analyzing the logs.
Summary
When
persistSharingPostruns (either directly from a validAnnounceor from a retry caused by the
(actor_id, sharing_id)unique-constraintviolation), it calls
persistPoston the original object, andpersistPostunconditionally walks the remoterepliescollection andrecursively re-invokes itself on each reply — which in turn walks that
reply's
repliescollection. Under normal load this already produces alot of traffic; when multiple such activities are processed concurrently
it produces bursty, parallel fetches of the same remote URL at
millisecond intervals, severe enough to be rejected with HTTP 429 by the
origin instance.
Evidence (single offending URL hit three times in ~80 ms)
Stack trace at the 429 failure
Code path
persistPost(src/federation/post.ts) fetches therepliescollection once up-front:
and later walks the entire collection, recursively persisting each
reply — which itself calls
getReplies()on that reply:persistSharingPost(same file) callspersistPost(originalObject, …)on every invocation. It currently deduplicates only by the
Announceactivity IRI, so when the same
(actor, sharing)pair is announcedagain (e.g. re-reblog or duplicated delivery), the insert fails against
the
posts_actor_id_sharing_id_uniqueconstraint, Fedify retries theactivity, and each retry re-enters this code path.
Impact
within tens of milliseconds.
replies → N recursive
persistPostcalls → N furthergetReplies()fetches, etc.
dropped.
Suggested fixes
Make
persistSharingPostidempotent on(actor_id, sharing_id)before insert, so duplicate Announces do not trigger retries and
re-fetches at all:
(or use
onConflictDoNothing().returning()+ lookup.)Reconsider whether
persistPostmust eagerly traverse the remoterepliescollection on every call — in particular, whether itshould do so at all from
persistSharingPost, and whether therecursive descent per-reply is necessary. Skipping it, bounding it
(max depth / max items), or making it lazy would both reduce
baseline load and eliminate the worst-case burst observed above.
Environment
src/federation/post.ts—persistSharingPost,persistPost;src/federation/collection.ts—iterateCollection