fix(ipni): preserve failure reason in deal-flow rethrow#490
Merged
Conversation
The rethrow at the deal-flow IPNI verification site swallowed ipniResult.failedCIDs[0].reason, so the outer ipni_tracking_failed log only showed "root CID not verified" regardless of why verification actually failed (timeout vs missing expected provider vs HTTP fetch error vs parse error). Pass the underlying reason through so SPs and operators can self-diagnose IPNI breaches without filing tickets. Closes #473
3 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
Improves observability for IPNI deal-flow failures by preserving and rethrowing the underlying IPNI verification failure reason so downstream/outer logs (e.g., ipni_tracking_failed) surface actionable error details instead of the generic “root CID not verified”.
Changes:
- Include
ipniResult.failedCIDs[0]?.reasonin the thrown error message whenrootCIDVerifiedis false (with a fallback to the prior generic message).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Carry the underlying failure reason via the native Error.cause field
rather than concatenating it into the message. toStructuredError
already recurses through cause (MAX_CAUSE_DEPTH=5), so the structured
log will surface { error.message: "...root CID not verified",
error.cause.message: "IPNI verification timed out after 60000ms" }.
Cleaner separation: outer message describes the failure class,
cause carries the diagnostic detail.
SgtPooki
added a commit
that referenced
this pull request
Apr 28, 2026
…lureReason Companion to #490, which now carries the actual failure detail via Error.cause rather than concatenating it into the outer message. Promote cause.message to the top-level failureReason field when present, falling back to error.message when no cause is attached (e.g. for non-rethrown errors thrown directly inside the catch block). Result: { failureReason: "IPNI verification timed out after 60000ms" } instead of { failureReason: "IPNI verification failed for deal X: root CID not verified" }.
4 tasks
silent-cipher
approved these changes
Apr 28, 2026
SgtPooki
added a commit
that referenced
this pull request
May 4, 2026
…failed (#491) * feat(ipni): expose failureReason as top-level field on ipni_tracking_failed toStructuredError(error) only captures error.message inside a nested error.message field, which forces BS users to JSON-extract through two layers to filter or aggregate by failure type. Promote the underlying message to a top-level failureReason field so operators and SPs can group ipni_tracking_failed events by reason class without parsing the stack-bearing error blob. Companion to the fix in #490. * refactor(ipni): prefer error.cause.message over error.message for failureReason Companion to #490, which now carries the actual failure detail via Error.cause rather than concatenating it into the outer message. Promote cause.message to the top-level failureReason field when present, falling back to error.message when no cause is attached (e.g. for non-rethrown errors thrown directly inside the catch block). Result: { failureReason: "IPNI verification timed out after 60000ms" } instead of { failureReason: "IPNI verification failed for deal X: root CID not verified" }.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ipniResult.failedCIDs[0].reasonthrough to the rethrow atapps/backend/src/deal-addons/strategies/ipni.strategy.ts:248so the outeripni_tracking_failedlog carries the actual reason instead ofroot CID not verifiedipni_root_cid_verification_failedalready populates afailureReasonfield correctly. The outer log loses it during rethrow.Why
beck-8 reported that the generic message makes SP-side troubleshooting impossible. After this PR, the same failed deal will surface "IPNI verification timed out after 60000ms" / "Missing expected provider(s): ..." / "Failed to query IPNI indexer: ..." at the catch site instead of "root CID not verified".
See dealbot#473 audit comment: #473 (comment) for the full investigation including evidence that all visible IPNI failures are real SLO breaches with reasons that are already captured by the verify code path but lost at this single rethrow.
Test plan
pnpm exec vitest run src/deal-addons/strategies/ipni.strategy.spec.ts-> 11/11 pass (no test asserts on the old message string)IpniStatus.FAILED, still recordsfailure.timedoutmetric. Only the error message changes.ipni_tracking_failedevents in BS now showfailureReasoncontent matchingipni_root_cid_verification_failedfor the same deal.