Skip to content

M2-M01: fix(clearnode): alert on home channel challenge instead of auto-checkpoint#710

Merged
philanton merged 1 commit into
fix/audit-findings-r2from
fix/clearnode-m2-m01
Apr 27, 2026
Merged

M2-M01: fix(clearnode): alert on home channel challenge instead of auto-checkpoint#710
philanton merged 1 commit into
fix/audit-findings-r2from
fix/clearnode-m2-m01

Conversation

@philanton
Copy link
Copy Markdown
Contributor

@philanton philanton commented Apr 27, 2026

Summary

  • Audit finding M2-M01: HandleHomeChannelChallenged always scheduled a ScheduleCheckpoint for the latest signed state when its version exceeded the challenged version, but the Checkpoint worker only handles OPERATE, DEPOSIT, and WITHDRAW intents. Latest signed states with CLOSE, escrow initiate/finalize, or migration intents were retried as checkpoints and failed silently, leaving channels exposed to stale settlement at challenge expiry.
  • Per the audit recommendation, automatic challenge response is postponed. The handler now emits a Warn log with channel id, user wallet, blockchain id, asset, challenged state version, and challenge expiry. Operators handle the alert manually before expiry.
  • Updated unit tests: replaced the success path with TestHandleHomeChannelChallenged_LogsOnly (asserts no UpdateChannel / GetLastStateByChannelID / ScheduleCheckpoint / RefreshUserEnforcedBalance calls), plus new _ChannelNotFound and _TypeMismatch cases.
  • No external documentation referenced the prior auto-checkpoint behaviour; the handler doc comment is updated in place to record the rationale.

HandleEscrowDepositChallenged and HandleEscrowWithdrawalChallenged are untouched — their scheduled actions (ScheduleFinalizeEscrowDeposit / ScheduleFinalizeEscrowWithdrawal) are dedicated paths and out of scope for this finding.

Test plan

  • go test ./clearnode/event_handlers/...
  • go vet ./clearnode/event_handlers/...
  • go build ./clearnode/...

Summary by CodeRabbit

  • Bug Fixes

    • Modified channel challenge handling to eliminate automatic state mutations and resolution scheduling. The system now logs challenge events without performing automated updates or follow-up actions.
  • Tests

    • Updated test coverage to verify log-only behavior for channel challenges and added edge-case validation for missing or mismatched channel scenarios.

…to-checkpoint

The previous handler always scheduled a checkpoint of the latest signed state
when its version was greater than the challenged version. The Checkpoint
worker action only supports OPERATE, DEPOSIT and WITHDRAW intents, so a
latest signed state with a CLOSE, escrow initiate/finalize or migration
intent would be retried as a checkpoint and ultimately fail without
submitting the dedicated on-chain method, leaving the channel exposed to
stale settlement at challenge expiry.

Disable the automatic response: log a warning carrying the channel id,
user wallet, blockchain, asset, challenged version and challenge expiry
so operators are alerted and can submit the appropriate on-chain action
manually before the challenge window closes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 27, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4c6c5a50-5c64-4d70-9dde-55eac99df959

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/clearnode-m2-m01

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
clearnode/event_handlers/service.go (1)

117-148: ⚠️ Potential issue | 🟠 Major

Visibility regression: HandleHomeChannelChallenged no longer persists challenged status or expiry, creating inconsistency with escrow handlers and dead code in HandleHomeChannelCheckpointed.

The audit concern was specific to ScheduleCheckpoint silently failing for non-checkpointable intents (CLOSE / escrow init/finalize / migration). However, this implementation also removed the channel.Status = ChannelStatusChallenged and channel.ChallengeExpiresAt updates, which were not part of the audit scope and introduce several issues:

  • API/UI queries will report a challenged home channel as Open, hiding the on-chain challenge from end users.
  • HandleHomeChannelCheckpointed (lines 100–102) resets Challenged → Open, but nothing now sets the status to Challenged for home channels, rendering this logic dead code.
  • HandleEscrowDepositChallenged (lines 251–255) and HandleEscrowWithdrawalChallenged (lines 367–371) both persist Status=Challenged and ChallengeExpiresAt, so home and escrow challenge tracking is now inconsistent.

Status and expiry persistence poses no "impossible transaction" risk, unlike the scheduling issue. Consider keeping the status/expiry updates and only removing the ScheduleCheckpoint call and related operations.

Requires clarification: Confirm with the audit/product team whether dropping status and expiry persistence for home channels was intentional, or whether only the auto-resolution logic was meant to be removed. If downstream surfaces don't depend on these fields, this can be safely deferred.

🧹 Nitpick comments (2)
clearnode/event_handlers/service.go (1)

139-146: Recommend pairing the warning with a metric/alert hook.

Operator response to a challenged home channel is now strictly time-bounded by ChallengeExpiry. A Warn log line alone is fragile — log pipelines can drop, get filtered, or be sampled, and there’s no rate signal. Consider emitting a counter (e.g. home_channel_challenged_total{blockchain_id, asset}) or a dedicated alert hook so on-call can wire up reliable alerting independent of log scraping.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clearnode/event_handlers/service.go` around lines 139 - 146, The Warn log for
a challenged home channel (logger.Warn in service.go) should be paired with a
metric/alert emission so operators get a reliable signal; add/increment a
Prometheus counter (e.g. home_channel_challenged_total) or call the existing
alert hook when you hit the logger.Warn branch, including labels blockchain_id
(channel.BlockchainID) and asset (channel.Asset) and optionally chanID or
challengedStateVersion as additional labels; ensure the counter is
defined/registered at package init (or use the project's central metrics
registry) and increment it in the same block where logger.Warn is called so
every warning also produces a metric/alertable event.
clearnode/event_handlers/service_test.go (1)

144-192: Optional: mirror LogsOnly's defensive AssertNotCalled checks in the two short‑circuit tests.

TestHandleHomeChannelChallenged_ChannelNotFound and TestHandleHomeChannelChallenged_TypeMismatch rely on the unset mock expectations to implicitly reject mutating calls. Adding explicit AssertNotCalled for UpdateChannel / GetLastStateByChannelID / ScheduleCheckpoint / RefreshUserEnforcedBalance would make the intent of these short-circuit branches as obvious as in LogsOnly, and protect against a future regression that, say, accidentally calls UpdateChannel before the type guard.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clearnode/event_handlers/service_test.go` around lines 144 - 192, Add
explicit negative assertions in TestHandleHomeChannelChallenged_ChannelNotFound
and TestHandleHomeChannelChallenged_TypeMismatch to mirror LogsOnly: after
calling service.HandleHomeChannelChallenged, call mockStore.AssertNotCalled for
the mutating mock methods UpdateChannel, GetLastStateByChannelID,
ScheduleCheckpoint, and RefreshUserEnforcedBalance to ensure those methods are
not invoked; place these AssertNotCalled checks alongside the existing
mockStore.AssertExpectations(t) so the short‑circuit branches are explicitly
verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@clearnode/event_handlers/service_test.go`:
- Around line 144-192: Add explicit negative assertions in
TestHandleHomeChannelChallenged_ChannelNotFound and
TestHandleHomeChannelChallenged_TypeMismatch to mirror LogsOnly: after calling
service.HandleHomeChannelChallenged, call mockStore.AssertNotCalled for the
mutating mock methods UpdateChannel, GetLastStateByChannelID,
ScheduleCheckpoint, and RefreshUserEnforcedBalance to ensure those methods are
not invoked; place these AssertNotCalled checks alongside the existing
mockStore.AssertExpectations(t) so the short‑circuit branches are explicitly
verified.

In `@clearnode/event_handlers/service.go`:
- Around line 139-146: The Warn log for a challenged home channel (logger.Warn
in service.go) should be paired with a metric/alert emission so operators get a
reliable signal; add/increment a Prometheus counter (e.g.
home_channel_challenged_total) or call the existing alert hook when you hit the
logger.Warn branch, including labels blockchain_id (channel.BlockchainID) and
asset (channel.Asset) and optionally chanID or challengedStateVersion as
additional labels; ensure the counter is defined/registered at package init (or
use the project's central metrics registry) and increment it in the same block
where logger.Warn is called so every warning also produces a metric/alertable
event.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b11f24fa-b387-4af2-b2eb-e116b959a4c2

📥 Commits

Reviewing files that changed from the base of the PR and between 6a6c125 and 7443147.

📒 Files selected for processing (2)
  • clearnode/event_handlers/service.go
  • clearnode/event_handlers/service_test.go

@philanton philanton merged commit 3874587 into fix/audit-findings-r2 Apr 27, 2026
3 checks passed
@philanton philanton deleted the fix/clearnode-m2-m01 branch April 27, 2026 10:31
nksazonov added a commit that referenced this pull request Apr 28, 2026
M2-I02: fix(clearnode/api): normalize participant addresses consistently across endpoints (#712)
M2-L01: docs(protocol): clarify session-key authorization scope and metadata extensibility (#711)
M2-M01: fix(clearnode): alert on home channel challenge instead of auto-checkpoint (#710)
M2-H01: fix(contracts/ChannelHub): add finalize escrow home chain intent check (#704)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants