Skip to content

[#2288] Fix _handle_get_state arity + live GossipLayer regression test#2312

Open
maitoyamada09 wants to merge 1 commit intoScottcjn:mainfrom
maitoyamada09:fix/2288-get-state-arity
Open

[#2288] Fix _handle_get_state arity + live GossipLayer regression test#2312
maitoyamada09 wants to merge 1 commit intoScottcjn:mainfrom
maitoyamada09:fix/2288-get-state-arity

Conversation

@maitoyamada09
Copy link
Copy Markdown

Fixes #2288

Summary

  • _handle_get_state was calling _signed_content with the old 3-arg shape (msg_type, sender_id, payload), but since [SECURITY][MEDIUM] msg_id and ttl fields not covered by signature — replay under fresh msg_id #2272 the helper requires 5 (msg_type, sender_id, msg_id, ttl, payload). Every GET_STATE triggered a TypeError on the responder and the state response was dropped.
  • Fix: generate a deterministic msg_id (sha256 over msg_type:sender_id:payload:time, 24 hex chars — same pattern as create_message), use ttl=0 for the STATE response, call _signed_content with the 5-arg shape, and echo msg_id/ttl back in the response dict so the requester can reconstruct the exact signed content (AC Wallet Generation tool code cleanup/functionality confirmation #2).
  • request_full_sync now prefers the echoed msg_id/ttl when rebuilding the incoming GossipMessage, with a fallback to the old sync:{responder_id}:{timestamp} shape.
  • Scope kept narrow (_handle_get_state + immediate caller), per the bounty's scoping note.

Test plan

New file: node/tests/test_p2p_get_state_arity_2288.py — 4 tests exercising two live GossipLayer instances per AC #3, no mocks. Loader mirrors test_p2p_hardening_phase2.py / test_p2p_phase_f_ed25519.py.

  • test_handle_get_state_does_not_raiseAC DOS Tools Initial Upload #1. Confirmed to fail on pre-fix code with TypeError: _signed_content() missing 2 required positional arguments: 'ttl' and 'payload'.
  • test_state_response_includes_msg_id_and_ttlAC Wallet Generation tool code cleanup/functionality confirmation #2.
  • test_state_response_signature_verifies_end_to_endAC security: harden attestation endpoint against replay and spoofing #3. Reconstructs the signed bytes on the requester side exactly as verify_message does (same _signed_content args + :timestamp suffix) and recomputes the HMAC, asserting it matches the responder's signature.
  • test_state_response_tamper_fails_verification — negative control: post-sign payload flip must not produce the original HMAC. Guards against regressions that drop msg_id/ttl from the signed content.

All 4 tests pass on this branch.

Why the end-to-end test checks HMAC bytes rather than calling verify_message directly

There is a pre-existing, unrelated bug on main in verify_message (rustchain_p2p_gossip.py:483) — it unpacks p2p_identity.unpack_signature(...) (3-tuple since the key-version change) into 2 variables and raises ValueError: too many values to unpack (expected 2, got 3). This already breaks every existing P2P test on main (test_p2p_hardening_phase2.py, test_p2p_phase_f_ed25519.py, etc.), so AC #4 ("existing P2P tests still pass") is moot on the current base — it is not a regression from this PR. Flagging it here as a heads-up; happy to open a separate issue / fix PR. Working at the HMAC bytes level keeps this test decoupled from that bug and gives an exact, deterministic check of the #2288 signing contract.

Bounty claim

  • GitHub: maitoyamada09
  • RTC wallet: maitoyamada09

🤖 Generated with Claude Code

…on test

Fixes Scottcjn#2288

The `_handle_get_state` handler was calling `_signed_content` with only 3
positional args (`msg_type`, `sender_id`, `payload`), but since the Phase
B signing change (Scottcjn#2272) that method requires 5 (`msg_type`, `sender_id`,
`msg_id`, `ttl`, `payload`). Any peer sending a GET_STATE gossip message
triggered a TypeError on the responder and the state response was
silently dropped — breaking attestation-sync integrity.

Fix
---
- `_handle_get_state` now generates a deterministic `msg_id` (sha256 over
  `msg_type:sender_id:payload:time` truncated to 24 hex chars, mirroring
  `create_message`), uses `ttl=0` for the STATE response, and calls
  `_signed_content` with the full 5-arg shape.
- The response dict now includes `msg_id` and `ttl` so the requester can
  rebuild the exact signed content and verify the signature (AC Scottcjn#2).
- `request_full_sync` now prefers the echoed `msg_id`/`ttl` when
  reconstructing the incoming `GossipMessage`, falling back to the old
  `sync:{responder_id}:{timestamp}` shape for older peers (whose sigs
  would never have verified anyway due to the arity bug).

Scope kept narrow: only `_handle_get_state` and its immediate caller
`request_full_sync` are touched, per the bounty's scoping note.

Regression test — `node/tests/test_p2p_get_state_arity_2288.py`
----------------------------------------------------------------
Four tests exercised against two live `GossipLayer` instances (per AC
Scottcjn#3, no mocks):

  1. `test_handle_get_state_does_not_raise` — covers the original
     TypeError path; fails on pre-fix code with the exact message
     `_signed_content() missing 2 required positional arguments`.
  2. `test_state_response_includes_msg_id_and_ttl` — AC Scottcjn#2.
  3. `test_state_response_signature_verifies_end_to_end` — AC Scottcjn#3.
     Reconstructs the signed bytes on the requester side (same
     `_signed_content` + timestamp suffix `verify_message` uses) and
     recomputes the HMAC, asserting it matches the responder's
     signature. This deliberately operates at the HMAC bytes level
     rather than calling `verify_message` directly because of a
     pre-existing unrelated bug on `main` in `verify_message` — it
     unpacks `p2p_identity.unpack_signature()` (3-tuple) into 2
     variables and raises `ValueError` on every existing P2P test.
     Mentioned here as a heads-up; out of scope for Scottcjn#2288.
  4. `test_state_response_tamper_fails_verification` — negative
     control: a post-sign payload flip must not produce the original
     HMAC, guarding against regressions that drop msg_id/ttl from the
     signed content.

Loader pattern (`importlib.util` + tempfile sqlite) mirrors the
existing `test_p2p_hardening_phase2.py` / `test_p2p_phase_f_ed25519.py`
so the new file slots into the current P2P test suite cleanly.

Wallet for payout: maitoyamada09

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related tests Test suite changes labels Apr 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Welcome to RustChain! Thanks for your first pull request.

Before we review, please make sure:

  • Your PR has a BCOS-L1 or BCOS-L2 label
  • New code files include an SPDX license header
  • You've tested your changes against the live node

Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150)

A maintainer will review your PR soon. Thanks for contributing!

Copy link
Copy Markdown

@FlintLeng FlintLeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Fix _handle_get_state arity + GossipLayer regression test. ✅

Assessment

  • 225 additions, 6 deletions
  • Fixes argument count mismatch in _handle_get_state
  • Adds regression test for live GossipLayer behavior

Positives

  • Regression test prevents re-introduction of this bug
  • Fix is targeted to the specific arity issue

Concerns

  • 225 additions is substantial for an arity fix — verify the regression test doesn't depend on external services
  • Consider integration test isolation (mock gossip responses)

Valuable fix + regression test. ✅

Copy link
Copy Markdown

@rockytian-top rockytian-top left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: #2312 — [#2288] Fix _handle_get_state arity + live GossipLayer regre

Overall: Approve — good contribution.

Code quality: The changes look clean and focused.

Suggestions:

  • Consider adding inline comments for non-obvious logic
  • Error handling could be more explicit in the new functions

No blockers from my side. Nice work!

Copy link
Copy Markdown

@fengqiankun6-sudo fengqiankun6-sudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Critical bug fix for #2288. The 3-arg _signed_content call was raising TypeError, causing all GET_STATE responses to be silently dropped. The fix correctly uses 5-arg signature (msg_type, sender_id, msg_id, ttl, payload) with a synthetic msg_id. Good regression test added.

Copy link
Copy Markdown

@wuxiaobinsh-gif wuxiaobinsh-gif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: [#2288] Fix _handle_get_state arity + live GossipLayer regression test

Summary

This PR fixes a critical arity mismatch in _handle_get_state where _signed_content was being called with 3 arguments instead of the required 5 (after #2272's interface change).

Technical Observations

1. Elegant workaround for msg_id generation
The deterministic msg_id = sha256(f"{msg_type}:{sender_id}:{payload}:{time}") approach is clean and consistent with the existing create_message pattern. Using ttl=0 for STATE responses is logical since these are synchronous replies, not forwarded messages.

2. Backward compatibility with fallback
The dual-path fallback (echoed msg_id/ttl vs old sync:{responder_id}:{timestamp}) in request_full_sync is a thoughtful touch — it gracefully handles pre-fix responders. This kind of backward-compatible protocol extension is exactly what's needed for a live network.

3. Test isolation from pre-existing bug
Writing the end-to-end test at the HMAC bytes level (rather than calling verify_message) to avoid the pre-existing main bug in verify_message is pragmatic. The comment clearly explains this decision.

4. Test coverage is thorough
Four tests covering: non-crash (AC#1), response shape (AC#2), end-to-end signature verification (AC#3), and tamper detection. The use of live GossipLayer instances over mocks is appropriate for P2P protocol testing.

Minor Notes

  • The PR description flags a pre-existing verify_message bug (3-tuple vs 2-tuple unpacking in p2p_identity.unpack_signature). This is outside scope but worth tracking separately.
  • The scope is appropriately narrow — only _handle_get_state and its immediate caller were modified.

Verdict: Looks good to merge. ✅

@MichaelSovereign
Copy link
Copy Markdown
Contributor

Michael Sovereign here. Good catch on the in ! I've confirmed that this is indeed a regression from PR #2296 (which merged today).

I've just opened PR #2320 to fix this unpacking bug globally. Once merged, it will unblock your AC #4 and all existing P2P tests on . Thanks for flagging!

Copy link
Copy Markdown

@fengqiankun6-sudo fengqiankun6-sudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Fixing _handle_get_state arity along with the GossipLayer regression test is comprehensive work.

Claiming bounty #2782 (PR Review - 2 RTC)

@MichaelSovereign
Copy link
Copy Markdown
Contributor

Michael Sovereign here. Great work on the arity fix and the live node regression tests. This is a critical fix for attestation sync. I've unified the signature unpacking API in PR #2321 to prevent future divergences. Verified and LGTM! 🦅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/L PR: 201-500 lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BOUNTY: 25 RTC] _handle_get_state calls _signed_content with wrong arity (TypeError when STATE requested)

6 participants