Skip to content

Refactor/post v1 cleanup#46

Merged
entrius merged 35 commits intotestfrom
refactor/post-v1-cleanup
Apr 15, 2026
Merged

Refactor/post v1 cleanup#46
entrius merged 35 commits intotestfrom
refactor/post-v1-cleanup

Conversation

@LandynDev
Copy link
Copy Markdown
Collaborator

No description provided.

Validator axon, pending-confirm processing, and miner fulfillment now
reject any source tx whose on-chain sender does not match the address
the user proved ownership of at reserve time. Without this, a user
could reserve a miner for amount X and submit any unrelated third-party
tx of amount X to the miner's deposit address, tricking the quorum into
initiating a swap they never funded themselves.

The CLI swap summary now surfaces the reserved source address and warns
that validators will reject tx from any other address.
Block hint came from the caller's view of the chain, which can race
against finality by a block or two. The previous plus or minus 2 window
occasionally missed transfers that landed one block later than the hint.
Previously the entry was removed before the contract call, so a
transient RPC failure during vote_initiate would silently lose the
queued swap — the validator would never retry and the user's swap
would sit until reservation expiry. The entry is now only dropped on
successful vote or when the contract reports the swap is already
initiated (a terminal, non-retryable outcome).
New DEFAULT_MINER_TIMEOUT_CUSHION_BLOCKS constant (5), overridable via
MINER_TIMEOUT_CUSHION_BLOCKS env var. SwapFulfiller skips fulfillment
when fewer than cushion blocks remain before timeout, trading a skipped
swap for avoided slashes when the dest chain is slow to include the
miner's transfer. Set to 0 to disable the cushion.
SwapPoller no longer keeps an in-memory _processed filter — it just
reports raw contract state. SwapFulfiller's persistent _sent cache now
records both the dest-tx outcome and whether mark_fulfilled has
succeeded, so retry polls neither re-send funds nor re-call the
contract. One source of truth for "have I handled this swap", and it
survives restarts.
SwapFulfiller.send_dest_funds used to re-read the miner's commitment
from substrate storage on every swap just to get its own from_address.
The miner already knows its own addresses — they're the ones it just
posted. Now the miner neuron loads the pair once at startup, passes a
chain→address dict to SwapFulfiller, and reloads it when the CLI
signals a new rate post via a flag file in ~/.allways/miner/.
Dropped the key= parameter from ChainProvider.send_amount. TAO provider
now accepts bt.Wallet at construction; BTC provider already reads from
env. Miner's SwapFulfiller no longer threads its wallet into per-send
conditionals, and adding a new chain no longer means reasoning about
where its signing key is plumbed from.
Replaces the mutable fee_divisor storage field with a const FEE_DIVISOR
of 100 in lib.rs. set_fee_divisor and get_fee_divisor messages are gone
— even the owner cannot change the fee. Python pins FEE_DIVISOR=100 in
constants.py and every caller (validator init, miner init, CLI quote,
CLI swap, admin command, view command) consumes the constant directly
instead of polling the contract. One source of truth, no cache to
refresh, no divergence risk between miner and validator views.

Contract metadata will be regenerated at the C9/C10 dev-env redeploy
boundary.
Previously only the timeout path checked min_collateral after slashing.
The confirm path also deducts from collateral (the 1% fee), and a
marginally-collateralized miner could drop below the minimum without
being deactivated, so validators would still credit them with
crown-time despite the miner being unable to accept new swaps.

Now confirm mirrors timeout's guard: if collateral drops below
min_collateral after fee deduction, flip miner_active to false,
record the deactivation block, and emit MinerActivated{active: false}
so event watchers can observe the state change.
_log_status, _check_state_changes, _read_current_state, _last_pair,
_last_active, _last_collateral, _last_status_step, and
MINER_STATUS_LOG_INTERVAL_STEPS are all gone. The miner forward loop
no longer polls contract state every forward step to pretty-log state
transitions — that was ~80 lines of code existing purely for logs
that nobody reads during steady state. What mattered (collateral,
active, committed pair) is visible via the on-chain dashboard and the
validator's event watcher.
_poll_commitments was a single function interleaving three concerns:
event pruning, rate refresh, and dereg purge. Split into
_prune_aged_rate_events, _refresh_miner_rates, and
_purge_deregistered_hotkeys, with _poll_commitments now a thin
orchestrator that enforces the poll cadence and calls the three in
order. No behavior change — existing tests still cover the orchestrator
via _poll_commitments.
rate_state.db and pending_confirms.db are gone. One state.db file,
one ValidatorStateStore class, one connection, one lock, four tables:
pending_confirms, rate_events, collateral_events, swap_outcomes. Every
validator consumer now accesses state via self.state_store instead of
two separate handles. Dev-env operators running multiple validators
against the same file still benefit from the busy_timeout-before-WAL
init dance that the old RateStateStore introduced.

Old rate_state.db and pending_confirms.db files from prior runs are
orphaned — operators should delete them on upgrade.
New class that replays contract events (CollateralPosted/Withdrawn/
Slashed, MinerActivated, SwapCompleted/TimedOut, ConfigUpdated) into
in-memory miner state. The decoder is ported from alw-utils' existing
event watcher (same SCALE decode path, same metadata-driven registry)
so both consumers can stay in sync as ink! events evolve.

No validator wiring yet — this commit just adds the class and 11 unit
tests that exercise every event type against a stub substrate. The
real substrate sync path will come in C14 together with removing
_poll_collaterals and _refresh_min_collateral from forward.py.
_poll_collaterals, _refresh_min_collateral, and their related constants
(COLLATERAL_POLL_INTERVAL_BLOCKS, MIN_COLLATERAL_REFRESH_INTERVAL_BLOCKS)
are gone. The validator now owns a ContractEventWatcher that replays
collateral, active-flag, min_collateral, and swap-outcome events from
the chain each forward step.

Scoring consequences:
- _crown_holders now gates on the event-sourced active_miners set in
  addition to metagraph membership — fixing a long-standing bug where
  an inactive miner could still hold the crown if they stayed in the
  metagraph with a good rate and collateral above min.
- _replay_crown_time takes the event_watcher as a new parameter and
  sources collateral history from its in-memory list. Rates still come
  from state_store (populated by commitment polling).
- min_collateral reads off the watcher's cached value.

ValidatorStateStore loses its collateral_events table and the
insert_/get_latest_/get_events_in_range_ collateral methods that go
with it. Tests migrated to seed the watcher directly via the
CollateralEvent dataclass. Dead test file test_poll_collaterals.py
removed.
Outcome persistence moved fully to ContractEventWatcher in C14 —
SwapTracker now just maintains the in-memory active set so the forward
loop knows what to verify and vote on. The store_store parameter is
gone from __init__, and the fragile 'if None infer state from
timeout_block' branch in _poll_inner is replaced with a bounded retry
counter (NULL_SWAP_RETRY_LIMIT=3). Transient RPC flakes no longer
manufacture completion or timeout outcomes; after the retry limit the
swap is silently dropped from active tracking and the event watcher's
replay fills in the real outcome.

resolve() no longer writes to the ledger either — it just drops the
swap from active and clears voted/retry state.
Function renames for readability across the validator module:
  _clear_provider_caches              → clear_provider_caches
  _process_pending_confirms           → initialize_pending_user_reservations
  _verify_fulfilled                   → confirm_miner_fulfillments
  _score_miners                       → run_scoring_pass
  _replay_crown_time                  → replay_crown_time_window
  _crown_holders                      → crown_holders_at_instant
  _timeout_expired                    → enforce_swap_timeouts
  _extend_near_timeout_fulfilled      → extend_fulfilled_near_timeout
  SwapVerifier.is_swap_complete       → verify_miner_fulfillment

No behavior change. Every caller and test now reads like the action
it's performing instead of grepping through underscored abbreviations.
Sweeps ~40 module-level private functions across validator, miner,
cli, event_watcher, axon_handlers, chain_providers/bitcoin, and
various CLI helpers. Every free function that was _foo becomes foo,
with all callers and imports updated atomically.

Class-level private methods (SwapFulfiller._sent, ValidatorStateStore
._require_connection, etc.) are intentionally left alone in this
commit — class internals need more care than a free-function sweep
and can land separately if they're worth the churn.

No behavior change. 202 tests still green.
Covers the post-refactor architecture so operators see at a glance:
- Validator state now consolidated at ~/.allways/validator/state.db
- MINER_TIMEOUT_CUSHION_BLOCKS env var and the slash-safety reasoning
- Lite-node requirement applies to event streaming as well as
  commitment polling (same connection, same reason)

Project-level CLAUDE.md (outside the allways git repo) is updated
locally with the same architectural notes — not part of this commit.
Full sweep for consistency — every column, field, property, parameter,
local variable, and test fixture that used source/dest now uses
from/to. Scoped across:

- ink! contract (lib.rs, types.rs, events.rs): SwapData fields,
  message params, internal mappings (used_from_tx, reservation_*,
  reservation_from_addr). Contract rebuilt, metadata JSON regenerated.
- Python dataclasses (classes.py): Swap, MinerPair, Commitment,
  SwapRequest, Reservation
- contract_client.py selector args and SCALE encoders
- validator module (forward, swap_tracker, chain_verification,
  axon_handlers, state_store, event_watcher)
- miner module (fulfillment, swap_poller, neurons/miner)
- CLI (swap, quote, view, post_tx, resume, status, pair, miner_commands)
- State store schema (pending_confirms columns)
- Helper functions: calculate_dest_amount → calculate_to_amount;
  sign_source_proof → sign_from_proof; verify_source_proof →
  verify_from_proof
- Synapses: source_chain / dest_chain / source_address fields →
  from_chain / to_chain / from_address
- CLI flags: --src/--dest → --from/--to;
  --source-address → --from-address; --source-tx-hash → --from-tx-hash
- Unified miner_deposit_address and miner_source_address → miner_from_address

Miner event topic hashes are unchanged because ink! signature topics
are derived from event type names, not field names. contract_client
selectors are also unchanged (4-byte hash of message name). Metadata
JSON is regenerated via `cargo contract build --release` so kwargs
dicts match the new ABI param names.

All 202 unit tests pass.
Follows C21 (module-level functions) with a deeper sweep that touches:
- All class private methods (ValidatorStateStore.require_connection,
  ContractEventWatcher.process_block/apply_event/..., SwapFulfiller
  .verify_swap_safety/load_sent_cache/save_sent_cache, SubtensorProvider
  .parse_raw_extrinsic/get_block/match_transfer, BitcoinProvider
  .rpc_call/blockstream_*, SwapVerifier.verify_tx, etc.)
- All private instance state fields (self.conn, self.lock, self.db_path,
  self.cursor, self.last_known_rates, self.sent, self.sent_cache_path,
  self.null_retry_count, self.consecutive_poll_failures, etc.)
- Test module helpers (make_store, make_validator, make_watcher,
  make_tracker, make_swap, make_pair, seed_collateral, etc.)
- contract_client static alias: compact_encode_len (was wrapped twice)

Naming conventions are now ubiquitous: no underscore prefix anywhere
except dunder methods. Every test helper and class internal reads the
same way as public API — the reader doesn't have to grep through
prefixed noise. 202 tests green.
Final cleanup pass: dest_decimals → to_decimals, source_decimals →
from_decimals in utils.rate.calculate_to_amount signature and all
callers (scoring, CLI quote, CLI swap). canon_src/canon_dest locals
→ canon_from/canon_to across CLI and utils. Keeps the from/to
naming consistent all the way down to the lowest-level rate math.
Three related fixes to the event watcher:

1. Cold start bootstrap: initialize() now takes metagraph_hotkeys and
   a contract_client, then reads current collateral / active flag /
   min_collateral from the contract for every miner before advancing
   the cursor. Previously replay started from (head - 2×window) with
   empty state, so any miner whose CollateralPosted / MinerActivated
   predated the window was silently treated as collateral=0 and
   inactive — effectively excluded from crown-time scoring until they
   emitted a fresh event. Snapshot is O(N) on startup only.

2. SET vs delta: CollateralPosted.total and CollateralWithdrawn.remaining
   are now used as authoritative post-event balances (SET semantics).
   Only CollateralSlashed still needs delta math since ink! doesn't
   emit a post-slash total. Eliminates drift when replay is missing
   history.

3. Bounded sync: sync_to caps at MAX_BLOCKS_PER_SYNC=50 blocks per
   call. A 10-minute outage used to block the forward loop while
   sync_to walked hundreds of blocks sequentially; now the cursor
   catches up incrementally across forward steps without stalling
   verification, timeouts, or scoring.

Also:
- Per-hotkey collateral events index for O(log n) latest-before
  lookups during scoring replay (was O(n²) in the window-start
  reconstruction loop).
- Warning logs on decoder failures (topic-unknown, byte parsing,
  scale decode) so a stale metadata.json doesn't silently drop every
  event.
- set_collateral / adjust_collateral split: set_collateral is the
  authoritative SET helper, adjust_collateral wraps it for delta math.
gather(return_exceptions=True) on both discovery and refresh passes —
a single flaky get_swap can no longer abort the entire forward step.
Exceptions fall through the same retry-counter path as None returns so
a transient RPC flake buys the swap a few extra attempts before being
dropped.

prune_stale_voted_ids() sweeps any voted_ids whose swap no longer
lives in active at the end of each poll. Normally these get cleaned
up alongside active.pop() but a race or exceptional path can leave
orphans; this bounds the set size.
ContractErrorKind was unused outside contract_client.py — every caller
matched on raw substrate error strings ('ContractReverted' in str(e)).
The enum added no value and the string match was duplicated across 4
call sites.

Changes:
- Delete ContractErrorKind enum and its kind= parameter from
  ContractError. The class is now a plain Exception subclass with a
  message.
- Add is_contract_rejection(e) module helper that matches both our own
  'contract rejected' messages and substrate's 'ContractReverted'
  string. One place to update if either convention changes.
- Replace 5 call sites in forward.py and axon_handlers.py with
  is_contract_rejection(e).

Getter return-type convention is already uniform after the fee
cleanup: every getter raises ContractError on failure, returns a
concrete int/bool/str/Swap. Callers that want a fallback use
try/except or `value or default` short-circuits for 0/empty.
SwapRequest, Reservation, SwapRequestStatus, and ReservationStatus were
declared in classes.py but never imported or instantiated anywhere in
the codebase. Dead code from a previous design iteration — deleted.

Also: the ValidatorStateStore.row_to_pending static helper kept an
underscore prefix from the earlier class-level sweep (explicit skip on
collision). Rename to match the rest of the codebase now that we've
audited it's safe.
Two readability/operational fixes in SwapFulfiller:

1. Rename the post-fee amount from to_amount → user_receives_amount
   through the entire send path (verify_swap_safety return, local vars
   in process_swap, send_dest_funds parameter, log messages). Previous
   naming collided with swap.to_amount which has different semantics at
   different lifecycle stages (pre-fee at initiate, post-fee after
   mark_fulfilled). The local now says what it is.

2. Hot-reload MINER_TIMEOUT_CUSHION_BLOCKS on every verify_swap_safety
   call. Operators tuning their slash exposure no longer need to
   restart the miner — edit the env and the next forward step picks
   up the new cushion. Read is cheap (one os.environ.get + int
   parse) and only runs when a swap is being evaluated.

Step 3 comment updated to explain the two-purpose sent cache (skip
re-send AND skip re-mark) since retry paths now also re-mark.
constants.py cleanup:
- Delete SWAP_FEE_PERCENT and MAX_FEE_PERCENT. FEE_DIVISOR is
  hardcoded 1% post-C8; these display helpers were either duplicative
  or stale (MAX_FEE_PERCENT referenced a divisor floor we removed).
- Delete COMMITMENT_REVEAL_BLOCKS — not referenced anywhere in the
  codebase after the V1 rewrite.
- Delete commented-out testnet CONTRACT_ADDRESS and replace with a
  doc line pointing at the env var override path.
- Retag MINER_POLL_INTERVAL_SECONDS / VALIDATOR_POLL_INTERVAL_SECONDS
  as "neuron loop heartbeat" in a comment so the name stops implying
  it's the scoring/forward cadence (which is driven by bittensor's
  base class, not this constant).
- Drop the "Must be > max swap amount" comment on MIN_COLLATERAL_TAO
  — the default values (0.1 / 0.5 TAO) violate the invariant and the
  comment was misleading.

compact_encode_len cleanup:
- Remove the `compact_encode_len = staticmethod(compact_encode_len)`
  class alias. The module-level function is already imported where
  needed (axon_handlers, tests). Internal encode_value callers now
  use the module function directly instead of self.compact_encode_len.
- Tests updated to import the module-level helper.
…eads

SwapVoter was 42 lines wrapping 3 one-line contract calls — 2 methods
caught exceptions and returned bool, 1 deliberately re-raised. Leaky
abstraction that added nothing over a module-level helper. Now there
are 3 free functions in allways/validator/voting.py:
  confirm_swap(client, wallet, swap_id) -> bool
  timeout_swap(client, wallet, swap_id) -> bool
  extend_swap_timeout(client, wallet, swap_id) -> bool
Callers in forward.py pass self.contract_client + self.wallet; the
voter parameter is gone from confirm_miner_fulfillments and
enforce_swap_timeouts. Validator init no longer builds a SwapVoter.

Also factored purge_expired out of the read path. Previously get_all,
has, and pending_size each wrote-commit-purged expired rows under the
lock on every call — a side effect hidden inside a read. Now
purge_expired_pending() is explicit and the forward loop calls it
once per tick. Reads are pure.
Adds a structured module docstring so the next person reading this
file doesn't have to walk 1k lines to understand the selector registry,
encoder, reader helpers, writer, and error flow. Also documents the
3-step recipe for adding a new contract message.
New TestSCALEDecoder class covers the decoder end-to-end without a
live substrate — hand-encoded AccountId/u64/u128/bool bytes fed into
decode_topic_fields + decode_data_fields. Protects against decoder
regressions from metadata drift or SCALE encoder changes.

Also new:
- TestBootstrap: verifies initialize() seeds collateral/active/
  min_collateral from a mocked contract_client and survives RPC
  exceptions.
- TestSetCollateral: verifies the SET semantics from R1 (total from
  CollateralPosted, remaining from CollateralWithdrawn) and the
  O(log n) bisect in get_latest_collateral_before.

Bug fix in get_latest_collateral_before: the pre-event gap case
(events exist for hotkey but none at/before block) was falling back
to the snapshot, which reflects state AFTER the events and is not
valid for those older queries. Now returns None for that gap and only
uses the snapshot when no events exist at all.
…erage

Rename the stale test_swap_tracker_outcomes.py to test_swap_tracker.py
and grow it from 7 tests to 25. Adds coverage for:

- initialize(): empty contract, backward scan, cutoff stop, terminal
  swap skipping
- RPC resilience: gather(return_exceptions=True) from R2 — a flaky
  get_swap on one swap no longer aborts the poll; exceptions count
  toward NULL_SWAP_RETRY_LIMIT; discovery-phase flakes are skipped
- prune_stale_voted_ids: orphan vote cleanup from R2
- get_fulfilled / get_near_timeout_fulfilled / get_timed_out: status
  filter, timeout-zero handling, strict-vs-inclusive boundary
- Unused tmp_path fixture params removed from resolve tests (tracker
  doesn't touch the filesystem)

Addresses review findings C6, C7, M2, m1.
Previously this file only roundtripped the third-party bitcoin_message_tool
library directly — our BitcoinProvider wrappers had zero coverage. A
regression in address type detection, WIF conversion, P2TR rejection,
or the missing-key error branch would have passed silently.

New tests exercise the provider methods end-to-end:
- P2WPKH / P2PKH / P2SH-P2WPKH sign + verify roundtrips
- P2TR rejection on both sign and verify (BIP-137 doesn't support it)
- Unknown address type rejection
- Missing WIF falls through to empty-string return
- Regtest WIF conversion path
- Malformed signature is caught and returns False, never raises
- Regtest address conversion path in verify

Addresses review finding M1.
Three kinds of fixes:

1. Delete TestEdgeCases::test_rate_precision_constant,
   test_btc_sat_constant, test_tao_rao_constant — they assert
   RATE_PRECISION == 10**18 etc., which only fails if you intentionally
   change the constant. They test nothing about calculate_to_amount's
   behavior.

2. TestEdgeCases::test_negative_rate_string → test_negative_rate_
   produces_negative_amount. The old assertion was `isinstance(result,
   int)` which passes for any integer, including wildly wrong ones.
   The new assertion locks in the actual behavior (negative input
   produces negated output) and documents that the actual guard lives
   upstream at the contract post.

3. TestFeeDeduction::test_fee_plus_user_equals_total had a weak
   inequality: `fee + user <= tao_amount`. Since
   `apply_fee_deduction(to_amount, d) = to_amount - to_amount // d`,
   the invariant is exactly `fee + user == tao_amount`. Tightened to
   `==` and two new tests cover the integer floor-division edge cases
   (99, 100, 101 rao through the fee function) that the old weak
   assertion couldn't have caught.

Addresses review finding M3.
New tests/test_fulfillment.py covers the parts of SwapFulfiller the
refactor branch changed:

- load_timeout_cushion_blocks(): default, empty string, valid int,
  zero allowed, negative clamped to 0 (sign-flip safety), invalid
  string falls back
- verify_swap_safety() cushion behavior including hot-reload mid-run
  (R5 feature) — the test patches env between two calls on the same
  fulfiller instance and asserts the second call sees the new value
  without reconstruction
- Zero cushion allows right up to timeout
- Missing rate or miner_from_address fails safety
- Returns the post-fee user_receives_amount, not the pre-fee
  to_amount — locks in the R5 naming and math

Also: test_poll_commitments.make_validator now includes a MagicMock
event_watcher defensively. poll_commitments doesn't touch it today
but a helper that does later should AttributeError loud, not
silently pass against a missing attribute.

Addresses review findings C8 (cushion hot-reload test) and M4
(poll_commitments fixture) along with the user_receives_amount
rename verification.
Three unrelated cleanups bundled:

1. test_chains.py — delete TestChainProperties. The four tests asserted
   CHAIN_BTC.decimals == 8 etc., which is the same as asserting the
   constant equals itself. They only fail if someone changes the
   constant, at which point these tests fail first and obscure any
   downstream breakage from the same change.

2. test_scale.py::make_client — add a docstring explaining why we use
   __new__ to skip __init__ (the real constructor opens a subtensor
   connection we don't need for encoder-only tests).

3. test_event_watcher.py::make_watcher — use Alice's SS58 address as
   the test contract_address instead of '5contract'. The fake string
   bypasses any future validation the watcher might add on the
   contract address; a real SS58 is forward-compatible.

Addresses review findings m2, m3, m4.
@entrius entrius merged commit 32c1cf8 into test Apr 15, 2026
2 checks passed
@entrius entrius deleted the refactor/post-v1-cleanup branch April 15, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants