Refactor/post v1 cleanup by LandynDev · Pull Request #46 · entrius/allways

LandynDev · 2026-04-15T15:05:40Z

No description provided.

Validator axon, pending-confirm processing, and miner fulfillment now reject any source tx whose on-chain sender does not match the address the user proved ownership of at reserve time. Without this, a user could reserve a miner for amount X and submit any unrelated third-party tx of amount X to the miner's deposit address, tricking the quorum into initiating a swap they never funded themselves. The CLI swap summary now surfaces the reserved source address and warns that validators will reject tx from any other address.

Block hint came from the caller's view of the chain, which can race against finality by a block or two. The previous plus or minus 2 window occasionally missed transfers that landed one block later than the hint.

Previously the entry was removed before the contract call, so a transient RPC failure during vote_initiate would silently lose the queued swap — the validator would never retry and the user's swap would sit until reservation expiry. The entry is now only dropped on successful vote or when the contract reports the swap is already initiated (a terminal, non-retryable outcome).

New DEFAULT_MINER_TIMEOUT_CUSHION_BLOCKS constant (5), overridable via MINER_TIMEOUT_CUSHION_BLOCKS env var. SwapFulfiller skips fulfillment when fewer than cushion blocks remain before timeout, trading a skipped swap for avoided slashes when the dest chain is slow to include the miner's transfer. Set to 0 to disable the cushion.

SwapPoller no longer keeps an in-memory _processed filter — it just reports raw contract state. SwapFulfiller's persistent _sent cache now records both the dest-tx outcome and whether mark_fulfilled has succeeded, so retry polls neither re-send funds nor re-call the contract. One source of truth for "have I handled this swap", and it survives restarts.

SwapFulfiller.send_dest_funds used to re-read the miner's commitment from substrate storage on every swap just to get its own from_address. The miner already knows its own addresses — they're the ones it just posted. Now the miner neuron loads the pair once at startup, passes a chain→address dict to SwapFulfiller, and reloads it when the CLI signals a new rate post via a flag file in ~/.allways/miner/.

Dropped the key= parameter from ChainProvider.send_amount. TAO provider now accepts bt.Wallet at construction; BTC provider already reads from env. Miner's SwapFulfiller no longer threads its wallet into per-send conditionals, and adding a new chain no longer means reasoning about where its signing key is plumbed from.

Replaces the mutable fee_divisor storage field with a const FEE_DIVISOR of 100 in lib.rs. set_fee_divisor and get_fee_divisor messages are gone — even the owner cannot change the fee. Python pins FEE_DIVISOR=100 in constants.py and every caller (validator init, miner init, CLI quote, CLI swap, admin command, view command) consumes the constant directly instead of polling the contract. One source of truth, no cache to refresh, no divergence risk between miner and validator views. Contract metadata will be regenerated at the C9/C10 dev-env redeploy boundary.

Previously only the timeout path checked min_collateral after slashing. The confirm path also deducts from collateral (the 1% fee), and a marginally-collateralized miner could drop below the minimum without being deactivated, so validators would still credit them with crown-time despite the miner being unable to accept new swaps. Now confirm mirrors timeout's guard: if collateral drops below min_collateral after fee deduction, flip miner_active to false, record the deactivation block, and emit MinerActivated{active: false} so event watchers can observe the state change.

_log_status, _check_state_changes, _read_current_state, _last_pair, _last_active, _last_collateral, _last_status_step, and MINER_STATUS_LOG_INTERVAL_STEPS are all gone. The miner forward loop no longer polls contract state every forward step to pretty-log state transitions — that was ~80 lines of code existing purely for logs that nobody reads during steady state. What mattered (collateral, active, committed pair) is visible via the on-chain dashboard and the validator's event watcher.

_poll_commitments was a single function interleaving three concerns: event pruning, rate refresh, and dereg purge. Split into _prune_aged_rate_events, _refresh_miner_rates, and _purge_deregistered_hotkeys, with _poll_commitments now a thin orchestrator that enforces the poll cadence and calls the three in order. No behavior change — existing tests still cover the orchestrator via _poll_commitments.

rate_state.db and pending_confirms.db are gone. One state.db file, one ValidatorStateStore class, one connection, one lock, four tables: pending_confirms, rate_events, collateral_events, swap_outcomes. Every validator consumer now accesses state via self.state_store instead of two separate handles. Dev-env operators running multiple validators against the same file still benefit from the busy_timeout-before-WAL init dance that the old RateStateStore introduced. Old rate_state.db and pending_confirms.db files from prior runs are orphaned — operators should delete them on upgrade.

New class that replays contract events (CollateralPosted/Withdrawn/ Slashed, MinerActivated, SwapCompleted/TimedOut, ConfigUpdated) into in-memory miner state. The decoder is ported from alw-utils' existing event watcher (same SCALE decode path, same metadata-driven registry) so both consumers can stay in sync as ink! events evolve. No validator wiring yet — this commit just adds the class and 11 unit tests that exercise every event type against a stub substrate. The real substrate sync path will come in C14 together with removing _poll_collaterals and _refresh_min_collateral from forward.py.

_poll_collaterals, _refresh_min_collateral, and their related constants (COLLATERAL_POLL_INTERVAL_BLOCKS, MIN_COLLATERAL_REFRESH_INTERVAL_BLOCKS) are gone. The validator now owns a ContractEventWatcher that replays collateral, active-flag, min_collateral, and swap-outcome events from the chain each forward step. Scoring consequences: - _crown_holders now gates on the event-sourced active_miners set in addition to metagraph membership — fixing a long-standing bug where an inactive miner could still hold the crown if they stayed in the metagraph with a good rate and collateral above min. - _replay_crown_time takes the event_watcher as a new parameter and sources collateral history from its in-memory list. Rates still come from state_store (populated by commitment polling). - min_collateral reads off the watcher's cached value. ValidatorStateStore loses its collateral_events table and the insert_/get_latest_/get_events_in_range_ collateral methods that go with it. Tests migrated to seed the watcher directly via the CollateralEvent dataclass. Dead test file test_poll_collaterals.py removed.

Outcome persistence moved fully to ContractEventWatcher in C14 — SwapTracker now just maintains the in-memory active set so the forward loop knows what to verify and vote on. The store_store parameter is gone from __init__, and the fragile 'if None infer state from timeout_block' branch in _poll_inner is replaced with a bounded retry counter (NULL_SWAP_RETRY_LIMIT=3). Transient RPC flakes no longer manufacture completion or timeout outcomes; after the retry limit the swap is silently dropped from active tracking and the event watcher's replay fills in the real outcome. resolve() no longer writes to the ledger either — it just drops the swap from active and clears voted/retry state.

Function renames for readability across the validator module: _clear_provider_caches → clear_provider_caches _process_pending_confirms → initialize_pending_user_reservations _verify_fulfilled → confirm_miner_fulfillments _score_miners → run_scoring_pass _replay_crown_time → replay_crown_time_window _crown_holders → crown_holders_at_instant _timeout_expired → enforce_swap_timeouts _extend_near_timeout_fulfilled → extend_fulfilled_near_timeout SwapVerifier.is_swap_complete → verify_miner_fulfillment No behavior change. Every caller and test now reads like the action it's performing instead of grepping through underscored abbreviations.

Sweeps ~40 module-level private functions across validator, miner, cli, event_watcher, axon_handlers, chain_providers/bitcoin, and various CLI helpers. Every free function that was _foo becomes foo, with all callers and imports updated atomically. Class-level private methods (SwapFulfiller._sent, ValidatorStateStore ._require_connection, etc.) are intentionally left alone in this commit — class internals need more care than a free-function sweep and can land separately if they're worth the churn. No behavior change. 202 tests still green.

Covers the post-refactor architecture so operators see at a glance: - Validator state now consolidated at ~/.allways/validator/state.db - MINER_TIMEOUT_CUSHION_BLOCKS env var and the slash-safety reasoning - Lite-node requirement applies to event streaming as well as commitment polling (same connection, same reason) Project-level CLAUDE.md (outside the allways git repo) is updated locally with the same architectural notes — not part of this commit.

Full sweep for consistency — every column, field, property, parameter, local variable, and test fixture that used source/dest now uses from/to. Scoped across: - ink! contract (lib.rs, types.rs, events.rs): SwapData fields, message params, internal mappings (used_from_tx, reservation_*, reservation_from_addr). Contract rebuilt, metadata JSON regenerated. - Python dataclasses (classes.py): Swap, MinerPair, Commitment, SwapRequest, Reservation - contract_client.py selector args and SCALE encoders - validator module (forward, swap_tracker, chain_verification, axon_handlers, state_store, event_watcher) - miner module (fulfillment, swap_poller, neurons/miner) - CLI (swap, quote, view, post_tx, resume, status, pair, miner_commands) - State store schema (pending_confirms columns) - Helper functions: calculate_dest_amount → calculate_to_amount; sign_source_proof → sign_from_proof; verify_source_proof → verify_from_proof - Synapses: source_chain / dest_chain / source_address fields → from_chain / to_chain / from_address - CLI flags: --src/--dest → --from/--to; --source-address → --from-address; --source-tx-hash → --from-tx-hash - Unified miner_deposit_address and miner_source_address → miner_from_address Miner event topic hashes are unchanged because ink! signature topics are derived from event type names, not field names. contract_client selectors are also unchanged (4-byte hash of message name). Metadata JSON is regenerated via `cargo contract build --release` so kwargs dicts match the new ABI param names. All 202 unit tests pass.

Follows C21 (module-level functions) with a deeper sweep that touches: - All class private methods (ValidatorStateStore.require_connection, ContractEventWatcher.process_block/apply_event/..., SwapFulfiller .verify_swap_safety/load_sent_cache/save_sent_cache, SubtensorProvider .parse_raw_extrinsic/get_block/match_transfer, BitcoinProvider .rpc_call/blockstream_*, SwapVerifier.verify_tx, etc.) - All private instance state fields (self.conn, self.lock, self.db_path, self.cursor, self.last_known_rates, self.sent, self.sent_cache_path, self.null_retry_count, self.consecutive_poll_failures, etc.) - Test module helpers (make_store, make_validator, make_watcher, make_tracker, make_swap, make_pair, seed_collateral, etc.) - contract_client static alias: compact_encode_len (was wrapped twice) Naming conventions are now ubiquitous: no underscore prefix anywhere except dunder methods. Every test helper and class internal reads the same way as public API — the reader doesn't have to grep through prefixed noise. 202 tests green.

Final cleanup pass: dest_decimals → to_decimals, source_decimals → from_decimals in utils.rate.calculate_to_amount signature and all callers (scoring, CLI quote, CLI swap). canon_src/canon_dest locals → canon_from/canon_to across CLI and utils. Keeps the from/to naming consistent all the way down to the lowest-level rate math.

Three related fixes to the event watcher: 1. Cold start bootstrap: initialize() now takes metagraph_hotkeys and a contract_client, then reads current collateral / active flag / min_collateral from the contract for every miner before advancing the cursor. Previously replay started from (head - 2×window) with empty state, so any miner whose CollateralPosted / MinerActivated predated the window was silently treated as collateral=0 and inactive — effectively excluded from crown-time scoring until they emitted a fresh event. Snapshot is O(N) on startup only. 2. SET vs delta: CollateralPosted.total and CollateralWithdrawn.remaining are now used as authoritative post-event balances (SET semantics). Only CollateralSlashed still needs delta math since ink! doesn't emit a post-slash total. Eliminates drift when replay is missing history. 3. Bounded sync: sync_to caps at MAX_BLOCKS_PER_SYNC=50 blocks per call. A 10-minute outage used to block the forward loop while sync_to walked hundreds of blocks sequentially; now the cursor catches up incrementally across forward steps without stalling verification, timeouts, or scoring. Also: - Per-hotkey collateral events index for O(log n) latest-before lookups during scoring replay (was O(n²) in the window-start reconstruction loop). - Warning logs on decoder failures (topic-unknown, byte parsing, scale decode) so a stale metadata.json doesn't silently drop every event. - set_collateral / adjust_collateral split: set_collateral is the authoritative SET helper, adjust_collateral wraps it for delta math.

gather(return_exceptions=True) on both discovery and refresh passes — a single flaky get_swap can no longer abort the entire forward step. Exceptions fall through the same retry-counter path as None returns so a transient RPC flake buys the swap a few extra attempts before being dropped. prune_stale_voted_ids() sweeps any voted_ids whose swap no longer lives in active at the end of each poll. Normally these get cleaned up alongside active.pop() but a race or exceptional path can leave orphans; this bounds the set size.

ContractErrorKind was unused outside contract_client.py — every caller matched on raw substrate error strings ('ContractReverted' in str(e)). The enum added no value and the string match was duplicated across 4 call sites. Changes: - Delete ContractErrorKind enum and its kind= parameter from ContractError. The class is now a plain Exception subclass with a message. - Add is_contract_rejection(e) module helper that matches both our own 'contract rejected' messages and substrate's 'ContractReverted' string. One place to update if either convention changes. - Replace 5 call sites in forward.py and axon_handlers.py with is_contract_rejection(e). Getter return-type convention is already uniform after the fee cleanup: every getter raises ContractError on failure, returns a concrete int/bool/str/Swap. Callers that want a fallback use try/except or `value or default` short-circuits for 0/empty.

SwapRequest, Reservation, SwapRequestStatus, and ReservationStatus were declared in classes.py but never imported or instantiated anywhere in the codebase. Dead code from a previous design iteration — deleted. Also: the ValidatorStateStore.row_to_pending static helper kept an underscore prefix from the earlier class-level sweep (explicit skip on collision). Rename to match the rest of the codebase now that we've audited it's safe.

Two readability/operational fixes in SwapFulfiller: 1. Rename the post-fee amount from to_amount → user_receives_amount through the entire send path (verify_swap_safety return, local vars in process_swap, send_dest_funds parameter, log messages). Previous naming collided with swap.to_amount which has different semantics at different lifecycle stages (pre-fee at initiate, post-fee after mark_fulfilled). The local now says what it is. 2. Hot-reload MINER_TIMEOUT_CUSHION_BLOCKS on every verify_swap_safety call. Operators tuning their slash exposure no longer need to restart the miner — edit the env and the next forward step picks up the new cushion. Read is cheap (one os.environ.get + int parse) and only runs when a swap is being evaluated. Step 3 comment updated to explain the two-purpose sent cache (skip re-send AND skip re-mark) since retry paths now also re-mark.

constants.py cleanup: - Delete SWAP_FEE_PERCENT and MAX_FEE_PERCENT. FEE_DIVISOR is hardcoded 1% post-C8; these display helpers were either duplicative or stale (MAX_FEE_PERCENT referenced a divisor floor we removed). - Delete COMMITMENT_REVEAL_BLOCKS — not referenced anywhere in the codebase after the V1 rewrite. - Delete commented-out testnet CONTRACT_ADDRESS and replace with a doc line pointing at the env var override path. - Retag MINER_POLL_INTERVAL_SECONDS / VALIDATOR_POLL_INTERVAL_SECONDS as "neuron loop heartbeat" in a comment so the name stops implying it's the scoring/forward cadence (which is driven by bittensor's base class, not this constant). - Drop the "Must be > max swap amount" comment on MIN_COLLATERAL_TAO — the default values (0.1 / 0.5 TAO) violate the invariant and the comment was misleading. compact_encode_len cleanup: - Remove the `compact_encode_len = staticmethod(compact_encode_len)` class alias. The module-level function is already imported where needed (axon_handlers, tests). Internal encode_value callers now use the module function directly instead of self.compact_encode_len. - Tests updated to import the module-level helper.

…eads SwapVoter was 42 lines wrapping 3 one-line contract calls — 2 methods caught exceptions and returned bool, 1 deliberately re-raised. Leaky abstraction that added nothing over a module-level helper. Now there are 3 free functions in allways/validator/voting.py: confirm_swap(client, wallet, swap_id) -> bool timeout_swap(client, wallet, swap_id) -> bool extend_swap_timeout(client, wallet, swap_id) -> bool Callers in forward.py pass self.contract_client + self.wallet; the voter parameter is gone from confirm_miner_fulfillments and enforce_swap_timeouts. Validator init no longer builds a SwapVoter. Also factored purge_expired out of the read path. Previously get_all, has, and pending_size each wrote-commit-purged expired rows under the lock on every call — a side effect hidden inside a read. Now purge_expired_pending() is explicit and the forward loop calls it once per tick. Reads are pure.

Adds a structured module docstring so the next person reading this file doesn't have to walk 1k lines to understand the selector registry, encoder, reader helpers, writer, and error flow. Also documents the 3-step recipe for adding a new contract message.

New TestSCALEDecoder class covers the decoder end-to-end without a live substrate — hand-encoded AccountId/u64/u128/bool bytes fed into decode_topic_fields + decode_data_fields. Protects against decoder regressions from metadata drift or SCALE encoder changes. Also new: - TestBootstrap: verifies initialize() seeds collateral/active/ min_collateral from a mocked contract_client and survives RPC exceptions. - TestSetCollateral: verifies the SET semantics from R1 (total from CollateralPosted, remaining from CollateralWithdrawn) and the O(log n) bisect in get_latest_collateral_before. Bug fix in get_latest_collateral_before: the pre-event gap case (events exist for hotkey but none at/before block) was falling back to the snapshot, which reflects state AFTER the events and is not valid for those older queries. Now returns None for that gap and only uses the snapshot when no events exist at all.

…erage Rename the stale test_swap_tracker_outcomes.py to test_swap_tracker.py and grow it from 7 tests to 25. Adds coverage for: - initialize(): empty contract, backward scan, cutoff stop, terminal swap skipping - RPC resilience: gather(return_exceptions=True) from R2 — a flaky get_swap on one swap no longer aborts the poll; exceptions count toward NULL_SWAP_RETRY_LIMIT; discovery-phase flakes are skipped - prune_stale_voted_ids: orphan vote cleanup from R2 - get_fulfilled / get_near_timeout_fulfilled / get_timed_out: status filter, timeout-zero handling, strict-vs-inclusive boundary - Unused tmp_path fixture params removed from resolve tests (tracker doesn't touch the filesystem) Addresses review findings C6, C7, M2, m1.

Previously this file only roundtripped the third-party bitcoin_message_tool library directly — our BitcoinProvider wrappers had zero coverage. A regression in address type detection, WIF conversion, P2TR rejection, or the missing-key error branch would have passed silently. New tests exercise the provider methods end-to-end: - P2WPKH / P2PKH / P2SH-P2WPKH sign + verify roundtrips - P2TR rejection on both sign and verify (BIP-137 doesn't support it) - Unknown address type rejection - Missing WIF falls through to empty-string return - Regtest WIF conversion path - Malformed signature is caught and returns False, never raises - Regtest address conversion path in verify Addresses review finding M1.

Three kinds of fixes: 1. Delete TestEdgeCases::test_rate_precision_constant, test_btc_sat_constant, test_tao_rao_constant — they assert RATE_PRECISION == 10**18 etc., which only fails if you intentionally change the constant. They test nothing about calculate_to_amount's behavior. 2. TestEdgeCases::test_negative_rate_string → test_negative_rate_ produces_negative_amount. The old assertion was `isinstance(result, int)` which passes for any integer, including wildly wrong ones. The new assertion locks in the actual behavior (negative input produces negated output) and documents that the actual guard lives upstream at the contract post. 3. TestFeeDeduction::test_fee_plus_user_equals_total had a weak inequality: `fee + user <= tao_amount`. Since `apply_fee_deduction(to_amount, d) = to_amount - to_amount // d`, the invariant is exactly `fee + user == tao_amount`. Tightened to `==` and two new tests cover the integer floor-division edge cases (99, 100, 101 rao through the fee function) that the old weak assertion couldn't have caught. Addresses review finding M3.

New tests/test_fulfillment.py covers the parts of SwapFulfiller the refactor branch changed: - load_timeout_cushion_blocks(): default, empty string, valid int, zero allowed, negative clamped to 0 (sign-flip safety), invalid string falls back - verify_swap_safety() cushion behavior including hot-reload mid-run (R5 feature) — the test patches env between two calls on the same fulfiller instance and asserts the second call sees the new value without reconstruction - Zero cushion allows right up to timeout - Missing rate or miner_from_address fails safety - Returns the post-fee user_receives_amount, not the pre-fee to_amount — locks in the R5 naming and math Also: test_poll_commitments.make_validator now includes a MagicMock event_watcher defensively. poll_commitments doesn't touch it today but a helper that does later should AttributeError loud, not silently pass against a missing attribute. Addresses review findings C8 (cushion hot-reload test) and M4 (poll_commitments fixture) along with the user_receives_amount rename verification.

Three unrelated cleanups bundled: 1. test_chains.py — delete TestChainProperties. The four tests asserted CHAIN_BTC.decimals == 8 etc., which is the same as asserting the constant equals itself. They only fail if someone changes the constant, at which point these tests fail first and obscure any downstream breakage from the same change. 2. test_scale.py::make_client — add a docstring explaining why we use __new__ to skip __init__ (the real constructor opens a subtensor connection we don't need for encoder-only tests). 3. test_event_watcher.py::make_watcher — use Alice's SS58 address as the test contract_address instead of '5contract'. The fake string bypasses any future validation the watcher might add on the contract address; a real SS58 is forward-compatible. Addresses review findings m2, m3, m4.

LandynDev added 30 commits April 14, 2026 17:59

Widen TAO block hint search to the hinted block plus or minus 3

a14d969

Block hint came from the caller's view of the chain, which can race against finality by a block or two. The previous plus or minus 2 window occasionally missed transfers that landed one block later than the hint.

LandynDev added 5 commits April 14, 2026 21:15

entrius approved these changes Apr 15, 2026

View reviewed changes

entrius merged commit 32c1cf8 into test Apr 15, 2026
2 checks passed

entrius deleted the refactor/post-v1-cleanup branch April 15, 2026 17:02

This was referenced Apr 17, 2026

Persist scoring window and voted swaps across validator restarts #17

Closed

refactor: consolidate duplicate block-time and fee divisor constants #51

Closed

skip extend reservation vote when reservation already expired #36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/post v1 cleanup#46

Refactor/post v1 cleanup#46
entrius merged 35 commits intotestfrom
refactor/post-v1-cleanup

LandynDev commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LandynDev commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants