feat(tx): lower default gas_adjustment to 1.3, make tuning configurable, retry on OOG#283
Conversation
…OG retry with multiplicative escalation
Supernode was using gas_adjustment=1.5 + 50k padding for every outbound
tx (including MsgFinalizeAction), inflating fee spend. Default lowered
to 1.3 with an escalating OOG-retry path as a safety net, and all
gas/fee tuning knobs are now driven by optional YAML config so operators
can tune without a code change.
Changes
-------
pkg/lumera/modules/tx:
- DefaultGasAdjustment 1.5 -> 1.3.
- Add DefaultGasAdjustmentMultiplier (1.3) and
DefaultGasAdjustmentMaxAttempts (3, hard-capped at 10).
- Extend TxConfig / TxHelperConfig with GasAdjustmentMultiplier and
GasAdjustmentMaxAttempts.
- Add applyTxHelperDefaults so zero-valued TxHelperConfig fields fall
back to package defaults deterministically.
- isOutOfGas detector (matches both "out of gas" and
code=11 codespace=sdk patterns).
- executeWithOOGRetry: cloned TxConfig per attempt (no mutation of
caller state); multiplicative escalation
(gas_adjustment *= multiplier); structured Datadog-faceted log
lines with metric="tx_oog_retry" + outcome=(retrying|success|exhausted).
- Wire the OOG retry loop into TxHelper.ExecuteTransaction so all
msg types (action_msg, supernode_msg, audit_msg) benefit.
pkg/lumera (client config):
- NewConfig gains variadic TxOptions for source-level backward compat.
- Config.toTxHelperConfig materializes a normalized TxHelperConfig.
pkg/lumera/modules/{action_msg,supernode_msg,audit_msg}:
- New NewModuleWithTxHelperConfig constructor (explicit TxHelperConfig).
- Legacy NewModule(...) preserved for backward compatibility; now
delegates to the helper-config constructor with defaults.
supernode/config:
- LumeraClientConfig: add optional YAML fields
gas_adjustment, gas_adjustment_multiplier,
gas_adjustment_max_attempts, gas_padding, gas_price, fee_denom.
- Omitted / zero values preserve the default behavior.
supernode/cmd/start.go:
- initLumeraClient now propagates the new TxOptions from YAML into
lumera.NewConfig.
Tests
-----
pkg/lumera/modules/tx:
- DefaultGasAdjustment == 1.3.
- applyTxHelperDefaults: zero-valued fields use defaults, operator
values preserved, max-attempts cap honored.
- isOutOfGas: simulation and broadcast OOG patterns matched; unrelated
codes (e.g. sequence mismatch, code=11 wrong codespace) ignored.
- executeWithOOGRetry:
* success on first attempt
* escalates multiplicatively and succeeds on 3rd attempt
(1.3 -> 1.69 -> 2.197)
* non-OOG errors bail immediately (single attempt)
* max-attempts honored (stops at N, surfaces "out of gas after N attempts")
* clone isolates caller TxConfig from mutation
* nil base cfg returns error
pkg/lumera:
- NewConfig rejects invalid inputs and propagates TxOptions through
toTxHelperConfig().
supernode/config:
- LoadConfig round-trips all six new YAML keys.
- Omitted YAML keys yield zero values (defaults applied at tx layer).
Risk / rollout
--------------
Client-side only. No chain change, no migration, no upgrade handler.
Operators can override gas_adjustment at runtime via YAML if the new
default ever proves too tight; OOG retry is a safety net in the
meantime.
Refs: https://www.notion.so/pastelnetwork/SN-Gas-used-for-Finalize-tx-is-too-high-345df11fee148011bac1fe1268c99326
Both items from the previous review are now resolved in e324c6a. No new issues found.
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
Simulates the production scenario:
1.3 is rejected by the chain (code=11 codespace=sdk, out of gas)
→ retry at 1.69 → retry at 2.197 → success
Three new integration tests using a scripted tx.Module stub:
- OOG×2 → success on 3rd attempt; verifies exact
1.3 → 1.69 → 2.197 sequence; sequence advances exactly once;
base config not mutated.
- All attempts OOG → clear error, no sequence hole.
- Non-OOG error bails at first attempt, no sequence advance.
| resp, err := executeWithOOGRetry( | ||
| ctx, | ||
| h.config, | ||
| func(cfg *TxConfig) (*sdktx.BroadcastTxResponse, error) { | ||
| return h.txmod.ProcessTransaction(ctx, []types.Msg{msg}, accountInfo, cfg) | ||
| }, | ||
| ) |
There was a problem hiding this comment.
The OOG retry wrapper is wired into ExecuteTransaction here, but ExecuteTransactionWithMsgs (line 366) still calls ProcessTransaction directly without it. Any caller using the pre-built-messages path won't get the escalating gas adjustment on OOG failures. Currently there are no callers of ExecuteTransactionWithMsgs in the repo, so it's not a live bug, but it's an inconsistency in the public API that could bite external pkg/lumera consumers. Consider wrapping that call in executeWithOOGRetry as well for parity.
Fix it with Roo Code or mention @roomote and request a fix.
There was a problem hiding this comment.
Fixed in e324c6a. Wrapped ExecuteTransactionWithMsgs in executeWithOOGRetry for public-API parity with ExecuteTransaction. Sequence-mismatch retry intentionally not added here because the caller owns accountInfo (documented in the doc comment). New tests: TestExecuteTransactionWithMsgs_AppliesOOGRetry + TestExecuteTransactionWithMsgs_NonOOGBailsImmediately.
| if config.GasAdjustmentMaxAttempts > 0 { | ||
| h.config.GasAdjustmentMaxAttempts = config.GasAdjustmentMaxAttempts | ||
| } |
There was a problem hiding this comment.
UpdateConfig accepts any positive GasAdjustmentMaxAttempts without enforcing the hard cap of 10 that applyTxHelperDefaults applies. If an operator (or a runtime code path) calls UpdateConfig with a value like 50, the safety net against fee runaway is bypassed. Consider adding the same cap here.
| if config.GasAdjustmentMaxAttempts > 0 { | |
| h.config.GasAdjustmentMaxAttempts = config.GasAdjustmentMaxAttempts | |
| } | |
| if config.GasAdjustmentMaxAttempts > 0 { | |
| if config.GasAdjustmentMaxAttempts > 10 { | |
| config.GasAdjustmentMaxAttempts = 10 | |
| } | |
| h.config.GasAdjustmentMaxAttempts = config.GasAdjustmentMaxAttempts | |
| } |
Fix it with Roo Code or mention @roomote and request a fix.
There was a problem hiding this comment.
Fixed in e324c6a. Extracted the hard cap into MaxGasAdjustmentAttemptsCap = 10 as a single source of truth, and applied it in UpdateConfig mirroring applyTxHelperDefaults. New tests: TestTxHelper_UpdateConfig_CapsMaxAttempts (9999 → 10) and TestTxHelper_UpdateConfig_PreservesReasonableMaxAttempts (5 → 5).
…p ExecuteTransactionWithMsgs Two consistency gaps surfaced in code review: 1. UpdateConfig accepted any positive GasAdjustmentMaxAttempts, bypassing the 10-attempt fee-runaway safety cap enforced by applyTxHelperDefaults. A runtime-reconfig caller (SetTxHelperConfig → UpdateConfig) could set MaxAttempts=9999 and defeat the protection. Now mirrors the defaults cap via the new MaxGasAdjustmentAttemptsCap constant. 2. ExecuteTransactionWithMsgs called ProcessTransaction directly, skipping the OOG retry wrapper that ExecuteTransaction uses. Public-API split- brain — external pkg/lumera SDK consumers using the pre-built-messages path would not get OOG escalation. Now wraps in executeWithOOGRetry. Sequence-mismatch retry is intentionally NOT added here because the caller owns accountInfo (documented in doc comment). Extracted hard cap (10) into named constant MaxGasAdjustmentAttemptsCap for clarity and a single source of truth. Tests: - TestTxHelper_UpdateConfig_CapsMaxAttempts (9999 → 10) - TestTxHelper_UpdateConfig_PreservesReasonableMaxAttempts (5 → 5) - TestExecuteTransactionWithMsgs_AppliesOOGRetry (3 attempts, escalating) - TestExecuteTransactionWithMsgs_NonOOGBailsImmediately (non-OOG bubbles) go test ./pkg/... + ./supernode/config/... green. gofmt + go vet clean.
Summary
Supernode was using
gas_adjustment=1.5+50kpadding for every outbound tx (includingMsgFinalizeAction), inflating fee spend. This PR:GasAdjustmentfrom1.5→1.3.Client-side only. No chain change. No migration. No upgrade handler.
Ref: Notion — SN Gas used for Finalize tx is too high
Behavior change
Defaults
GasAdjustment1.51.3GasAdjustmentMultiplier1.3GasAdjustmentMaxAttempts3(cap 10)GasPadding5000050000GasPrice0.0250.025FeeDenomulumeulumeNew optional YAML (
supernode.ymlunderlumera:):Omitted / zero-valued fields fall back to package defaults deterministically.
OOG retry
"out of gas"simulation errors andcode=11 codespace=sdkbroadcast errors.GasAdjustment *= Multiplieron each retry. Default sequence:1.3 → 1.69 → 2.197.TxConfig.Observability
Every retry emits a structured
logtraceline that's facet-able in Datadog:Files changed
pkg/lumera/modules/tx/impl.go,helper.go,interface.go— defaults,isOutOfGas,executeWithOOGRetry,applyTxHelperDefaults,TxConfig/TxHelperConfigextensions, retry wiring inExecuteTransaction.pkg/lumera/config.go,client.go—TxOptionsvariadic onNewConfig;Config.toTxHelperConfig(); all three msg modules receive explicit helper config.pkg/lumera/modules/{action_msg,supernode_msg,audit_msg}— newNewModuleWithTxHelperConfigconstructors; legacyNewModulepreserved for backward compat.supernode/config/config.go— six new optionalLumeraClientConfigYAML fields.supernode/cmd/start.go— propagates YAML →lumera.NewConfig.Tests
16 new tests, all passing:
pkg/lumera/modules/tx/helper_test.go:1.3.applyTxHelperDefaults— zero → defaults, operator values preserved, max-attempts cap honored.isOutOfGas— 6 subcases: nil / simulation string /code=11 codespace=sdk/ raw_log-only / sequence-mismatch (false) / wrong codespace (false).executeWithOOGRetry:1.3 → 1.69 → 2.197sequence)TxConfigfrom mutationpkg/lumera/config_test.go:NewConfigrejects invalid inputs (4 subcases).TxOptionspropagate throughtoTxHelperConfig().supernode/config/config_gas_test.go:Verification:
go build ./pkg/... ./supernode/...clean.go vet ./pkg/lumera/... ./supernode/config/... ./supernode/cmd/...clean.gofmt -lclean.go testsweep overpkg/...,supernode/...,sdk/...— 0 failures.Risks
1.3 × simGas + 50kis industry-standard for Cosmos SDK; retry loop covers pathological edges. Max-attempts cap at 10 prevents runaway fee spend.TxHelperalready serializes viasync.Mutex; per-attemptTxConfigis cloned.pkg/lumeraconsumers. Mitigated via variadicTxOptionsonNewConfigand legacyNewModuleconstructors preserved.Rollout / Rollback
gas_adjustment: 1.5in YAML if required.Follow-ups (not in this PR)
lumera-tools-internalto document the new keys in sample launch-bundle configs.metric:tx_oog_retrypost-deploy.