Skip to content

backfill config loader + validation pipeline (#684)#701

Closed
karthikiyer56 wants to merge 5 commits intokarthik/backfill-impl/scaffold-subcommandfrom
karthik/backfill-impl/config-validate
Closed

backfill config loader + validation pipeline (#684)#701
karthikiyer56 wants to merge 5 commits intokarthik/backfill-impl/scaffold-subcommandfrom
karthik/backfill-impl/config-validate

Conversation

@karthikiyer56
Copy link
Copy Markdown
Contributor

@karthikiyer56 karthikiyer56 commented Apr 19, 2026

Important

Base is temporarily karthik/backfill-impl/scaffold-subcommand (head of #699), not feature/full-history. This branch stacks on #699; I'll re-target the base after #699 merges.

Closes #684.

Summary

  • Port pkg/geometry from reference commit 19a9179 — Index-only API; drops the old Range* nomenclature per the current design doc.
  • Add backfill/config/ with the UPPER_SNAKE_CASE TOML schema, plus ParseConfig + MarshalTOML round-trip.
  • Validate — required fields + path defaults under DEFAULT_DATA_DIR.
  • ApplyFlags — widens --start-ledger / --end-ledger outward to chunk boundaries; resolves the 0 = GOMAXPROCS sentinel on --workers; merges --log-level / --log-format overrides.
  • ValidateFlags + ValidateAgainstStore (CHUNKS_PER_TXHASH_INDEX immutability) + ValidateAgainstBSB (availability probe).
  • Store and BSBAvailabilityProbe interfaces with in-memory / no-op implementations. Real implementations replace these constructor calls as part of #689 and #688.
  • Fill in the subcommand body scaffolded by #699: read TOML → run the pipeline → print summary. Adds --log-level / --log-format flags.

Unit tests

  • pkg/geometry
    • Chunk + index boundary math (ChunkFirstLedger, ChunkLastLedger, IndexFirstChunk, IndexLastChunk, etc.).
    • Ledger → chunk and chunk → index round-trips.
    • Index contiguity across 5 consecutive indexes.
  • config/config_test.go
    • Full-schema TOML parse (every section + nested [BACKFILL.BSB]).
    • Default path + numeric resolution under DEFAULT_DATA_DIR.
    • MarshalTOML round-trip.
  • config/flags_test.go
    • Chunk-boundary widening — 4 cases (both mid-chunk, both aligned, start=2 + end mid-chunk, start mid-chunk + end aligned).
    • workers=0GOMAXPROCS; max-retries passes through literally (including 0).
    • Logging override merge.
    • BuildGeometry threads CHUNKS_PER_TXHASH_INDEX.
  • config/validate_test.go
    • Required-field rejections (missing DEFAULT_DATA_DIR, missing [BACKFILL.BSB], empty BUCKET_PATH).
    • ValidateFlags — start < 2, end ≤ start, workers < 0, max-retries < 0; workers=0 sentinel passes.
    • ValidateAgainstStore — absent / match / mismatch branches via mock Store; precondition error when Validate() was skipped.
    • ValidateAgainstBSB — available / insufficient / probe-error branches via mock probe.
  • backfill/cmd_test.go
    • Flag registration (all 8 flags present).
    • Golden-path summary (valid config exits 0, summary contains resolved fields).
    • 4 failure paths surfaced through cmd.Execute (start < 2, end ≤ start, missing DEFAULT_DATA_DIR, missing [BACKFILL.BSB]).
    • Missing --config file.

Ports pkg/geometry verbatim from reference commit 19a9179 (chunk/range
boundary math) and fills in the full-history-backfill subcommand body:

- Config package under cmd/stellar-rpc/internal/fullhistory/backfill/config/
  with UPPER_SNAKE_CASE TOML schema ([SERVICE], [BACKFILL], [BACKFILL.BSB],
  [IMMUTABLE_STORAGE.*], [LOGGING], [META_STORE]).
- ParseConfig / MarshalTOML round-trip via pelletier/go-toml v1.
- Validate (TOML struct-level rules + path-default resolution).
- CLIFlags + ApplyFlags (chunk-boundary widening via pkg/geometry,
  workers/max-retries/logging defaults + overrides).
- ValidateFlags (start/end/workers/max-retries sanity).
- ValidateAgainstConfigStore (CHUNKS_PER_TXHASH_INDEX immutability via
  a narrow ConfigStore interface; real RocksDB-backed impl lands in #689).
- ValidateAgainstBSB (availability probe via a narrow
  BSBAvailabilityProbe interface; real impl lands in #688).
- Subcommand runBackfill: load TOML -> Validate -> ApplyFlags ->
  ValidateFlags -> ValidateAgainstConfigStore(stub) ->
  ValidateAgainstBSB(stub) -> printSummary. Adds --log-level and
  --log-format flags alongside the six registered in #699.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the full-history backfill subcommand’s config loading + pre-DAG validation pipeline, including a new TOML-backed config schema and a ported geometry helper package used for chunk/range boundary math.

Changes:

  • Added cmd/stellar-rpc/internal/fullhistory/backfill/config/ with TOML parse/marshal, validation, CLI flag merge (including chunk-boundary expansion), and validation against stubbed ConfigStore/BSB probe interfaces.
  • Implemented full-history-backfill subcommand flow: read config → validate → apply flags → validate flags → validate against config-store stub → validate against BSB stub → print summary.
  • Ported pkg/geometry (plus tests) to support chunk/range/index boundary calculations.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
cmd/stellar-rpc/internal/fullhistory/pkg/geometry/geometry.go Adds geometry constants and parameterized boundary math helpers.
cmd/stellar-rpc/internal/fullhistory/pkg/geometry/geometry_test.go Unit tests for range/chunk/index math.
cmd/stellar-rpc/internal/fullhistory/backfill/config/config.go Defines TOML schema + ParseConfig/MarshalTOML.
cmd/stellar-rpc/internal/fullhistory/backfill/config/config_test.go Tests schema decode, defaults, and TOML round-trip.
cmd/stellar-rpc/internal/fullhistory/backfill/config/flags.go CLIFlags + ApplyFlags and BuildGeometry wiring.
cmd/stellar-rpc/internal/fullhistory/backfill/config/flags_test.go Tests chunk-boundary widening and flag-derived defaults.
cmd/stellar-rpc/internal/fullhistory/backfill/config/validate.go Config validation, flag validation, config-store immutability, and BSB availability checks.
cmd/stellar-rpc/internal/fullhistory/backfill/config/validate_test.go Tests validation rules (pass/fail) and interface-based validators.
cmd/stellar-rpc/internal/fullhistory/backfill/config/interfaces.go Defines ConfigStore/BSBAvailabilityProbe + in-memory stubs.
cmd/stellar-rpc/internal/fullhistory/backfill/cmd.go Wires the end-to-end subcommand pipeline and prints a resolved-config summary.
cmd/stellar-rpc/internal/fullhistory/backfill/cmd_test.go Subcommand-level tests for flag registration, success path summary, and failure cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/stellar-rpc/internal/fullhistory/pkg/geometry/geometry.go Outdated
Comment thread cmd/stellar-rpc/internal/fullhistory/pkg/geometry/geometry.go Outdated
Comment thread cmd/stellar-rpc/internal/fullhistory/pkg/geometry/geometry.go Outdated
Comment thread cmd/stellar-rpc/internal/fullhistory/backfill/config/validate.go Outdated
Comment thread cmd/stellar-rpc/internal/fullhistory/backfill/config/config.go Outdated
Comment thread cmd/stellar-rpc/internal/fullhistory/backfill/cmd_test.go Outdated
- Rename Range-era nomenclature in pkg/geometry to Index-only per the
  current design doc. Drop the Range* constants + methods; keep Chunk*
  and Index* APIs. Add LedgerToChunkID so flags.go stops inlining the
  formula.
- Rename ConfigStore -> Store (revive: stuttering package.Type).
- Rename Stub* -> InMemory* / Nop* to describe what the impl IS rather
  than what slice it anticipates. Forward-looking comments explain why
  the in-memory Store and no-op BSB probe are the right shape until
  #689 and #688 land their RocksDB- and GCS-backed impls.
- Trim comments: keep business-logic and stub-context explanations;
  drop section dividers, restatement of code, and verbose godoc.
- Treat --workers=0 as valid (the documented GOMAXPROCS sentinel
  matches cobra's flag default). Previous ValidateFlags rejected 0
  as < 1, which contradicted the help text.
- Fix META_STORE default-path comment casing (lowercase, matching
  filepath.Join literal).
- TestNewCmd_MissingConfigFile now uses a t.TempDir()-derived path to
  remove flakiness against a pre-existing /tmp file.
- golangci-lint v2.11.3 clean: errors.New over fmt.Errorf for plain
  strings (perfsprint), range-over-int loops (intrange), long-line
  split (lll), remove duplicate package docstrings (godoclint).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/stellar-rpc/internal/fullhistory/backfill/cmd.go
Comment thread cmd/stellar-rpc/internal/fullhistory/backfill/config/validate.go
- ApplyFlags no longer silently rewrites --max-retries=0 to 3. cobra
  already defaults the flag to 3, so omission works naturally; an
  explicit --max-retries=0 is a legitimate "no retries, fail fast"
  request and must not be clobbered. Drop the defaultMaxRetries
  constant and the sentinel branch.
- ValidateAgainstStore now enforces its Validate()-ran-first
  precondition explicitly. A caller who skips Validate would otherwise
  have ChunksPerTxHashIndex == 0 and seed "0" into the store,
  permanently locking the layout to that bogus value.
package geometry

// FirstLedger is the first ledger of the Stellar network.
const FirstLedger uint32 = 2
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concepts to internalize before reading this file — Chunk (fixed 10,000 ledgers, the atomic unit of ingestion + crash recovery) and Index (ChunksPerTxHashIndex chunks; the cadence at which one RecSplit txhash index gets built). All backfill math in this package is expressible through those two terms. FirstLedger=2 because Stellar ledger sequences start there — every boundary formula offsets by that anchor.

// non-empty values.
//
// Precondition: Validate has run, so ChunksPerTxHashIndex is populated.
func (c *Config) ApplyFlags(flags CLIFlags) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chunk-boundary expansion is outward only: start widens DOWN to the first ledger of its chunk, end widens UP to the last. Never narrows. This lets operators pass any range (e.g. --start-ledger 5_000_000 --end-ledger 56_337_842) without hand-computing chunk alignment — we always cover at least the requested range, possibly slightly more at each end. The >= FirstLedger guards are there to prevent a uint32 underflow if a bogus value reaches ApplyFlags before ValidateFlags runs; ValidateFlags is what finally rejects it.

// value of CHUNKS_PER_TXHASH_INDEX is seeded. Exported so the real
// metastore facade (#689) uses the exact same literal — the whole point
// of the immutability check is defeated if two sites disagree.
const ChunksPerTxHashIndexMetaKey = "config:chunks_per_txhash_index"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This key is exported on purpose. Once #689 lands the RocksDB-backed metastore facade, that code will read/write the exact same key. If two sites ever disagree on the literal string, the whole immutability check is defeated — a second run would see an 'absent' key and happily seed a new value next to the old one. Keeping the literal in exactly one place is the point.

// We enforce that explicitly here instead of trusting the caller — a
// stray first-run invocation without Validate would otherwise seed "0"
// into the store and permanently lock the layout to that value.
func (c *Config) ValidateAgainstStore(store Store) error {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three branches to verify below:

  1. Absent key → seed the current value and pass (first run).
  2. Stored == current → pass (subsequent run, layout unchanged).
  3. Stored != current → hard abort, with both values named in the error.

The precondition check at the top protects against a caller who forgot to run Validate() first. Without it we would silently seed "0" into the store on an un-initialized Config and permanently lock the on-disk layout to that bogus value.

// check runs against. The interface is deliberately narrow (Get + Put
// only). This slice ships an in-memory implementation; a real
// RocksDB-backed one will land as part of #689.
type Store interface {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Store (Get + Put) and BSBAvailabilityProbe (MaxAvailableLedger) are deliberately narrow. This slice ships in-memory / no-op implementations so the subcommand can run the full pipeline end-to-end today. When the real RocksDB-backed store lands in #689 and the real GCS-backed probe in #688, those implementations drop in at the two cmd.go call sites — nothing else in this package has to change.

// validate flags → immutability check → BSB availability → summary.
// Each step wraps its error so operators see the specific rule that
// failed.
func runBackfill(cmd *cobra.Command, _ []string) error {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline order is load-bearing, not cosmetic:

  1. Validate fills defaults (including CHUNKS_PER_TXHASH_INDEX=1000 when absent).
  2. ApplyFlags depends on those defaults — BuildGeometry reads ChunksPerTxHashIndex to widen the range.
  3. ValidateFlags sanity-checks the raw CLI numerics.
  4. ValidateAgainstStore locks the layout-defining CHUNKS_PER_TXHASH_INDEX value — requires Validate to have run (enforced as a precondition in validate.go).
  5. ValidateAgainstBSB checks the widened end ledger against BSB — requires ApplyFlags to have run.

Re-ordering any step breaks a downstream consumer's precondition.

Keep require only where a failure would genuinely invalidate the
subsequent check (Execute before reading out.String(), error
non-nil before reading err.Error()). All independent equality /
containment checks switched to assert so a single failing run
surfaces every problem, not just the first.
@karthikiyer56
Copy link
Copy Markdown
Contributor Author

Closing this PR (and its parent issue #684) as part of the rewrite of PRD #678.

Why

TOML config parsing is no longer a backfill concern. Under the new framing in PRD #678, the unified stellar-rpc service's func main parses the TOML config; backfill consumes an already-parsed Go Config struct. The 8-rule validation framework collapses:

  • [BACKFILL] and [BACKFILL.BSB] are obsolete — design doc has flat [BSB] at the unified-service level.
  • There are no per-run CLI flags. --start-ledger, --end-ledger, --workers, --max-retries, --verify-recsplit all gone.
  • No BSB-availability probe — design doc explicitly says "no source probe; source-coverage problems surface at runtime as task failures".
  • Chunk-boundary widening is owned by Phase 1 (catchup) in streaming-side code, not by per-call validate.
  • CHUNKS_PER_TXHASH_INDEX is being renamed to LEDGERS_PER_TX_INDEX (cleaner — domain-natural unit).
  • Service-startup validate_config (immutability check, etc.) is owned by streaming-side code.
  • Per-call validate(range_start, range_end) inside run_backfill is two asserts inline (bounds + ordering) — no slice needed.

What survives, and where it lands

The pkg/geometry port from commit 19a9179 is good work and still wanted. Land it as a tiny standalone PR if you'd like; it'll be consumed by the first slice that needs ledger/chunk/index arithmetic.

Closing without merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backfill 3: Config + validation (TOML schema, CLI merge, all 8 validation rules, chunk-boundary expansion)

2 participants