feat: settings migrations framework#255
Conversation
Adds a forward-migration runner for the JSON settings file. Files carry a top-level schema_version; on load the runner replays any missing migrations in order, atomically rewrites the file, and keeps three rotating .bak backups. Higher-than-known versions are rejected so a downgraded build cannot drop fields silently. Seeds the registry with v0->v1 (stamp the version) and v1->v2 (consolidate api.token into api.api_key) plus golden fixtures and an authoring guide at docs/config-migrations.mdx.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a1200f7011
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub fn load_and_migrate(path: &Path) -> Result<(serde_json::Value, MigrationOutcome)> { | ||
| load_and_migrate_with(path, ®istered_migrations()) |
There was a problem hiding this comment.
Invoke migration runner from the config load path
The new migration entrypoint is never called by production config loading, so migrations and downgrade checks are effectively dead code. Config::load_inner still reads only TOML layers (config.toml / .agent/settings.toml) and does not route any settings file through this API, which means older schemas will not be upgraded at runtime despite the new framework and docs claiming load-time migration.
Useful? React with 👍 / 👎.
| let backup_path = rotate_backups_and_archive(path, &raw)?; | ||
| atomic_write_json(path, &value)?; |
There was a problem hiding this comment.
Defer backup rotation until after successful temp write
load_and_migrate_with rotates and rewrites .bak slots before attempting atomic_write_json. If serialization, temp-file write, fsync, or rename fails, the function returns an error with the live file unchanged but backups already mutated, which can drop the oldest retained recovery point during a failed migration attempt.
Useful? React with 👍 / 👎.
…ation Addresses two codex review findings on the settings-migrations framework: P1 — the migration runner was dead code. `Config::load_inner` read the TOML layers directly and never invoked the JSON-based runner, so migrations, backups, and downgrade checks never ran in production. Settings on disk are TOML; migrations are pure functions over serde_json::Value. The cleanest bridge is to convert at the file edge: add `load_and_migrate_toml(path)` that parses TOML, converts via toml::Value <-> serde_json::Value, drives the existing migration chain, and rewrites the file as TOML. `Config::load_inner` now routes every file layer (user, project, project-local) through this entry point before the existing TOML merge stage. The JSON conversion is lossy for TOML datetimes (RFC 3339 string round-trip); no shipping schema field uses one and the limitation is documented. P2 — backup rotation ran before the new file write. If serialization, the temp-file open, fsync, or rename failed, the .bak.N chain had already been mutated, potentially clobbering a backup the user depended on. Reorder so the temp-file write + atomic rename complete first; only after the live file is in place do we rotate the chain and archive the pre-migration bytes (snapshotted in memory before the write) into .bak.1. New integration tests: - TOML round-trip migration (v0 -> current) via the public load_and_migrate_toml entry point. - Production wiring: Config::load against a v0 .agent/settings.toml in a tempdir confirms the file is rewritten with schema_version stamped and api.token consolidates into api.api_key. - Atomic-write failure (read-only parent dir) leaves the prior .bak.1 untouched and never creates .bak.2 — proves rotation runs strictly after a successful write. Drive-by: localized #[allow(clippy::wrong_self_convention)] on the Migration trait's from_version accessor with a comment explaining the lint misfires on getter pairs like from_version/to_version.
|
Addressed codex review findings: wired migrations into Config::load via a new load_and_migrate_toml that crosses TOML<->JSON at the file edge (datetime conversion is lossy and documented; no shipping field uses one), and reordered backup rotation to run strictly after the atomic temp-write+rename so a failed write can't mutate the .bak.N chain. New tests cover the TOML round-trip, the production Config::load path against a planted v0 .agent/settings.toml, and a read-only-dir failure that leaves the prior backup untouched. |
…profile migrations Second-round codex findings on the settings-migrations PR: - atomic_write_bytes now writes to a randomly-named temp file via tempfile::NamedTempFile::new_in, so a pre-planted symlink at the deterministic <path>.tmp slot can no longer be truncated. - Capture the live file's permission mode before rewriting and apply it to both the rewritten file and the .bak.1 archive (defaulting to 0600 when the file does not exist yet, since these files contain API keys). Windows preserves the read-only flag only — POSIX modes do not map cleanly. Backups go through the same atomic-write helper so they inherit the same mode-preservation and platform handling. - Use NamedTempFile::persist for the rename, which on Windows uses MoveFileExW with MOVEFILE_REPLACE_EXISTING — std::fs::rename refuses to overwrite on Windows, so every migration of an existing config would have failed there before. - After a successful POSIX rename, fsync the parent directory so the rename's directory entry survives a crash. Skipped on Windows (no equivalent and the ReplaceFile path already provides durability). - Route Profile::load through migrations::load_and_migrate_toml so old profile snapshots get the same schema upgrade path as the main settings file. New unit test covers the v0 → v2 upgrade end-to-end. - Add #[cfg(unix)] tests for 0600/0644 mode preservation and for the pre-planted-symlink truncation case. - Document comment-loss in docs/config-migrations.mdx alongside the existing TOML-datetime caveat. Promotes tempfile from a dev-dep to a regular dep on agent-code-lib. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Addressed second-round codex findings: unguessable temp paths via tempfile::NamedTempFile (kills the symlink-truncation TOCTOU at |
Summary
crates/lib/src/config/migrations/with aMigrationtrait, an ordered registry, and a runner that readsschema_versionfrom the settings JSON, replays every missing migration in order, and writes the upgraded file back atomically..bak.Nbackups (3 slots, oldest dropped on each rewrite). A current-version file is a no-op fast path: no rewrite, no backup, no mtime change. A higher-than-known version is rejected with a clear error rather than dropping fields silently.v0 → v1stamps the version onto pre-versioned files, andv1 → v2consolidates a legacyapi.tokenfield intoapi.api_key.crates/lib/tests/fixtures/config_migrations/and 10 integration tests covering migration, no-op, backup rotation, parse-failure rollback, migration-failure rollback, and downgrade rejection.docs/config-migrations.mdx.Test plan
cargo build --workspacesucceeds.cargo test -p agent-code-lib --lib— all 793 unit tests green, including 17 new migration tests.cargo test --test config_migrations— all 10 integration tests green.load_and_migrateagainst a v0 fixture rewrites the file once, leaves a single.bak.1, and matches the v0→current golden..bak.Nslots, with the oldest pre-migration body dropped.