Skip to content

feat: settings migrations framework#255

Merged
emal-avala merged 3 commits into
mainfrom
feat/settings-migrations
May 1, 2026
Merged

feat: settings migrations framework#255
emal-avala merged 3 commits into
mainfrom
feat/settings-migrations

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

  • Adds crates/lib/src/config/migrations/ with a Migration trait, an ordered registry, and a runner that reads schema_version from the settings JSON, replays every missing migration in order, and writes the upgraded file back atomically.
  • Implements rotating .bak.N backups (3 slots, oldest dropped on each rewrite). A current-version file is a no-op fast path: no rewrite, no backup, no mtime change. A higher-than-known version is rejected with a clear error rather than dropping fields silently.
  • Seeds two example migrations to demonstrate the pattern: v0 → v1 stamps the version onto pre-versioned files, and v1 → v2 consolidates a legacy api.token field into api.api_key.
  • Ships golden fixtures under crates/lib/tests/fixtures/config_migrations/ and 10 integration tests covering migration, no-op, backup rotation, parse-failure rollback, migration-failure rollback, and downgrade rejection.
  • Adds an authoring guide at docs/config-migrations.mdx.

Test plan

  • cargo build --workspace succeeds.
  • cargo test -p agent-code-lib --lib — all 793 unit tests green, including 17 new migration tests.
  • cargo test --test config_migrations — all 10 integration tests green.
  • No references to other AI products or assistants in code, comments, or docs.
  • Running load_and_migrate against a v0 fixture rewrites the file once, leaves a single .bak.1, and matches the v0→current golden.
  • Four sequential migrations on the same file leave exactly three .bak.N slots, with the oldest pre-migration body dropped.

Adds a forward-migration runner for the JSON settings file. Files
carry a top-level schema_version; on load the runner replays any
missing migrations in order, atomically rewrites the file, and
keeps three rotating .bak backups. Higher-than-known versions are
rejected so a downgraded build cannot drop fields silently.

Seeds the registry with v0->v1 (stamp the version) and v1->v2
(consolidate api.token into api.api_key) plus golden fixtures and
an authoring guide at docs/config-migrations.mdx.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a1200f7011

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +239 to +240
pub fn load_and_migrate(path: &Path) -> Result<(serde_json::Value, MigrationOutcome)> {
load_and_migrate_with(path, &registered_migrations())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Invoke migration runner from the config load path

The new migration entrypoint is never called by production config loading, so migrations and downgrade checks are effectively dead code. Config::load_inner still reads only TOML layers (config.toml / .agent/settings.toml) and does not route any settings file through this API, which means older schemas will not be upgraded at runtime despite the new framework and docs claiming load-time migration.

Useful? React with 👍 / 👎.

Comment thread crates/lib/src/config/migrations/mod.rs Outdated
Comment on lines +267 to +268
let backup_path = rotate_backups_and_archive(path, &raw)?;
atomic_write_json(path, &value)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer backup rotation until after successful temp write

load_and_migrate_with rotates and rewrites .bak slots before attempting atomic_write_json. If serialization, temp-file write, fsync, or rename fails, the function returns an error with the live file unchanged but backups already mutated, which can drop the oldest retained recovery point during a failed migration attempt.

Useful? React with 👍 / 👎.

…ation

Addresses two codex review findings on the settings-migrations framework:

P1 — the migration runner was dead code. `Config::load_inner` read the
TOML layers directly and never invoked the JSON-based runner, so
migrations, backups, and downgrade checks never ran in production.

Settings on disk are TOML; migrations are pure functions over
serde_json::Value. The cleanest bridge is to convert at the file edge:
add `load_and_migrate_toml(path)` that parses TOML, converts via
toml::Value <-> serde_json::Value, drives the existing migration chain,
and rewrites the file as TOML. `Config::load_inner` now routes every
file layer (user, project, project-local) through this entry point
before the existing TOML merge stage. The JSON conversion is lossy for
TOML datetimes (RFC 3339 string round-trip); no shipping schema field
uses one and the limitation is documented.

P2 — backup rotation ran before the new file write. If serialization,
the temp-file open, fsync, or rename failed, the .bak.N chain had
already been mutated, potentially clobbering a backup the user
depended on. Reorder so the temp-file write + atomic rename complete
first; only after the live file is in place do we rotate the chain
and archive the pre-migration bytes (snapshotted in memory before the
write) into .bak.1.

New integration tests:
- TOML round-trip migration (v0 -> current) via the public
  load_and_migrate_toml entry point.
- Production wiring: Config::load against a v0 .agent/settings.toml in
  a tempdir confirms the file is rewritten with schema_version stamped
  and api.token consolidates into api.api_key.
- Atomic-write failure (read-only parent dir) leaves the prior .bak.1
  untouched and never creates .bak.2 — proves rotation runs strictly
  after a successful write.

Drive-by: localized #[allow(clippy::wrong_self_convention)] on the
Migration trait's from_version accessor with a comment explaining the
lint misfires on getter pairs like from_version/to_version.
@emal-avala
Copy link
Copy Markdown
Member Author

Addressed codex review findings: wired migrations into Config::load via a new load_and_migrate_toml that crosses TOML<->JSON at the file edge (datetime conversion is lossy and documented; no shipping field uses one), and reordered backup rotation to run strictly after the atomic temp-write+rename so a failed write can't mutate the .bak.N chain. New tests cover the TOML round-trip, the production Config::load path against a planted v0 .agent/settings.toml, and a read-only-dir failure that leaves the prior backup untouched.

…profile migrations

Second-round codex findings on the settings-migrations PR:

- atomic_write_bytes now writes to a randomly-named temp file via
  tempfile::NamedTempFile::new_in, so a pre-planted symlink at the
  deterministic <path>.tmp slot can no longer be truncated.
- Capture the live file's permission mode before rewriting and apply
  it to both the rewritten file and the .bak.1 archive (defaulting to
  0600 when the file does not exist yet, since these files contain
  API keys). Windows preserves the read-only flag only — POSIX modes
  do not map cleanly. Backups go through the same atomic-write helper
  so they inherit the same mode-preservation and platform handling.
- Use NamedTempFile::persist for the rename, which on Windows uses
  MoveFileExW with MOVEFILE_REPLACE_EXISTING — std::fs::rename refuses
  to overwrite on Windows, so every migration of an existing config
  would have failed there before.
- After a successful POSIX rename, fsync the parent directory so the
  rename's directory entry survives a crash. Skipped on Windows (no
  equivalent and the ReplaceFile path already provides durability).
- Route Profile::load through migrations::load_and_migrate_toml so old
  profile snapshots get the same schema upgrade path as the main
  settings file. New unit test covers the v0 → v2 upgrade end-to-end.
- Add #[cfg(unix)] tests for 0600/0644 mode preservation and for the
  pre-planted-symlink truncation case.
- Document comment-loss in docs/config-migrations.mdx alongside the
  existing TOML-datetime caveat.

Promotes tempfile from a dev-dep to a regular dep on agent-code-lib.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@emal-avala
Copy link
Copy Markdown
Member Author

Addressed second-round codex findings: unguessable temp paths via tempfile::NamedTempFile (kills the symlink-truncation TOCTOU at <settings>.tmp); preserve original file mode on rewrite + backup, defaulting to 0600 for new files (no more 0644 leak of API keys); cross-platform durable rename via NamedTempFile::persist (fixes Windows rename-over-existing); POSIX parent-dir fsync after rename; Profile::load now routes through the migration runner so old profile snapshots get upgraded; Caveats section in docs covers comment-loss + datetime round-trip. Three new #[cfg(unix)] tests for 0600/0644 preservation and planted-symlink resistance. CI gate green locally — only the bwrap sandbox tests fail in this worktree (environmental, pre-existing). Commit: e734667.

@emal-avala emal-avala merged commit c76b9e6 into main May 1, 2026
13 of 14 checks passed
@emal-avala emal-avala deleted the feat/settings-migrations branch May 1, 2026 20:45
emal-avala added a commit that referenced this pull request May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant