Skip to content

refactor(server): unify policy persistence in objects table#972

Merged
johntmyers merged 6 commits intomainfrom
refactor/os-103-unified-objects
Apr 28, 2026
Merged

refactor(server): unify policy persistence in objects table#972
johntmyers merged 6 commits intomainfrom
refactor/os-103-unified-objects

Conversation

@johntmyers
Copy link
Copy Markdown
Collaborator

@johntmyers johntmyers commented Apr 24, 2026

Summary

This PR moves sandbox policy revisions, draft policy chunks, and settings persistence onto the unified objects table for the greenfield OS-103 schema.
It replaces the old dedicated policy/chunk tables, switches singleton settings writes to name-based upserts, and adds handler-level coverage for the draft chunk lifecycle.

Related Issue

closes OS-103

Changes

  • replace the Postgres and SQLite migration baselines with a unified objects schema and remove the old 002/003 policy-specific migrations
  • add PolicyRevisionPayload and DraftChunkPayload wrappers in the proto layer and update persistence backends to store policy revisions and draft chunks inside objects
  • change settings persistence to upsert singleton records by name, update sandbox teardown cleanup, and tighten CAS retry logic to use typed unique-constraint handling
  • add handler-level draft chunk lifecycle coverage and fix the Rust/Python e2e regressions uncovered while exercising inference routing locally

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Commands run:

  • cargo fmt --all -- --check
  • RUSTC_WRAPPER= cargo test -p openshell-server persistence::tests -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-server sandbox_settings_save_and_load_round_trip -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-server delete_unlock_sandbox_set_succeeds_after_global_delete -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-server draft_chunk_handler_lifecycle_round_trip -- --nocapture
  • RUSTC_WRAPPER= cargo test --manifest-path e2e/rust/Cargo.toml --features e2e --test host_gateway_alias -- --nocapture
  • RUSTC_WRAPPER= UV_NO_SYNC=1 uv run pytest python/openshell/sandbox_test.py -k no_verify
  • user-ran RUSTC_WRAPPER= mise run e2e:rust
  • user-ran RUSTC_WRAPPER= mise run e2e:python

mise run pre-commit was attempted but is still blocked locally by two non-branch issues: ignored architecture/plans/* files without SPDX headers and the workspace openshell-prover test link step failing with ld: library 'z3' not found.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@johntmyers johntmyers requested a review from a team as a code owner April 24, 2026 21:38
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 24, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@johntmyers johntmyers self-assigned this Apr 24, 2026
@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label Apr 24, 2026
@johntmyers
Copy link
Copy Markdown
Collaborator Author

/ok to test ed81ed9

@johntmyers johntmyers force-pushed the refactor/os-103-unified-objects branch from ed81ed9 to 031134b Compare April 27, 2026 19:12
@johntmyers
Copy link
Copy Markdown
Collaborator Author

/ok to test 031134b

@johntmyers johntmyers removed the test:e2e Requires end-to-end coverage label Apr 27, 2026
@pimlock
Copy link
Copy Markdown
Collaborator

pimlock commented Apr 27, 2026

Sandbox settings are now keyed by (object_type, name), where name is the sandbox name. The API delete path deletes settings by name, but backend-driven deletion/reconcile paths operate by sandbox id and currently remove only the sandbox record/in-memory state. If a sandbox disappears from the driver without going through the API delete flow, its sandbox_settings row can remain, and a later sandbox with the same name can inherit those stale settings.

Potential fixes: key sandbox settings by immutable sandbox id using scope = sandbox_id, restore a stable id such as settings:{sandbox_id}, or centralize cleanup so every driver/API deletion path deletes the settings row before losing the sandbox name.

Related question: should the local cleanup portion of API-driven deletes and driver/watch/reconcile-driven deletes be unified? The full flows differ because API delete still has to call the driver, but the gateway-owned cleanup probably wants one shared helper for sandbox row, sandbox settings, SSH sessions, indexes, watch/log buses, and any other sandbox-scoped state.

Comment thread crates/openshell-server/src/persistence/mod.rs Outdated
Comment thread crates/openshell-server/src/persistence/mod.rs Outdated
Comment thread crates/openshell-server/src/persistence/tests.rs
@johntmyers
Copy link
Copy Markdown
Collaborator Author

johntmyers commented Apr 28, 2026

Sandbox settings are now keyed by (object_type, name), where name is the sandbox name. The API delete path deletes settings by name, but backend-driven deletion/reconcile paths operate by sandbox id and currently remove only the sandbox record/in-memory state. If a sandbox disappears from the driver without going through the API delete flow, its sandbox_settings row can remain, and a later sandbox with the same name can inherit those stale settings.

Potential fixes: key sandbox settings by immutable sandbox id using scope = sandbox_id, restore a stable id such as settings:{sandbox_id}, or centralize cleanup so every driver/API deletion path deletes the settings row before losing the sandbox name.

Related question: should the local cleanup portion of API-driven deletes and driver/watch/reconcile-driven deletes be unified? The full flows differ because API delete still has to call the driver, but the gateway-owned cleanup probably wants one shared helper for sandbox row, sandbox settings, SSH sessions, indexes, watch/log buses, and any other sandbox-scoped state.

@pimlock

Left settings keyed by name still and added a shared helper that centralizes gateway-owned sb cleanup that deletes ssh sessions plus the settings row.

f35d668

@johntmyers johntmyers force-pushed the refactor/os-103-unified-objects branch from beb9fdd to a21ff9e Compare April 28, 2026 17:36
@johntmyers
Copy link
Copy Markdown
Collaborator Author

/ok to test a21ff9e

@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label Apr 28, 2026
@github-actions
Copy link
Copy Markdown

Label test:e2e applied for a21ff9e. Open the existing run and click Re-run all jobs to execute with the label set. The E2E Gate check on this PR will flip green automatically once the run finishes.

@johntmyers johntmyers requested review from drew and pimlock April 28, 2026 21:26
@pimlock pimlock added test:e2e Requires end-to-end coverage and removed test:e2e Requires end-to-end coverage labels Apr 28, 2026
@github-actions
Copy link
Copy Markdown

Label test:e2e applied, but pull-request/972 is at a21ff9e while the PR head is 42f6ff3. A maintainer needs to comment /ok to test 42f6ff3bd7b5652b34f4349e6c698b0153cd1e83 to refresh the mirror. Once the mirror catches up, re-run Branch E2E Checks from the Actions tab.

@pimlock
Copy link
Copy Markdown
Collaborator

pimlock commented Apr 28, 2026

/ok to test 42f6ff3

@johntmyers johntmyers merged commit d414e69 into main Apr 28, 2026
31 of 34 checks passed
@johntmyers johntmyers deleted the refactor/os-103-unified-objects branch April 28, 2026 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants