Skip to content

Feat/provisioning architecture rewrite#70

Merged
shard77 merged 16 commits intodevfrom
feat/provisioning-architecture-rewrite
Mar 10, 2026
Merged

Feat/provisioning architecture rewrite#70
shard77 merged 16 commits intodevfrom
feat/provisioning-architecture-rewrite

Conversation

@shard77
Copy link
Copy Markdown
Member

@shard77 shard77 commented Mar 10, 2026

No description provided.

shard77 added 16 commits March 10, 2026 09:36
- Add migration 8: images table, provision_status enum
- Extend machines table with pool persistence columns
- Add ImagesConfig struct for image store path
- Replace base_image with image in MachineDefaults
- Fix all base_image references in scheduler and CLI
- Add uuid workspace dependency to malbox-database
- Add ImageError enum with Insert/Fetch/Update/Delete variants
- Register images module in repositories.rs
- Implement insert_image, fetch_image_by_name, fetch_image_by_id,
  fetch_all_images, update_image_availability, delete_image_by_name
  using sqlx::query_as with .bind() calls
- Add ProvisionStatusDb enum for provision tracking
- Extend Machine struct with pool persistence fields
- Add fetch_pool_machines, upsert_pool_machine queries
- Add update_provision_status, remove_pool_machine
- Convert query_as! macros to runtime query_as
- Replace in-memory HashMap pool with Vec<PooledMachine> + DB backing
- Add db_id field to PooledMachine for DB row tracking
- MachinePool now accepts PgPool and provides load_from_db()
- Add Database variant to ResourceError
- PooledManager takes PgPool, passes to MachinePool
- initialize_pool() is now two-phase: reconcile + fill
- Phase 1 reconciles existing DB machines against provider
- Phase 2 provisions new machines to reach min pool size
- Add provision_new_machine() with full lifecycle tracking
- DB status transitions: unprovisioned -> provisioning -> provisioned
- On provision failure, DB status set to failed
- build_spec() helper extracts spec construction
- OnDemandManager unchanged (no DB dependency)
- POST /v1/images to register new images
- GET /v1/images to list all images
- GET /v1/images/{name} to get single image
- DELETE /v1/images/{name} to remove image
- Remove validate_base_image function
- Resolve image from DB registry for pooled manager
- Pass PgPool to PooledManager::new
- Spawn image store filesystem watcher
- Add mod image_store declaration
- Replace base_image with image in machinery defaults
- Add [images] section for managed image store
- Replace legacy columns (locked, status varchar, reserved, etc.)
  with machine_status enum and lifecycle columns
- Add current_task_id, error_message, clean_snapshot, provider fields
- Move provision_status to machine_status with 7 lifecycle states
- FK constraints for image_id and current_task_id added in migration 8
- Remove ManagerType enum (pooled/ondemand distinction gone)
- Remove PoolConfig struct
- Move clean_snapshot_name to MachineryConfig top level
- Keep MachineDefaults unchanged
- Replace ProvisionStatusDb with MachineStatusDb (7 states)
- New Machine struct: drop legacy fields, add status/current_task_id/error_message
- Atomic acquire_machine with FOR UPDATE SKIP LOCKED
- New lifecycle queries: insert, acquire, release, update_status, etc.
- Remove all legacy queries: lock/unlock, pool-specific, query builders
- NoMachineAvailable, MachineAssigned, InvalidMachineState
- SnapshotRestoreFailed for revert failures
- New MachinePool with lifecycle: create, acquire, release, revert,
  delete, retry, reconcile
- Reconciliation validates DB state against provider on startup
- Snapshot restore with graceful degradation (re-provision fallback)
- Notify mechanism for workers waiting on machine availability
- Remove MachineryManager trait, PooledManager, OnDemandManager
- Strip manager.rs to just wait_for_endpoint utility
- Replace MachineryManager trait with Arc<MachinePool>
- Worker acquires/releases machines via pool methods
- Re-enqueue task when no machine available
- Remove task_to_machine_spec helper
- Build RuntimeMachine from DbMachine for transport
- POST/GET /v1/machines for create and list
- GET/DELETE /v1/machines/:id for fetch and delete
- POST /v1/machines/:id/retry to retry failed machines
- Add malbox-resources dependency to malbox-http
- Replace PooledManager/OnDemandManager with unified MachinePool
- Remove validate_manager_compatibility function
- Call machine_pool.reconcile() on startup
- Pass machine_pool to both scheduler and HTTP server
- Update example config to reflect new architecture
- Move image store watcher module to malbox-utils
- Add malbox-database and notify deps to malbox-utils
- Remove notify dep from malbox-daemon
@shard77 shard77 merged commit 0c81d6a into dev Mar 10, 2026
1 check failed
@shard77 shard77 deleted the feat/provisioning-architecture-rewrite branch March 10, 2026 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant