Symptom
Invoking any cache-lease code path against a pre-existing (older-schema) fbuild cache database errors with:
```
lnk cache index error: no such column: refcount in
UPDATE leases SET refcount = refcount + 1
WHERE entry_id = ?1 AND holder_pid = ?2 AND holder_nonce = ?3
at offset 29
```
Reproduced on Windows against `~/.fbuild/prod/cache` created by an older fbuild version. Fresh `DiskCache::open_at(tempdir)` DBs work fine — the schema `leases.refcount` column exists there. The issue is specifically on long-lived production DBs that were created before the `refcount` column was added.
Impact
Any consumer of `DiskCache::lease()`:
Workaround in PR #119
The `.lnk` resolver currently wraps `lease()` in a `best_effort_lease()` helper that logs a warning and returns `None` on failure. This keeps the feature functional (the cached blob is still valid and verifiable; it just isn't pinned against concurrent GC). That's good enough for single-process builds but weakens the GC-safety invariant for parallel ones.
Proper fix
One of:
- Schema migration on `DiskCache::open()` — detect missing `refcount` column, run `ALTER TABLE leases ADD COLUMN refcount INTEGER NOT NULL DEFAULT 0;` + `INSERT ... UPDATE` re-baseline.
- Idempotent `CREATE TABLE` that always includes `refcount`, and a versioned schema migration table (`PRAGMA user_version` or a `schema_migrations` table) that applies deltas on open.
Option 2 is the longer-term clean answer; option 1 is a single-revision patch.
Related
- PR #119 — `.lnk` resource pointers, works around the issue with `best_effort_lease`.
Once this lands, `best_effort_lease` in `crates/fbuild-packages/src/lnk/resolver.rs` can revert to `cache.lease(entry).map_err(map_cache_err)?` and the lease will once again be a hard requirement.
Symptom
Invoking any cache-lease code path against a pre-existing (older-schema) fbuild cache database errors with:
```
lnk cache index error: no such column: refcount in
UPDATE leases SET refcount = refcount + 1
WHERE entry_id = ?1 AND holder_pid = ?2 AND holder_nonce = ?3
at offset 29
```
Reproduced on Windows against `~/.fbuild/prod/cache` created by an older fbuild version. Fresh `DiskCache::open_at(tempdir)` DBs work fine — the schema `leases.refcount` column exists there. The issue is specifically on long-lived production DBs that were created before the `refcount` column was added.
Impact
Any consumer of `DiskCache::lease()`:
Workaround in PR #119
The `.lnk` resolver currently wraps `lease()` in a `best_effort_lease()` helper that logs a warning and returns `None` on failure. This keeps the feature functional (the cached blob is still valid and verifiable; it just isn't pinned against concurrent GC). That's good enough for single-process builds but weakens the GC-safety invariant for parallel ones.
Proper fix
One of:
Option 2 is the longer-term clean answer; option 1 is a single-revision patch.
Related
Once this lands, `best_effort_lease` in `crates/fbuild-packages/src/lnk/resolver.rs` can revert to `cache.lease(entry).map_err(map_cache_err)?` and the lease will once again be a hard requirement.