Skip to content

cleanup_old_versions can delete files held by long-running readers #6607

@LuciferYang

Description

@LuciferYang

Description

A Dataset handle at an older version (via checkout_version) has no way to signal cleanup_old_versions that it's in use. If cleanup runs while a reader is mid-scan, the manifest and data files for that version can be deleted out from under it, and the scan fails with object-store NotFound errors.

Reproduce

  1. Open a dataset and checkout_version(N) where N is older than lance.auto_cleanup.older_than.
  2. Start a scan. While the scan is running, run cleanup_old_versions with a before_version that includes N (or let auto-cleanup fire on the next commit).
  3. The scan fails mid-stream with NotFound from the object store.

Current workarounds

Create a temporary tag before the read and delete it after. Works, but every reader has to cooperate and it gets awkward on auto-cleanup-per-commit deployments — cleanup runs on every write.

What I'd want

A short-lived, renewable signal that a reader can publish for the version it's using: cleanup treats the version as retained while the signal is live, and a crashed reader doesn't pin the version forever. Something like an advisory lease with a TTL.

Happy to put up a PR if this direction sounds reasonable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions