Skip to content

Data lifecycle GC scanning fails for nested/namespaced model types #330

@umag

Description

@umag

Description

In src/domain/data/data_lifecycle_service.ts (lines 186-195), findExpiredData() iterates the .swamp/data/ directory using readDir(), which yields only the first path segment. For nested types like command/shell or @docker/host, it reads command or @docker as the type name. Subdirectories (shell, host) are incorrectly treated as model IDs, causing the scan to fail silently.

Steps to Reproduce

  1. Create a model of type command/shell or @docker/host
  2. Run the model to produce data with a lifetime (e.g., 1h)
  3. Wait for the data to expire
  4. Run garbage collection
  5. The expired data is never cleaned up

Expected Behavior

GC scanning should correctly handle nested type directory structures (type/subtype/modelId/dataName/version/).

Actual Behavior

The scanner treats the type subdirectory (shell, host) as a model UUID, resulting in failed lookups. Data for all nested/namespaced types is never garbage collected.

Summary

This affects the data lifecycle service's directory scanning logic. The fix would involve walking the directory tree to the correct depth based on the type structure, or using the model registry to enumerate known types and their directory paths rather than relying on directory listing.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions