Skip to content

Archive Strategy: Compacted Task History #69

@dcramer

Description

@dcramer

Archive Strategy: Compacted Task History

Problem Statement

Over time, dex accumulates completed tasks that:

  • Consume memory (all loaded on every operation)
  • Slow down performance (parsing/filtering large datasets)
  • Create noise in active task lists
  • Grow the storage file indefinitely

Solution: Compacted Archive in JSONL Format

Storage Structure

.dex/
├── tasks.jsonl          # Active tasks (pending + recent completed) - JSONL format
└── archive.jsonl        # Archived tasks (compacted format) - JSONL format

Both files use JSONL (JSON Lines) format:

  • One task per line
  • Compact JSON (no pretty-printing)
  • Same format benefits: merge-safe, line-based, append-friendly

What Gets Compacted

Full Task (tasks.jsonl):

{"id":"abc123","description":"Add Auth","context":"[50KB plan]",...,"children":["t1","t2"]}

Archived Task (archive.jsonl):

{"id":"abc123","description":"Add Auth","completed_at":"...","archived_at":"...","result":"Completed","archived_children":[{"id":"t1","description":"Database","result":"Done"}]}

Size reduction: 50-80% (mainly from dropping context field)

Key Principles

  1. JSONL everywhere: Both active and archive use same format
  2. Compacted fields: Archive drops context, blockedBy, blocks, children, timestamps
  3. Rolled-up children: Epic's children compacted into archived_children array
  4. Complete lineages only: Children of active parents stay in tasks.jsonl
  5. Time + Count criteria: Archive if completed >90 days AND not in recent 50

Archive Operations

Manual: dex archive <task-id>

  • Archives task + all descendants
  • Validates all completed, no active ancestors
  • Compacts and appends to archive.jsonl
  • Removes from tasks.jsonl

Auto-archive (future):

  • Runs during write operations
  • Time (90 days) + Count (recent 50) based
  • Only complete lineages
  • Off by default initially

Query:

  • dex list → only tasks.jsonl (fast)
  • dex list --archived → only archive.jsonl
  • dex list --all → both files
  • dex show <id> → auto-checks both files

Implementation Phases

See subtasks for detailed breakdown.

Task Tree

  • Add manual archive command 2uvg80ks
  • Implement auto-archive functionality j8osnde9
  • Add CLI flags for archive operations kotl63oy
  • Create Archive Storage Layer qi9iakzt
  • Implement Compaction Logic qzx9knp8
  • Add comprehensive archive tests v771l2ru
  • Integrate archive into query operations xe0pfmym

Task Details

[x] Add manual archive command 2uvg80ks

Description

Implement 'dex archive' CLI command in src/cli/commands.ts.

Command: dex archive

Functionality:

  • Validate task + all descendants are completed
  • Validate no active ancestors
  • Collect task + descendants using collectArchivableTasks()
  • Compact using compactTask()
  • Clean up blocking references in active tasks
  • Move to archive.jsonl
  • Remove from tasks.jsonl
  • Display summary with size reduction

Error cases:

  • Pending descendants: Show list of incomplete subtasks
  • Active blockers: Show tasks this would leave without blockers
  • Not found: Clear error message

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Implemented 'dex archive ' CLI command in src/cli/archive.ts.

Features:

  • Archives completed task + all descendants to archive.jsonl
  • Validates task and descendants are completed
  • Validates no incomplete ancestors
  • Compacts tasks (drops blockedBy, blocks, children, timestamps, priority)
  • Preserves id, parent_id, name, description, result, completed_at, metadata
  • Cleans up blocking references in remaining active tasks
  • Shows size reduction percentage

Added 11 tests in archive.test.ts covering:

  • Help display, validation errors
  • Single task archiving
  • Parent + subtask archiving
  • Incomplete subtask/ancestor rejection
  • Blocking reference cleanup
  • Description preservation

Updated: help.ts, args.ts (commands list), docs/src/pages/cli.astro

[x] Implement auto-archive functionality j8osnde9

Description

Add automatic archival in src/core/auto-archive.ts.

Implementation:

  • Run during write operations (storage.writeAsync)
  • Time + count criteria: completed >90 days AND not in recent 50
  • Only archive complete lineages (no active ancestors)
  • Only archive top-level completed tasks automatically
  • Silent operation with logging to .dex/archive.log
  • Configurable via .dex/config.json (archive.auto, archive.age_days, archive.keep_recent)

Algorithm:

  1. Filter completed top-level tasks (no parent_id)
  2. Check canAutoArchive() for each
  3. Archive eligible tasks using existing archive logic
  4. Log actions for transparency

Start with auto-archive OFF by default (opt-in).

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Implemented auto-archive functionality in src/core/auto-archive.ts. Features:

  • Runs during writeAsync() operations in JsonlStorage
  • Configurable via .dex/config.toml: archive.auto (default: false), archive.age_days (default: 90), archive.keep_recent (default: 50)
  • Only archives root-level completed tasks with complete lineages
  • Logs archive events to .dex/archive.log
  • 14 passing tests in auto-archive.test.ts

Auto-archive is opt-in (disabled by default) per the task requirements.

[x] Add CLI flags for archive operations kotl63oy

Description

Update CLI commands to expose archive functionality.

Add flags to existing commands:

  • dex list --archived: List only archived tasks
  • dex list --all: Include both active and archived
  • dex show : Auto-detect archive (no flag needed)
  • dex search --archived: Search archived tasks

Add new bulk operations:

  • dex archive --older-than 60d: Archive all completed >60 days
  • dex archive --completed: Archive ALL completed tasks
  • dex archive --except id1,id2: Archive all except specified

Update display format:

  • Show (ARCHIVED) badge on archived tasks
  • Note when context is removed
  • Display archived_children in tree format
  • Show archive count in list summary

Update help text and documentation.

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Implemented CLI flags for archive operations:

  • Added --archived flag to 'dex list' to view archived tasks
  • Added bulk archive flags: --older-than, --completed, --except, --dry-run
  • Updated 'dex show' to auto-detect and display archived tasks
  • Added formatArchivedTask and formatArchivedTaskShow helpers
  • Code simplified by extracting compactCollection and removeArchivedTasks helpers
  • Updated help.ts documentation

All 707 tests pass. Committed to main.

[x] Create Archive Storage Layer qi9iakzt

Description

Implement archive storage infrastructure in src/core/storage/archive-storage.ts.

IMPORTANT: archive.jsonl uses JSONL format (same as tasks.jsonl):

  • One archived task per line
  • Compact JSON (no pretty-printing)
  • Line-based for git merge-safety
  • Append-only operations

Create ArchiveStorage class with methods:

  • readArchive(): Read archive.jsonl, split by newlines, parse each line as ArchivedTask
  • appendArchive(tasks): Append tasks to archive.jsonl (one per line, compact JSON)
  • searchArchive(query): Search archived tasks by description/result
  • getArchived(id): Get specific archived task by ID

Add ArchivedTaskSchema to src/types.ts with compacted fields:

  • Keep: id, parent_id, description, completed_at, archived_at, result, metadata.github
  • Drop: context, blockedBy, blocks, children, created_at, updated_at, priority
  • Add: archived_children array for rolled-up subtasks

File operations:

  • Read: Load entire file, split by \n, parse each line
  • Write: Serialize to compact JSON (JSON.stringify, no formatting), join with \n
  • Atomic writes: temp file + rename (same as JsonlStorage)

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Created ArchiveStorage class in src/core/storage/archive-storage.ts with methods: readArchive(), writeArchive(), appendArchive(), searchArchive(), getArchived(), removeArchived(). Added ArchivedTask, ArchivedChild, and ArchiveStore schemas to types.ts. Exported from storage/index.ts. Added 28 tests in archive-storage.test.ts covering all operations including JSONL parsing, atomic writes, search, and round-trip preservation. All 639 tests pass.

[x] Implement Compaction Logic qzx9knp8

Description

Create compaction utilities in src/core/archive-compactor.ts.

Implement functions:

  • compactTask(task, children): Convert full Task to compacted ArchivedTask

    • Strip context, blockedBy, blocks, children, created_at, updated_at, priority
    • Roll up children into archived_children array
    • Preserve GitHub metadata if present
    • Add archived_at timestamp
  • collectArchivableTasks(taskId, allTasks): Gather task + descendants

    • Validate all are completed
    • Validate no active ancestors
    • Return {root, descendants} for archival
  • canAutoArchive(task, allTasks, config): Check if task meets auto-archive criteria

    • Time + count based (90 days + not in recent 50)
    • No active ancestors
    • Complete lineage only

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Implemented compaction utilities in src/core/archive-compactor.ts:

  • compactTask(task, children): Converts Task to ArchivedTask, stripping context, blockedBy, blocks, children, created_at, updated_at, priority. Preserves id, parent_id, description, result, completed_at, metadata.github, metadata.commit. Adds archived_at and rolls up children into archived_children.

  • collectArchivableTasks(taskId, allTasks): Gathers task + descendants, validates all completed with no active ancestors. Returns {root, descendants} or null if invalid.

  • canAutoArchive(task, allTasks, config): Checks auto-archive criteria (minAgeDays: 90, keepRecentCount: 50 defaults). Validates time-based and count-based thresholds, plus complete lineage.

  • findAutoArchivableTasks(allTasks, config): Finds root-level tasks eligible for auto-archiving (children archived with parents).

Tests: 26 passing tests covering all functions, edge cases, and configuration options.

Commit: 31007c0

[x] Add comprehensive archive tests v771l2ru

Description

Create test suite for archive functionality.

Test files:

  • src/core/storage/archive-storage.test.ts: Archive storage operations
  • src/core/archive-compactor.test.ts: Compaction logic
  • src/core/auto-archive.test.ts: Auto-archive criteria

Test cases:

  1. Compaction: Full task → compacted size reduction
  2. Roll-up: Epic with children → archived_children format
  3. Auto-archive: Time + count criteria validation
  4. Active ancestors: Cannot archive with active parent
  5. Complete lineages: Only archive complete branches
  6. Manual archive: Command validation and error cases
  7. Query integration: List/show/search with archives
  8. Blocking cleanup: Remove archived task from blockedBy arrays
  9. GitHub metadata: Preserve issue links in archive
  10. Performance: Large dataset (1000 active + 5000 archived)

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Added comprehensive archive integration tests:

  • list --archived flag tests (filtering, JSON output, empty state)
  • show command tests for archived tasks (details, GitHub metadata, archived children, JSON)
  • Performance tests for large datasets (5000 tasks read/write/search)
  • Extracted createArchivedTask helper to test-helpers.ts

All 722 tests pass. Commit: 562a457

[x] Integrate archive into query operations xe0pfmym

Description

Update TaskService in src/core/task-service.ts to support archive queries.

Update methods:

  • list(): Add 'archived' parameter to list only archived tasks

    • Default: only active tasks
    • archived=true: only archived tasks
    • all=true: both active and archived
    • Show count: 'Showing X tasks (Y archived)'
  • get(): Auto-check archive if task not in active store

    • First check tasks.jsonl
    • If not found, check archive.jsonl
    • Return Task | ArchivedTask | null
  • search(): Add 'includeArchive' parameter

    • Default: only active tasks
    • includeArchive=true: search both files
    • Return Array<Task | ArchivedTask>

Handle ArchivedTask display (note missing context, show archived_children).

This is part of 'Archive Strategy: Compacted Task History' (task t64hfub3).

Result

Integrated archive queries into TaskService:

  • Added getWithArchive() method that auto-checks archive if task not found in active store
  • Added listArchived() method to list archived tasks with optional query filter
  • Added search() method with includeArchive option to search both active and archived tasks
  • Added isArchivedTask() type guard for distinguishing Task from ArchivedTask
  • Refactored CLI list and show commands to use new TaskService methods instead of direct ArchiveStorage access
  • Added 'archived' field to ListTasksInput type

All 720 tests pass.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions