Skip to content

Feature: Task TTL / automatic expiry #17

@deepjoy

Description

@deepjoy

Summary

Allow tasks to declare a time-to-live (TTL) after which they are automatically expired and removed from the queue without executing.

Motivation

In a continuous sync engine, tasks represent actions against a point-in-time snapshot of the world. If a task sits in the queue long enough, the world may have changed:

  • The source file was deleted before the upload task ran
  • A newer version was synced via a different code path (event-driven vs. poll-driven)
  • The conflict that generated a resolution task was already resolved manually

Executing stale tasks wastes bandwidth, can produce incorrect results, and may conflict with newer state. Today, consumers must build their own staleness checks inside every TaskExecutor::execute — a TTL at the scheduler level would handle this generically.

Proposed Behavior

  • TaskSubmission gains a TTL:
    TaskSubmission::new("file-transfer")
        .ttl(Duration::from_secs(3600))  // expire if not started within 1 hour
        .payload_json(&plan)?
  • TTL semantics:
    • The TTL clock starts at submission time (not when the task becomes runnable)
    • If the task hasn't started executing by the time TTL expires, it's moved to history with HistoryStatus::Expired
    • A task that has already started is not affected by TTL (it runs to completion or failure)
    • Children inherit the parent's remaining TTL by default, but can override
  • Expired tasks trigger a SchedulerEvent::TaskExpired { task_id, dedup_key, age } event
  • A global default TTL can be set on the scheduler:
    Scheduler::builder()
        .default_ttl(Duration::from_secs(7200))  // 2h default
        .build();
  • Per-task-type TTL override at registration:
    registry.register::<CleanupExecutor>(TaskTypeConfig {
        default_ttl: Some(Duration::from_secs(86400)),  // cleanup tasks can wait longer
        ..Default::default()
    });

Example: Watch Mode Staleness

// During watch, submit transfers with a TTL matching the poll interval
// If the next poll cycle runs before this task executes, the differ will
// produce a fresh action anyway — no point running the stale one
scheduler.submit(
    TaskSubmission::new("file-transfer")
        .dedup_key(&format!("transfer:{profile}:{key}"))
        .ttl(poll_interval * 2)  // generous buffer
        .payload_json(&plan)?
).await?;

Design Considerations

  • TTL expiry checks should be efficient — the scheduler shouldn't scan all tasks on every tick. A sorted index on submitted_at + ttl in SQLite enables a periodic sweep (e.g. every 30s)
  • Expiring a parent task should expire all its pending children (children that are already running should complete or be cancelled, configurable)
  • TTL should interact correctly with retry: if a task fails and is requeued for retry, does the original TTL still apply, or does the retry get a fresh TTL? (Recommend: original TTL still applies — retries don't extend the window)
  • Consider a ttl_from option: Submission (default) vs. FirstAttempt for cases where queue wait time shouldn't count against execution attempts

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions