Skip to content

taquba-bulk-v0.1.0

Choose a tag to compare

@micllam micllam released this 30 May 12:23
· 65 commits to master since this release
bb3fd01

Initial release. Per-batch orchestrator that runs one pipeline over many
inputs in a single process on top of taquba-workflow.

Added

  • Pipeline: the per-item contract (typed Input / Output, an Error
    that converts into a StepError, and an async run). Each input item
    becomes one taquba-workflow run whose single step invokes run; the
    pipeline's own logical steps live inside run as BulkCtx::memoized
    calls.
  • BulkCtx<T>: per-item execution context. Carries the typed input,
    run_id, and submitter headers; exposes memoized (durable per-step
    result caching so an at-least-once retry replays cached results instead of
    repeating a paid call), record_cost, and cancel_token.
  • CostReport: generic named-metric accumulator (token counts, paid-API
    units, compute-seconds, dollars). Interior-mutable while a step runs and
    serializable for the per-item envelope and the batch rollup.
  • Bulk / BulkBuilder: the runner. Submits N runs, drives the worker pool,
    streams output as items complete, and aggregates progress and cost.
    Builder options: output, key_fn, headers, max_concurrent,
    poll_interval, queue_name, memo_prefix, fail_threshold. run
    executes to completion; run_with_shutdown drains in-flight items on a
    shutdown signal (e.g. spot preemption).
  • ProgressSnapshot: point-in-time counts, rate, estimated time remaining,
    and cost rollup, returned by Bulk::progress.
  • BulkReport: final counts, elapsed time, cost rollup, and
    failed_run_ids (re-submitting those ids resumes from cached memo state).
  • OutputSink with JsonlSink (one JSON record per line) and NullSink
    (discards records, for side-effecting pipelines); read_jsonl for
    line-delimited JSON input.
  • Error / Result: crate error type, including
    Error::FailureThresholdExceeded when the share of failed items crosses
    the configured threshold.
  • Re-exports StepError and StepErrorKind from taquba-workflow for the
    Pipeline::Error type.