Skip to content

Add bounded batch parity for embeddings and transcription#3

Merged
fcogidi merged 5 commits intomainfrom
feat/endpoint_parity
Apr 17, 2026
Merged

Add bounded batch parity for embeddings and transcription#3
fcogidi merged 5 commits intomainfrom
feat/endpoint_parity

Conversation

@fcogidi
Copy link
Copy Markdown
Collaborator

@fcogidi fcogidi commented Apr 17, 2026

Summary

  • Introduce batch transcription APIs and resilient embedding micro-batching with failure isolation
  • Add lazy, bounded batch execution for embedding, and transcription paths with ordered results and per-item callbacks
  • Split batch orchestration into private helpers, keep public LMClient methods explicit, and update docs plus benchmark accounting

Testing

  • uv run pytest -q
  • uv run pre-commit run -a

@fcogidi fcogidi requested a review from Copilot April 17, 2026 20:05
@fcogidi fcogidi marked this pull request as ready for review April 17, 2026 20:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds bounded/lazy batch execution parity across embeddings and transcription, including resilient embedding micro-batching with per-item failure isolation and per-item callbacks, plus updates to docs and CLI benchmark accounting to match the new behavior.

Changes:

  • Refactors embedding + transcription request paths into private helpers and adds bounded batch runners with ordered results and per-item callbacks.
  • Introduces transcribe_batch / atranscribe_batch APIs and extends embedding batch APIs with micro_batch_size, on_progress, and on_result.
  • Updates CLI embed benchmarking to account for micro-batching/partial failures per vector, and adds tests/docs for the new batch behavior.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_client_limits.py Adds coverage ensuring embedding queue-wait metrics include scheduler delay under bounded admission.
tests/test_client_batch.py Expands batch tests for embedding failure isolation, callbacks, strict-mode cancellation, and transcription batch behavior.
tests/test_cli_bench.py Adds benchmark tests for micro-batch sizing propagation and partial-failure accounting.
tests/fakes.py Extends fake CLI client to record micro_batch_size for embed batch calls.
src/infermesh/types.py Updates batch-result docs and introduces callback type aliases for batch methods.
src/infermesh/client.py Wires embedding/transcription through new helpers; adds transcription batch APIs and embedding micro-batch/callback options.
src/infermesh/_transcription.py New bounded/lazy async transcription batch runner + single-item helper.
src/infermesh/_generation.py Switches to shared task-cancellation helper for bounded generation batches.
src/infermesh/_embedding.py New resilient embedding micro-batch runner with recursive failure isolation and per-item callbacks.
src/infermesh/_cli_bench.py Adjusts embed benchmark accounting to count per-vector submissions/successes/failures; dedupes per-request stats.
src/infermesh/_batch_utils.py Adds shared cancel_tasks helper for internal batch runners.
docs/guide.md Documents callback contract parity and new embed/transcribe batch behavior.
README.md Updates examples and guidance for micro-batched embedding and transcription batch usage/callback parity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/infermesh/_transcription.py Outdated
Comment thread src/infermesh/_embedding.py
Comment thread src/infermesh/types.py
…sult typing

- Skip recursive micro-batch split on retryable errors in embed_batch
- Use bare raise in transcription batch error handler
- Define OnBatchResult as PEP 695 generic type alias

Made-with: Cursor
@fcogidi fcogidi force-pushed the feat/endpoint_parity branch from 446a356 to e6cf0c1 Compare April 17, 2026 20:28
@fcogidi fcogidi merged commit 105af70 into main Apr 17, 2026
8 checks passed
@fcogidi fcogidi deleted the feat/endpoint_parity branch April 17, 2026 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants