Skip to content

RD-623: Argmax OSS engine and pipelines (replace legacy WhisperKit transcription)#99

Merged
EduardoPach merged 2 commits into
mainfrom
eduardo/rd-623
May 12, 2026
Merged

RD-623: Argmax OSS engine and pipelines (replace legacy WhisperKit transcription)#99
EduardoPach merged 2 commits into
mainfrom
eduardo/rd-623

Conversation

@EduardoPach
Copy link
Copy Markdown
Collaborator

What does this PR do?

This change integrates the Argmax open-source Swift CLI (argmax-cli from argmaxinc/argmax-oss-swift) into OpenBench as a first-class engine and pipeline set, and retires the old Python WhisperKit transcription module that wrapped a different code path.

Engine

  • Adds ArgmaxOpenSourceEngine (argmax_oss_engine.py): resolves the CLI via optional cli_path, or clone + swift build -c release --product argmax-cli under ARGMAX_OSS_CACHE_DIR (default ~/.cache/openbench/argmax-oss), with optional commit_hash pin.
  • Exposes transcribe and diarize helpers that shell out to argmax-cli with caller-supplied flag lists.

Pipelines

  • Transcription: ArgmaxOpenSourceTranscriptionPipeline — runs transcribe, parses JSON report segments/words (including per-word timings when present).
  • Diarization: ArgmaxOpenSourceDiarizationPipeline — runs diarize, produces RTTM compatible with existing annotation loading.
  • Orchestration: ArgmaxOpenSourceOrchestrationPipeline — diarize then transcribe and merge speaker labels using word timing.

Aliases and removals

  • Legacy whisperkit.py transcription pipeline is removed.
  • Existing whisperkit-* transcription aliases (e.g. tiny, large-v3, large-v3-turbo) now target the OSS transcription pipeline and configs (same alias names for benchmark continuity).
  • New aliases: argmax-oss-diarization, argmax-oss-orchestration-tiny (plus orchestration config pattern consistent with other orchestration entries).

Testing

  • Exercise openbench-cli evaluate (or pipeline smoke) with a whisperkit-* or argmax-oss-* alias after ensuring argmax-cli is built or cli_path is set.

Introduce argmax_oss_engine and wire transcription, diarization, and
orchestration pipelines plus pipeline aliases. Remove legacy whisperkit
transcription module in favor of Argmax OSS.

Made-with: Cursor
@EduardoPach EduardoPach requested review from arda-argmax and dbrkn May 1, 2026 13:43
args.extend(["--model-path", self.model_path])
if self.model_repo:
args.extend(["--model-repo", self.model_repo])
if self.model_token:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked Claude for a review, and it pointed out the following potential token leak issue:

model_token is described as being “scrubbed from errors”, but:
logger.debug("Argmax OSS transcribe: %s", cmd) logs the full command, including --model-token
RuntimeError(f"argmax-cli ... failed: {e.stderr}") may also expose the token
We should either redact tokens from logs/exceptions or pass --model-token via environment variables or stdin instead of CLI arguments.

Copy link
Copy Markdown
Contributor

@dbrkn dbrkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@EduardoPach EduardoPach merged commit 9966861 into main May 12, 2026
2 checks passed
@EduardoPach EduardoPach deleted the eduardo/rd-623 branch May 12, 2026 18:59
EduardoPach added a commit that referenced this pull request May 22, 2026
…line

Extend the engine with a tts() method mirroring transcribe()/diarize(),
add a new ArgmaxOpenSourceSpeechGenerationPipeline that uses it, and
retire the old WhisperKit-based speech-gen pipeline (whisperkit-cli no
longer exists since #99).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants