Skip to content

parakeet-cli: support stdin / pipe input (--input -) #38

Description

@HaleTom

Problem

parakeet-cli transcribe --input only accepts a file path. There is no way to pipe audio from stdin (e.g. from ffmpeg or curl), forcing a temp-file workaround.

Repro

model=~/.cache/huggingface/models/kashif3314/nemotron-3.5-asr-streaming-0.6b-gguf/nemotron-3.5-asr-streaming-0.6b-q4_k.gguf
url='https://raw.githubusercontent.com/yaph/tts-samples/main/mp3/English/en-CA-ClaraNeural.mp3'

# Attempt 1: process substitution — hangs / extremely slow
curl -fsSL "$url" \
  | ffmpeg -i pipe:0 -codec:a pcm_f32le -ar 16000 -ac 1 -f wav pipe:1 \
  | parakeet-cli transcribe --model "${model?}" --input <(cat)
# → speed=0.0569x, had to Ctrl-C

# Workaround: temp file — works fine (near-realtime)
curl -fsSL "$url" \
  | ffmpeg -i pipe:0 -codec:a pcm_f32le -ar 16000 -ac 1 -f wav pipe:1 > test.wav
parakeet-cli transcribe --model "${model?}" --input test.wav
# → duration 0.8 sec (audio is about 5 seconds long)

Root cause

audio_io.cpp uses dr_wav_open_file_and_read_pcm_frames_f32(path) which opens the file via fopen. dr_wav seeks within the file to parse the WAV header and locate the data chunk. On a pipe (named or anonymous), seeking fails or degrades to a slow read-all path.

Suggested fix

Support --input - to read WAV from stdin. The implementation would:

  1. Read stdin into a std::vector<uint8_t> buffer
  2. Call dr_wav_open_memory_and_read_pcm_frames_f32() on the buffer

This avoids the seek problem entirely and is the standard convention (- = stdin) used by ffmpeg, whisper.cpp, and most CLI tools.

Environment

  • parakeet.cpp built from master
  • Linux, Vulkan backend (AMD Radeon 880M)
  • ffmpeg n8.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions