Skip to content

No audio splitting for batch ASR providers (e.g. GLM-ASR 30s limit) #508

@dempsey-wen

Description

@dempsey-wen

Problem

OpenLess sends the entire recording as a single WAV file to the ASR provider. There is no audio splitting/chunking logic — the audio buffer is accumulated and uploaded in one request:

fn consume_pcm_chunk(&self, pcm: &[u8]) {
    self.buffer.lock().extend_from_slice(pcm);
}

This works fine for streaming providers (volcengine WebSocket, Qwen realtime), but fails for batch/file-upload ASR providers that have a per-request duration cap.

Affected providers

Provider Interface Max duration
zhipu GLM-ASR HTTP batch 30 seconds
volcengine WebSocket streaming No hard limit
Qwen ASR (realtime) WebSocket streaming No hard limit
Local Qwen3-ASR Local inference No limit

When using GLM-ASR and recording exceeds 30 seconds, the API rejects the request and transcription fails silently (no error shown in UI).

Suggested fix

  • Detect the recording duration before sending.
  • If the duration exceeds the provider's limit, split the audio into chunks (e.g. at silence boundaries), send each chunk separately, and concatenate the results.
  • Alternatively, show a warning or auto-stop recording when approaching the limit for batch providers.

Environment

  • OpenLess version: 1.3.2
  • OS: Linux (Ubuntu, kernel 6.8.0)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions