[improvement]: Add playback progress acknowledgment for WebSocket media (per-chunk or byte-level acknowledgment)

### Is your feature or improvement request related to a problem? Please describe.

Asterisk’s WebSocket media driver (`chan_websocket`) allows streaming binary audio data from external applications (like AI voice bots) but does not provide any mechanism to know when a specific portion of audio has actually been played.

This causes synchronization issues for AI-driven real-time streaming systems that generate audio dynamically (e.g., Text-to-Speech or conversational AI). Without playback progress acknowledgment, the application has no way to determine when the audio it sent has finished playing. 

This leads to two major problems:
1. Over-buffering — increases latency since the application keeps sending new audio before the previous one is played.
2. Under-buffering — causes playback gaps when the application waits too long to send the next chunk.


### Describe the solution you'd like

Introduce a playback progress acknowledgment mechanism for the WebSocket media driver.

Possible designs:

1. **Mark-based acknowledgment**
   - Allow clients to send a `MARK id=<uuid>` text control command that sets a logical boundary in the playback queue.
   - When Asterisk finishes playing all media queued before that mark, it responds with `MARK_PLAYED id=<uuid>`.

   Example:


### Describe alternatives you've considered

We explored existing Asterisk WebSocket control messages:
- `REPORT_QUEUE_DRAINED`: Notifies only when the *entire* queue is empty — not useful for partial playback acknowledgment.
- `FLUSH_MEDIA`: Clears buffered audio but provides no confirmation of playback.
- `MEDIA_XOFF` / `MEDIA_XON`: Flow control only, unrelated to playback timing.

We also attempted to simulate playback timing locally by estimating real-time audio consumption (1 ms per 8 bytes for μ-law), but this approach is only an approximation and cannot confirm actual playback progress in Asterisk.


### Additional context

Use case: AI-driven voice bots and real-time speech generation systems streaming audio to Asterisk over WebSocket.

Modern AI engines (like OpenAI Realtime API, ElevenLabs, or custom TTS models) generate audio in small, variable-sized chunks. These systems require acknowledgment when certain chunks have been played so they can dynamically:
- Generate the next segment of speech
- Handle interruptions or “barge-in” events
- Avoid excessive buffering and latency

Adding per-chunk or progress-based acknowledgment would significantly improve synchronization for real-time applications and make Asterisk more compatible with emerging AI voice technologies.

Proposed area: WebSocket Media Driver (`chan_websocket`)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[improvement]: Add playback progress acknowledgment for WebSocket media (per-chunk or byte-level acknowledgment) #1574

Is your feature or improvement request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[improvement]: Add playback progress acknowledgment for WebSocket media (per-chunk or byte-level acknowledgment) #1574

Description

Is your feature or improvement request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions