Skip to content

fix(audio): add preferredFormat() fallback for WASAPI Float32 output devices#3231

Merged
jensenpat merged 1 commit into
aethersdr:mainfrom
NF0T:fix/vudu-wasapi-float-fallback
May 28, 2026
Merged

fix(audio): add preferredFormat() fallback for WASAPI Float32 output devices#3231
jensenpat merged 1 commit into
aethersdr:mainfrom
NF0T:fix/vudu-wasapi-float-fallback

Conversation

@NF0T
Copy link
Copy Markdown
Collaborator

@NF0T NF0T commented May 27, 2026

Summary

ClientPuduMonitor::startPlayback() accepted only Int16 stereo at 24 kHz or 48 kHz. On Windows 11 with WASAPI shared mode, the audio engine mixer runs internally in Float32. When the shared pipeline is in Float32 mode, IAudioClient::IsFormatSupported() (surfaced via QAudioDevice::isFormatSupported()) rejects Int16 regardless of hardware capability — causing startPlayback() to bail at format negotiation on any Windows machine whose WASAPI mix format is not Int16.

Root cause confirmed: the same C-Media USB device opens successfully in AudioEngine as Float32 stereo 48 kHz — the hardware is not the constraint, the WASAPI shared pipeline is. This is the same class of format-negotiation problem already solved in QuindarLocalSink.cpp:61 and AudioEngine.cpp:306 via dev.preferredFormat().

Fix

When both Int16 @ 24 kHz and Int16 @ 48 kHz are rejected, query dev.preferredFormat(), enforce stereo channel count, and accept it if isFormatSupported() confirms it. After preparePlaybackPcm() produces the resampled Int16 buffer, convert Int16 → Float32 in-place when the negotiated sink format is Float32 (trivial per-sample multiply; the r8brain resampler already handles any rate conversion).

The accepted format and fallback reason are recorded in the AudioSummaryLogger open-success summary, making the path inspectable from future support bundles.

Files changed

File Change
src/core/ClientPuduMonitor.cpp Add preferredFormat() as third format candidate; add Int16→Float32 conversion post-resample when sink format is Float32

Test plan

  • Windows 11 with C-Media USB (or any WASAPI Float32 device): record audio → confirm actual audio playback through device, not just a recovered-but-silent RX stream
  • Confirm Audio Auxiliary sink summary log line now appears for "Aetherial monitor playback" with fallback reason showing preferredFormat path
  • Linux build: record and play back → confirm Int16 24 kHz fast path taken; preferredFormat branch unreached
  • macOS: confirm Int16 48 kHz path taken on standard CoreAudio devices
  • CI check-windows, check-macos, check-paths pass

Constitution

Principle VIII — Evidence Over Assertion: the WASAPI Float32 behavior is confirmed by the log in #3222 showing format=Float for AudioEngine's successful open of the same device that ClientPuduMonitor fails on. The fix follows QuindarLocalSink's established preferredFormat() pattern rather than asserting a novel approach.

Fixes #3229
Refs #3222

🤖 Generated with Claude Code

@NF0T NF0T requested a review from a team as a code owner May 27, 2026 23:52
@jensenpat
Copy link
Copy Markdown
Collaborator

@AetherClaude does this touch the main audio sink or just the pudu recorder?

@aethersdr-agent
Copy link
Copy Markdown
Contributor

@jensenpat Just the pudu recorder playback path — the change is scoped to ClientPuduMonitor::startPlayback() in src/core/ClientPuduMonitor.cpp (the local recap playback of the recorded snippet). The main RX audio sink in AudioEngine is not touched; it already has its own preferredFormat() handling at AudioEngine.cpp:306 (and QuindarLocalSink.cpp:61 for the Quindar tone sink). This PR brings ClientPuduMonitor in line with that established pattern rather than altering the main pipeline.

Concretely, the current code at ClientPuduMonitor.cpp:202-231 only tries Int16 stereo at 24 kHz then 48 kHz, then gives up — which is what fails when WASAPI shared mode is in Float32. The fix adds a third candidate via dev.preferredFormat() (forcing stereo), and when the negotiated format comes back as Float32 it converts the resampled Int16 buffer to Float32 in-place after preparePlaybackPcm(). No behaviour change on Linux (24 kHz Int16 fast path still wins) or macOS (48 kHz Int16 still wins on CoreAudio).

73, Jeremy KK7GWY & Claude (AI dev partner)


🤖 aethersdr-agent · cost: $1.5956 · model: claude-opus-4-7

@NF0T NF0T mentioned this pull request May 28, 2026
2 tasks
@jensenpat
Copy link
Copy Markdown
Collaborator

@AetherClaude should we move audio sink negotiation to it's own factory, so that we are not replicating sample rate issues in 3 places? what is the current number of sample rate issues and PRs? build a history table.

@aethersdr-agent
Copy link
Copy Markdown
Contributor

@jensenpat Yes — a factory is the right call, and PR #3231 is the third instance that proves it. Grepping isFormatSupported / preferredFormat across src/core shows the same negotiation logic, with slightly different rate ladders and sample-format preferences, scattered across six sinks:

# Sink File:line Format preference preferredFormat() fallback?
1 RX engine / mic source AudioEngine.cpp:306,1125-1151,4263 native rate first, then 48k/44k/24k yes
2 Quindar local tone sink QuindarLocalSink.cpp:58-65 Float 48k stereo yes
3 CW sidetone CwSidetoneQAudioSink.cpp:53-59 Float→Int16, multi-rate ladder no (relies on Int16 fallback)
4 QSO recorder playback QsoRecorder.cpp:377-380 Int16 24k → 48k, give up no — same bug as Pudu pre-#3231
5 Pudu monitor playback ClientPuduMonitor.cpp:212-231 Int16 24k → 48k, give up adds it in #3231
6 Device diagnostics probe DeviceDiagnostics.cpp:63,102 informational only n/a

Sample-rate / format-negotiation PR + issue history

# Title State Sink touched
#2629 (referenced in CwSidetone code) Int16-only VB-Audio Virtual Cable closed CW sidetone
#2668 / #2669 fix(cw): Int16 fallback for sidetone sink on Int16-only WASAPI devices merged CW sidetone
#2694 CWX sidetone latency / distortion on Windows FLEX-8400 closed CW sidetone
#2295 / #2487 fix(audio): replace QsoRecorder QTimer playback with QAudioSink pull mode merged QSO recorder
#2403 / #2459 fix(audio): restore L-R pan on Windows 48kHz resampler path merged RX engine
#2929 / #3036 fix(audio): WASAPI mono-only USB PnP mic silent-open recovery merged mic source
#2930 fix(audio): silent SSB/voice TX on macOS — honour mic native rate merged mic source
#2971 / #2973 Default consolidated audio sink / sample-rate summary logging merged (cross-cutting log)
#3222 VUDU TX record/playback mutes rx open RX engine + Pudu
#3229 VUDU monitor playback silently fails on WASAPI Float32 open Pudu
#3231 This PR — preferredFormat() fallback for Pudu open Pudu

That's 11 PRs/issues across 6 sinks, every one of them re-discovering some variant of "Qt's isFormatSupported() lies, and the negotiation rule depends on what the WASAPI shared-mode mixer happens to be in today." The next one to break is almost certainly QsoRecorder::startPlayback() at QsoRecorder.cpp:377-380 — it has the identical Int16-24k-then-48k-then-give-up shape that this PR is fixing in Pudu, and will fail the same way on the same WASAPI Float32 endpoint that motivated #3229.

Proposed factory shape

Something like AudioSinkNegotiator in src/core/audio/:

// src/core/audio/AudioSinkNegotiator.h
namespace aether::audio {

struct NegotiatedSink {
    QAudioFormat   format;              // actually accepted by the device
    int            actualRate = 0;      // == format.sampleRate()
    bool           fallbackOccurred = false;
    QStringList    fallbackReasons;     // human-readable, for AudioSummaryLogger
    QStringList    attemptedFormats;    // ordered probe trail
};

enum class FormatPreference {
    Int16First,   // QSO recorder, Pudu monitor: source PCM is Int16
    FloatFirst,   // Quindar, AudioEngine RX: source is Float32 already
    NativeFirst,  // CW sidetone: prefer whatever the device wants
};

struct NegotiationRequest {
    int  desiredRateHz = 48000;
    int  channels      = 2;
    FormatPreference preference = FormatPreference::Int16First;
    QList<int> rateCandidates;  // empty => {desired, 48000, 44100, 24000}
    QString    pathLabel;       // for the AudioSummaryLogger summary line
};

// Negotiates an output format, including preferredFormat() as the final
// candidate. Always logs the open-success/open-failure summary via
// AudioSummaryLogger so the path is inspectable from support bundles.
std::optional<NegotiatedSink>
negotiateOutput(const QAudioDevice& dev, const NegotiationRequest& req);

}  // namespace aether::audio

Callers shrink from ~25 lines of probe-ladder + summary-logger plumbing to:

auto sink = aether::audio::negotiateOutput(dev, {
    .desiredRateHz = kSampleRate,
    .channels      = kChannels,
    .preference    = aether::audio::FormatPreference::Int16First,
    .pathLabel     = QStringLiteral("Aetherial monitor playback"),
});
if (!sink) return;  // factory already logged the open-failure summary

…and the Int16→Float32 conversion-on-demand logic from #3231 belongs alongside it (a convertPcmToSinkFormat(QByteArray int16Pcm, const NegotiatedSink&) helper), so the next sink that needs it gets it for free.

Recommended sequencing

  1. Land fix(audio): add preferredFormat() fallback for WASAPI Float32 output devices #3231 as-is — it's narrowly scoped, unblocks VUDU monitor playback silently fails on Windows WASAPI Float32 output devices #3229, and matches the established QuindarLocalSink pattern. Don't expand its scope.
  2. Open a follow-up issue ("refactor: centralize audio sink format negotiation in AudioSinkNegotiator") that migrates all six call sites behind the factory, with QsoRecorder's missing-preferredFormat path as the motivating bug.
  3. The follow-up touches files already on the check-windows / check-macos CI gates (added in CI: expand check-windows path filter to hot platform-guarded files #3052/ci: add AudioEngine to check-windows path filter (#3052) #3210/ci: add check-macos build gate for hot platform-guarded files #3223), so the refactor lands with platform coverage from day one.

The risk of doing it inside #3231 is that the refactor changes behaviour for CW sidetone and the main RX engine (different probe order, different log lines), and any regression there is a much louder failure than a silent Pudu monitor — keep the blast radius small.

A maintainer can authorize the orchestrator to land #3231's fix by adding the aetherclaude-eligible label (not currently on the issue). The factory refactor would be a separate, larger change worth scoping in its own issue first.

73, Jeremy KK7GWY & Claude (AI dev partner)


🤖 aethersdr-agent · cost: $4.1747 · model: claude-opus-4-7

Copy link
Copy Markdown
Contributor

@aethersdr-agent aethersdr-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, @NF0T — the precedent referenced (QuindarLocalSink.cpp:61 and the preferredFormat() use in AudioEngine.cpp:306) is the right pattern to follow here, and the root-cause analysis (WASAPI shared mode running a Float32 mix on a hardware device that itself accepts Int16) lines up with what the diagnostics in DeviceDiagnostics.cpp already track.

A few small things worth a second look before merge:

1. Sample-format scope of the preferredFormat branch (src/core/ClientPuduMonitor.cpp:235-249 in the new code)

The conversion block only handles QAudioFormat::Float:

if (fmt.sampleFormat() == QAudioFormat::Float) { /* int16 → float */ }

…but the acceptance check earlier accepts any format the device reports as supported:

if (preferred.isValid() && dev.isFormatSupported(preferred)) { fmt = preferred; ... }

The PR comment explicitly notes Float is the only format observed in practice on WASAPI/CoreAudio. That's a reasonable assumption — but the code doesn't enforce it. If a device's preferredFormat() ever returns Int32, UInt8, etc., the Int16 payload will be sent to a sink expecting a different layout and you'll get garbage rather than a clean failure.

Suggestion: gate the preferred acceptance on sampleFormat() == Int16 || sampleFormat() == Float (Int16 needs no conversion, Float uses the existing branch), and otherwise fall through to the existing failure log. Single extra condition, removes a latent footgun.

2. Interaction with #3230 (the bail/mute-lift fix)

The new preferredFormat-rejection failure path (the new else branch that logs "output device supports neither Int16 stereo nor its own preferredFormat") is a 7th early-return path. Once #3230 lands with its bail() helper, this branch will also need to call it or it'll re-introduce the stranded-mute bug on this specific (rare) failure mode. Worth flagging in whichever PR merges second so it doesn't slip.

3. Minor: summary.fallbackReason already covers it

The post-success summary will include "48000Hz Int16 unsupported -> preferredFormat (…)" in the fallback reason and the negotiated sampleFormat field on the summary — so the path is already inspectable from a support bundle as the test plan expects. Nothing to change, just confirming it works.

Otherwise the conversion math (/ 32768.0f) matches preparePlaybackPcm()'s existing scaling at ClientPuduMonitor.cpp:150-151, the buffer-size arithmetic is correct (frames * kChannels * sizeof(float)), and the scope is appropriately tight — single file, single function, mirrors prior art.


🤖 aethersdr-agent · cost: $6.9063 · model: claude-opus-4-7

Copy link
Copy Markdown
Contributor

@aethersdr-agent aethersdr-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NF0T — the diagnosis (WASAPI shared mixer locking out Int16 despite hardware capability) is solid, and reusing the preferredFormat() pattern from QuindarLocalSink / AudioEngine is the right move. The Int16→Float32 conversion is straightforward and the m_playPcm = std::move(floatPcm) happens before m_playBuffer.setBuffer(&m_playPcm), so the QBuffer pointer is fine.

Two things worth tightening before merge:

1. Narrow what preferredFormat() is allowed to return

The post-resample conversion only handles QAudioFormat::Float:

if (fmt.sampleFormat() == QAudioFormat::Float) {
    // Int16 → Float32 conversion
}

…but the acceptance check above will accept anything isFormatSupported() confirms — Int32, UInt8, future formats, etc. If preferredFormat ever returns one of those (uncommon but not impossible on exotic CoreAudio aggregates or pro-audio drivers), the Int16 payload would be played through a sink expecting a different layout → garbage. The comment claims Float32 is the only realistic case, but the code doesn't enforce that claim. Suggest gating acceptance to Int16 or Float only:

const auto sf = preferred.sampleFormat();
const bool acceptable = (sf == QAudioFormat::Int16 || sf == QAudioFormat::Float);
if (preferred.isValid() && acceptable && dev.isFormatSupported(preferred)) {
    ...
}

That keeps the diagnostic path (logOpenFailure) honest for the rare unsupported case and prevents silent corruption.

2. Mute lifting on the new bail path

The new else branch logs the failure and returns without emitting muteRxRequested(false) — same class of bug as #3230. If #3230 lands first you'll need to wire the new path through its bail() lambda; if this lands first the same fix needs to be repeated here. Worth a coordinating note across the two PRs or rebasing to use the lambda once both are aligned.

Otherwise looks good — minor enough that I'd be happy to see it land once #2 in particular is resolved (either via #3230 or inline here).


🤖 aethersdr-agent · cost: $5.1402 · model: claude-opus-4-7

jensenpat
jensenpat previously approved these changes May 28, 2026
Copy link
Copy Markdown
Collaborator

@jensenpat jensenpat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fix Ryan

@jensenpat jensenpat self-assigned this May 28, 2026
jensenpat pushed a commit to NF0T/AetherSDR that referenced this pull request May 28, 2026
…devices (aethersdr#3231) - Principle VIII.

ClientPuduMonitor::startPlayback() accepted only Int16 stereo at 24 kHz or 48 kHz. On Windows WASAPI shared mode, Float32 mix pipelines can reject both Int16 candidates even when the device works through AudioEngine.

When both Int16 candidates fail, query preferredFormat(), force stereo, and accept it if supported. Convert the prepared Int16 payload to Float32 when the sink negotiated Float32 so playback can start on WASAPI Float32 devices.

The fallback is logged through AudioSummaryLogger, and the failure path preserves an inspectable reason if preferredFormat() is rejected.

Fixes aethersdr#3229
Refs aethersdr#3222

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: jensenpat <patjensen@gmail.com>
@jensenpat jensenpat force-pushed the fix/vudu-wasapi-float-fallback branch from a25aae5 to 7de3d5e Compare May 28, 2026 05:06
…devices (aethersdr#3231) - Principle VIII.

ClientPuduMonitor::startPlayback() accepted only Int16 stereo at 24 kHz or 48 kHz. On Windows WASAPI shared mode, Float32 mix pipelines can reject both Int16 candidates even when the device works through AudioEngine.

When both Int16 candidates fail, query preferredFormat(), force stereo, and accept it if supported. Convert the prepared Int16 payload to Float32 when the sink negotiated Float32 so playback can start on WASAPI Float32 devices. This keeps the existing bail() paths from PR 3230 so RX mute is still lifted on every playback-start failure.

The fallback is logged through AudioSummaryLogger, and the failure path preserves an inspectable reason if preferredFormat() is rejected.

Fixes aethersdr#3229
Refs aethersdr#3222

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: jensenpat <patjensen@gmail.com>
@jensenpat jensenpat force-pushed the fix/vudu-wasapi-float-fallback branch from 7de3d5e to aaf85cf Compare May 28, 2026 05:08
Copy link
Copy Markdown
Collaborator

@jensenpat jensenpat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved refreshed maintainer-signed squash. The conflict with #3230 is resolved by preserving the mute-release bail() paths while adding the preferredFormat()/Float32 fallback from this PR.

@jensenpat jensenpat merged commit 51f275d into aethersdr:main May 28, 2026
6 checks passed
@NF0T NF0T deleted the fix/vudu-wasapi-float-fallback branch May 29, 2026 15:41
ten9876 pushed a commit that referenced this pull request Jun 6, 2026
…3397)

## What

First, **pure-addition** step of the consolidated audio sink factory
(#3306): a single shared, OS-parameterized format/rate **negotiation
policy** that every audio sink and source will migrate onto, replacing
the ~9 divergent per-sink fallback ladders and per-OS `#ifdef` branches
that are the root of a recurring class of platform audio bugs.

Zero behaviour change — this is the maintainer-endorsed *"land the
helper first, migrate sinks incrementally"* approach (#3306, and the
#3194 thread's "sensitive audio-stack changes must be separate PRs with
soak time").

## Why this shape

The historical bugs escaped CI because the per-OS rate ladders were
compiled behind `Q_OS_*`, so a Linux runner could only ever exercise the
Linux path. The new policy is a **pure function over an injected
`DeviceCaps` snapshot with `TargetOs` as a parameter, not an `#ifdef`**
— so one headless, hardware-free test binary exercises the Windows,
macOS *and* Linux ladders on any runner.

## Contents

- **`src/core/AudioFormatNegotiator.{h,cpp}`** — dependency-free policy
(links only `Qt6::Core`). One ladder owns:
- Windows force-48k + r8brain (#2120/#2123), macOS 48k/A2DP (#1705),
Linux native-24k — divergences preserved as **data**, not `#ifdef`.
- The **universal 44.1 kHz rung** that RX/Quindar/QSO are missing today
(#3385).
- `Int16`↔`Float32` + `preferredFormat()` catch-all (#2669 / #1090 /
#3231).
- macOS mic preferred-rate-first (#2930) and Bluetooth-HFP native rate
(#2615); Windows probe-at-open (#2929).
- The two stereo resampler strategies — **PreservePan** (RX/QSO) vs
**MonoCollapse** (TCI-TX/RADE) — kept deliberately distinct
(#2403/#2459).
- **`tests/audio_format_negotiation_test.cpp`** — table-driven golden
matrix, registered under CTest. **25/25 pass**, including the
44.1k-only-device regression guard.
- **`docs/audio-sink-factory.md`** — the full three-layer design (pure
policy → live Qt wrapper → device-ownership router), the
device-following **"uncoupling"** fix (CW / RADE / Aetherial-Audio
"Pudu" recorder following the selected output), the regression-guard
invariants, and the one-sink-per-PR migration plan.

## Test

```
$ ./build/audio_format_negotiation_test
...
25/25 checks passed
```

## Follow-ups (per the migration plan in the doc)

Live Qt wrapper → migrate RX speaker → migrate PC mic →
`AudioOutputRouter` (closes the uncoupling class) → TCI-TX client-rate →
PipeWire DAX shared resampler. Each a separate, soakable PR.

Refs #3306

💻 Generated with Claude Code (Opus 4.8) with architecture by @jensenpat

Co-authored-by: Codex <noreply@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VUDU monitor playback silently fails on Windows WASAPI Float32 output devices

2 participants