Skip to content

Controller Audio Internals

hifihedgehog edited this page Jun 15, 2026 · 1 revision

Controller Audio Internals

How AudioPassthroughService drives the DualSense and DualShock 4 speaker: the per-device sink model, the WASAPI loopback mirror, the USB and Bluetooth transports, and the Opus and SBC encoders.

This is the developer-side companion to Controller Audio (the user guide) and issue #83.


Files

File Role
PadForge.App/Common/Input/AudioPassthroughService.cs The core. Sinks, captures, the worker and Bluetooth threads, USB and BT transports, Opus encode, remote-audio hooks. internal static class.
PadForge.App/Common/Input/Ds4SbcEncoder.cs Clean-room SBC encoder for DualShock 4 Bluetooth audio.
PadForge.App/Common/Input/SoundMacroService.cs Macro sound decode and playback. Feeds the per-slot sink mixers.
PadForge.App/Common/Input/UserEffectsDispatcher.cs Asserts the DualSense firmware speaker path and the master-volume byte on each output report.
PadForge.App/Common/Input/RemoteLinkOutputRouter.cs The ShipAudio producer side of the [[Remote Link Internals
PadForge.App/Services/InputService.cs Wires the config provider and the SendAudio / OnRemoteAudioReceived delegates.

Dependencies: Concentus (pure-C# Opus) and NAudio.Wasapi (loopback capture, WasapiOut, mixers, Media Foundation decode).

Identity constants at the top of the service: Rate = 48000, SonyVid = 0x054C, Ds5Pids = {0x0CE6, 0x0DF2} (DualSense, DualSense Edge), Ds4Pids = {0x05C4, 0x09CC, 0x0BA0} (DualShock 4 v1, v2, and the Sony USB wireless adaptor).


The sink model

One Sink exists per speaker-capable pad, held in Dictionary<Guid, Sink> _sinks keyed by device InstanceGuid under _lock. Keying on the device, not the slot, is deliberate: a slot holding two DualSense gets two independent sinks with independent Opus sequence counters. Per-device runtime state is never static, which is the rule that fixed the garbled-audio bug when two DualSense shared a counter.

Each sink carries its audio graph (MacroMixer, a 48 kHz stereo MixingSampleProvider, plus Source, the mixer-and-loopback combiner), its transport (IWavePlayer Player for USB, or IntPtr BtHandle plus a BtWritePool Tx for Bluetooth), and its per-lane encoder state. The DualSense lane holds Ds5OpusEncoder, Ds5Seq (the report sequence nibble), and Ds5PktCounter. The DualShock 4 lane holds Ds4Sbc, the pending-sample buffer, the SBC frame queue, and the resampler carry.

BtWritePool is a fixed 8-slot overlapped non-blocking HID write pool (WriteFile with FILE_FLAG_OVERLAPPED). TrySend returns immediately. Saturation drops the frame, an I/O error flags a hard fail. The kernel HID IRP queue is the jitter buffer, roughly 80 ms in flight across the eight slots.


Threads

Two threads, started by EnsureThreads_NoLock:

  1. WorkerThreadMain ("PadForge.AudioWorker") is the single owner of all device I/O. It waits on _workSignal or wakes every five seconds and runs ReconcileOnWorker. Sink build and teardown, capture start and stop, and the CreateFile on Bluetooth handles all happen here and only here.
  2. BtThreadMain ("PadForge.BtAudio", highest priority) runs the Bluetooth cadence loop. It never does transport I/O of its own beyond Tx.TrySend. On a transport error it sets TransportFailed and lets the worker rebuild.

The worker reconcile runs five phases to keep I/O outside the lock: build the desired-sink list (phase 1, no lock), diff against _sinks (phase 2, under lock, no I/O), build and dispose transports (phase 3, unlocked), reconcile captures (phase 4), then notify SoundMacroService and sweep expired vendor-audio tests (phase 5). The desired-sink list walks the 16 slots through EnumerateAssignedSonyPads and FindOnlineSonyDevice, which gates on IsOnline, a non-empty DevicePath, the Sony VID, and a DualSense or DualShock 4 PID. Bluetooth is detected by the HID-over-BT service GUID in the device path. A wired DualShock 4 is skipped because it has no USB audio interface.

BtThreadMain runs at CadenceMs = 10 + 2/3 (10.667 ms) and pulls 512 frames per tick. It applies a small ring-cushion drift trim, runs a two-second idle gate, and dispatches Ds4BtTick or SendDs5BtFrame. Timing uses timeBeginPeriod(1) and a high-resolution waitable timer. A late tick is skipped, never repaid, because the firmware drops back-to-back bursts.


System-audio mirror (WASAPI loopback)

CaptureEntry wraps one WasapiLoopbackCapture plus a half-second ring, stored in Dictionary<string, CaptureEntry> _captures keyed by endpoint ID. ReconcileCapturesOnWorker resolves the default render endpoint once, maps each passthrough-on sink's MirrorSourceId (empty string means follow the default, re-resolved every pass) to a wanted endpoint, starts the missing captures, and points each sink at its CaptureEntry. A pad's own endpoint is excluded so the mirror never feeds the pad back into itself.

StartCaptureEntry opens the device, rejects a non-active endpoint, and on each DataAvailable converts any source rate and format to 48 kHz stereo float into the ring. SinkSource.Read holds a per-sink cursor over the shared ring, resyncs only on a genuine stall, and adds the loopback samples on top of the macro mix at full scale. Volume lives in the firmware byte, not the samples.


Macro sounds

SoundMacroService decodes WAV, MP3, M4A, AAC, WMA, and FLAC through MediaFoundationReader (or a streaming reader for pfsound:// package refs), resamples to 48 kHz stereo, and caches the PCM under a 128 MB LRU cap.

StartPlacements is the routing core. It calls AudioPassthroughService.GetSlotSinkMixers(slot, out pendingActivation, deviceFilter), which returns the MacroMixer of every live sink on the slot (optionally filtered to one device GUID) and marks _macroDemand so the sinks persist. When the slot has eligible speaker pads whose sinks are not live yet, pendingActivation is true and the caller drops the sound rather than leak it to the PC speakers. When the slot has no controller sink and no device filter, it falls back to a per-slot system-default WasapiOut, where the volume is applied in the sample domain instead.

PlayTestBeep(slot, deviceGuid) routes an 880 Hz, 200 ms tone through StartPlacements with a device filter, so the Audio-tab test hits only the selected device and never fans out to the slot's other pads.


USB transport

The USB branch of BuildTransportOnWorker reads the HID interface's PnP container ID through cfgmgr32, then matches a render MMDevice whose container ID is the same. If that endpoint is disabled in Windows it re-enables it through IPolicyConfig, then restores the prior default if Windows promoted the pad to default. UsbFrameProvider shapes each frame so channel 0 is silent, channel 1 carries the mono program mix (L+R) * 0.5 (the firmware speaker tap), and any further channels are zero. Playback is WasapiOut in shared mode with event sync at 30 ms latency.


DualSense Bluetooth (Opus over report 0x35)

Constants: 480 samples per Opus frame (10 ms at 48 kHz), 200 bytes per frame (hard CBR at 160 kbps), and a 334-byte report. CreateDs5OpusEncoder builds a Concentus encoder configured OPUS_APPLICATION_AUDIO, 160000 bps, CBR, pre-created at transport build so the first frame does not pay construction cost.

Each tick time-compresses the 512-frame pull to 480 samples, then SendDs5BtFrame encodes it and assembles the 0x35 report: byte 0 is the report ID, byte 1 carries the sequence nibble (Ds5Seq advances mod 16), a 0x11 session-header packet carries Ds5PktCounter, and a 0x13 speaker-lane packet carries the 200-byte Opus payload. The last four bytes are a reflected CRC32 pre-seeded with the 0xA2 Sony Bluetooth output prefix. The firmware contract from the issue #83 hardware experiment is one report per tick, never bursts.


DualShock 4 Bluetooth (SBC over report 0x17)

Ds4SbcEncoder is a clean-room SBC encoder written from the A2DP specification Appendix B, with no libsbc or GPL lineage. Fixed config is 32 kHz, 8 subbands, 16 blocks, joint stereo, SNR allocation, bitpool 48, producing 109-byte frames. The spec's segment-folded analysis window is un-folded at the analysis matrix, and the filterbank conventions were pinned against ffmpeg's decoder at roughly 76 and 67 dB round-trip SNR.

At transport build the DualShock 4 branch sends a one-shot 0x11 control report to enable the audio path and set headphone and speaker volume bytes. Ds4BtTick resamples 48 kHz to 32 kHz with a persistent phase carried across ticks, encodes each complete 256-sample block to an SBC frame into a bounded queue, and drains it preferring the four-frame 0x17 report and falling back to the two-frame 0x14. Unlike the DualSense cadence, the DualShock 4 path is availability-driven and allows bursts. Byte 5 of the report selects the internal speaker.


Master volume as the firmware speaker byte

Master volume is never scaled into the samples on the controller path. For the DualSense it is asserted on every output report in UserEffectsDispatcher. When the device wants the speaker path, the dispatcher sets the valid flags, writes speakerVolume mapped from the 0-100% master onto the firmware's 0x3D to 0x64 window (0 mutes), and sets the audio-control flags to the internal-speaker path. When the sink tears down, a one-shot restores the headphone path. SetSlotVolume retunes live because the byte is read per report. The DualShock 4 path has no per-tick volume byte. Its volumes are fixed in the one-shot enable report.


Remote Link audio

On the receive (owner) side, InputService subscribes to the link server's audio event, resolves the exposed slot to a local source device, and calls AudioPassthroughService.FeedRemoteAudio. That writes the incoming s16 48 kHz stereo PCM into a RemoteAudioRing and marks the device's remote-audio demand. The worker then builds a real Bluetooth or USB sink for it, and SinkSource.Read drains the ring instead of loopback and macros, so the relayed audio reaches the physical pad speaker.

On the produce (consumer) side, a peer:// pad becomes a sink with no local transport. The Bluetooth thread's peer lane calls ShipPeerAudioTick, which pulls the same per-pad mix, converts to s16, and flushes exact 1024-byte blocks through RemoteLinkOutputRouter.ShipAudio. The PCM is shipped full-scale. The volume is the owner's firmware byte. See Remote Link Internals for the wire side.


Related pages

Clone this wiki locally