-
Notifications
You must be signed in to change notification settings - Fork 902
Fix: Support multi-channel audio recording for USB audio interfaces #1176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Support multi-channel audio recording for USB audio interfaces #1176
Conversation
Investigation identified root cause of audio slowdown/pitch-down bug: - Export hardcoded to 48 kHz (crates/audio/src/audio_data.rs:18) - Microphone device filter conflicts with max sample rate selection - System audio devices at 48 kHz but bug persists (metadata issue) Documentation added: - docs/hardware-specs.md: Peripheral hardware specifications - docs/audio-device-status.md: Current system audio configuration - docs/investigation-findings.md: Complete technical investigation report Updated .gitignore to exclude working files and memory-bank folder. Next: Create test recording to verify hypothesis with ffprobe analysis. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Changed filter to accept devices that support 48kHz (not reject those with higher max rates) - Explicitly configure device to use 48kHz instead of maximum sample rate - This ensures both metadata and actual audio data are at 48kHz Fixes audio slowdown/pitch drop issue with high sample rate audio interfaces like BlackStar Polar 2
Fixes audio corruption bug when recording with multi-channel USB audio interfaces like BlackStar Polar 2 (4 channels). Previously, AudioInfo clamped all audio devices to a maximum of 2 channels (stereo), causing 4-channel interleaved audio data to be misinterpreted as 2-channel data. This resulted in severe audio corruption that manifested as pitched-down playback. Changes: - Increased MAX_AUDIO_CHANNELS from 2 to 8 to support professional audio interfaces - Removed channel clamping in from_stream_config() to preserve actual channel count - Added support for multi-channel layouts: SURROUND (3ch), QUAD (4ch), 5.0, 5.1, 6.1, 7.1 Tested with BlackStar Polar 2 (4-channel USB audio interface) on macOS. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Adds comprehensive debug logging to help diagnose audio device issues: - Log all available device configurations - Log selected configuration - Log final configuration after sample rate adjustment - Log device usability status This logging was instrumental in diagnosing the multi-channel audio bug and will help with future audio device troubleshooting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughIncreases AudioInfo.MAX_AUDIO_CHANNELS from 2 to 8 and expands channel-layout recognition to support 3–8 channels. Changes microphone device/config selection to prefer configs with max_sample_rate ≥ 48000, enforce a 48000 Hz final stream rate, and adds verbose logging around device selection, chosen config, and first callback. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App as App
participant Mic as MicrophoneFeed
participant Dev as Audio Device
participant Stream as Input Stream
participant MI as media-info
App->>Mic: start microphone feed
Mic->>Dev: enumerate devices + configs
Dev-->>Mic: device list + configs
Mic->>Mic: filter configs (max_sample_rate ≥ 48000)
Mic->>Mic: sort configs (sample_size, max_rate) and choose
Mic->>Mic: finalize config -> set sample_rate=48000
Mic->>MI: request channel validation (raw_channels)
MI-->>Mic: channel layout or None (clamp to [1,8] if needed)
Mic->>Dev: build input stream (rate=48000, channels=n, format)
Dev-->>Mic: stream created
Mic->>Stream: start stream
note right of Mic: logs: device name, configs, chosen params
Stream-->>Mic: audio callbacks (first callback logged once)
Mic-->>App: forward MicrophoneSamples
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (4)crates/**/src/**/*.rs📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.{ts,tsx,js,jsx,rs}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.rs📄 CodeRabbit inference engine (AGENTS.md)
Files:
crates/*/src/**/*📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧬 Code graph analysis (1)crates/recording/src/feeds/microphone.rs (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/media-info/src/lib.rs (1)
65-75
: Guard against unsupported channel counts (>8) to avoid runtime panic.from_stream_config sets channels from cpal without validation, but channel_layout() later unwraps and will panic for counts not in 1..=8. Clamp or validate here.
- let channels = config.channels(); + let raw_channels = config.channels(); + let channels = if Self::channel_layout_raw(raw_channels).is_some() { + raw_channels + } else { + raw_channels.min(Self::MAX_AUDIO_CHANNELS).max(1) + }; Self { sample_format, sample_rate: config.sample_rate().0, // we do this here and only here bc we know it's cpal-related channels: channels.into(), time_base: FFRational(1, 1_000_000), buffer_size, }
🧹 Nitpick comments (5)
.gitignore (1)
51-53
: Also ignore the root-level memory bank directory.Docs reference “.memory-bank/” at repo root; it’s not ignored here. Add it to prevent accidental commits.
# Audio bug fix working files docs/memory-bank/ claude-audio-fix.md +/.memory-bank/
crates/recording/src/feeds/microphone.rs (1)
144-156
: Avoid new inline comments in Rust sources.Per coding guidelines, remove inline comments; rely on logs/clear code instead.
docs/hardware-specs.md (1)
20-31
: Align channel count with observed system config.System profile shows 4 input channels; update “Type/Input Channels” here to avoid confusion.
docs/investigation-findings.md (2)
92-103
: Add language to fenced code block (markdownlint MD040).Specify a language for this block.
-``` +```text Polar 2 (BlackStar): - Default Input Device: Yes - Current Sample Rate: 48000 Hz - Input Channels: 4 - Transport: USB Insta360 Link 2: - Current Sample Rate: 48000 Hz - Input Channels: 1 - Transport: USB--- `250-253`: **Avoid PII and add language to fenced block.** Replace absolute user path with a repo‑relative path and add language. ```diff -``` -/Users/mattdreier/Desktop/CAP/.memory-bank/ -``` +```text +.memory-bank/ +```
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.gitignore
(1 hunks)crates/media-info/src/lib.rs
(3 hunks)crates/recording/src/feeds/microphone.rs
(2 hunks)docs/audio-device-status.md
(1 hunks)docs/hardware-specs.md
(1 hunks)docs/investigation-findings.md
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
crates/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
For desktop IPC, use tauri_specta derive/macros on Rust types/events; do not hand-roll bindings
Files:
crates/media-info/src/lib.rs
crates/recording/src/feeds/microphone.rs
**/*.{ts,tsx,js,jsx,rs}
📄 CodeRabbit inference engine (CLAUDE.md)
Do not add inline, block, or docstring comments in any language; code must be self-explanatory
Files:
crates/media-info/src/lib.rs
crates/recording/src/feeds/microphone.rs
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs
: Format Rust code usingrustfmt
and ensure all Rust code passes workspace-level clippy lints.
Rust modules should be named with snake_case, and crate directories should be in kebab-case.
Files:
crates/media-info/src/lib.rs
crates/recording/src/feeds/microphone.rs
crates/*/src/**/*
📄 CodeRabbit inference engine (AGENTS.md)
Rust crates should place tests within the
src/
and/or a siblingtests/
directory for each crate insidecrates/*
.
Files:
crates/media-info/src/lib.rs
crates/recording/src/feeds/microphone.rs
🧬 Code graph analysis (2)
crates/media-info/src/lib.rs (1)
crates/audio/src/audio_data.rs (1)
channels
(162-164)
crates/recording/src/feeds/microphone.rs (1)
crates/media-info/src/lib.rs (1)
ffmpeg_sample_format_for
(272-282)
🪛 markdownlint-cli2 (0.18.1)
docs/investigation-findings.md
92-92: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
251-251: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (9)
crates/media-info/src/lib.rs (2)
27-27
: MAX_AUDIO_CHANNELS increase looks good.Expands support to 8 channels as intended.
95-101
: Expanded layouts (3–8 channels) look correct.Mappings align with common FFmpeg channel layouts.
crates/recording/src/feeds/microphone.rs (7)
129-137
: Error logging addition is fine.Captures device context when enumerating configs.
157-163
: Sort behavior OK.Prefers larger sample size; secondary key by max rate. Fine given the 48 kHz filtering below.
169-178
: Selected-config logging is helpful.Retain; aids debugging.
183-191
: Usable/rejected device logs are helpful.Clear visibility on selection outcome.
193-207
: Forcing final sample rate to 48 kHz is consistent with the current pipeline.Good: prior filter guarantees 48 kHz support before setting it.
If the long-term direction is “record at device rate,” we’ll need a resampler or to thread actual rate through export; confirm this is intentional for now.
343-351
: Build-stream log is fine.Includes device name, rate, channels, and format.
357-366
: First-callback visibility is fine.One-time info log; low overhead.
let selected = configs | ||
.into_iter() | ||
.filter(|c| c.min_sample_rate().0 <= 48000 && c.max_sample_rate().0 <= 48000) | ||
.find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some()) | ||
}) | ||
.and_then(|config| { | ||
device | ||
.name() | ||
.ok() | ||
.map(|name| (name, device, config.with_max_sample_rate())) | ||
}) | ||
.filter(|c| c.min_sample_rate().0 <= 48000 && c.max_sample_rate().0 >= 48000) | ||
.find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Filter out unsupported channel counts (align with AudioInfo::MAX_AUDIO_CHANNELS).
Prevents selecting configs that would later panic when building AudioInfo.
- let selected = configs
+ let selected = configs
.into_iter()
- .filter(|c| c.min_sample_rate().0 <= 48000 && c.max_sample_rate().0 >= 48000)
+ .filter(|c| {
+ c.min_sample_rate().0 <= 48_000
+ && c.max_sample_rate().0 >= 48_000
+ && c.channels() >= 1
+ && c.channels() <= AudioInfo::MAX_AUDIO_CHANNELS
+ })
.find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some());
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
let selected = configs | |
.into_iter() | |
.filter(|c| c.min_sample_rate().0 <= 48000 && c.max_sample_rate().0 <= 48000) | |
.find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some()) | |
}) | |
.and_then(|config| { | |
device | |
.name() | |
.ok() | |
.map(|name| (name, device, config.with_max_sample_rate())) | |
}) | |
.filter(|c| c.min_sample_rate().0 <= 48000 && c.max_sample_rate().0 >= 48000) | |
.find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some()); | |
let selected = configs | |
.into_iter() | |
.filter(|c| { | |
c.min_sample_rate().0 <= 48_000 | |
&& c.max_sample_rate().0 >= 48_000 | |
&& c.channels() >= 1 | |
&& c.channels() <= AudioInfo::MAX_AUDIO_CHANNELS | |
}) | |
.find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some()); |
🤖 Prompt for AI Agents
In crates/recording/src/feeds/microphone.rs around lines 164 to 168, the config
selection currently only checks sample rate and format but may pick configs with
too many channels that later panic when building AudioInfo; add a filter step
before the find to exclude configs whose channel_count() is 0 or greater than
AudioInfo::MAX_AUDIO_CHANNELS (use AudioInfo::MAX_AUDIO_CHANNELS) so only
supported channel counts are considered.
Hey @MattDreier thanks so much for this! Would you be able to get the code rabbit suggestions done/looked at, I'll then test, merge and push a new release once confirmed :) |
I've addressed the critical CodeRabbit suggestion by adding channel validation in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/media-info/src/lib.rs (1)
100-105
: LGTM!The channel layout mappings are correct and use standard FFmpeg channel configurations for 3–8 channels.
Consider adding tests to verify the new channel layouts are correctly mapped, especially for the multi-channel use case this PR addresses.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/media-info/src/lib.rs
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
crates/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
For desktop IPC, use tauri_specta derive/macros on Rust types/events; do not hand-roll bindings
Files:
crates/media-info/src/lib.rs
**/*.{ts,tsx,js,jsx,rs}
📄 CodeRabbit inference engine (CLAUDE.md)
Do not add inline, block, or docstring comments in any language; code must be self-explanatory
Files:
crates/media-info/src/lib.rs
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs
: Format Rust code usingrustfmt
and ensure all Rust code passes workspace-level clippy lints.
Rust modules should be named with snake_case, and crate directories should be in kebab-case.
Files:
crates/media-info/src/lib.rs
crates/*/src/**/*
📄 CodeRabbit inference engine (AGENTS.md)
Rust crates should place tests within the
src/
and/or a siblingtests/
directory for each crate insidecrates/*
.
Files:
crates/media-info/src/lib.rs
🧬 Code graph analysis (1)
crates/media-info/src/lib.rs (4)
crates/recording/src/feeds/microphone.rs (1)
config
(219-221)crates/scap-cpal/src/lib.rs (1)
config
(73-75)crates/recording/src/sources/screen_capture/mod.rs (1)
config
(412-414)crates/audio/src/audio_data.rs (1)
channels
(162-164)
🔇 Additional comments (1)
crates/media-info/src/lib.rs (1)
27-27
: LGTM!The constant increase from 2 to 8 channels properly supports multi-channel USB audio interfaces.
let raw_channels = config.channels(); | ||
let channels = if Self::channel_layout_raw(raw_channels).is_some() { | ||
raw_channels | ||
} else { | ||
raw_channels.min(Self::MAX_AUDIO_CHANNELS).max(1) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Verify clamping behavior is logged for devices with >8 channels.
The clamping logic prevents panics but will silently misinterpret audio data from devices reporting more than 8 channels. For example, a 12-channel device clamped to 8 will have incorrect channel boundaries in the interleaved data, causing corruption.
Ensure this clamping scenario is logged (e.g., in microphone.rs
device selection) so users understand why unsupported devices produce incorrect audio.
Run this script to check if clamping is logged:
🏁 Script executed:
#!/bin/bash
# Check if channel count clamping/validation is logged in microphone feed
rg -n -C5 'channel|clamp|MAX_AUDIO_CHANNELS' crates/recording/src/feeds/microphone.rs
Length of output: 2301
Add warning when clamping channel count
Raw channel counts outside [1, MAX_AUDIO_CHANNELS] are clamped in crates/media-info/src/lib.rs
(lines 65–70) but no log records the original value. Insert a warning log there to report the original raw_channels
and the clamped result.
🤖 Prompt for AI Agents
In crates/media-info/src/lib.rs around lines 65–70, when raw_channels is outside
[1, MAX_AUDIO_CHANNELS] it gets clamped but the original value isn't logged; add
a warning log that reports both the original raw_channels and the clamped
channels value before returning/using the clamped result. Use the crate’s
logging facility (e.g., log::warn! or tracing::warn!) and ensure the appropriate
macro is imported so the warning reads something like "clamping audio channel
count: raw=X -> clamped=Y" with raw and clamped values interpolated.
Ahhh looks like the real problem here is that I wasn't filtering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/recording/src/feeds/microphone.rs (1)
151-155
: Filter out unsupported channel counts (align with AudioInfo::MAX_AUDIO_CHANNELS).The filter at lines 153-154 still does not validate channel counts against
AudioInfo::MAX_AUDIO_CHANNELS
. Although the PR comments mention that channel validation was added infrom_stream_config()
(in crates/media-info/src/lib.rs), that validation only clamps values at AudioInfo construction time. Selecting a config with >8 channels here will still cause issues downstream. Add a channel count filter before thefind
to exclude configs with unsupported channel counts.Apply this diff:
let selected = configs .into_iter() - .filter(|c| c.min_sample_rate().0 <= 48000 && c.max_sample_rate().0 >= 48000) + .filter(|c| { + c.min_sample_rate().0 <= 48_000 + && c.max_sample_rate().0 >= 48_000 + && c.channels() >= 1 + && c.channels() <= AudioInfo::MAX_AUDIO_CHANNELS + }) .find(|c| ffmpeg_sample_format_for(c.sample_format()).is_some())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/recording/src/feeds/microphone.rs
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
crates/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
For desktop IPC, use tauri_specta derive/macros on Rust types/events; do not hand-roll bindings
Files:
crates/recording/src/feeds/microphone.rs
**/*.{ts,tsx,js,jsx,rs}
📄 CodeRabbit inference engine (CLAUDE.md)
Do not add inline, block, or docstring comments in any language; code must be self-explanatory
Files:
crates/recording/src/feeds/microphone.rs
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs
: Format Rust code usingrustfmt
and ensure all Rust code passes workspace-level clippy lints.
Rust modules should be named with snake_case, and crate directories should be in kebab-case.
Files:
crates/recording/src/feeds/microphone.rs
crates/*/src/**/*
📄 CodeRabbit inference engine (AGENTS.md)
Rust crates should place tests within the
src/
and/or a siblingtests/
directory for each crate insidecrates/*
.
Files:
crates/recording/src/feeds/microphone.rs
🧬 Code graph analysis (1)
crates/recording/src/feeds/microphone.rs (1)
crates/media-info/src/lib.rs (2)
ffmpeg_sample_format_for
(277-287)sample_size
(114-116)
🔇 Additional comments (3)
crates/recording/src/feeds/microphone.rs (3)
157-160
: LGTM! Sample rate enforcement is correct.The change to explicitly set 48000 Hz using
with_sample_rate()
instead of using the device's maximum rate aligns with the PR objective to "enforce 48000 Hz for the final stream." This ensures consistent sample rates across different devices.
296-318
: LGTM! Comprehensive device enumeration logging.The added logging for available configs and selected stream parameters is valuable for debugging multi-channel audio configuration issues. The info level is appropriate for setup diagnostics.
325-334
: LGTM! First callback logging is helpful.Logging the first audio callback with data size and format helps verify that the stream started correctly and is receiving data in the expected format. Limiting this to the first callback prevents log spam.
cec7bcf
to
1e79835
Compare
Fix: Support multi-channel audio recording for USB audio interfaces
Summary
Fixes a critical audio bug where recordings with multi-channel USB audio interfaces (4+ channels) produced severely corrupted audio that manifested as pitched-down/slowed playback.
Problem
The
AudioInfo
struct was hardcoded to clamp all audio devices to a maximum of 2 channels (stereo). When users recorded with professional audio interfaces like the BlackStar Polar 2 (4 channels), the 4-channel interleaved audio data was misinterpreted as 2-channel data, causing severe audio corruption.User Impact
Solution
MAX_AUDIO_CHANNELS
from 2 to 8 to support professional audio interfacesclamp(1, Self::MAX_AUDIO_CHANNELS)
to preserve actual device channel countChanges
Core Fix (
crates/media-info/src/lib.rs
)MAX_AUDIO_CHANNELS
constant from 2 → 8from_stream_config()
Debugging Enhancement (
crates/recording/src/feeds/microphone.rs
)Testing
Tested extensively with:
Test Logs Evidence
Before fix (Test Cycle 7):
After fix (Test Cycle 8):
Compatibility
Related Issues
This fix resolves audio corruption issues reported by users with multi-channel USB audio interfaces. The bug manifested as:
Technical Details
The bug occurred because:
[L1][R1][C1][D1][L2][R2][C2][D2]...
wrap_frame()
interpreted the data as:[L1][R1][L2][R2]...
(treating C1/D1 as L2/R2)The fix ensures the actual channel count is preserved throughout the pipeline.
Summary by CodeRabbit