Conversation
📝 WalkthroughWalkthroughThis PR introduces in-app screen-share audio capture and mixing capabilities across Android and iOS platforms. On Android, a callback mechanism supplies screen audio bytes for mixing into microphone capture. On iOS, new ReplayKit-based screen capturer with audio converter and mixer classes integrate into the audio pipeline via a delegation protocol. Changes
Sequence Diagram(s)sequenceDiagram
participant RPScreenRecorder as RPScreenRecorder<br/>(ReplayKit)
participant InAppScreenCapturer as InAppScreenCapturer
participant ScreenShareAudioConverter as ScreenShareAudioConverter
participant ScreenShareAudioMixer as ScreenShareAudioMixer
participant AudioDeviceModule as AudioDeviceModule<br/>(AVAudioEngine)
participant MicrophoneInput as Microphone<br/>Input
RPScreenRecorder->>InAppScreenCapturer: CMSampleBuffer (video+audio)
InAppScreenCapturer->>InAppScreenCapturer: Convert video to RTCVideoFrame
InAppScreenCapturer->>InAppScreenCapturer: Forward audio to audioBufferHandler
InAppScreenCapturer->>ScreenShareAudioMixer: enqueue(CMSampleBuffer)
ScreenShareAudioMixer->>ScreenShareAudioConverter: pcmBuffer(from sampleBuffer)
ScreenShareAudioConverter-->>ScreenShareAudioMixer: AVAudioPCMBuffer
ScreenShareAudioMixer->>ScreenShareAudioMixer: Check if silent
alt not silent
ScreenShareAudioMixer->>ScreenShareAudioMixer: convertIfRequired(to format)
ScreenShareAudioMixer->>AudioDeviceModule: Schedule on AVAudioPlayerNode
end
MicrophoneInput->>AudioDeviceModule: Microphone audio stream
AudioDeviceModule->>AudioDeviceModule: Mix screen audio into engine graph
AudioDeviceModule-->>AudioDeviceModule: Combined audio output
sequenceDiagram
participant WebRTCModule as WebRTCModule
participant GetUserMediaImpl as GetUserMediaImpl
participant AudioDeviceModule as createAudioDeviceModule()
participant Microphone as Microphone<br/>Buffer Callback
participant ScreenAudioProvider as ScreenAudioBytesProvider
WebRTCModule->>AudioDeviceModule: Create with callback
AudioDeviceModule->>Microphone: Register AudioBufferCallback
Microphone->>Microphone: Capture PCM (bytesRead)
alt bytesRead > 0
Microphone->>ScreenAudioProvider: getScreenAudioBytes(bytesRead)
ScreenAudioProvider-->>Microphone: ByteBuffer (screen audio)
alt buffer non-null & has data
Microphone->>Microphone: mixScreenAudioIntoBuffer()<br/>(16-bit additive)
end
end
Microphone-->>AudioDeviceModule: Return captureTimeNs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java (1)
42-42: Visibility change exposes MediaProjection permission data to external packages.Making
GetUserMediaImplpublic and adding thegetMediaProjectionPermissionResultData()getter enables external code to access the MediaProjection permission Intent. This is required for the audio capture feature, but consider:
- Documenting this class/method as internal API to discourage misuse
- The returned Intent could be used to create additional MediaProjections without user consent (though Android should enforce single-use)
Also applies to: 65-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java` at line 42, The class GetUserMediaImpl and its getter getMediaProjectionPermissionResultData() expose the MediaProjection permission Intent publicly; restrict access by making the class and method package-private (remove public) or annotate them as internal (e.g., `@RestrictTo`(RestrictTo.Scope.LIBRARY)) and add a Javadoc comment marking them as internal API; additionally, avoid returning the raw Intent instance - document that callers must not reuse it and consider returning a safe copy or nulling/clearing the stored Intent after it has been consumed in GetUserMediaImpl to prevent external code from creating additional MediaProjections without consent.ios/RCTWebRTC/InAppScreenCapturer.m (1)
125-135: Observer removal may execute on a different thread.Observers are added on the main queue but
unregisterAppStateObserversmay be called from any thread. WhileNSNotificationCenteris thread-safe, there's a potential issue: if removal happens before the asyncregisterAppStateObserversdispatch executes, the flag will be cleared but observers will still be added afterward.Consider dispatching removal to main queue as well, or using the fix suggested for
registerAppStateObserverswhich handles this case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ios/RCTWebRTC/InAppScreenCapturer.m` around lines 125 - 135, unregisterAppStateObservers can be called off-main and races with the async registerAppStateObservers dispatch; ensure removal runs on the main queue and mirrors the register guard: dispatch the removal block to the main queue, check and set the _observingAppState flag inside that main-queue block, and then call [[NSNotificationCenter defaultCenter] removeObserver:... ] for UIApplicationDidBecomeActiveNotification and UIApplicationWillResignActiveNotification; alternatively, adopt the same post-dispatch re-check pattern used in registerAppStateObservers so observers aren’t added after you clear _observingAppState.ios/RCTWebRTC/InAppScreenCaptureController.m (1)
33-39: Minor: Redundant deviceId fallback.The
deviceIdis set to@"in-app-screen-capture"ininitWithCapturer:(Line 16), so the nil-coalescing operator on Line 35 is unnecessary sinceself.deviceIdwill never be nil after initialization.♻️ Suggested simplification
- (NSDictionary *)getSettings { return @{ - @"deviceId": self.deviceId ?: @"in-app-screen-capture", + @"deviceId": self.deviceId, @"groupId": @"", @"frameRate": @(30) }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ios/RCTWebRTC/InAppScreenCaptureController.m` around lines 33 - 39, The getSettings method uses a redundant nil-coalescing fallback for deviceId even though initWithCapturer: already sets self.deviceId to @"in-app-screen-capture"; update -[InAppScreenCaptureController getSettings] to use self.deviceId directly (remove the ?: @"in-app-screen-capture" fallback) so the dictionary simply assigns @"deviceId": self.deviceId, keeping the rest of the keys unchanged.ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioMixer.swift (1)
108-137: Consider extracting format comparison to avoid duplication.Lines 119-122 duplicate the format comparison logic from
ScreenShareAudioConverter.formatsMatch(). Consider exposing that method or usingconvertIfRequireddirectly (it already has the identity optimization).Simplified approach
- // 3. Convert to graph format (e.g. 48 kHz / 1 ch / float32) - let buffer: AVAudioPCMBuffer - if pcm.format.sampleRate != targetFormat.sampleRate - || pcm.format.channelCount != targetFormat.channelCount - || pcm.format.commonFormat != targetFormat.commonFormat - || pcm.format.isInterleaved != targetFormat.isInterleaved { - guard let converted = audioConverter.convertIfRequired(pcm, to: targetFormat) else { return } - buffer = converted - } else { - buffer = pcm - } + // 3. Convert to graph format if needed (converter handles identity case) + guard let buffer = audioConverter.convertIfRequired(pcm, to: targetFormat) else { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioMixer.swift` around lines 108 - 137, The format comparison in enqueue(_:) duplicates logic in ScreenShareAudioConverter.formatsMatch(); change enqueue to avoid manual field-by-field checks by either calling ScreenShareAudioConverter.formatsMatch(pcm.format, targetFormat) or simply always calling audioConverter.convertIfRequired(pcm, to: targetFormat) and using the returned buffer (it already returns the original when no conversion is needed), then proceed to schedule and play the buffer; update references in enqueue to use audioConverter.pcmBuffer(from:), audioConverter.convertIfRequired(...), and playerNode.scheduleBuffer(...) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java`:
- Around line 367-370: Retainment of mediaProjectionPermissionResultData can
leak a stale Intent; update GetUserMediaImpl to clear
mediaProjectionPermissionResultData when the MediaProjection is revoked or when
ScreenCapturerAndroid.onStop() is called and also null out displayMediaPromise
there; implement this by hooking into the MediaProjection callback/stop path
(where the MediaProjection is released) and calling
mediaProjectionPermissionResultData = null (and displayMediaPromise = null if
applicable), and add a freshness check inside
getMediaProjectionPermissionResultData() to validate the Intent (e.g., ensure a
live MediaProjection or timestamp/flag) and return null or throw if it is stale
so external callers cannot reuse an invalid Intent.
In `@android/src/main/java/com/oney/WebRTCModule/WebRTCModule.java`:
- Around line 166-185: The mixing loop can write past the mic buffer and perform
pointless writes; change the logic so it never writes beyond micShorts' valid
range by bounding the loop to the mic buffer capacity: compute micSamples =
Math.min(bytesRead / 2, micShorts.remaining()), keep screenSamples =
screenShorts.remaining(), then iterate i from 0 to micSamples (not totalSamples)
and for each i if i >= screenSamples copy screenShorts.get(i) into micShorts
only when i < micShorts.remaining(); otherwise add micShorts.get(i) +
screenShorts.get(i) when both exist, clamp to Short range, and write with
micShorts.put(i). Ensure no breaks that skip processing and avoid writing when
micShorts has no capacity.
In `@ios/RCTWebRTC/InAppScreenCapturer.m`:
- Around line 109-123: The race occurs because _observingAppState is set to YES
before observers are actually added on the main queue in
registerAppStateObservers, allowing unregisterAppStateObservers to run and skip
removal; fix by moving the state change and the addObserver calls together on
the main queue: in registerAppStateObservers dispatch to the main queue and
inside that block check _observingAppState, set _observingAppState = YES, then
call addObserver for appDidBecomeActive and appWillResignActive (or
alternatively use dispatch_sync to ensure immediate registration), and ensure
unregisterAppStateObservers likewise removes observers on the main queue and
sets _observingAppState = NO to keep the flag consistent with actual observer
registration.
In `@ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioConverter.swift`:
- Around line 90-95: In ScreenShareAudioConverter.swift the int16 branch that
checks pcmBuffer.int16ChannelData always performs a single interleaved memcpy
which breaks non-interleaved input when avFormat.isInterleaved is false; update
the int16 path to mirror the float handling: when avFormat.isInterleaved is true
do the single memcpy using bytesPerFrame and frameCount, otherwise loop
per-channel (using avFormat.formatDescription.mChannelsPerFrame / channelCount),
compute per-channel byte stride (bytesPerFrame / channelCount), and memcpy each
pcmBuffer.int16ChannelData[channel] from the proper offset in dataPointer for
frameCount bytes (respecting totalLength), similar to how the float
non-interleaved case is handled.
In `@ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioMixer.swift`:
- Around line 85-93: startMixing() currently only sets isMixing and does not
perform the ADM reconfiguration described in the comment; update startMixing()
to, after setting isMixing = true, either (A) request an ADM reconfigure so
onConfigureInputFromSource runs again (call the ADM reconfiguration/reset
method) or (B) immediately wire the mixer nodes using the ADM's cached context:
grab AudioDeviceModule.lastInputSource, lastInputDestination and lastInputFormat
and run the same node-attachment/configuration logic used in
onConfigureInputFromSource to attach your mixer nodes; reference startMixing(),
onConfigureInputFromSource, and
AudioDeviceModule.lastInputSource/lastInputDestination/lastInputFormat when
making the change.
In `@ios/RCTWebRTC/WebRTCModule`+RTCMediaStream.m:
- Around line 217-234: The screen-share audio path is not wired: when
options.useInAppScreenCapture is true you must set
InAppScreenCapturer.audioBufferHandler and, if options.includeScreenShareAudio
is enabled, create/retain a ScreenShareAudioMixer and call [mixer startMixing]
before starting capture; attach the handler to forward audio into the mixer and
store the mixer reference (e.g., on the module or options) so you can call
[mixer stopMixing] from mediaStreamTrackRelease; update the block that creates
InAppScreenCapturer (and options.activeInAppScreenCapturer) to set
audioBufferHandler, check options.includeScreenShareAudio, instantiate and start
the mixer accordingly, and ensure mediaStreamTrackRelease stops and releases the
mixer.
---
Nitpick comments:
In `@android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java`:
- Line 42: The class GetUserMediaImpl and its getter
getMediaProjectionPermissionResultData() expose the MediaProjection permission
Intent publicly; restrict access by making the class and method package-private
(remove public) or annotate them as internal (e.g.,
`@RestrictTo`(RestrictTo.Scope.LIBRARY)) and add a Javadoc comment marking them as
internal API; additionally, avoid returning the raw Intent instance - document
that callers must not reuse it and consider returning a safe copy or
nulling/clearing the stored Intent after it has been consumed in
GetUserMediaImpl to prevent external code from creating additional
MediaProjections without consent.
In `@ios/RCTWebRTC/InAppScreenCaptureController.m`:
- Around line 33-39: The getSettings method uses a redundant nil-coalescing
fallback for deviceId even though initWithCapturer: already sets self.deviceId
to @"in-app-screen-capture"; update -[InAppScreenCaptureController getSettings]
to use self.deviceId directly (remove the ?: @"in-app-screen-capture" fallback)
so the dictionary simply assigns @"deviceId": self.deviceId, keeping the rest of
the keys unchanged.
In `@ios/RCTWebRTC/InAppScreenCapturer.m`:
- Around line 125-135: unregisterAppStateObservers can be called off-main and
races with the async registerAppStateObservers dispatch; ensure removal runs on
the main queue and mirrors the register guard: dispatch the removal block to the
main queue, check and set the _observingAppState flag inside that main-queue
block, and then call [[NSNotificationCenter defaultCenter] removeObserver:... ]
for UIApplicationDidBecomeActiveNotification and
UIApplicationWillResignActiveNotification; alternatively, adopt the same
post-dispatch re-check pattern used in registerAppStateObservers so observers
aren’t added after you clear _observingAppState.
In `@ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioMixer.swift`:
- Around line 108-137: The format comparison in enqueue(_:) duplicates logic in
ScreenShareAudioConverter.formatsMatch(); change enqueue to avoid manual
field-by-field checks by either calling
ScreenShareAudioConverter.formatsMatch(pcm.format, targetFormat) or simply
always calling audioConverter.convertIfRequired(pcm, to: targetFormat) and using
the returned buffer (it already returns the original when no conversion is
needed), then proceed to schedule and play the buffer; update references in
enqueue to use audioConverter.pcmBuffer(from:),
audioConverter.convertIfRequired(...), and playerNode.scheduleBuffer(...)
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1c683d9e-b75e-43f1-baac-1cd400cea64f
📒 Files selected for processing (14)
android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.javaandroid/src/main/java/com/oney/WebRTCModule/WebRTCModule.javaandroid/src/main/java/com/oney/WebRTCModule/WebRTCModuleOptions.javaios/RCTWebRTC/InAppScreenCaptureController.hios/RCTWebRTC/InAppScreenCaptureController.mios/RCTWebRTC/InAppScreenCapturer.hios/RCTWebRTC/InAppScreenCapturer.mios/RCTWebRTC/Utils/AudioDeviceModule/AudioDeviceModule.swiftios/RCTWebRTC/Utils/AudioDeviceModule/AudioGraphConfigurationDelegate.swiftios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioConverter.swiftios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioMixer.swiftios/RCTWebRTC/WebRTCModule+RTCMediaStream.mios/RCTWebRTC/WebRTCModule.mios/RCTWebRTC/WebRTCModuleOptions.h
| // Note: mediaProjectionPermissionResultData is intentionally NOT nulled here. | ||
| // It is retained so it can be reused to create a MediaProjection for | ||
| // screen share audio capture (AudioPlaybackCaptureConfiguration). | ||
| displayMediaPromise = null; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how/where the MediaProjection permission data is used for audio capture
# and whether there's lifecycle management
rg -n "getMediaProjectionPermissionResultData|AudioPlaybackCapture" --type javaRepository: GetStream/react-native-webrtc
Length of output: 462
🏁 Script executed:
# First, get the full GetUserMediaImpl.java file to understand the context
wc -l android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.javaRepository: GetStream/react-native-webrtc
Length of output: 142
🏁 Script executed:
# Get the relevant section and understand how mediaProjectionPermissionResultData is stored and used
cat -n android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | sed -n '60,80p'Repository: GetStream/react-native-webrtc
Length of output: 1117
🏁 Script executed:
# Check if there's any cleanup/reset of mediaProjectionPermissionResultData
rg -n "mediaProjectionPermissionResultData\s*=" android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.javaRepository: GetStream/react-native-webrtc
Length of output: 140
🏁 Script executed:
# Find all usages of getMediaProjectionPermissionResultData() across the codebase
rg -n "getMediaProjectionPermissionResultData\(\)" --type java -A 3 -B 1Repository: GetStream/react-native-webrtc
Length of output: 545
🏁 Script executed:
# Check for any lifecycle-related cleanup (e.g., onDestroy, cleanup, clear methods)
rg -n "onDestroy|cleanup|clear.*Intent|reset.*MediaProjection" android/src/main/java/com/oney/WebRTCModule/ --type javaRepository: GetStream/react-native-webrtc
Length of output: 55
🏁 Script executed:
# Get more context around where mediaProjectionPermissionResultData is set (line 105)
cat -n android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | sed -n '95,120p'Repository: GetStream/react-native-webrtc
Length of output: 1315
🏁 Script executed:
# Check the entire createScreenStream method and related methods for cleanup
cat -n android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | sed -n '350,380p'Repository: GetStream/react-native-webrtc
Length of output: 1627
🏁 Script executed:
# Look for stop/end/cleanup methods related to screen sharing
rg -n "stopScreenCapture|endScreenShare|stopDisplayMedia" --type javaRepository: GetStream/react-native-webrtc
Length of output: 55
🏁 Script executed:
# Check where screen sharing ends and if there's any cleanup
rg -n "displayMediaPromise.*null|mediaProjection.*null|unbind.*Service" android/src/main/java/com/oney/WebRTCModule/ --type java -B 2 -A 2Repository: GetStream/react-native-webrtc
Length of output: 3201
🏁 Script executed:
# Check how ScreenCaptureController uses the mediaProjectionPermissionResultData
fd -e java -x grep -l "ScreenCaptureController" {} \;Repository: GetStream/react-native-webrtc
Length of output: 215
🏁 Script executed:
# Search for actual calls to getMediaProjectionPermissionResultData() outside of the class
rg -n "getMediaProjectionPermissionResultData" --type java -B 3 -A 3Repository: GetStream/react-native-webrtc
Length of output: 806
🏁 Script executed:
# Check ScreenCaptureController.java to see if it uses mediaProjectionPermissionResultData
cat -n android/src/main/java/com/oney/WebRTCModule/ScreenCaptureController.java | head -100Repository: GetStream/react-native-webrtc
Length of output: 3934
🏁 Script executed:
# Look for where ScreenCaptureController gets the Intent/data
rg -n "new ScreenCaptureController|getMediaProjectionPermissionResultData" android/src/main/java/com/oney/WebRTCModule/ --type java -B 2 -A 2Repository: GetStream/react-native-webrtc
Length of output: 1290
🏁 Script executed:
# Check if there's any method to reset or clear the Intent when screen sharing stops
cat -n android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | sed -n '1,50p'Repository: GetStream/react-native-webrtc
Length of output: 2163
🏁 Script executed:
# Search for AudioPlaybackCaptureConfiguration usage in the codebase
rg -n "AudioPlaybackCaptureConfiguration" --type java -B 3 -A 3Repository: GetStream/react-native-webrtc
Length of output: 1668
🏁 Script executed:
# Check if getMediaProjectionPermissionResultData is actually called from any native modules or bindings
rg -n "getMediaProjectionPermissionResultData" . --type java --type js --type ts -B 2 -A 2Repository: GetStream/react-native-webrtc
Length of output: 600
🏁 Script executed:
# Look for any MediaProjection creation or reuse patterns
rg -n "createMediaProjection|getMediaProjection" --type java -B 2 -A 2Repository: GetStream/react-native-webrtc
Length of output: 590
🏁 Script executed:
# Check if there's any lifecycle management or cleanup when screen sharing ends
cat -n android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | sed -n '400,450p'Repository: GetStream/react-native-webrtc
Length of output: 2582
🏁 Script executed:
# Check if there are any issues or tests that mention stale Intent or MediaProjection problems
rg -n "stale|lifecycle|MediaProjection.*reuse|AudioPlaybackCapture" --type java -B 2 -A 2Repository: GetStream/react-native-webrtc
Length of output: 1310
🏁 Script executed:
# Check the complete flow of when/how getMediaProjectionPermissionResultData is intended to be called
rg -n "public|protected" android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | head -30Repository: GetStream/react-native-webrtc
Length of output: 1128
🏁 Script executed:
# Look for any documentation or comments about the intended use of getMediaProjectionPermissionResultData()
cat -n android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java | sed -n '65,112p'Repository: GetStream/react-native-webrtc
Length of output: 2344
🏁 Script executed:
# Check if ScreenCapturerAndroid or MediaProjection has any caching/reuse pattern that would support audio capture
rg -n "ScreenCapturerAndroid|createMediaProjection" . --type java -B 3 -A 3 | head -60Repository: GetStream/react-native-webrtc
Length of output: 1744
🏁 Script executed:
# Check if there are any JavaScript/TypeScript files that might use getMediaProjectionPermissionResultData
find . -type f \( -name "*.js" -o -name "*.ts" -o -name "*.tsx" \) -exec grep -l "getMediaProjectionPermissionResultData\|AudioPlaybackCapture" {} \;Repository: GetStream/react-native-webrtc
Length of output: 55
🏁 Script executed:
# Check if MediaProjectionService has any cleanup or lifecycle management
cat -n android/src/main/java/com/oney/WebRTCModule/MediaProjectionService.java 2>/dev/null | head -100Repository: GetStream/react-native-webrtc
Length of output: 4173
🏁 Script executed:
# Find the MediaProjectionService file
find . -name "MediaProjectionService.java" -type fRepository: GetStream/react-native-webrtc
Length of output: 146
Add lifecycle management to clear the retained Intent when screen sharing ends.
The mediaProjectionPermissionResultData Intent is retained for reuse in AudioPlaybackCaptureConfiguration, but there's no mechanism to clear it when:
- The MediaProjection is revoked (e.g., user stops screen sharing)
- The
ScreenCapturerAndroid.onStop()callback fires
Since getMediaProjectionPermissionResultData() is a public API, external consumers could retrieve and use a stale Intent for audio capture, which would fail. Either clear the Intent explicitly when the MediaProjection ends, or add validation to ensure freshness before use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@android/src/main/java/com/oney/WebRTCModule/GetUserMediaImpl.java` around
lines 367 - 370, Retainment of mediaProjectionPermissionResultData can leak a
stale Intent; update GetUserMediaImpl to clear
mediaProjectionPermissionResultData when the MediaProjection is revoked or when
ScreenCapturerAndroid.onStop() is called and also null out displayMediaPromise
there; implement this by hooking into the MediaProjection callback/stop path
(where the MediaProjection is released) and calling
mediaProjectionPermissionResultData = null (and displayMediaPromise = null if
applicable), and add a freshness check inside
getMediaProjectionPermissionResultData() to validate the Intent (e.g., ensure a
live MediaProjection or timestamp/flag) and return null or throw if it is stale
so external callers cannot reuse an invalid Intent.
| int micSamples = Math.min(bytesRead / 2, micShorts.remaining()); | ||
| int screenSamples = screenShorts.remaining(); | ||
| int totalSamples = Math.max(micSamples, screenSamples); | ||
|
|
||
| for (int i = 0; i < totalSamples; i++) { | ||
| int sum; | ||
| if (i >= micSamples) { | ||
| // Screen-only: mic buffer is shorter — write screen sample directly | ||
| sum = screenShorts.get(i); | ||
| } else if (i >= screenSamples) { | ||
| // Mic-only: screen buffer is shorter — keep mic sample as-is | ||
| break; | ||
| } else { | ||
| // Both buffers have data — add samples | ||
| sum = micShorts.get(i) + screenShorts.get(i); | ||
| } | ||
| if (sum > Short.MAX_VALUE) sum = Short.MAX_VALUE; | ||
| if (sum < Short.MIN_VALUE) sum = Short.MIN_VALUE; | ||
| micShorts.put(i, (short) sum); | ||
| } |
There was a problem hiding this comment.
Buffer overflow risk when screen buffer exceeds mic buffer capacity.
The mixing logic has two issues:
-
IndexOutOfBoundsException: IfscreenSamples > micShorts.limit(), the loop will throw when callingmicShorts.put(i, ...)fori >= micShorts.limit(). -
Pointless writes: The "screen-only" branch (lines 172-174) writes beyond
bytesRead, but WebRTC will only processbytesReadbytes from the mic buffer, so this extra data is ignored.
The mixing should be bounded to the valid mic buffer range:
🐛 Proposed fix
- int micSamples = Math.min(bytesRead / 2, micShorts.remaining());
- int screenSamples = screenShorts.remaining();
- int totalSamples = Math.max(micSamples, screenSamples);
+ int micSamples = Math.min(bytesRead / 2, micShorts.remaining());
+ int screenSamples = screenShorts.remaining();
+ // Only mix within the valid mic buffer range - we can't extend beyond bytesRead
+ int samplesToMix = Math.min(micSamples, screenSamples);
- for (int i = 0; i < totalSamples; i++) {
- int sum;
- if (i >= micSamples) {
- // Screen-only: mic buffer is shorter — write screen sample directly
- sum = screenShorts.get(i);
- } else if (i >= screenSamples) {
- // Mic-only: screen buffer is shorter — keep mic sample as-is
- break;
- } else {
- // Both buffers have data — add samples
- sum = micShorts.get(i) + screenShorts.get(i);
- }
+ for (int i = 0; i < samplesToMix; i++) {
+ int sum = micShorts.get(i) + screenShorts.get(i);
if (sum > Short.MAX_VALUE) sum = Short.MAX_VALUE;
if (sum < Short.MIN_VALUE) sum = Short.MIN_VALUE;
micShorts.put(i, (short) sum);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| int micSamples = Math.min(bytesRead / 2, micShorts.remaining()); | |
| int screenSamples = screenShorts.remaining(); | |
| int totalSamples = Math.max(micSamples, screenSamples); | |
| for (int i = 0; i < totalSamples; i++) { | |
| int sum; | |
| if (i >= micSamples) { | |
| // Screen-only: mic buffer is shorter — write screen sample directly | |
| sum = screenShorts.get(i); | |
| } else if (i >= screenSamples) { | |
| // Mic-only: screen buffer is shorter — keep mic sample as-is | |
| break; | |
| } else { | |
| // Both buffers have data — add samples | |
| sum = micShorts.get(i) + screenShorts.get(i); | |
| } | |
| if (sum > Short.MAX_VALUE) sum = Short.MAX_VALUE; | |
| if (sum < Short.MIN_VALUE) sum = Short.MIN_VALUE; | |
| micShorts.put(i, (short) sum); | |
| } | |
| int micSamples = Math.min(bytesRead / 2, micShorts.remaining()); | |
| int screenSamples = screenShorts.remaining(); | |
| // Only mix within the valid mic buffer range - we can't extend beyond bytesRead | |
| int samplesToMix = Math.min(micSamples, screenSamples); | |
| for (int i = 0; i < samplesToMix; i++) { | |
| int sum = micShorts.get(i) + screenShorts.get(i); | |
| if (sum > Short.MAX_VALUE) sum = Short.MAX_VALUE; | |
| if (sum < Short.MIN_VALUE) sum = Short.MIN_VALUE; | |
| micShorts.put(i, (short) sum); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@android/src/main/java/com/oney/WebRTCModule/WebRTCModule.java` around lines
166 - 185, The mixing loop can write past the mic buffer and perform pointless
writes; change the logic so it never writes beyond micShorts' valid range by
bounding the loop to the mic buffer capacity: compute micSamples =
Math.min(bytesRead / 2, micShorts.remaining()), keep screenSamples =
screenShorts.remaining(), then iterate i from 0 to micSamples (not totalSamples)
and for each i if i >= screenSamples copy screenShorts.get(i) into micShorts
only when i < micShorts.remaining(); otherwise add micShorts.get(i) +
screenShorts.get(i) when both exist, clamp to Short range, and write with
micShorts.put(i). Ensure no breaks that skip processing and avoid writing when
micShorts has no capacity.
| - (void)registerAppStateObservers { | ||
| if (_observingAppState) return; | ||
| _observingAppState = YES; | ||
|
|
||
| dispatch_async(dispatch_get_main_queue(), ^{ | ||
| [[NSNotificationCenter defaultCenter] addObserver:self | ||
| selector:@selector(appDidBecomeActive) | ||
| name:UIApplicationDidBecomeActiveNotification | ||
| object:nil]; | ||
| [[NSNotificationCenter defaultCenter] addObserver:self | ||
| selector:@selector(appWillResignActive) | ||
| name:UIApplicationWillResignActiveNotification | ||
| object:nil]; | ||
| }); | ||
| } |
There was a problem hiding this comment.
Potential race condition in observer registration.
_observingAppState is set to YES synchronously on Line 111, but the actual observer registration is dispatched asynchronously to the main queue. If unregisterAppStateObservers is called before the dispatch block executes, it will skip removal (due to _observingAppState being reset to NO), but then the dispatch block will execute and add observers that will never be cleaned up.
Consider moving the flag assignment inside the dispatch block, or using dispatch_sync if you need immediate registration:
🛠️ Suggested fix
- (void)registerAppStateObservers {
if (_observingAppState) return;
- _observingAppState = YES;
dispatch_async(dispatch_get_main_queue(), ^{
+ if (self->_observingAppState) return;
+ self->_observingAppState = YES;
+
[[NSNotificationCenter defaultCenter] addObserver:self
selector:`@selector`(appDidBecomeActive)
name:UIApplicationDidBecomeActiveNotification
object:nil];
[[NSNotificationCenter defaultCenter] addObserver:self
selector:`@selector`(appWillResignActive)
name:UIApplicationWillResignActiveNotification
object:nil];
});
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - (void)registerAppStateObservers { | |
| if (_observingAppState) return; | |
| _observingAppState = YES; | |
| dispatch_async(dispatch_get_main_queue(), ^{ | |
| [[NSNotificationCenter defaultCenter] addObserver:self | |
| selector:@selector(appDidBecomeActive) | |
| name:UIApplicationDidBecomeActiveNotification | |
| object:nil]; | |
| [[NSNotificationCenter defaultCenter] addObserver:self | |
| selector:@selector(appWillResignActive) | |
| name:UIApplicationWillResignActiveNotification | |
| object:nil]; | |
| }); | |
| } | |
| - (void)registerAppStateObservers { | |
| if (_observingAppState) return; | |
| dispatch_async(dispatch_get_main_queue(), ^{ | |
| if (self->_observingAppState) return; | |
| self->_observingAppState = YES; | |
| [[NSNotificationCenter defaultCenter] addObserver:self | |
| selector:`@selector`(appDidBecomeActive) | |
| name:UIApplicationDidBecomeActiveNotification | |
| object:nil]; | |
| [[NSNotificationCenter defaultCenter] addObserver:self | |
| selector:`@selector`(appWillResignActive) | |
| name:UIApplicationWillResignActiveNotification | |
| object:nil]; | |
| }); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ios/RCTWebRTC/InAppScreenCapturer.m` around lines 109 - 123, The race occurs
because _observingAppState is set to YES before observers are actually added on
the main queue in registerAppStateObservers, allowing
unregisterAppStateObservers to run and skip removal; fix by moving the state
change and the addObserver calls together on the main queue: in
registerAppStateObservers dispatch to the main queue and inside that block check
_observingAppState, set _observingAppState = YES, then call addObserver for
appDidBecomeActive and appWillResignActive (or alternatively use dispatch_sync
to ensure immediate registration), and ensure unregisterAppStateObservers
likewise removes observers on the main queue and sets _observingAppState = NO to
keep the flag consistent with actual observer registration.
| } else if let int16Data = pcmBuffer.int16ChannelData { | ||
| let bytesPerFrame = Int(avFormat.streamDescription.pointee.mBytesPerFrame) | ||
| memcpy(int16Data[0], dataPointer, min(totalLength, Int(frameCount) * bytesPerFrame)) | ||
| } else { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Int16 non-interleaved format handling is incomplete.
The float branch at lines 80-89 correctly handles both interleaved and non-interleaved layouts, but the int16 branch always performs an interleaved copy. If avFormat.isInterleaved is false for int16 data, this will produce incorrect output.
While RPScreenRecorder typically provides float32 buffers, the file header claims support for "int16, interleaved, non-interleaved" layouts.
Proposed fix to handle non-interleaved int16
} else if let int16Data = pcmBuffer.int16ChannelData {
let bytesPerFrame = Int(avFormat.streamDescription.pointee.mBytesPerFrame)
- memcpy(int16Data[0], dataPointer, min(totalLength, Int(frameCount) * bytesPerFrame))
+ if avFormat.isInterleaved {
+ memcpy(int16Data[0], dataPointer, min(totalLength, Int(frameCount) * bytesPerFrame))
+ } else {
+ let channelCount = Int(avFormat.channelCount)
+ let framesSize = Int(frameCount) * MemoryLayout<Int16>.size
+ for ch in 0..<channelCount {
+ memcpy(int16Data[ch], dataPointer.advanced(by: ch * framesSize), framesSize)
+ }
+ }
} else {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else if let int16Data = pcmBuffer.int16ChannelData { | |
| let bytesPerFrame = Int(avFormat.streamDescription.pointee.mBytesPerFrame) | |
| memcpy(int16Data[0], dataPointer, min(totalLength, Int(frameCount) * bytesPerFrame)) | |
| } else { | |
| return nil | |
| } | |
| } else if let int16Data = pcmBuffer.int16ChannelData { | |
| let bytesPerFrame = Int(avFormat.streamDescription.pointee.mBytesPerFrame) | |
| if avFormat.isInterleaved { | |
| memcpy(int16Data[0], dataPointer, min(totalLength, Int(frameCount) * bytesPerFrame)) | |
| } else { | |
| let channelCount = Int(avFormat.channelCount) | |
| let framesSize = Int(frameCount) * MemoryLayout<Int16>.size | |
| for ch in 0..<channelCount { | |
| memcpy(int16Data[ch], dataPointer.advanced(by: ch * framesSize), framesSize) | |
| } | |
| } | |
| } else { | |
| return nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioConverter.swift` around
lines 90 - 95, In ScreenShareAudioConverter.swift the int16 branch that checks
pcmBuffer.int16ChannelData always performs a single interleaved memcpy which
breaks non-interleaved input when avFormat.isInterleaved is false; update the
int16 path to mirror the float handling: when avFormat.isInterleaved is true do
the single memcpy using bytesPerFrame and frameCount, otherwise loop per-channel
(using avFormat.formatDescription.mChannelsPerFrame / channelCount), compute
per-channel byte stride (bytesPerFrame / channelCount), and memcpy each
pcmBuffer.int16ChannelData[channel] from the proper offset in dataPointer for
frameCount bytes (respecting totalLength), similar to how the float
non-interleaved case is handled.
| /// Enable audio mixing. Call when screen share with audio starts. | ||
| /// | ||
| /// If the engine is already running (i.e., `onConfigureInputFromSource` has | ||
| /// already fired), this triggers an ADM reconfiguration so the graph gets | ||
| /// rewired with our nodes. | ||
| @objc public func startMixing() { | ||
| guard !isMixing else { return } | ||
| isMixing = true | ||
| } |
There was a problem hiding this comment.
Critical: startMixing() does not trigger graph reconfiguration as documented.
The comment states this "triggers an ADM reconfiguration so the graph gets rewired with our nodes," but the implementation only sets isMixing = true.
If startMixing() is called after onConfigureInputFromSource has already fired (common case: user starts screen share after call is established), the mixer nodes will never be attached because the callback has already passed.
The mixer needs to either:
- Trigger an ADM reset/reconfiguration to force a new
onConfigureInputFromSourcecallback, or - Use the cached input context from
AudioDeviceModuleto wire nodes immediately
Proposed fix using cached input context from AudioDeviceModule
The AudioDeviceModule now exposes lastInputSource, lastInputDestination, and lastInputFormat. The mixer could store a reference to the ADM and use these to wire nodes on-demand:
+ /// Reference to the ADM for accessing cached input context.
+ private weak var audioDeviceModule: AudioDeviceModule?
+
+ `@objc` public init(audioDeviceModule: AudioDeviceModule) {
+ self.audioDeviceModule = audioDeviceModule
+ super.init()
+ }
+
`@objc` public func startMixing() {
guard !isMixing else { return }
isMixing = true
+
+ // If the engine is already running, wire nodes immediately using cached context
+ if let engine = currentEngine,
+ let format = graphFormat ?? audioDeviceModule?.lastInputFormat,
+ let destination = audioDeviceModule?.lastInputDestination {
+ let source = audioDeviceModule?.lastInputSource
+ attachAndWireNodes(engine: engine, source: source, destination: destination, format: format)
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ios/RCTWebRTC/Utils/AudioDeviceModule/ScreenShareAudioMixer.swift` around
lines 85 - 93, startMixing() currently only sets isMixing and does not perform
the ADM reconfiguration described in the comment; update startMixing() to, after
setting isMixing = true, either (A) request an ADM reconfigure so
onConfigureInputFromSource runs again (call the ADM reconfiguration/reset
method) or (B) immediately wire the mixer nodes using the ADM's cached context:
grab AudioDeviceModule.lastInputSource, lastInputDestination and lastInputFormat
and run the same node-attachment/configuration logic used in
onConfigureInputFromSource to attach your mixer nodes; reference startMixing(),
onConfigureInputFromSource, and
AudioDeviceModule.lastInputSource/lastInputDestination/lastInputFormat when
making the change.
| if (options.useInAppScreenCapture) { | ||
| // Clear the flag so subsequent getDisplayMedia calls use broadcast by default | ||
| options.useInAppScreenCapture = NO; | ||
|
|
||
| InAppScreenCapturer *capturer = [[InAppScreenCapturer alloc] initWithDelegate:videoSource]; | ||
| InAppScreenCaptureController *controller = [[InAppScreenCaptureController alloc] initWithCapturer:capturer]; | ||
|
|
||
| // Store weak reference for audio mixing wiring | ||
| options.activeInAppScreenCapturer = capturer; | ||
|
|
||
| captureController = controller; | ||
| } else { | ||
| // Existing broadcast extension path | ||
| ScreenCapturer *screenCapturer = [[ScreenCapturer alloc] initWithDelegate:videoSource]; | ||
| ScreenCaptureController *screenCaptureController = | ||
| [[ScreenCaptureController alloc] initWithCapturer:screenCapturer]; | ||
| captureController = screenCaptureController; | ||
| } |
There was a problem hiding this comment.
Critical: Screen share audio path is not wired.
The audioBufferHandler on InAppScreenCapturer is never set, and ScreenShareAudioMixer.startMixing is never called. According to the context snippets:
InAppScreenCapturer.audioBufferHandlermust be set beforestartCapturefor audio mixing to work (context snippet 1)- If the handler is nil, audio samples are silently dropped (context snippet 2)
- The mixer "stays dormant until
startMixingis called" (context snippet 3)
The includeScreenShareAudio flag added to WebRTCModuleOptions is also never checked here.
🐛 Proposed fix to wire the audio path
if (options.useInAppScreenCapture) {
// Clear the flag so subsequent getDisplayMedia calls use broadcast by default
options.useInAppScreenCapture = NO;
InAppScreenCapturer *capturer = [[InAppScreenCapturer alloc] initWithDelegate:videoSource];
InAppScreenCaptureController *controller = [[InAppScreenCaptureController alloc] initWithCapturer:capturer];
// Store weak reference for audio mixing wiring
options.activeInAppScreenCapturer = capturer;
+ // Wire audio mixing if requested
+ if (options.includeScreenShareAudio && options.screenShareAudioMixer) {
+ ScreenShareAudioMixer *mixer = options.screenShareAudioMixer;
+ capturer.audioBufferHandler = ^(CMSampleBufferRef sampleBuffer) {
+ [mixer enqueueSampleBuffer:sampleBuffer];
+ };
+ [mixer startMixing];
+ }
+
captureController = controller;
} else {Note: You'll also need to call [mixer stopMixing] when the track is released in mediaStreamTrackRelease:.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (options.useInAppScreenCapture) { | |
| // Clear the flag so subsequent getDisplayMedia calls use broadcast by default | |
| options.useInAppScreenCapture = NO; | |
| InAppScreenCapturer *capturer = [[InAppScreenCapturer alloc] initWithDelegate:videoSource]; | |
| InAppScreenCaptureController *controller = [[InAppScreenCaptureController alloc] initWithCapturer:capturer]; | |
| // Store weak reference for audio mixing wiring | |
| options.activeInAppScreenCapturer = capturer; | |
| captureController = controller; | |
| } else { | |
| // Existing broadcast extension path | |
| ScreenCapturer *screenCapturer = [[ScreenCapturer alloc] initWithDelegate:videoSource]; | |
| ScreenCaptureController *screenCaptureController = | |
| [[ScreenCaptureController alloc] initWithCapturer:screenCapturer]; | |
| captureController = screenCaptureController; | |
| } | |
| if (options.useInAppScreenCapture) { | |
| // Clear the flag so subsequent getDisplayMedia calls use broadcast by default | |
| options.useInAppScreenCapture = NO; | |
| InAppScreenCapturer *capturer = [[InAppScreenCapturer alloc] initWithDelegate:videoSource]; | |
| InAppScreenCaptureController *controller = [[InAppScreenCaptureController alloc] initWithCapturer:capturer]; | |
| // Store weak reference for audio mixing wiring | |
| options.activeInAppScreenCapturer = capturer; | |
| // Wire audio mixing if requested | |
| if (options.includeScreenShareAudio && options.screenShareAudioMixer) { | |
| ScreenShareAudioMixer *mixer = options.screenShareAudioMixer; | |
| capturer.audioBufferHandler = ^(CMSampleBufferRef sampleBuffer) { | |
| [mixer enqueueSampleBuffer:sampleBuffer]; | |
| }; | |
| [mixer startMixing]; | |
| } | |
| captureController = controller; | |
| } else { | |
| // Existing broadcast extension path | |
| ScreenCapturer *screenCapturer = [[ScreenCapturer alloc] initWithDelegate:videoSource]; | |
| ScreenCaptureController *screenCaptureController = | |
| [[ScreenCaptureController alloc] initWithCapturer:screenCapturer]; | |
| captureController = screenCaptureController; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ios/RCTWebRTC/WebRTCModule`+RTCMediaStream.m around lines 217 - 234, The
screen-share audio path is not wired: when options.useInAppScreenCapture is true
you must set InAppScreenCapturer.audioBufferHandler and, if
options.includeScreenShareAudio is enabled, create/retain a
ScreenShareAudioMixer and call [mixer startMixing] before starting capture;
attach the handler to forward audio into the mixer and store the mixer reference
(e.g., on the module or options) so you can call [mixer stopMixing] from
mediaStreamTrackRelease; update the block that creates InAppScreenCapturer (and
options.activeInAppScreenCapturer) to set audioBufferHandler, check
options.includeScreenShareAudio, instantiate and start the mixer accordingly,
and ensure mediaStreamTrackRelease stops and releases the mixer.
Summary by CodeRabbit
Release Notes