Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about voice-activity-detection #368

Closed
lewiszlw opened this issue Dec 16, 2022 · 2 comments
Closed

Question about voice-activity-detection #368

lewiszlw opened this issue Dec 16, 2022 · 2 comments

Comments

@lewiszlw
Copy link

Hi team,

Is this project support vad? I didn't find any vad example. Search code and only one place contains

/// AnswerOptions structure describes the options used to control the answer
/// creation process.
#[derive(Default, Debug, PartialEq, Eq, Copy, Clone)]
pub struct RTCAnswerOptions {
    /// voice_activity_detection allows the application to provide information
    /// about whether it wishes voice detection feature to be enabled or disabled.
    pub voice_activity_detection: bool,
}

/// OfferOptions structure describes the options used to control the offer
/// creation process
#[derive(Default, Debug, PartialEq, Eq, Copy, Clone)]
pub struct RTCOfferOptions {
    /// voice_activity_detection allows the application to provide information
    /// about whether it wishes voice detection feature to be enabled or disabled.
    pub voice_activity_detection: bool,

    /// ice_restart forces the underlying ice gathering process to be restarted.
    /// When this value is true, the generated description will have ICE
    /// credentials that are different from the current credentials
    pub ice_restart: bool,
}

If supports, could you share some doc/example I can learn?

@k0nserv
Copy link
Member

k0nserv commented Dec 16, 2022

Hey, this flag seems to have been obsoleted by the specification and doesn't do anything currently.

A way to do voice activity detection is to add the audio level extension to the negotiated headers

    media_engine
        .register_header_extension(
            webrtc::RTCRtpHeaderExtensionCapability {
                uri: String::from("urn:ietf:params:rtp-hdrext:ssrc-audio-level"),
            },
            webrtc::RTPCodecType::Audio,
            None,
        )

With this the other side will send audio level data in all audio RTP packets that you can extract to detect silence vs audio. In browsers you can use RTCRtpReceiver.getContributingSources()[0].audioLevel to get the audio level for an instant in time.

@lewiszlw
Copy link
Author

Thanks for the quick answer. Sorry I find my needs are still unclear somewhere. Resolving first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants