audio.vadwebrtc

This repository contains an R package which is an Rcpp wrapper around the webrtc Voice Activity Detection module.

example-vad.mp4

The package was created with as main goal to remove non-speech audio segments before doing an automatic transcription using audio.whisper to avoid transcription hallucinations. It contains

functions to detect the location of voice in audio using a Gaussian Mixture Model implemented in webrtc
functions to extract audio where there is voice / silence in a new audio file
functionality to rewrite the timepoints of transcribed sentences where specific sections with non-audio are removed to make sure the timepoints of the transcriptions without silences align with the original audio signal

Installation

The package is currently not on CRAN
For the development version of this package: remotes::install_github("bnosac/audio.vadwebrtc")

Look to the documentation of the functions: help(package = "audio.vadwebrtc")

Example

Get a audio file in 16 bit with mono PCM samples (pcm_s16le codec) with a sampling rate of either 8Khz, 16KHz or 32Khz

library(audio.vadwebrtc)
file <- system.file(package = "audio.vadwebrtc", "extdata", "test_wav.wav")
vad  <- VAD(file, mode = "normal")
vad
Voice Activity Detection 
  - file: D:/Jan/R/win-library/4.1/audio.vadwebrtc/extdata/test_wav.wav 
  - sample rate: 16000 
  - VAD type: webrtc-gmm, VAD mode: normal, VAD by milliseconds: 10, VAD frame_length: 160
    - Percent of audio containing a voiced signal: 90.2% 
    - Seconds voiced: 6.3 
    - Seconds unvoiced: 0.7
vad$vad_segments
 vad_segment start  end has_voice
           1  0.00 0.08     FALSE
           2  0.09 3.30      TRUE
           3  3.31 3.71     FALSE
           4  3.72 6.78      TRUE
           5  6.79 6.99     FALSE

Example of a simple plot of these audio and voice segments

library(av)
x <- read_audio_bin(file)
plot(seq_along(x) / 16000, x, type = "l", xlab = "Seconds", ylab = "Signal")
abline(v = vad$vad_segments$start, col = "red", lwd = 2)
abline(v = vad$vad_segments$end, col = "blue", lwd = 2)

Or show it interactively alongside R package wavesurfer: wavesurfer

library(wavesurfer)
library(shiny)
file <- system.file(package = "audio.vadwebrtc", "extdata", "test_wav.wav")
vad  <- VAD(file, mode = "lowbitrate")
anno <- data.frame(audio_id = vad$file, 
                   region_id = vad$vad_segments$vad_segment, 
                   start = vad$vad_segments$start, 
                   end = vad$vad_segments$end, 
                   label = ifelse(vad$vad_segments$has_voice, "Voiced", "Silent"))
anno <- subset(anno, label %in% "Silent")
  
wavs_folder <- system.file(package = "audio.vadwebrtc", "extdata")
shiny::addResourcePath("wav", wavs_folder)
ui <- fluidPage(
  wavesurferOutput("my_ws", height = "128px"),
  tags$p("Press spacebar to toggle play/pause."),
)
server <- function(input, output, session) {
  output$my_ws <- renderWavesurfer({
    wavesurfer(audio = paste0("wav/", "test_wav.wav"), annotations = anno) %>%
      ws_set_wave_color('#5511aa') %>%
      ws_cursor()
  })
}
shinyApp(ui = ui, server = server)

Support in text mining

Need support in text mining? Contact BNOSAC: http://www.bnosac.be

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
R		R
inst		inst
man		man
src		src
tools		tools
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.note		LICENSE.note
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

audio.vadwebrtc

Installation

Example

Support in text mining

About

Licenses found

Releases 2

Packages

Languages

License

Licenses found

bnosac/audio.vadwebrtc

Folders and files

Latest commit

History

Repository files navigation

audio.vadwebrtc

Installation

Example

Support in text mining

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages