-
Notifications
You must be signed in to change notification settings - Fork 7
Speaker Separation with Labels v1.0
Date completed | November 14th, 2023 |
Release where first appeared | OpenWillis v1.6 |
Researcher / Developer | Vijay Yadav |
import openwillis as ow
speaker_dict = ow.speaker_separation_labels(filepath = '', transcript_json = '')
To use the speaker_dict
output to save the separated audio files:
ow.to_audio(filepath = '', speaker_dict = '', output_dir = '')
This function splits an audio file with multiple speakers into individual files for each speaker, assuming prior speaker labeling during transcription. It is meant to work alongside the Speech transcription with AWS and Speech transcription with Whisper functions in OpenWillis.
The user must provide the source audio file that was originally transcribed and the JSON output from the speech transcription function they used. In return, the function will output a speaker dictionary, named speaker_dict
.
This dictionary can then be fed into the to_audio
function to save the separated audio files in the specified output directory output_dir
. The naming convention for these files is filename_speakerlabel
, with filename
indicating the original audio source's name and speakerlabel
denoting the respective speaker's label from the JSON output.
Type | String |
Description | Path to audio file to be separated |
Type | JSON |
Description | JSON output from the Speech transcription with AWS/Whisper functions |
Type | Dictionary |
Description | A dictionary with the speaker label as the key and the audio signal numpy array as the value. |
OpenWillis was developed by a small team of clinicians, scientists, and engineers based in Brooklyn, NY.
- Release notes
- Getting started
-
List of functions
- Facial Expressivity v2.0
- Emotional Expressivity v2.0
- Eye Blink Rate v1.0
- Speech Transcription with Vosk v1.0
- Speech Transcription with Whisper v1.1
- Speech Transcription with AWS v1.1
- Speaker Separation with Labels v1.0
- Speaker Separation without Labels v1.0
- Vocal Acoustics v2.0
- Speech Characteristics v3.0
- GPS Analysis v1.0
- Research guidelines