Speaker Separation with Labels v1.0

Date completed	November 14th, 2023
Release where first appeared	OpenWillis v1.6
Researcher / Developer	Vijay Yadav

1 – Use

import openwillis as ow

speaker_dict = ow.speaker_separation_labels(filepath = '', transcript_json = '')

To use the speaker_dict output to save the separated audio files:

ow.to_audio(filepath = '', speaker_dict = '', output_dir = '')

2 – Methods

This function splits an audio file with multiple speakers into individual files for each speaker, assuming prior speaker labeling during transcription. It is meant to work alongside the Speech transcription with AWS and Speech transcription with Whisper functions in OpenWillis.

The user must provide the source audio file that was originally transcribed and the JSON output from the speech transcription function they used. In return, the function will output a speaker dictionary, named speaker_dict.

This dictionary can then be fed into the to_audio function to save the separated audio files in the specified output directory output_dir. The naming convention for these files is filename_speakerlabel, with filename indicating the original audio source's name and speakerlabel denoting the respective speaker's label from the JSON output.

3 – Inputs

3.1 – `filepath`

Type	String
Description	Path to audio file to be separated

3.2 – `transcript_json`

Type	JSON
Description	JSON output from the Speech transcription with AWS/Whisper functions

4 – Outputs

4.1 – `speaker_dict`

Type	Dictionary
Description	A dictionary with the speaker label as the key and the audio signal numpy array as the value.

OpenWillis was developed by a small team of clinicians, scientists, and engineers based in Brooklyn, NY.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speaker Separation with Labels v1.0

1 – Use

2 – Methods

3 – Inputs

3.1 – `filepath`

3.2 – `transcript_json`

4 – Outputs

4.1 – `speaker_dict`

Table of contents

Clone this wiki locally

Speaker Separation with Labels v1.0

1 – Use

2 – Methods

3 – Inputs

3.1 – filepath

3.2 – transcript_json

4 – Outputs

4.1 – speaker_dict

Table of contents

Clone this wiki locally

3.1 – `filepath`

3.2 – `transcript_json`

4.1 – `speaker_dict`