Skip to content

Dynamic Mixing For Speech Processing (mix-on-the-fly)

Notifications You must be signed in to change notification settings

khanld/Dynamic-Mixing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DYNAMIC MIXING FOR SPEECH PROCESSING (MIX-ON-THE-FLY)

Documentation

An easy-to-use Dynamic Mixing python code for Speech Processing tasks such as Speech Enhancement, Speech Source Separation, Target Speech Extraction, and Speech Augmentation.

Installation

pip install -r requirements.txt

Usage

It is recommended to understand the DynamicMixing arguments before using it. You must provide either the bg_noise_dataset or bb_noise_dataset argument.
Inline python code:

from DynamicMixing import DynamicMixing

mixer = DynamicMixing(bg_noise_dataset = 'audios/bg_noise.txt',
                      bb_noise_dataset = 'audios/bb_noise.txt',
                      rir_dataset = 'audios/rir.txt',
                      snr_range = [-5, 25],
                      sir_range = [-5, 25],
                      sr = 16000,
                      max_bg_noise_to_mix = 3,
                      max_speakers_to_mix = 3,
                      reverb_proportion = 0.5,
                      target_level = -25,
                      target_level_floating_value = 10,
                      allowed_overlapped_bg_noise = True,
                      silence_length = 0.2,
                      saved_dir = 'audios/noisy')

clean_path = 'audios/clean/book_00000_chp_0009_reader_06709_2.wav'
output = mixer.generate(clean_path, save_to_dir = True)

# output is a dictionary, pls check the DynamicMixing code
print("Output: ", output)

# get the noisy data
noisy_y = output['noisy']
print("Noisy data: ", noisy_y)

Generate and save generated noisy audios:

python generate.py \
    --clean_dataset=audios/clean.txt \
    --bg_noise_dataset=audios/bg_noise.txt \
    --bb_noise_dataset=audios/bb_noise.txt \
    --rir_dataset=audios/rir.txt \
    --snr_range=-5,10 \
    --sir_range=-5,25 \
    --max_bg_noise_to_mix=3 \
    --max_speakers_to_mix=3 \
    --reverb_proportion=0.5 \
    --target_level=-25 \
    --target_level_floating_value=10 \
    --allowed_overlapped_bg_noise=true \
    --silence_length=0.2 \
    --saved_dir=audios/noisy