How should I train audio with 24k and 44.1k sampling rates? #2

accum-dai · 2024-05-07T16:10:26Z

It seems that the model upsamples based on the hop_size, so training audio at 24k and 44.1k should be the same as training 22050 Hz audio; I just need to modify the sampling rate settings in params.py. However, I'm not quite sure about one thing. Do I need to modify this function?

def remove_cutoff_frequency(signal):
    signal = torchaudio.functional.highpass_biquad(
        signal, sample_rate=22050 // 2, cutoff_freq=15
    )
    signal = torchaudio.functional.lowpass_biquad(
        signal, sample_rate=22050 / 2, cutoff_freq=5500
    )
    return signal

signofthefour · 2024-06-24T05:48:38Z

@accum-dai Sorry for late reply!
This function is necessary for removing aliasing at the cut-off frequency. While it is only required for post-processing and not for training, I recommend using it to slightly enhance the output quality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How should I train audio with 24k and 44.1k sampling rates? #2

How should I train audio with 24k and 44.1k sampling rates? #2

accum-dai commented May 7, 2024 •

edited

Loading

signofthefour commented Jun 24, 2024

How should I train audio with 24k and 44.1k sampling rates? #2

How should I train audio with 24k and 44.1k sampling rates? #2

Comments

accum-dai commented May 7, 2024 • edited Loading

signofthefour commented Jun 24, 2024

accum-dai commented May 7, 2024 •

edited

Loading