preprocessing_mel question #18

Kerry0123 · 2020-09-15T12:05:15Z

hi，I have doubt about the preprocessing_mel function. I use the following preprocessing method. The generated audio file is muted.

def melspectrogram(wav, hparams):
D = _stft(preemphasis(wav, hparams.preemphasis, hparams.preemphasize), hparams)
S = _amp_to_db(_linear_to_mel(np.abs(D), hparams), hparams) - hparams.ref_level_db

np.dot(mel_basis, S)

if hparams.signal_normalization:
	return _normalize(S, hparams)
return S

def _stft(y, hparams):
if hparams.use_lws: False
return _lws_processor(hparams).stft(y).T
else:
return librosa.stft(y=y, n_fft=hparams.n_fft, hop_length=get_hop_size(hparams), win_length=hparams.win_size)
librosa.stft(y, n_fft=num_fft, hop_length=hop_length, win_length=win_length)
def _linear_to_mel(spectogram, hparams):
global _mel_basis
if _mel_basis is None:
_mel_basis = _build_mel_basis(hparams)
return np.dot(_mel_basis, spectogram)

def _amp_to_db(x, hparams):
min_level = np.exp(hparams.min_level_db / 20 * np.log(10))
return 20 * np.log10(np.maximum(min_level, x))

np.exp(-100 / 20 * np.log(10))

min_level = 10**(-100 / 20)
return 20 * np.log10(np.maximum(min_level, x))

def _normalize(S, hparams):
if hparams.allow_clipping_in_normalization: （True）
if hparams.symmetric_mels: （True）
return np.clip((2 * hparams.max_abs_value) * ((S - hparams.min_level_db) / (-hparams.min_level_db)) - hparams.max_abs_value,
-hparams.max_abs_value, hparams.max_abs_value)
else:
return np.clip(hparams.max_abs_value * ((S - hparams.min_level_db) / (-hparams.min_level_db)), 0, hparams.max_abs_value)

The main difference is “S = _amp_to_db(_linear_to_mel(np.abs(D), hparams), hparams) - hparams.ref_level_db” and _normalize，
hparams.ref_level_db =20, hparams.max_abs_value = 4;
data is [-4, 4], your preprocessing data is[0, 1]; the data range has a great influence on the model? I don't understand，I am asking for your help. thank you.

The text was updated successfully, but these errors were encountered:

Kerry0123 · 2020-09-15T12:13:12Z

I am asking for your help. thank you.

bshall · 2020-09-15T14:02:47Z

Hi @Kerry0123,

Did you retrain the model with your preprocessing steps or did you feed your spectrograms directly to the pretrained model?

Kerry0123 · 2020-09-16T07:06:53Z

I retrain the model with my preprocessing steps. The loss of epoch 1 is 0.66. Loss will drop to 0. I am asking for your help. thank you.

bshall · 2020-09-16T08:46:48Z

@Kerry0123, something weird is going on because that loss is very low. What dataset are you using? The ZeroSpeech one? Also, could you share an example spectrogram so I can check if anything is odd?

Kerry0123 · 2020-09-16T09:00:42Z

The dataset is BZNSYP(Chinese dataset)，To align the output of the synthesizer with the input of the vocoder，I use the preprocessing of the tacotron2 synthesizer. Its github link: https://github.com/cnlinxi/style-token_tacotron2.
python preprocess.py --dataset=biaobei --base_dir=/tmp-data/data/ --output=/nfs/volume-340-1/tts_data_preprocess/training_data_biaobe.
Is it convenient to tell me your email address? I send you mel file. I am asking for your help. thank you.

bshall · 2020-09-16T09:27:05Z

Sure, you can send it to benjamin.l.van.niekerk@gmail.com

Just to check, you kept all the other preprocessing the same e.g. mu-law encoding and all the padding stuff here?

Kerry0123 closed this as completed Sep 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocessing_mel question #18

preprocessing_mel question #18

Kerry0123 commented Sep 15, 2020 •

edited

Loading

Kerry0123 commented Sep 15, 2020

bshall commented Sep 15, 2020

Kerry0123 commented Sep 16, 2020

bshall commented Sep 16, 2020

Kerry0123 commented Sep 16, 2020

bshall commented Sep 16, 2020

preprocessing_mel question #18

preprocessing_mel question #18

Comments

Kerry0123 commented Sep 15, 2020 • edited Loading

Kerry0123 commented Sep 15, 2020

bshall commented Sep 15, 2020

Kerry0123 commented Sep 16, 2020

bshall commented Sep 16, 2020

Kerry0123 commented Sep 16, 2020

bshall commented Sep 16, 2020

Kerry0123 commented Sep 15, 2020 •

edited

Loading