How to speed up synthesis? #134

OnceJune · 2021-12-22T01:07:45Z

Hi, I tried to use WORLD to synth in mobile phones, the audio quality is good but speed is not fast. Is there any way to speed up synthesis? I called synthesisrealtime, and use a very small fft len, I noticed there're 7 fft forward/inverse when processing only one frame, is it possible to decrease the number? Thanks in advance.

mmorise · 2021-12-22T07:33:16Z

It isn't easy to speed up the synthesis when using the implemented algorithm. If you want to speed up the synthesis, you should implement another algorithm, and I have proposed an algorithm for this purpose. Since this algorithm is not released yet, you must implement it if needed.
https://ieeexplore.ieee.org/document/9023206

Another approach is to reduce the sampling frequency. The 24-kHz (or 22.05 kHz) sampling is reasonable as the value not to degrade the sound quality, and it is straightforward.

OnceJune · 2021-12-22T07:56:23Z

@mmorise Thanks, currently I'm using 16k synth, with mgc order 59. I've tried fft length 256, which output good audio quality. When I decrease fft to 128, the quality comes worse. If I use mgc order 23, do you think the quality will be good with fft length 128?

OnceJune · 2021-12-22T08:40:39Z

https://ieeexplore.ieee.org/document/9023206

read but not very understand lol

mmorise · 2021-12-23T03:08:00Z

I think appropriate FFT length depends on the F0 of the input signal, and the order of mgc would not affect the best FFT length.

OnceJune · 2021-12-23T09:39:26Z

https://ieeexplore.ieee.org/document/9023206

Am I understand correct? (Please delete this comment if I shouldn't write it here since your paper is not released yet:))

Prepare 7 band-pass filters;
Prepare MVN;
Prepare Pulse(Is it minimun phase using sp?);
Multiply 1 & 3;
Conv 2 & 3;
Multiply each subband from 4 by 1-ap, then sum together;
Multiply each subband from 5 by interpolated ap, then sum together;
Add 6 & 7.

Thank you again.

mmorise · 2021-12-24T08:36:42Z

There are several tunings for the 16-kHz speech synthesis. For example, the number of band-pass filters is three. Fig. 1 in the paper shows how to generate the excitation signal. After that, the algorithm process the excitation signal by a simple overlap-add (OLA) algorithm. This idea is similar to the mixed excitation.

Prepare Pulse(Is it minimun phase using sp?);

No. This algorithm uses a zero-phase spectrum to compensate for the original signal completely.

OnceJune · 2022-01-07T09:38:51Z

@mmorise Many thanks to your answer. I found minimum phase code in WORLD, how can I find zero-phase spectrum?

mmorise · 2022-01-07T13:23:22Z

The zero-phase spectrum of a spectrum X[k] is defined as the |X[k]|. In this synthesis, we use zero-phase as the phase spectrum of the excitation signals. After generating the excitation signal, the minimum phase spectrum generated from the spectral envelope is used.

bfs18 · 2023-11-02T09:34:01Z

hi @mmorise How to generate pulse? Is it generated from pitch in the similar logic as GetPulseLocationsForTimeBase in World code?

mmorise · 2023-11-02T15:51:18Z

Yes, the pulse is generated based on temporal positions in the vocal cord vibrations calculated by GetPulseLocatiosForTimebase in the synthesis function. In detail, amplitude 1 is given at these positions.

bfs18 · 2023-11-03T05:04:29Z

Hi @mmorise Thanks for you kind reply. My savior is online now. xD
I'm implementing the algorithm, but due to limited knowledge in audio signal processing, I have some questions with the details. Besides, this post is sort of misleading.

I annotated the questions in the figure.

is the filter applied via sliding widow multiplication and summation (temporal convolution)?
does the * symbol indicates temporal convolution? And is the temporal convolution implement via FFT frame-wisely. If this is the case, this part employs FFT N times , it is time-consuming.
Is Ap the AperiodicRatio in WORLD code? The ▷ symbol indicates scalar multiplication?
c = sqrt(number of samples in frame) ?
envelope shaping is implemented by multiplying the temporal signal with the interpolated AperiodicRatio?
Is step 2 of the algorithm the same as I depicted? The spectrum is first transformed into a minimum phase spectrum, which is then multiplied by the FFT of the excitation signal of the corresponding frame, and finally IFFT is performed.
number of taps of the filters used in 1.?
How is v/uv used in this algorithm?
How to "calculate the filter and the convolution in advance" as mentioned in Section III?

I'm sorry for so many questions and I look forward to your replies.

mmorise · 2023-11-03T05:43:37Z

Sorry, I misunderstood.
Do you have a MATLAB license? If yes, you can download an implementation of MATLAB (Please see TestWORLDRequiem.m for the usage).
https://www.isc.meiji.ac.jp/~mmorise/world/english/download.html

If you don't have it, I'll explain it again, but please give me some time because I have forgotten the details.

I didn't implement a C++ version, which is helpful for practical realization, because it may be close to a patent by another company. This is foresight to avoid trouble in the patent. I guess it is unlikely to cause patent trouble, but please use it with self-responsibility if you implement this program in C++.

bfs18 · 2023-11-03T06:09:12Z

Sorry, I misunderstood. Do you have a MATLAB license? If yes, you can download an implementation of MATLAB (Please see TestWORLDRequiem.m for the usage). https://www.isc.meiji.ac.jp/~mmorise/world/english/download.html

If you don't have it, I'll explain it again, but please give me some time because I have forgotten the details.

I didn't implement a C++ version, which is helpful for practical realization, because it may be close to a patent by another company. This is foresight to avoid trouble in the patent. I guess it is unlikely to cause patent trouble, but please use it with self-responsibility if you implement this program in C++.

Thank you for your quick reply!!! Great, the matlab code is open-sourced. I'll dive into the matlab code first.

bfs18 · 2023-11-04T09:58:50Z

Hi @mmorise , the matlab code is concise and clear. Now I grasp the idea and implementation details of the paper. Thank you!!

mmorise closed this as completed Feb 6, 2022

mmorise reopened this Nov 3, 2023

mmorise closed this as completed Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to speed up synthesis? #134

How to speed up synthesis? #134

OnceJune commented Dec 22, 2021

mmorise commented Dec 22, 2021

OnceJune commented Dec 22, 2021

OnceJune commented Dec 22, 2021

mmorise commented Dec 23, 2021

OnceJune commented Dec 23, 2021

mmorise commented Dec 24, 2021 •

edited

Loading

OnceJune commented Jan 7, 2022

mmorise commented Jan 7, 2022

bfs18 commented Nov 2, 2023

mmorise commented Nov 2, 2023

bfs18 commented Nov 3, 2023

mmorise commented Nov 3, 2023

bfs18 commented Nov 3, 2023

bfs18 commented Nov 4, 2023

How to speed up synthesis? #134

How to speed up synthesis? #134

Comments

OnceJune commented Dec 22, 2021

mmorise commented Dec 22, 2021

OnceJune commented Dec 22, 2021

OnceJune commented Dec 22, 2021

mmorise commented Dec 23, 2021

OnceJune commented Dec 23, 2021

mmorise commented Dec 24, 2021 • edited Loading

OnceJune commented Jan 7, 2022

mmorise commented Jan 7, 2022

bfs18 commented Nov 2, 2023

mmorise commented Nov 2, 2023

bfs18 commented Nov 3, 2023

mmorise commented Nov 3, 2023

bfs18 commented Nov 3, 2023

bfs18 commented Nov 4, 2023

mmorise commented Dec 24, 2021 •

edited

Loading