(suggestion) Ability to preface Gaussian Diffusion with a user-selectable acoustic model #198

SouperDuper · 2023-12-10T22:36:02Z

Vaguely inspired by DiffSinger's Shallow Diffusion mechanism.

Gaussian Diffusion has a tendency to become "lost" at times, producing undesirable results.

The ability to (separately?) train and use a lower quality acoustic model before diffusion could potentially provide a scaffolding to improve model stability, with the trade-off of more processing overhead.

In addition to this, the suggestion is to allow users to select this acoustic model so they may weigh their pros and cons, as they may influence the final result.
(Such as FFConvLSTM → GaussianDiffusion compared to BiLSTMResF0NonAttentiveDecoder → GaussianDiffusion.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(suggestion) Ability to preface Gaussian Diffusion with a user-selectable acoustic model #198

(suggestion) Ability to preface Gaussian Diffusion with a user-selectable acoustic model #198

SouperDuper commented Dec 10, 2023 •

edited

(suggestion) Ability to preface Gaussian Diffusion with a user-selectable acoustic model #198

(suggestion) Ability to preface Gaussian Diffusion with a user-selectable acoustic model #198

Comments

SouperDuper commented Dec 10, 2023 • edited

SouperDuper commented Dec 10, 2023 •

edited