Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement melody encoder and support glide input #143

Merged
merged 16 commits into from
Oct 8, 2023

Conversation

yqzhishen
Copy link
Member

@yqzhishen yqzhishen commented Sep 15, 2023

Implementation of #142.

Experiment results

We trained pitch predictors on three datasets, each containing one singer to test the effects of melody encoder:

  • Female#1, with a plain pitch style.
  • Female#2, with an extremely expressive pitch style.
  • Male#1, with a expressive pitch style.

The comparisons on maximum RPAs (raw pitch accuracy with tolerance of 50 cents) achieved after convergence (>150k steps) are shown below.

w/ base pitch w/ melody encoder w/ glide embedding Female#1 Female#2 Male#1
× N/A 0.8613 0.6128 0.6073
× 0.8744 0.6575 0.6276
× × 0.8629 0.6879 0.6461
× - 0.6961 -

The results showed that melody encoder is more suitable than base pitch to carry music score information, especially on expressive datasets. On TensorBoard, significant improvements on short slurs and long vibratos were also observed. In our internal tests, pitch predictors with melody encoder also outperformed the old method on out-of-range notes, and can still show its sensitiveness even if the music scores are far higher than normal range (e.g. over C7 for a male singer). [Demo]

Additional experiments on ornaments: the glides

With the modeling of melody encoder on note sequence, we successfully introduced ornament flags to the architecture of the variance model. For this time we tested glides, where the pitch smoothly rises at the beginning of the note, or drops at the end of the note. We labeled 31 notes that glide up and 75 notes that glide down out of 71 minutes of data from Female#1, and left everything else unchanged. The experiment results showed a slightly higher RPA with glide type embedding than the baseline. In further tests, melody encoder with glide type embedding can produce accurate and natural glides with simple glide flags, without having to draw manual pitch curves like before. [Demo]

@yqzhishen yqzhishen linked an issue Sep 15, 2023 that may be closed by this pull request
@yqzhishen yqzhishen marked this pull request as ready for review October 7, 2023 16:17
@yqzhishen yqzhishen merged commit 3ae0e7d into openvpi:main Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Melody encoder and ornaments modeling
1 participant