Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[teleMelody] How to import lyric to the generated midi sample? #78

Closed
elricwan opened this issue Oct 17, 2022 · 9 comments
Closed

[teleMelody] How to import lyric to the generated midi sample? #78

elricwan opened this issue Oct 17, 2022 · 9 comments

Comments

@elricwan
Copy link

Hi @jzq2000, I have a quick question about inference. In the example link https://ai-muzic.github.io/telemelody/, I notice that each sample contains both media and lyric. From the code, we could generated the midi file given lyric. So how do we match them together? Thank you!

@jzq2000
Copy link
Collaborator

jzq2000 commented Oct 18, 2022

To match them together, we add each lyric with timestamp in midi files, please referring to the codes. And the music scores in the example link are exported from the app MuseScore.

Feel free to tell me if I have any misunderstanding about your questions.

@elricwan
Copy link
Author

Thank you for the quick response, it looks like the generated midi object already contains the lyric. However, when I open the generated midi file (with GarageBand), I can only listen to it and I could not find any lyric. How can I transform the midi file to the sample of your style:

Screen Shot 2022-10-18 at 12 14 35 PM

BTW, I upload a midi file that generated by the inference code. Thank you!
0.mid.zip

@jzq2000
Copy link
Collaborator

jzq2000 commented Oct 19, 2022

You only need to open the midi file with an app which supports lyrics (e.g., MuseScore).

image

@jzq2000
Copy link
Collaborator

jzq2000 commented Oct 19, 2022

BTW, it seems that lyric-melody alignment goes wrong in your midi files. You may check the input format, referring to the testset

@elricwan
Copy link
Author

elricwan commented Oct 19, 2022

I see, I think I find the problem.
First, I did not separate the word from the lyric. the little is split to lit -tle as "crazy lit -tle thing called love [sep]", how do you decide which one to split?
More importantly, the syllable generated from the lyric is different from the syllable in phone_test. The function I use to generate syllable is:

from phonemizer import phonemize
from phonemizer.separator import default_separator, Separator

phn = phonemize(
    text,
    language='en-us',
    backend='festival',
    separator=Separator(phone='_', word=' ', syllable=' @@'),
    strip=True,
    preserve_punctuation=True,
    njobs=4
)

The text input I put

crazy lit -tle thing called love [sep] this thing called love [sep] it cries in a cra -dle night [sep] it swings it jives [sep] it shakes o -ver like a jel -ly fish [sep] i kind -a like it [sep] cra -zy lit -tle thing called love [sep] there goes my ba -by [sep] she knows how to rock roll [sep] she drives me cra -zy [sep] she gives me hot and cold fe -ver [sep] then she me [sep] get hip [sep] get on my [sep] take a back seat [sep] and take a long ride on my mo -tor bike [sep]

And the syllabus I get is

k_r_ey @@z_iy l_ih_t t_l th_ih_ng k_ao_l_d l_ah_v l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ih_t dh_ax_s th_ih_ng k_ao_l_d l_ah_v l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ih_t ih_t k_r_ay_z ih_n ax k_r_aa d_l n_ay_t l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t ih_t s_w_ih_ng_z ih_t jh_ay_v_z l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t ih_t sh_ey_k_s ow v_er l_ay_k ax jh_eh_l l_ay f_ih_sh l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t ay k_ay_n_d ey l_ay_k ih_t l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t k_r_aa z_ay l_ih_t t_l th_ih_ng k_ao_l_d l_ah_v l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t dh_eh_r g_ow_z m_ay b_aa b_ay l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t sh_iy n_ow_z hh_aw t_ax r_aa_k r_ow_l l_ax_f_t @@b_r_ax_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t sh_iy d_r_ay_v_z m_iy k_r_aa z_ay l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t sh_iy g_ih_v_z m_iy hh_aa_t ae_n_d k_ow_l_d f_ey v_er l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t dh_eh_n sh_iy m_iy l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t g_eh_t hh_ih_p l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t g_eh_t aa_n m_ay l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t t_ey_k ax b_ae_k s_iy_t l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t ae_n_d t_ey_k ax l_ao_ng r_ay_d aa_n m_ay m_ow t_ao_r b_ay_k l_eh_f_t @@b_r_ae_k @@ax_t s_ax_p @@t_eh_m_b @@er r_ay_t @@b_r_ae_k @@ax_t

Could you tell me how to fix it? Thank you!

@jzq2000
Copy link
Collaborator

jzq2000 commented Oct 20, 2022

  1. The text input is word sequence without '-' separation.
  2. '[sep]' should not be converted to corresponding syllables.

@elricwan
Copy link
Author

elricwan commented Oct 20, 2022

Thank you for the clarification! I appreciate your time. But there is still slightly difference between my phone and the phone in the test file. Take the first sentence for instance:

this thing called love and i just [sep] handle it [sep] this thing called love and i must get [sep] round to it [sep] i ready [sep]
dh_ax_s th_ih_ng k_ao_l_d l_ah_v ae_n_d ay jh_ah_s_t [sep] hh_ae_n_d @@ax_l ih_t [sep] dh_ax_s th_ih_ng k_ao_l_d l_ah_v ae_n_d ay m_ah_s_t g_eh_t [sep] r_aw_n_d t_ax ih_t [sep] ay r_eh_d @@iy [sep]

From the test file, we notice that

handle it

corresponds to phone

hh_ae_n_d @@ax_l ih_t

However, when I type the same input to the function:

text = 'handle it'
phn = phonemize(
    text,
    language='en-us',
    backend='festival',
    separator=Separator(phone='_', word=' ', syllable=' @@'),
    strip=True,
    preserve_punctuation=True,
    njobs=4
)

print(phn)

I got:

hh_ae_n @@d_ax_l ih_t

The position of d is different. How can I fix it? Thank you.

@jzq2000
Copy link
Collaborator

jzq2000 commented Oct 27, 2022

I'm guessing it's related to the version, but could not check it due to the loss of my environment.

BTW, I think both your code and result are correct and does not need a fix.

@elricwan
Copy link
Author

I see, thank you for solving my problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants