You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just run the samples that were provided and played with the settings, but no luck to get even close to the advertised quality. The sound is good, but the music is very strange.
I tried with changing the params, but that made it worst.
model.set_generation_params(
use_sampling=True,
top_k=0,
top_p=0.9,
temperature=3.0,
max_cfg_coef=10.0,
min_cfg_coef=1.0,
decoding_steps=[int(20 * model.lm.cfg.dataset.segment_duration // 10), 10, 10, 10],
span_arrangement='stride1'
)
medium is a little bit better but still not as close to anything that I heard from other models
The text was updated successfully, but these errors were encountered:
I also had the same impression on MAGNeT and found that an author answered why MAGNeT performance is worse than the audio samples in demo page here.
[About Magnet‘s performance] : #395
As the author mentioned, "rescoring technique using MusicGen" (they haven't provided in this repository) should improve performance at inference time. And what we get now is the non-rescoring version in Table 3 in the paper.
In addition to this, I'm thinking the following based on the paper..
MAGNeT is faster, but the performance is still worse than MusicGen originally (Table 1).
The current metrics of music generative model (FAD, CLAP score, ets..) are not perfect. It seems that they can be easily cheated and the metrics value does't reflect the generation quality directly.
I love how quickly it creates the music.
What am I doing wrong?
I just run the samples that were provided and played with the settings, but no luck to get even close to the advertised quality. The sound is good, but the music is very strange.
I tried with changing the params, but that made it worst.
model.set_generation_params(
use_sampling=True,
top_k=0,
top_p=0.9,
temperature=3.0,
max_cfg_coef=10.0,
min_cfg_coef=1.0,
decoding_steps=[int(20 * model.lm.cfg.dataset.segment_duration // 10), 10, 10, 10],
span_arrangement='stride1'
)
medium is a little bit better but still not as close to anything that I heard from other models
The text was updated successfully, but these errors were encountered: