-
Notifications
You must be signed in to change notification settings - Fork 827
Open
Description
Fantastic work! I have been evaluating the model using sound files of different lengths. For sounds shorter (500ms in this example) than the 2 second audio clips used to train, I get the following warning:
WARNING:root:Large gap between audio n_frames(48) and target_length (204). Is the audio_target_length setting correct?
My question is how do sound clips of varying length affect the embedding output? In other words, can I still use embeddings from shorter clips, or should I duplicate shorter sounds to approximate the 2 seconds expected by the model?
fabawi, ninyx, artemisp, zeroQiaoba, zehuiwu and 7 morecococo2000, lzolyomi, gorjanradevski and sangminwoo
Metadata
Metadata
Assignees
Labels
No labels