You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been digging into this utility to prepare datasets for RAVE.
I've found a few things that I'd like to share with you so that we can improve it:
Just using resample without any data augmentation raises the level by 3.8 dB, approximately. I tested this by doing common phase cancellation (i.e., aligning the original and the resampled signals, inverting the phase of one of them, and adjusting the level until they cancel each other). This might not be a big deal for many audio files but if we are using mastered files, or any file peaking above -3dB, we will end up with a clipped signal.
The audio compression of the augmented signal is a nice addition. However, it also pumps the level of zones with soft background noise a lot. When playing live with RAVE I've found that there are plenty of zones like these, which is very annoying. The augmented version also fades in/out at the start and end of the audio file, which is unexpected (but not problematic, though)
The following image shows the same snippet of audio. The one on top is the original, the one in the middle is the resample version, and the bottom one is the augmented one.
The silent zone in the one on top peaks at -41dB, while the augmented version peaks at -7dB.
Don't you think this type of compression is a bit aggressive? Perhaps we can also benefit from also implementing expansion for signals like these?
Thank you so much,
Gabriel
The text was updated successfully, but these errors were encountered:
This can be desired sometimes, but for others this may not be desired. It could be a flag, especially considering that this happens just by using the resample command.
Hi Antoine,
I've been digging into this utility to prepare datasets for RAVE.
I've found a few things that I'd like to share with you so that we can improve it:
resample
without any data augmentation raises the level by 3.8 dB, approximately. I tested this by doing common phase cancellation (i.e., aligning the original and the resampled signals, inverting the phase of one of them, and adjusting the level until they cancel each other). This might not be a big deal for many audio files but if we are using mastered files, or any file peaking above -3dB, we will end up with a clipped signal.The following image shows the same snippet of audio. The one on top is the original, the one in the middle is the
resample
version, and the bottom one is theaugmented
one.The silent zone in the one on top peaks at -41dB, while the augmented version peaks at -7dB.
Don't you think this type of compression is a bit aggressive? Perhaps we can also benefit from also implementing expansion for signals like these?
Thank you so much,
Gabriel
The text was updated successfully, but these errors were encountered: