Passing data or a Processor into RNNDownBeatProcessor instead of a filename #357

robclouth · 2018-04-04T04:17:50Z

Hey!
I need to trim the audio before passing it into the RNNDownBeatProcessor, but I can't figure out how pass data instead of a filename. Any ideas?

Thanks

superbock · 2018-04-04T05:15:24Z

You have a couple of options to accomplish that:

trim the audio beforehand and pass it together with the correct sample_rate through the processor
simply add start and stop positions in seconds as processing arguments (when calling the processor)

HTH

robclouth · 2018-04-04T05:37:35Z

audio, sr = load_audio_file(audio_file)
audio = trim(audio)
act = RNNBeatProcessor()(audio, sample_rate=sr)

Like this?

superbock · 2018-04-04T06:02:49Z

Yes, but the above only works if the original sample rate is 44.1kHz. Sorry for the misleading first answer. To work with any sample rate, do the following:

audio, _ = load_audio_file(audio_file, 44100)  # resamples the signal to 44.1kHz
audio = trim(audio)
act = RNNBeatProcessor()(audio)

If the sample rate is already 44.1kHz, you can omit the first line.

There should be no need to trim the signal, though. I assumed that you wanted 'trim' the signal as in skipping the first N seconds or extracting snippets of a certain length, then act = RNNBeatProcessor()(audio_file, start=1, stop=10) would have been the easiest.

Some background information: RNNBeatProcessor requires 44.1kHz sampled audio in order to apply the same audio pre-processing (i.e. filtering) as when the neural network was trained. Thus, if the sample rate differs from the default 44.1kHz it needs to be resampled to exactly this rate.

robclouth · 2018-04-04T06:08:35Z

Great! Thanks for that. I'm noticing an offset of the returned beats by around 40ms before it should be. The trimming was to see if the silence at the start was throwing it off. Know where that offset might be coming from? If you think it'll be consistently 40ms i can manually adjust it.

superbock · 2018-04-04T07:23:38Z

Could you please check if this a constant offset (i.e. throughout the piece) or if it affects only a couple of beats (e.g. the first ones). If it is the former case, I'd be very interested in checking it. If it is the latter case, I fear that there's not much you can do about, since this is just what the network predicts.

I doubt that it has anything to do with leading or trailing silence in the files, but I cannot rule it out.

A question though: what are you doing with these raw activations? Usually, the final decision about the beats is done in a second step, e.g. by DBNBeatTrackingProcessor.

bzvew · 2018-05-09T03:45:06Z

I cannot run RNNDownBeatProcessor on an audio read by madmom. The error is as follows:

melody,sr = madmom.audio.signal.load_audio_file(file_name, sample_rate=44100,dtype ='float')

print(melody.shape)
(9192960, 2)

RNNDownBeatProcessor()(melody)

~/Downloads/madmom/madmom/audio/stft.py in process(self, data, **kwargs)
475 circular_shift=self.circular_shift,
476 include_nyquist=self.include_nyquist,
--> 477 fft_window=self.fft_window, **kwargs)
478 # cache the window used for FFT
479 # Note: depending on the signal this may be scaled already

~/Downloads/madmom/madmom/audio/stft.py in new(cls, frames, window, fft_size, circular_shift, include_nyquist, fft_window, **kwargs)
334 data = stft(frames, fft_window, fft_size=fft_size,
335 circular_shift=circular_shift,
--> 336 include_nyquist=include_nyquist)
337
338 # cast as ShortTimeFourierTransform

~/Downloads/madmom/madmom/audio/stft.py in stft(frames, window, fft_size, circular_shift, include_nyquist)
73 # TODO: add multi-channel support
74 raise ValueError('frames must be a 2D array or iterable, got %s with '
---> 75 'shape %s.' % (type(frames), frames.shape))
76
77 # shape of the frames

ValueError: frames must be a 2D array or iterable, got <class 'madmom.audio.signal.FramedSignal'> with shape (20846, 1024, 2).

superbock · 2018-05-09T06:45:07Z

RNNDownBeatProcessor expects a mono Signal — but there's no need to load the audio manually beforehand, simply pass file_name.

bzvew · 2018-05-09T07:09:20Z

I want to combine several tracks together, and makes that mix as input to RNNDownBeatProcessor. So I have to write the mix to file first, then read that file, right? Also I want to confirm if I should this madmom.audio.signal.rescale function to make sure the audio amplitude is not too large?

superbock · 2018-05-09T07:20:31Z

No, there's no need to save as file first, you only have to downmix it to mono. I included the information about the filename since your example showed that you are reading a file.

You did not state how you combined the tracks, so I can only guess. If you used an external tool, reading from file is the easiest. If you did it in Python and the signal is available as an ndarray, you can use the remix function to downmix it and then instantiate a Signal. I hoped that for the latter case it also worked by passing num_channels to Signal, but this does currently not work, see #367. I'll submit a fix soon, then you can do:

s = Signal(audio_array, sample_rate, num_channels=1)

bzvew · 2018-05-09T07:32:43Z

Sorry for missing that info. Below is what I do now:

melody,sr = madmom.audio.signal.load_audio_file(os.path.join(dir_name, melody_name), sample_rate=44100,dtype = 'float')

bass,_ = madmom.audio.signal.load_audio_file(os.path.join(dir_name, bass_name), sample_rate=44100,dtype = 'float')

mix =madmom.audio.signal.rescale(bass+melody)
RNNDownBeatProcessor()(mix)

superbock · 2018-05-09T08:05:31Z

mono = madmom.audio.signal.remix(mix, 1)
RNNDownBeatProcessor()(mono, sample_rate = sr)

should do the trick.

superbock · 2018-05-09T09:20:18Z

I merged #368, you can try with the current master if it works like you tried in your last comment (not entirely sure if it does).

bzvew · 2018-05-09T09:24:49Z

I add load_audio_file parameter num_channels=1, that solves the problem!

superbock · 2018-05-09T09:25:50Z

Of course, this solves it as well.

superbock closed this as completed Oct 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Passing data or a Processor into RNNDownBeatProcessor instead of a filename #357

Passing data or a Processor into RNNDownBeatProcessor instead of a filename #357

robclouth commented Apr 4, 2018

superbock commented Apr 4, 2018

robclouth commented Apr 4, 2018

superbock commented Apr 4, 2018

robclouth commented Apr 4, 2018

superbock commented Apr 4, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

superbock commented May 9, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

Passing data or a Processor into RNNDownBeatProcessor instead of a filename #357

Passing data or a Processor into RNNDownBeatProcessor instead of a filename #357

Comments

robclouth commented Apr 4, 2018

superbock commented Apr 4, 2018

robclouth commented Apr 4, 2018

superbock commented Apr 4, 2018

robclouth commented Apr 4, 2018

superbock commented Apr 4, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018

superbock commented May 9, 2018

bzvew commented May 9, 2018

superbock commented May 9, 2018