Use MUDA without audio file #38

mcartwright · 2016-11-04T21:27:15Z

In my current project, my jams files completely specify my audio, and at training time the audio can be synthesized from the jams file. I'd ideally augment these jams files with muda without having to synthesize them first, and then I'd simply save the muda-augmented jams files. At training time, I'd synthesize the audio and process the muda deformations.

Is it possible to use muda without passing the audio initial audio file? It seems right now, if I don't use muda.load_jam_audio() to process my jams file (it just adds the empty history and library versions to the jam?), it errors when I call the transform method of my pipeline.

Is there a reason muda needs the audio file before actually processing the audio?

bmcfee · 2016-11-04T22:27:28Z

I think I don't 100% understand your use case here. Is the problem that the audio does not live on disk?

If so, it wouldn't be hard to modify muda to support in-memory audio data that's dynamically generated. Everything after that should be pretty easy.

mcartwright · 2016-11-08T15:44:04Z

No, it's not the audio does not live on disk but rather it does not exist yet. Regardless of this point... the main thing is that I think it could be useful to to decouple the transformation of the jams files from the transformation of the audio... and then be able to load the transformed jams file at a later point in time and process the audio then.

This could be useful if you are "statically" augmenting a dataset (rather than dynamically at training time) in which you want to save the augmented dataset to use for training with another learner... but you don't want to save all of the processed audio files because of disk constraints.

Once these two are decoupled, it also doesn't seem necessary to provide the original audio file or signal in order to simply transform the jams.

bmcfee · 2016-11-08T15:58:19Z

Okay, thanks for clarifying. I think I understand your point now. (I think loading from in-memory audio is a good idea too though, and should be a separate issue.)

I get a little nervous about decoupling things in this way, but I do like the idea of delayed audio deformation. The current design makes this almost possible, since all transformation state is logged in the jams sandbox. Something like the following:

Add a delayed flag to BaseTransformer.transform and Pipeline.transform, and _transform. When this is true, the audio step is skipped in _transform.
Add a rebuild method that reconstitutes the audio from a jams with a muda history entry.

Then you should be able to do something like the following:

>>> # Build a muda pipeline
>>> for jam_out in pipeline.transform(jam_in, delayed=True):
            jam_out.save(SOMEWHERE)
>>> # Some time later
>>> jam_reload = jams.load(SOMEWHERE)
>>> audio_out = muda.rebuild(jam_reload, audio_in)

muda.rebuild would need to do the following:

inspect jam_reload for muda history
reconstruct all transformers in the history
apply transformers to audio_in in order, with the corresponding state variables

Sound reasonable?

mcartwright · 2016-11-08T16:04:16Z

Yeah, that sounds perfect.

bmcfee · 2016-11-08T16:21:53Z

While reviewing #40 , I realized a slight snag in this plan. The transformer in question needs there to be an audio buffer in place to calculate noise clip durations and generate the state object.

This might require some careful thought to get right.

bmcfee · 2017-09-01T15:51:26Z

Coming back to this having implemented a prototype of the rebuild op in #62 , and reviewed the current suite of deformer ops. TLDR: i think this isn't gonna work in general, but you might be able to hack around it to get something going.

As mentioned above, many of the deformers require access to the audio signal to determine their deformation states, notably pitch-shifting requires a tuning estimate, and background-noise requires a sample-accurate duration count. These things really do need to be pre-computed so that the deformations stay as a deterministic, self-contained set of functions of the state variables. I don't see a good way around that (eg by delayed computation).

@mcartwright if you're still interested in this, how hacky would it be to pre-compute a dummy signal for generating the deformation jamses, and then use the rebuild functionality afterward on real data?

bmcfee · 2019-08-21T19:33:04Z

I think this is as fixed as it's going to get, having merged the replay() function.

bmcfee added enhancement functionality labels Nov 8, 2016

bmcfee added this to the 0.1.2 milestone Nov 8, 2016

bmcfee modified the milestones: 0.1.2, 0.1.3 Mar 1, 2017

bmcfee closed this as completed Aug 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use MUDA without audio file #38

Use MUDA without audio file #38

mcartwright commented Nov 4, 2016

bmcfee commented Nov 4, 2016

mcartwright commented Nov 8, 2016 •

edited

Loading

bmcfee commented Nov 8, 2016

mcartwright commented Nov 8, 2016

bmcfee commented Nov 8, 2016

bmcfee commented Sep 1, 2017

bmcfee commented Aug 21, 2019

Use MUDA without audio file #38

Use MUDA without audio file #38

Comments

mcartwright commented Nov 4, 2016

bmcfee commented Nov 4, 2016

mcartwright commented Nov 8, 2016 • edited Loading

bmcfee commented Nov 8, 2016

mcartwright commented Nov 8, 2016

bmcfee commented Nov 8, 2016

bmcfee commented Sep 1, 2017

bmcfee commented Aug 21, 2019

mcartwright commented Nov 8, 2016 •

edited

Loading