Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 15 additions & 17 deletions doc/BufNMFCross.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,53 +3,51 @@
:sc-categories: FluidManipulation
:sc-related: Classes/FluidBufNMF, Classes/FluidNMFMatch, Classes/FluidNMFFilter
:see-also:
:description: Reconstructs the sound in the target buffer using components learned from the source buffer using an NMF decomposition
:description: Uses NMF decomposition to reconstruct a target sound using components learned from a source sound
:discussion:
The process works by attempting to reconstruct compoentns of the ``target`` sound using the timbre of the ``source`` sound, learned through a Nonnegative Matrix Factorisation. The result is a hybrid whose character depends on how well the target can be represnted by the source's spectral frames.

In contrast to :fluid-obj:`BufNMF`, the size and content of the bases dictionary are fixed in this application to be the spectrogram of the ``source``. Each spectral frame of ``source`` is a template: be aware that NMF is O(N^2) in the number of templates, so longer ``source`` buffers will take dramatically longer to process.

See Driedger, J., Prätzlich, T., & Müller, M. (2015). Let it Bee-Towards NMF-Inspired Audio Mosaicing. ISMIR, 350–356. http://ismir2015.uma.es/articles/13_Paper.pdf

The process works by attempting to reconstruct components of the ``target`` sound using the timbre (i.e., spectra) of the ``source`` sound, learned through a Non-negative Matrix Factorisation. The result is a hybrid whose character depends on how well the target can be represented by the source's spectral frames.

In contrast to :fluid-obj:`BufNMF`, each spectral frame of ``source`` is a spectral template. Be aware that NMF is O(N^2) in the number of templates, so longer ``source`` buffers will take dramatically longer to process.

See Driedger, J., Prätzlich, T., & Müller, M. (2015). Let it Bee-Towards NMF-Inspired Audio Mosaicing. ISMIR, 350–356. http://ismir2015.uma.es/articles/13_Paper.pdf

:control source:

A buffer whose content will supply the spectral bases used in the hybrid
A buffer whose content will supply the spectral bases used in the hybrid. The result will use the spectral frames from this buffer.

:control target:

A buffer whose content will supply the temporal activations used in the hybrid
A buffer whose content will supply the temporal activations used in the hybrid. The process aims to "sound like" this buffer using spectra from ``source``.

:control output:

A buffer to contain the new sound
A buffer to write the new sound to.

:control timeSparsity:

Control the repetition of source templates in the reconstruction by specifying a number of frames within which a template should not be re-used. Units are spectral frames.
Control the repetition of source templates in the reconstruction by specifying a number of frames within which a template should not be re-used. Units are spectral frames. The default is 7.

:control polyphony:

Control the spectral density of the output sound by restricting the number of simultaneous templates that can be used. Units are spectral bins.
Control the spectral density of the output sound by restricting the number of simultaneous templates that can be used. Units are spectral bins. The default is 10.

:control continuity:

Promote the use of N successive source frames, giving greater continuity in the result. This can not be bigger than the sizes of the input buffers, but useful values tend to be much lower (in the tens).
Promote the use of N successive source frames, giving greater continuity in the result. This can not be bigger than the sizes of the ``source`` buffer, but useful values tend to be much lower (in the tens). The default is 7.

:control iterations:

How many iterations of NMF to run
How many iterations of NMF to run. The default is 50.

:control windowSize:

The analysis window size in samples
The analysis window size in samples. The default is 1024.

:control hopSize:

The analysus hop size in samples (default winSize / 2)
The analysis hop size in samples. The default of -1 indicates half the ``windowSize``

:control fftSize:

The analsyis FFT size in samples (default = winSize)

The analsyis FFT size in samples The default of -1 indicates ``fftSize`` = ``windowSize``
37 changes: 20 additions & 17 deletions example-code/sc/BufNMFCross.scd
Original file line number Diff line number Diff line change
@@ -1,27 +1,30 @@

code::

~path = FluidFilesPath()
b = Buffer.read(s,~path+/+"Nicol-LoopE-M.wav")
t = Buffer.read(s,~path+/+"Tremblay-SA-UprightPianoPedalWide.wav")
o = Buffer.new
FluidBufNMFCross.process(s,t,b,o,windowSize: 2048, action:{"Ding".postln})
//wait for it to be done. It can take a while, depending on the length of your source.
o.play
(
~target = Buffer.readChannel(s,FluidFilesPath("Nicol-LoopE-M.wav"),channels:[0]);
~source = Buffer.readChannel(s,FluidFilesPath("Tremblay-SA-UprightPianoPedalWide.wav"),channels:[0]);
~output = Buffer(s);
)

//The result of the cross synthesis is a hybrid of the source and target sounds. The algorithm tries to match the target spectrum over time using components learned from the source. These parameters affect the reconstruction:
~sparsity = 4; //Avoid reusing a component from the source for this number of time frames
~polyphony = 3; //Avoid overlapping more than this number of source components at the same time
~continuity = 20; //Encourage the reconstruction to use this many temporally consecutive frames from the source
FluidBufNMFCross.processBlocking(s,~source,~target,~output,action:{"done".postln})
//wait for it to be done. It can take a while, depending on the length of your source.
~output.play;

//Using the UGen to run the process can be useful to monitor its progress
(
Routine{
~cross = FluidBufNMFCross.process(s,t,b,o,timeSparsity: ~sparsity, polyphony: ~polyphony, continuity: ~continuity, windowSize: 2048);
defer{{FreeSelfWhenDone.kr(~cross.kr).poll}.play;};
~cross.wait;
\Done.postln;
~cross = FluidBufNMFCross.process(s,~source,~target,~output,timeSparsity: 4, polyphony: 3, continuity: 20, windowSize: 2048);

{
FreeSelfWhenDone.kr(~cross.kr).poll;
}.play;

~cross.wait;

"done".postln;

}.play;
)
o.play

~output.play;
::