Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions doc/AmpFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
:digest: Realtime Amplitude Differential Feature
:species: transformer
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulationToolkit
:see-also: AmpGate, AmpSlice, OnsetFeature, NoveltyFeature
:description: Calculate the amplitude differential feature in realtime.
:discussion:
:fluid-obj:`AmpSlice` uses the differential between a fast and a slow envelope follower to determine changes in amplitude. This object calculates the amplitude differential and copies it to an output buffer.

:process: The audio rate in, control rate out version of the object.
:output: A KR signal of the feature.

:control in:

The audio to be processed.

:control fastRampUp:

The number of samples the relative envelope follower will take to reach the next value when raising. Typically, this will be faster than slowRampUp.

:control fastRampDown:

The number of samples the relative envelope follower will take to reach the next value when falling. Typically, this will be faster than slowRampDown.

:control slowRampUp:

The number of samples the absolute envelope follower will take to reach the next value when raising.

:control slowRampDown:

The number of samples the absolute envelope follower will take to reach the next value when falling.

:control floor:

The level in dB the slowRamp needs to be above to consider a detected difference valid, allowing to ignore the slices in the noise floor.

:control highPassFreq:

The frequency of the fourth-order Linkwitz–Riley high-pass filter (https://en.wikipedia.org/wiki/Linkwitz%E2%80%93Riley_filter). This is done first on the signal to minimise low frequency intermodulation with very fast ramp lengths. A frequency of 0 bypasses the filter.


67 changes: 67 additions & 0 deletions doc/BufAmpFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
:digest: Buffer-Based Amplitude Differential Feature
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulationToolkit
:see-also: BufAmpSlice, BufNoveltyFeature, BufAmpFeature, BufOnsetFeature
:description: Calculate the amplitude differential feature used by :fluid-obj:`BufAmpSlice`.
:discussion:
:fluid-obj:`BufAmpSlice` uses the differential between a fast and a slow envelope follower to determine changes in amplitude. This object calculates the amplitude differential and copies it to an output buffer.

:output: Nothing, as the destination buffer is declared in the function call.


:control source:

The index of the buffer to use as the source material to be sliced through novelty identification. The different channels of multichannel buffers will be summed.

:control startFrame:

Where in the srcBuf should the slicing process start, in sample.

:control numFrames:

How many frames should be processed.

:control startChan:

For multichannel sources, which channel should be processed.

:control numChans:

For multichannel sources, how many channel should be summed.

:control features:

The index of the buffer where the amplitude differential feature will be copied to.

:control fastRampUp:

The number of samples the relative envelope follower will take to reach the next value when raising. Typically, this will be faster than slowRampUp.

:control fastRampDown:

The number of samples the relative envelope follower will take to reach the next value when falling. Typically, this will be faster than slowRampDown.

:control slowRampUp:

The number of samples the absolute envelope follower will take to reach the next value when raising.

:control slowRampDown:

The number of samples the absolute envelope follower will take to reach the next value when falling.

:control floor:

The level in dB the slowRamp needs to be above to consider a detected difference valid, allowing to ignore the slices in the noise floor.

:control highPassFreq:

The frequency of the fourth-order Linkwitz–Riley high-pass filter (https://en.wikipedia.org/wiki/Linkwitz%E2%80%93Riley_filter). This is done first on the signal to minimise low frequency intermodulation with very fast ramp lengths. A frequency of 0 bypasses the filter.

:control padding:

Controls the zero-padding added to either end of the source buffer or segment. Possible values are 0 (no padding), 1 (default, half the window size), or 2 (window size - hop size). Padding ensures that all input samples are completely analysed: with no padding, the first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function. Mode 1 has the effect of centering the first sample in the analysis window and ensuring that the very start and end of the segment are accounted for in the analysis. Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.

:control action:

A Function to be evaluated once the offline process has finished and indices instance variables have been updated on the client side. The function will be passed indices as an argument.
101 changes: 101 additions & 0 deletions doc/BufNoveltyFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
:digest: Buffer-Based Novelty Feature
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition, UGens>Buffer
:sc-related: Guides/FluidCorpusManipulationToolkit
:see-also: BufNoveltySlice, BufNoveltyFeature, BufAmpFeature, BufOnsetFeature
:description: Calculates the novelty feature of audio stored in a buffer.
:discussion:
Calculate novelty of audio stored in a buffer, the feature used by :fluid-obj:`BufNoveltySlice` to perform segmentation.

Novelty is derived by running a kernel across the diagonal of the similarity matrix. It implements the seminal results published in 'Automatic Audio Segmentation Using a Measure of Audio Novelty' by J Foote.

The process will return a buffer containing a time series that describes the novelty feature changing over time in the source buffer.

:process: This is the method that calls for the slicing to be calculated on a given source buffer.
:output: Nothing, as the various destination buffers are declared in the function call.

:control source:

The index of the buffer to use as the source material to be sliced through novelty identification. The different channels of multichannel buffers will be summed.

:control startFrame:

Where in the srcBuf should the slicing process start, in sample.

:control numFrames:

How many frames should be processed.

:control startChan:

For multichannel srcBuf, which channel should be processed.

:control numChans:

For multichannel srcBuf, how many channel should be summed.

:control features:

The index of the buffer where the novelty feature will be written.

:control algorithm:

The feature on which novelty is computed.

:enum:

:0:
Spectrum – The magnitude of the full spectrum.

:1:
MFCC – 13 Mel-Frequency Cepstrum Coefficients.

:2:
Chroma - The contour of a 12-band chromagram.

:3:
Pitch – The pitch and its confidence.

:4:
Loudness – The true peak and loudness.

:control kernelSize:

The granularity of the window in which the algorithm looks for change, in samples. A small number will be sensitive to short term changes, and a large number should look for long term changes.

:control filterSize:

The size of a smoothing filter that is applied on the novelty curve. A larger filter filter size allows for cleaner cuts on very sharp changes.

:control windowSize:

The window size. As novelty estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally.

:control hopSize:

The window hop size. As novelty estimation relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts.

:control fftSize:

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision.

:control maxFFTSize:

How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.

:control maxKernelSize:

This cannot be modulated.

:control maxFilterSize:

This cannot be modulated.

:control padding:

Controls the zero-padding added to either end of the source buffer or segment. Possible values are 0 (no padding), 1 (default, half the window size), or 2 (window size - hop size). Padding ensures that all input samples are completely analysed: with no padding, the first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function. Mode 1 has the effect of centering the first sample in the analysis window and ensuring that the very start and end of the segment are accounted for in the analysis. Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.

:control action:

A Function to be evaluated once the offline process has finished and indices instance variables have been updated on the client side. The function will be passed indices as an argument.

2 changes: 1 addition & 1 deletion doc/BufNoveltySlice.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@

The index of the buffer where the indices (in sample) of the estimated starting points of slices will be written. The first and last points are always the boundary points of the analysis.

:control feature:
:control algorithm:

The feature on which novelty is computed.

Expand Down
106 changes: 106 additions & 0 deletions doc/BufOnsetFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
:digest: Buffer-Based Spectral Difference Feature
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulationToolkit
:see-also: BufOnsetSlice, BufNoveltyFeature, BufAmpFeature, BufOnsetFeature
:description: Calculate the spectral difference feature used by :fluid-obj:`BufOnsetSlice`.
:discussion:
Given a source buffer, calculates the feature used by :fluid-obj:`BufOnsetSlice` and copies it to another buffer.

The metric for calculating difference can be chosen from a curated selection, lending the algorithm toward slicing a broad range of musical materials.

:process: This is the method that calls for the slicing to be calculated on a given source buffer.
:output: Nothing, as the various destination buffers are declared in the function call.

:control source:

The index of the buffer to use as the source material to be sliced through novelty identification. The different channels of multichannel buffers will be summed.

:control startFrame:

Where in the srcBuf should the slicing process start, in sample.

:control numFrames:

How many frames should be processed.

:control startChan:

For multichannel sources, which channel should be processed.

:control numChans:

For multichannel sources, how many channel should be summed.

:control features:

The index of the buffer where the onset features will be written to.

:control metric:

The metric used to derive a difference curve between spectral frames. It can be any of the following:

:enum:

:0:
**Energy** thresholds on (sum of squares of magnitudes / nBins) (like Onsets \power)

:1:
**HFC** thresholds on (sum of (squared magnitudes * binNum) / nBins)

:2:
**SpectralFlux** thresholds on (diffence in magnitude between consecutive frames, half rectified)

:3:
**MKL** thresholds on (sum of log of magnitude ratio per bin) (or equivalently, sum of difference of the log magnitude per bin) (like Onsets mkl)

:4:
**IS** (WILL PROBABLY BE REMOVED) Itakura - Saito divergence (see literature)

:5:
**Cosine** thresholds on (cosine distance between comparison frames)

:6:
**PhaseDev** takes the past 2 frames, projects to the current, as anticipated if it was a steady state, then compute the sum of the differences, on which it thresholds (like Onsets \phase)

:7:
**WPhaseDev** same as PhaseDev, but weighted by the magnitude in order to remove chaos noise floor (like Onsets \wphase)

:8:
**ComplexDev** same as PhaseDev, but in the complex domain - the anticipated amp is considered steady, and the phase is projected, then a complex subtraction is done with the actual present frame. The sum of magnitudes is used to threshold (like Onsets \complex)

:9:
**RComplexDev** same as above, but rectified (like Onsets \rcomplex)

:control filterSize:

The size of a smoothing filter that is applied on the novelty curve. A larger filter filter size allows for cleaner cuts on very sharp changes.

:control frameDelta:

For certain metrics (HFC, SpectralFlux, MKL, Cosine), the distance does not have to be computed between consecutive frames. By default (0) it is, otherwise this sets the distane between the comparison window in samples.

:control windowSize:

The window size. As spectral differencing relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty

:control hopSize:

The window hop size. As spectral differencing relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).

:control fftSize:

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will default to windowSize.

:control maxFFTSize:

How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.

:control padding:

Controls the zero-padding added to either end of the source buffer or segment. Possible values are 0 (no padding), 1 (default, half the window size), or 2 (window size - hop size). Padding ensures that all input samples are completely analysed: with no padding, the first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function. Mode 1 has the effect of centering the first sample in the analysis window and ensuring that the very start and end of the segment are accounted for in the analysis. Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.

:control action:

A Function to be evaluated once the offline process has finished and indices instance variables have been updated on the client side. The function will be passed indices as an argument.

66 changes: 66 additions & 0 deletions doc/NoveltyFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
:digest: Realtime Novelty Feature
:species: descriptor
:sc-categories: Libraries>FluidDecomposition, UGens>Buffer
:sc-related: Guides/FluidCorpusManipulationToolkit
:see-also: NoveltySlice, AmpFeature, OnsetFeature
:description: Calculates the novelty feature of audio in realtime.
:discussion:
Calculate novelty in realtime, the feature used by :fluid-obj:`NoveltySlice` to perform segmentation.

Novelty derived by running a kernel across the diagonal of the similarity matrix. It implements the seminal results published in 'Automatic Audio Segmentation Using a Measure of Audio Novelty' by J Foote.

:process: The audio rate version of the object.
:output: A KR signal of the feature.

:control algorithm:

The feature on which novelty is computed.

:enum:

:0:
Spectrum – The magnitude of the full spectrum.

:1:
MFCC – 13 Mel-Frequency Cepstrum Coefficients.

:2:
Chroma - The contour of a 12-band chromagram.

:3:
Pitch – The pitch and its confidence.

:4:
Loudness – The true peak and loudness.

:control kernelSize:

The granularity of the window in which the algorithm looks for change, in samples. A small number will be sensitive to short term changes, and a large number should look for long term changes.

:control filterSize:

The size of a smoothing filter that is applied on the novelty curve. A larger filter filter size allows for cleaner cuts on very sharp changes.

:control windowSize:

The window size. As novelty estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally.

:control hopSize:

The window hop size. As novelty estimation relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts.

:control fftSize:

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision.

:control maxFFTSize:

How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.

:control maxKernelSize:

This cannot be modulated.

:control maxFilterSize:

This cannot be modulated.
Loading