Merge pull request #404 from librosa/cache-levels

Cache levels
librosa · Aug 17, 2016 · 397c724 · 397c724
2 parents 98469b4 + 1b28275
commit 397c724
Show file tree

Hide file tree

Showing 15 changed files with 347 additions and 182 deletions.
diff --git a/docs/cache.rst b/docs/cache.rst
@@ -37,61 +37,93 @@ The default configuration can be overridden by setting the following environment
   - `LIBROSA_CACHE_MMAP` : optional memory mapping mode `{None, 'r+', 'r', 'w+', 'c'}`
   - `LIBROSA_CACHE_COMPRESS` : flag to enable compression of data on disk `{0, 1}`
   - `LIBROSA_CACHE_VERBOSE` : controls how much debug info is displayed. `{int, non-negative}`
+  - `LIBROSA_CACHE_LEVEL` : controls the caching level: the larger this value, the more data is cached. `{int}`
 
 Please refer to the `joblib.Memory` `documentation
 <https://pythonhosted.org/joblib/memory.html#memory-reference>`_ for a detailed explanation of these
 parameters.
 
+
+Cache levels
+------------
+
+Cache levels operate in a fashion similar to logging levels.
+For small values of `LIBROSA_CACHE_LEVEL`, only the most important (frequently used) data are cached.
+As the cache level increases, broader classes of functions are cached.
+As a result, application code may run faster at the expense of larger disk usage.
+
+The caching levels are described as follows:
+
+    - 10: filter bases, independent of audio data (dct, mel, chroma, constant-q)
+    - 20: low-level features (cqt, stft, zero-crossings, etc)
+    - 30: high-level features (tempo, beats, decomposition, recurrence, etc)
+    - 40: post-processing (delta, stack_memory, normalize, sync)
+
+The default cache level is 10.
+
+
 Example
 -------
 To demonstrate how to use the cache, we'll first call an example script twice without caching::
 
-    [~/git/librosa/examples]$ time ./estimate_tuning.py ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3 
-    Loading  ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3
+    $ time -p ./estimate_tuning.py ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg 
+    Loading  ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg
     Separating harmonic component ... 
     Estimating tuning ... 
-    +6.00 cents
-    
-    real    0m4.369s
-    user    0m4.065s
-    sys     0m0.350s
-    
-    [~/git/librosa/examples]$ time ./estimate_tuning.py ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3 
-    Loading  ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3
+    +9.00 cents
+    real 6.74
+    user 6.03
+    sys 1.09
+
+    $ time -p ./estimate_tuning.py ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg 
+    Loading  ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg
     Separating harmonic component ... 
     Estimating tuning ... 
-    +6.00 cents
-    
-    real    0m4.414s
-    user    0m4.013s
-    sys     0m0.440s
-    
+    +9.00 cents
+    real 6.68
+    user 6.04
+    sys 1.05
+
 
 Next, we'll enable caching to `/tmp/librosa`::
 
-    [~/git/librosa/examples]$ export LIBROSA_CACHE_DIR=/tmp/librosa
+    $ export LIBROSA_CACHE_DIR=/tmp/librosa
+
+and set the cache level to 50::
+
+    $ export LIBROSA_CACHE_LEVEL=50
 
 And now we'll re-run the example script twice.  The first time, there will be no cached values, so the time
 should be similar to running without cache.  The second time, we'll be able to reuse intermediate values, so
 it should be significantly faster.::
 
-    [~/git/librosa/examples]$ time ./estimate_tuning.py ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3 
-    Loading  ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3
+    $ time -p ./estimate_tuning.py ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg 
+    Loading  ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg
     Separating harmonic component ... 
     Estimating tuning ... 
-    +6.00 cents
-    
-    real    0m4.859s
-    user    0m4.471s
-    sys     0m0.429s
-    
-    [~/git/librosa/examples]$ time ./estimate_tuning.py ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3 
-    Loading  ../librosa/example_data/Kevin_MacLeod_-_Vibe_Ace.mp3
+    +9.00 cents
+    real 7.60
+    user 6.79
+    sys 1.15
+
+    $ time -p ./estimate_tuning.py ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg 
+    Loading  ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg
     Separating harmonic component ... 
     Estimating tuning ... 
-    +6.00 cents
-    
-    real    0m0.931s
-    user    0m0.862s
-    sys     0m0.112s
+    +9.00 cents
+    real 1.64
+    user 1.30
+    sys 0.74
 
+Reducing the cache level to 20 yields an intermediate acceleration::
+
+    $ export LIBROSA_CACHE_LEVEL=50
+
+    $ time -p ./estimate_tuning.py ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg 
+    Loading  ../librosa/util/example_data/Kevin_MacLeod_-_Vibe_Ace.ogg
+    Separating harmonic component ... 
+    Estimating tuning ... 
+    +9.00 cents
+    real 4.98
+    user 4.17
+    sys 1.22
diff --git a/librosa/beat.py b/librosa/beat.py
@@ -182,7 +182,7 @@ def beat_track(y=None, sr=22050, onset_envelope=None, hop_length=512,
     return (bpm, beats)
 
 
-@cache
+@cache(level=30)
 def estimate_tempo(onset_envelope, sr=22050, hop_length=512, start_bpm=120,
                    std_bpm=1.0, ac_size=4.0, duration=90.0, offset=0.0):
     """Estimate the tempo (beats per minute) from an onset envelope
@@ -225,6 +225,9 @@ def estimate_tempo(onset_envelope, sr=22050, hop_length=512, start_bpm=120,
     --------
     librosa.onset.onset_strength
 
+    Notes
+    -----
+    This function caches at level 30.
 
     Examples
     --------
@@ -297,7 +300,6 @@ def estimate_tempo(onset_envelope, sr=22050, hop_length=512, start_bpm=120,
     return start_bpm
 
 
-@cache
 def __beat_tracker(onset_envelope, bpm, fft_res, tightness, trim):
     """Internal function that tracks beats in an onset strength envelope.
 

diff --git a/librosa/cache.py b/librosa/cache.py
@@ -15,35 +15,51 @@ class CacheManager(Memory):
     field, thereby allowing librosa.cache to act as a decorator function.
     '''
 
-    def __call__(self, function):
-        '''Decorator function.  Adds an input/output cache to
-        the specified function.'''
+    def __init__(self, cachedir, level=10, **kwargs):
+        super(CacheManager, self).__init__(cachedir, **kwargs)
+        # The level parameter controls which data we cache
+        # smaller numbers mean less caching
+        self.level = level
 
-        from decorator import FunctionMaker
+    def __call__(self, level):
+        '''Example usage:
 
-        def decorator_apply(dec, func):
-            """Decorate a function by preserving the signature even if dec
-            is not a signature-preserving decorator.
+        @cache(level=2)
+        def semi_important_function(some_arguments):
+            ...
+        '''
+        def wrapper(function):
+            '''Decorator function.  Adds an input/output cache to
+            the specified function.'''
 
-            This recipe is derived from
-            http://micheles.googlecode.com/hg/decorator/documentation.html#id14
-            """
+            from decorator import FunctionMaker
 
-            return FunctionMaker.create(
-                func, 'return decorated(%(signature)s)',
-                dict(decorated=dec(func)), __wrapped__=func)
+            def decorator_apply(dec, func):
+                """Decorate a function by preserving the signature even if dec
+                is not a signature-preserving decorator.
 
-        if self.cachedir is not None:
-            return decorator_apply(self.cache, function)
+                This recipe is derived from
+                http://micheles.googlecode.com/hg/decorator/documentation.html#id14
+                """
+
+                return FunctionMaker.create(
+                    func, 'return decorated(%(signature)s)',
+                    dict(decorated=dec(func)), __wrapped__=func)
+
+            if self.cachedir is not None and self.level >= level:
+                return decorator_apply(self.cache, function)
+
+            else:
+                return function
+        return wrapper
 
-        else:
-            return function
 
 # Instantiate the cache from the environment
 CACHE = CacheManager(os.environ.get('LIBROSA_CACHE_DIR', None),
                      mmap_mode=os.environ.get('LIBROSA_CACHE_MMAP', None),
                      compress=os.environ.get('LIBROSA_CACHE_COMPRESS', False),
-                     verbose=int(os.environ.get('LIBROSA_CACHE_VERBOSE', 0)))
+                     verbose=int(os.environ.get('LIBROSA_CACHE_VERBOSE', 0)),
+                     level=int(os.environ.get('LIBROSA_CACHE_LEVEL', 10)))
 
 # Override the module's __call__ attribute
 sys.modules[__name__] = CACHE
diff --git a/librosa/core/audio.py b/librosa/core/audio.py
@@ -152,7 +152,7 @@ def load(path, sr=22050, mono=True, offset=0.0, duration=None,
     return (y, sr)
 
 
-@cache
+@cache(level=20)
 def to_mono(y):
     '''Force an audio signal down to mono.
 
@@ -166,6 +166,10 @@ def to_mono(y):
     y_mono : np.ndarray [shape=(n,)]
         `y` as a monophonic time-series
 
+    Notes
+    -----
+    This function caches at level 20.
+
     Examples
     --------
     >>> y, sr = librosa.load(librosa.util.example_audio_file(), mono=False)
@@ -186,7 +190,7 @@ def to_mono(y):
     return y
 
 
-@cache
+@cache(level=20)
 def resample(y, orig_sr, target_sr, res_type='kaiser_best', fix=True, scale=False, **kwargs):
     """Resample a time series from orig_sr to target_sr
 
@@ -233,6 +237,10 @@ def resample(y, orig_sr, target_sr, res_type='kaiser_best', fix=True, scale=Fals
     scipy.signal.resample
     resampy.resample
 
+    Notes
+    -----
+    This function caches at level 20.
+
     Examples
     --------
     Downsample from 22 KHz to 8 KHz
@@ -355,7 +363,7 @@ def get_duration(y=None, sr=22050, S=None, n_fft=2048, hop_length=512,
     return float(n_samples) / sr
 
 
-@cache
+@cache(level=20)
 def autocorrelate(y, max_size=None, axis=-1):
     """Bounded auto-correlation
 
@@ -379,6 +387,10 @@ def autocorrelate(y, max_size=None, axis=-1):
         If `max_size` is specified, then `z.shape[axis]` is bounded
         to `max_size`.
 
+    Notes
+    -----
+    This function caches at level 20.
+
     Examples
     --------
     Compute full autocorrelation of y
@@ -422,7 +434,7 @@ def autocorrelate(y, max_size=None, axis=-1):
     return autocorr
 
 
-@cache
+@cache(level=20)
 def zero_crossings(y, threshold=1e-10, ref_magnitude=None, pad=True,
                    zero_pos=True, axis=-1):
     '''Find the zero-crossings of a signal `y`: indices `i` such that
@@ -431,6 +443,42 @@ def zero_crossings(y, threshold=1e-10, ref_magnitude=None, pad=True,
     If `y` is multi-dimensional, then zero-crossings are computed along
     the specified `axis`.
 
+
+    Parameters
+    ----------
+    y : np.ndarray
+        The input array
+
+    threshold : float > 0 or None
+        If specified, values where `-threshold <= y <= threshold` are
+        clipped to 0.
+
+    ref_magnitude : float > 0 or callable
+        If numeric, the threshold is scaled relative to `ref_magnitude`.
+
+        If callable, the threshold is scaled relative to
+        `ref_magnitude(np.abs(y))`.
+
+    pad : boolean
+        If `True`, then `y[0]` is considered a valid zero-crossing.
+
+    zero_pos : boolean
+        If `True` then the value 0 is interpreted as having positive sign.
+
+        If `False`, then 0, -1, and +1 all have distinct signs.
+
+    axis : int
+        Axis along which to compute zero-crossings.
+
+    Returns
+    -------
+    zero_crossings : np.ndarray [shape=y.shape, dtype=boolean]
+        Indicator array of zero-crossings in `y` along the selected axis.
+
+    Notes
+    -----
+    This function caches at level 20.
+
     Examples
     --------
     >>> # Generate a time-series
@@ -472,38 +520,6 @@ def zero_crossings(y, threshold=1e-10, ref_magnitude=None, pad=True,
     >>> # Find the indices of zero-crossings
     >>> np.nonzero(z)
     (array([ 0,  3,  5,  8, 10, 12, 15, 17, 19]),)
-
-
-    Parameters
-    ----------
-    y : np.ndarray
-        The input array
-
-    threshold : float > 0 or None
-        If specified, values where `-threshold <= y <= threshold` are
-        clipped to 0.
-
-    ref_magnitude : float > 0 or callable
-        If numeric, the threshold is scaled relative to `ref_magnitude`.
-
-        If callable, the threshold is scaled relative to
-        `ref_magnitude(np.abs(y))`.
-
-    pad : boolean
-        If `True`, then `y[0]` is considered a valid zero-crossing.
-
-    zero_pos : boolean
-        If `True` then the value 0 is interpreted as having positive sign.
-
-        If `False`, then 0, -1, and +1 all have distinct signs.
-
-    axis : int
-        Axis along which to compute zero-crossings.
-
-    Returns
-    -------
-    zero_crossings : np.ndarray [shape=y.shape, dtype=boolean]
-        Indicator array of zero-crossings in `y` along the selected axis.
     '''
 
     # Clip within the threshold
@@ -543,7 +559,6 @@ def zero_crossings(y, threshold=1e-10, ref_magnitude=None, pad=True,
                   constant_values=pad)
 
 
-@cache
 def clicks(times=None, frames=None, sr=22050, hop_length=512,
            click_freq=1000.0, click_duration=0.1, click=None, length=None):
     """Returns a signal with the signal `click` placed at each specified time