# Processing long files with streams

This notebook analyzes a collection of long recordings for quality-related audio features. The features that it computes are basically the same as those in the previous notebook (04 - Measuring Audio Quality), but the features are summarized by their statistics.

The dataset that we analyze here consists of many long recordings (about 1 hour each), so loading an entire recording into memory and doing feature extraction would take more memory than we typically have on a laptop.  Instead, librosa allows us to work in *blocks* of audio at a time using the `librosa.stream()` function.  We compute the spectral roll-off and contrast features for each block, and then summarize all of the block-wise features to get a description of the entire recording. *Please note that the audio files we are using here are not available as part of this notebook.*

It saves the results in a CSV file / pandas data frame that can be loaded back later for analysis.  This CSV file will power the visualization code used in the next notebook.

**Features**:
    - rolloff, 95% roll, and [5, 25, 50, 75, 95] percentiles over time
    - spectral contrast: averaged across 5 octave bands starting at 80Hz.  percentile aggregates over time

In [None]:
import librosa
import numpy as np
import pandas as pd
import os

In [None]:
def analyze_file(filename):
    
    # We need to know the sampling rate of the file in advance when streaming
    
    sr = librosa.get_samplerate(filename)
    
    # These are our analysis parameters, rescaled to match the sampling rate of the file in question
    frame_length = (2048 * sr) // 22050
    hop_length = (512 * sr) // 22050
    
    # Set up the stream for the file.  We'll look at blocks of 2048 frames at a time
    stream = librosa.stream(filename, 2048, frame_length, hop_length)
    
    # These lists will contain our extracted features
    rolloff = []
    contrasts = []
    
    # y here is one block's worth of audio, rather than the entire signal
    for y in stream:
        # Our analysis uses uncentered frames when streaming.  This avoids introducing artifacts at the block boundaries.
        S = np.abs(librosa.stft(y, n_fft=frame_length, hop_length=hop_length, center=False))
        
        # Compute the roll-off and append it to the `rolloff` list.  Same for contrast
        rolloff.extend(librosa.feature.spectral_rolloff(S=S, sr=sr, roll_percent=0.95)[0])
        contrasts.append(librosa.feature.spectral_contrast(S=S, sr=sr, fmin=80, n_bands=5)[1:])
    
    # Tidy up after ourselves: the stream is finished
    stream.close()
    
    # Now compute the statistics of the features, and put them in a pandas dataframe
    contrasts = np.concatenate(contrasts, axis=1)
    mean_contrast = np.mean(contrasts, axis=0)
    
    data = dict(filename=os.path.basename(filename))
    quantiles = [5, 25, 50, 75, 95]
    R = np.percentile(rolloff, quantiles)
    C = np.percentile(mean_contrast, quantiles)
    for i in range(len(quantiles)):
        data['rolloff_{:02d}'.format(quantiles[i])] = R[i]
        data['contrast_{:02d}'.format(quantiles[i])] = C[i]
    
    # Return the dataframe
    return data

In [None]:
# Get all the files that have been ogg-encoded
files = librosa.util.find_files('swdata/', ext='ogg')

In [None]:
len(files)

In [None]:
df = pd.DataFrame.from_records([analyze_file(_) for _ in files], index='filename')

In [None]:
df

In [None]:
df.to_csv('audio_quality.csv')