# FMA: A Dataset For Music Analysis

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.

## Features

The notebook generates:
* `features.json`: common features extracted with librosa.
* `spotify.json`: audio features provided by Spotify (formerly Echonest).

TODO:
* features given for the MSD

All features are extracted using [librosa](https://github.com/librosa/librosa). Alternatives:
* [MARSYAS](https://github.com/marsyas/marsyas) (C++ with Python bindings)
* [RP extract](http://www.ifs.tuwien.ac.at/mir/downloads.html) (Matlab, Java, Python)
* [jMIR jAudio](http://jmir.sourceforge.net) (Java)
* [MIRtoolbox](https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirtoolbox) (Matlab)

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

import utils
from tqdm import tqdm_notebook
import librosa
import pandas as pd
import numpy as np
from scipy import stats
import os.path
import ast

In [None]:
DATA_DIR = os.environ.get('DATA_DIR')
tracks = pd.read_csv(os.path.join(DATA_DIR, 'tracks.csv'), index_col=0, converters={'genres': ast.literal_eval})
#tracks = pd.read_json(os.path.join(DATA_DIR, 'tracks.json'), orient='split')
path = utils.build_path(tracks, os.path.join(DATA_DIR, 'fma_small'))

In [None]:
feature_sizes = dict(chroma_stft=12, chroma_cqt=12, chroma_cens=12, tonnetz=6, mfcc=13, rmse=1, zcr=1,
                     spectral_centroid=1, spectral_bandwidth=1, spectral_contrast=7, spectral_rolloff=1)
columns = []
for name, size in feature_sizes.items():
    for moment in ('mean', 'std', 'skew', 'kurtosis'):
        columns.extend((name, moment, '{:02d}'.format(i+1)) for i in range(size))

columns = pd.MultiIndex.from_tuples(columns, names=('feature', 'statistics', 'number'))

features = pd.DataFrame(index=tracks.index, columns=columns, dtype=np.float32)

# More performant to slice if indexes are sorted.
features.sort_index(axis=0, inplace=True)
features.sort_index(axis=1, inplace=True)

## 1 Segmentation

## 2 Low-level features

* Timbre (short-term): ZCR, SC, SR, SF, MFCC, DWCH
* Temporal: SM, ARM, FP, AM

Todo:
* parallel implementation

In [None]:
def feature_stats(name, values):
    features.loc[tid, (name, 'mean')] = values.mean(axis=1)
    features.loc[tid, (name, 'std')] = values.std(axis=1)
    features.loc[tid, (name, 'skew')] = stats.skew(values, axis=1)
    features.loc[tid, (name, 'kurtosis')] = stats.kurtosis(values, axis=1)

for i, tid in enumerate(tqdm_notebook(tracks.index)):
    x, sr = librosa.load(path(i), sr=None, mono=True)  # res_type='kaiser_fast'
    stft = np.abs(librosa.stft(x, n_fft=2048, hop_length=512))
    cqt = np.abs(librosa.cqt(x, sr=sr, hop_length=512, bins_per_octave=12, n_bins=7*12, tuning=None))

    c = librosa.feature.chroma_stft(n_chroma=12, S=stft**2)
    feature_stats('chroma_stft', c)
    c = librosa.feature.chroma_cqt(C=cqt, n_chroma=12, n_octaves=7)
    feature_stats('chroma_cqt', c)
    c = librosa.feature.chroma_cens(C=cqt, n_chroma=12, n_octaves=7)
    feature_stats('chroma_cens', c)
    t = librosa.feature.tonnetz(chroma=c)
    feature_stats('tonnetz', t)

    mel = librosa.feature.melspectrogram(sr=sr, S=stft**2)
    m = librosa.feature.mfcc(n_mfcc=13, S=librosa.power_to_db(mel))
    feature_stats('mfcc', m)

    rmse = librosa.feature.rmse(S=stft)
    feature_stats('rmse', rmse)
    zcr = librosa.feature.zero_crossing_rate(x, frame_length=2048, hop_length=512)
    feature_stats('zcr', zcr)

    s = librosa.feature.spectral_centroid(S=stft)
    feature_stats('spectral_centroid', s)
    s = librosa.feature.spectral_bandwidth(S=stft)
    feature_stats('spectral_bandwidth', s)
    s = librosa.feature.spectral_contrast(S=stft, n_bands=6)
    feature_stats('spectral_contrast', s)
    s = librosa.feature.spectral_rolloff(S=stft)
    feature_stats('spectral_rolloff', s)

## 3 High-level features

* Pitch: PH/PCP, EPCP
* Rhythm: BH, BPM
* Harmony: CP, CH

## 4 Store features

In [None]:
assert not features.isnull().values.any()

ndigits = 10
filename = os.path.join(DATA_DIR, 'features.csv')
features.to_csv(filename, float_format='%.{}e'.format(ndigits))

#features.to_json(os.path.join(DATA_DIR, 'features.json'), orient='split')
#features.to_hdf('features.hdf', 'features')
#features.to_hdf('features_zlib.hdf', 'features', complevel=9, complib='zlib')
#features.to_hdf('features_bzip2.hdf', 'features', complevel=9, complib='bzip2')
#features.to_hdf('features_lzo.hdf', 'features', complevel=9, complib='lzo')
#features.to_hdf('features_blosc.hdf', 'features', complevel=9, complib='blosc')

In [None]:
tmp = pd.read_csv(filename, index_col=0, header=[0, 1, 2])
np.testing.assert_allclose(tmp.values, features.values, rtol=10**-ndigits)

## 5 Spotify features

Todo: grab features through the Spotify API (formerly Echonest).