# FMA: A Dataset For Music Analysis

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.

## Usage

1. Download dataset from <https://github.com/mdeff/fma>.
2. Uncompress the archive, e.g. with `unzip fma_small.zip`.
3. Load and play with the data in this notebook.

In [None]:
%matplotlib inline

import utils
import librosa
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import IPython.display as ipd
import os.path
import ast

In [None]:
DATA_DIR = os.environ.get('DATA_DIR')

# Load meta-data.
tracks = pd.read_csv(os.path.join(DATA_DIR, 'tracks.csv'), index_col=0, converters={'genres': ast.literal_eval})
features = pd.read_csv(os.path.join(DATA_DIR, 'features.csv'), index_col=0, header=[0, 1, 2])
echonest = pd.read_csv(os.path.join(DATA_DIR, 'echonest.csv'), index_col=0, header=[0, 1, 2])

# Construct a path() function with will retrieve the audio of any track_id.
path = utils.build_path(tracks, os.path.join(DATA_DIR, 'fma_small'))

## 1 Metadata

The metadata table, a JSON file in the root directory of the archive, is composed of many colums:
1. The index is the ID of the song, taken from the FMA, used as the name of the audio file.
2. Meta-data from the Free Music Archive website (4 columns).
3. A column to indicate if the song is part of the training or testing set.


In [None]:
ipd.display(tracks.head())

## 2 Features

1. Features extracted from the audio for all tracks.
    1. MFCCs
2. For the `small` and `medium` datasets, data colected from the [Echonest](http://the.echonest.com/) API:
    1. Meta-data (5 columns).
    2. Temporal features: one vector per song.
    3. Audio features (8 columns).
    4. Social features (5 columns).
    5. Rankings (5 columns).

In [None]:
ipd.display(features['mfcc'].head())

In [None]:
ipd.display(echonest['echonest', 'metadata'].head())
ipd.display(echonest['echonest', 'audio_features'].head())
ipd.display(echonest['echonest', 'social_features'].head())
ipd.display(echonest['echonest', 'ranks'].head())

In [None]:
ipd.display(echonest['echonest', 'temporal_features'].head())
x = echonest.loc[10060, ('echonest', 'temporal_features')]
plt.figure(figsize=(15, 5))
plt.plot(x);

# 3 Audio

You can listen to an audio excerpt with the below code.

In [None]:
filename = path(0)
print('File: {}'.format(filename))
ipd.Audio(filename)

And use [librosa](https://github.com/librosa/librosa) to extract the raw waveform and compute audio features.

In [None]:
x, sr = librosa.load(filename)
print('Duration: {:.2f}s, {} samples'.format(x.shape[0] / sr, x.size))
ipd.display(ipd.Audio(data=x, rate=sr))

plt.figure(figsize=(15, 5))
plt.plot(x)

plt.figure(figsize=(15, 5))
S, freqs, bins, im = plt.specgram(x, NFFT=1024, Fs=sr, noverlap=512)

## 4 Genre classification

### 4.1 From features

In [None]:
# Be sure that you present the same tracks!
np.testing.assert_array_equal(tracks.index, features.index)

X = features['mfcc'].as_matrix()
y = tracks['top_genre'].as_matrix()

In [None]:
train = tracks['train'] == True

y_train = y[train]
y_test = y[~train]
X_train = X[train]
X_test = X[~train]

print('{} training examples, {} testing examples'.format(y_train.size, y_test.size))
print('{} features, {} classes'.format(X_train.shape[1], np.unique(y_train).size))

In [None]:
# Standardize features by removing the mean and scaling to unit variance.
scaler = StandardScaler(copy=False)
scaler.fit_transform(X_train)
scaler.transform(X_test)

# Support vector classification.
clf = SVC()
clf.fit(X_train, y_train)
score = clf.score(X_test, y_test)
print('Accuracy: {:.2f} %'.format(score*100))

### 4.2 From audio