# Reserach on audio matching

Steps:
1. Investigate available metadata
2. Investigate MP3 information compression
3. Check correlation between metadata & MP3
4. Attempt to predict metadata from MP

## Setup

In [18]:
# IMPORTS
import os

import pandas as pd

import FMA.utils

In [10]:
# PATHS
class paths():
    # Folders
    DATA_F = '../../data/'
    FMA_SMALL_F = DATA_F + 'fma_small/'
    FMA_METADATA_F = DATA_F + 'fma_metadata/'

    # Input files

## Investigation

Ideas:
* Use language_code to give song on same language preference
  * Find a way to detect the language from audio
* Use similarity first to search for exact matches

In [None]:
# READ A MP3 FILE
mp3_ex_path = paths.FMA_SMALL_F + '000/00002.mp3'

## FMA metadata

In [14]:
# READ A MP3 FILE
mp3_ex_path = paths.FMA_SMALL_F + '000/00002.mp3'

In [19]:
import librosa
import librosa.display
import numpy as np
import matplotlib.pyplot as plt

# Load the audio file
filename = FMA.utils.get_audio_path(paths.FMA_SMALL_F, 2)
audio_path = mp3_ex_path  # Replace with your MP3 file
y, sr = librosa.load(filename, sr=None)  # Load with original sampling rate

# Extract Features
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)  # MFCCs
chroma = librosa.feature.chroma_stft(y=y, sr=sr)  # Chroma Features
spectral_contrast = librosa.feature.spectral_contrast(y=y, sr=sr)  # Spectral Contrast
tempo, _ = librosa.beat.beat_track(y=y, sr=sr)  # Tempo
zero_crossing_rate = librosa.feature.zero_crossing_rate(y)  # Zero-Crossing Rate

# Print extracted features
print(f"MFCCs Shape: {mfccs.shape}")
print(f"Chroma Features Shape: {chroma.shape}")
print(f"Spectral Contrast Shape: {spectral_contrast.shape}")
print(f"Tempo: {tempo}")
print(f"Zero-Crossing Rate Shape: {zero_crossing_rate.shape}")

# Plot the extracted features
plt.figure(figsize=(12, 8))

plt.subplot(5, 1, 1)
librosa.display.specshow(mfccs, sr=sr, x_axis="time")
plt.colorbar()
plt.title("MFCCs")

plt.subplot(5, 1, 2)
librosa.display.specshow(chroma, sr=sr, x_axis="time", cmap="coolwarm")
plt.colorbar()
plt.title("Chroma Features")

plt.subplot(5, 1, 3)
librosa.display.specshow(spectral_contrast, sr=sr, x_axis="time", cmap="magma")
plt.colorbar()
plt.title("Spectral Contrast")

plt.subplot(5, 1, 4)
plt.plot(librosa.times_like(zero_crossing_rate), zero_crossing_rate[0])
plt.title("Zero-Crossing Rate")
plt.xlabel("Time (s)")

plt.subplot(5, 1, 5)
plt.bar(["Tempo"], [tempo])
plt.title("Tempo (BPM)")

plt.tight_layout()
plt.show()


AttributeError: module 'numpy' has no attribute '_no_nep50_warning'

## Further possible investigation

* **[MillionSongDataset](http://millionsongdataset.com/):** 
  * The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks.
  * The dataset does not include any audio, only the derived features.  Note, however, that sample audio can be fetched from services like [7digital](http://www.7digital.com/), using [code](https://github.com/tb2332/MSongsDB/tree/master/Tasks_Demos/Preview7digital) we provide.
* **Essentia:** https://essentia.upf.edu/documentation.html
* Check out https://echoprint.me/server/ for fingerprint identification
* Music similarity Github: https://github.com/CDrummond/music-similarity

**Investigate:**
**Similar webpages:**
* https://www.chosic.com/
* https://tunebat.com/

**Counterwebs**
* [This article](https://neurosciencenews.com/ai-music-recommendations-18153/): AI algorithms used by music streaming services are better at providing accurate recommendations for those who enjoy mainstream music. However, the algorithms often miss the mark when it comes to recommendations for those who listen to non-mainstream musical genres like hip-hop or heavy metal.

**Other things:**
* The million song dataset has a [similarity song dataset](http://millionsongdataset.com/lastfm/). Check if songs from Million Song Dataset lastfm are in the FMA and 

## References

* [FMA: A Dataset for Music Analysis](https://github.com/mdeff/fma)
* [Audio Data Processing in Python](https://www.youtube.com/watch?v=ZqpSb5p1xQo)