<a href="https://colab.research.google.com/github/jdavibedoya/essentia-models_mtg-jamendo/blob/master/Predictions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook computes predictions for 11 classifications tasks using two source tasks, namely MUSICNN_MSD and VGGish_AudioSet. More detailed information on the models used can be found here: [A collection of TensorFlow models for Essentia](https://mtg.github.io/essentia-labs//news/2020/01/16/tensorflow-models-released/)

In [0]:
# installing packages and downloading models

# install essentia-tensorflor
%pip install essentia-tensorflow -f https://essentia.upf.edu/python-wheels/

# download models
#!wget -N -P drive/Shared\ drives/AMPLAB\ Project/models/ -i drive/Shared\ drives/AMPLAB\ Project/models/Models.txt

In [0]:
# imports
import os
import json
import numpy as np
from essentia.standard import *

from google.colab import drive
drive.mount('/content/drive/', force_remount=True)
main_dir = "drive/Shared drives/AMPLAB Project/"

This cell computes the predictions for the source task `musicnn_msd`. This predictions are stored in the `predictions/musicnn_msd/` folder in a format similar to that used of the annotations, however this time two values $[a_0, a_1]$ are stored because the output layer of the architecture of the transfer learning classifiers consists of 2 units with sigmoid activations.

In [0]:
# computing predictions - musicnn_msd

# directories
main_dir = "drive/Shared drives/AMPLAB Project/"
models_dir = main_dir + "models/"
predictions_dir = main_dir + "predictions/musicnn_msd/"
annotations_dir = main_dir + "annotations/"

# target tasks by group
mood = ["mood_acoustic", "mood_electronic", "mood_aggressive", "mood_relaxed", "mood_happy", "mood_sad", "mood_party"]
miscellaneous = ["tonal_atonal", "danceability", "voice_instrumental", "gender"]

# set with the names of the annotation files
file_names = set()
for root, dirs, files in os.walk(annotations_dir):
    for file in files:
        if file.endswith('.json'):
            file_name = "-".join( file.split('-')[1:] )
            file_names.add(file_name)

# compute and store predictions for each audio sample
for file_name in file_names:
    if os.path.exists(predictions_dir + 'mood-' + file_name) and os.path.exists(predictions_dir + 'miscellaneous-' + file_name): # avoid recomputing
        continue
    # string manipulations to locate the audio file
    file_name_dir = file_name.split('-')[-1].split('_')[0]
    file_name_audio = file_name.split('_')[-1].split('.json')[0] 
    file_name_path = main_dir + "17/" + file_name_dir + "/" + file_name_audio

    audio = EasyLoader(filename = file_name_path, sampleRate = 16000, endTime = 180)() # load audio

    predictions_mood = {}
    predictions_miscellaneous = {}
    for root, dirs, files in os.walk(models_dir): # get model names
        for file in files:
              if file.endswith('musicnn-msd.pb'): # select source task
                  model_name = os.path.join(root, file)
                  model = file.split('-')[0]
                  prediction = TensorflowPredictMusiCNN(graphFilename = model_name)(audio) # compute prediction using MusiCNN model
                  prediction = np.mean(prediction, axis=0)
                  # store predictions by groups 
                  if model in mood:
                      predictions_mood[model] = prediction.tolist()
                  elif model in miscellaneous:
                      predictions_miscellaneous[model] = prediction.tolist()
                  else:
                      raise Exception("Invalid model")

    # create json files with predictions
    with open(predictions_dir + 'mood-' + file_name, 'w') as json_file_prediction:
        json.dump(predictions_mood, json_file_prediction)
    with open(predictions_dir + 'miscellaneous-' + file_name, 'w') as json_file_prediction: 
        json.dump(predictions_miscellaneous, json_file_prediction)

The next cell indicates the number of predictions that have been computed and stored so far.

In [3]:
# number of musicnn_msd prediction files per target task group
predictions_dir = main_dir + "predictions/musicnn_msd/"
print( '{} musicnn_msd predictions stored'.format( len(os.listdir(predictions_dir))//2 ) )

565 musicnn_msd predictions stored


This cell computes the predictions for the source task `vggish_audioset`.
This predictions are stored in the `predictions/vggish_audioset/` folder

In [0]:
# computing predictions - vggish_audioset

# directories
main_dir = "drive/Shared drives/AMPLAB Project/"
models_dir = main_dir + "models/"
predictions_dir = main_dir + "predictions/vggish_audioset/"
annotations_dir = main_dir + "annotations/"

# target tasks by group
mood = ["mood_acoustic", "mood_electronic", "mood_aggressive", "mood_relaxed", "mood_happy", "mood_sad", "mood_party"]
miscellaneous = ["tonal_atonal", "danceability", "voice_instrumental", "gender"]

# set with the names of the annotation files
file_names = set()
for root, dirs, files in os.walk(annotations_dir):
    for file in files:
        if file.endswith('.json'):
            file_name = "-".join( file.split('-')[1:] )
            file_names.add(file_name)

# compute and store predictions for each audio sample
for file_name in file_names:
    if os.path.exists(predictions_dir + 'mood-' + file_name) and os.path.exists(predictions_dir + 'miscellaneous-' + file_name): # avoid recomputing
        continue
    # string manipulations to locate the audio file
    file_name_dir = file_name.split('-')[-1].split('_')[0]
    file_name_audio = file_name.split('_')[-1].split('.json')[0] 
    file_name_path = main_dir + "17/" + file_name_dir + "/" + file_name_audio

    audio = EasyLoader(filename = file_name_path, sampleRate = 16000, endTime = 180)() # load audio

    predictions_mood = {}
    predictions_miscellaneous = {}
    for root, dirs, files in os.walk(models_dir): # get model names
        for file in files:
              if file.endswith('vggish-audioset.pb'): # select source task
                  model_name = os.path.join(root, file)
                  model = file.split('-')[0]
                  prediction = TensorflowPredictVGGish(graphFilename = model_name)(audio) # compute prediction using VGGish model
                  prediction = np.mean(prediction, axis=0)
                  # store predictions by groups 
                  if model in mood:
                      predictions_mood[model] = prediction.tolist()
                  elif model in miscellaneous:
                      predictions_miscellaneous[model] = prediction.tolist()
                  else:
                      raise Exception("Invalid model")
    
    # create json files with predictions
    with open(predictions_dir + 'mood-' + file_name, 'w') as json_file_prediction:
        json.dump(predictions_mood, json_file_prediction)
    with open(predictions_dir + 'miscellaneous-' + file_name, 'w') as json_file_prediction: 
        json.dump(predictions_miscellaneous, json_file_prediction)

The next cell indicates the number of predictions that have been computed and stored so far.

In [4]:
# number of vggish_audioset prediction files per target task group
predictions_dir = main_dir + "predictions/vggish_audioset/"
print( '{} vggish_audioset predictions stored'.format( len(os.listdir(predictions_dir))//2 ) )

565 vggish_audioset predictions stored
