# Genre Prediction Tutorial

This notebook walks you through how to use an already trained SVM to predict the genre of an audio file and run queries on the dataset.

In [1]:
import sys
sys.path.insert(1, '../musical_robots')

from spectrogram_dataset import AudioFeature, create_audio_feature_dataset, create_dataframes
import torch
import numpy as np
import torch.nn as nn
import matplotlib.pyplot as plt
from svm_prediction import svm_prediction
from dataset_queries import return_similar_genres, return_most_popular_song, \
    play_random_song_from_genre, play_song_from_filename

First, create dataframes from the corresponding files.

In [2]:
file_path_df, track_df, relevant_genre_df, genre_df = create_dataframes(
    file_paths_path = '../musical_robots/data/all_data_paths.txt' , 
    tracks_csv_path = '../musical_robots/data/fma_metadata/tracks.csv',
    genre_csv_path = '../musical_robots/data/fma_metadata/genres.csv')

The file_path_df contains filepaths for all of the audio files in the 'data/fma_small' folder. 

In [3]:
file_path_df.head(1)

Unnamed: 0,file_path
0,000/000002.mp3


The track_df contains information about each audio file, including song title, album title, artist name, number of track listens, and top genre. 


In [4]:
track_df.head(1)

Unnamed: 0,track_id,album_id,album_listens,album_title,artist_name,track_favorites,track_genre_top,track_genres,track_interest,track_listens,track_title
0,2,1,6073,AWOL - A Way Of Life,AWOL,2,Hip-Hop,[21],4656,1293,Food


The relevant_genre_df contains all of the genres that are identified as the top genre for genres in the fma_small dataset.  The genre_df contains all of the genres in track_df. 

In [5]:
relevant_genre_df.head(1)

Unnamed: 0,index,genre_id,title
0,20,21,Hip-Hop


Give the path to an mp3 file.

In [11]:
filename = '../musical_robots/data/fma_small/' + file_path_df.iloc[0]['file_path']

Play a song from its' filename.

In [12]:
audio = play_song_from_filename(filename=filename)
audio

Predict the clip's genre. Note that svm_prediction() takes as an argument the relevant_genre_df.

In [15]:
genre = svm_prediction(filename, relevant_genre_df, model_filename='../musical_robots/svm_model.pkl')
print('The predicted genre is: ', genre)

The predicted genre is:  Hip-Hop


Return the most similar genres. Note that the dataset queries take as argument genre_df.

In [16]:
similar_genres = return_similar_genres(genre = genre, genre_df = genre_df, track_df= track_df, k= 10) 

print('The most similar genres to ', genre, ' are : ', similar_genres)

The most similar genres to  Hip-Hop  are :  ['Rap', 'Alternative Hip-Hop', 'Hip-Hop Beats', 'Nerdcore', 'Breakbeat']


Return the most popular song.

In [17]:
most_popular_ids, most_popular_songs, artists, albums = return_most_popular_song(genre = genre, 
                                                                                 genre_df = genre_df, 
                                                                                 track_df=track_df)

print('The most popular song in the genre ', genre, 'is ', most_popular_songs, 
      ' by ', artists, ' from the album ', albums)


The most popular song in the genre  Hip-Hop is  ['Fater Lee']  by  ['Black Ant']  from the album  ['Free Beats Sel. 3']


Play a random song from the genre. 

In [21]:
audio, title, artist, album = play_random_song_from_genre(genre = genre, 
                                                          genre_df= genre_df, 
                                                          track_df= track_df, 
                                                          path_df= file_path_df,
                                                          path_to_data='../musical_robots/data/fma_small/')

print("Here's a random song from ", genre, ": ", title, ' by ', artist, ' from the album ', album)
audio

Here's a random song from  Hip-Hop :  About myself (outro)  by  Sun Sunych  from the album  Hybris
