# Modelling music genres with Convolutional Neural Networks

The previous notebook was mostly about **data processing**:
- Download the initial data,
- Reduce it,
- Make it more readable.

Here, we will be using the output of that previous notebook.

## Importing the data we need

Note: the Google Drive download **should work** no matter the Google account the user may choose for the authentication.

In [0]:
import pandas as pd
import numpy as np
import os
import time

In [0]:
# Install the PyDrive wrapper & import libraries.
# This only needs to be done once per notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Download a file based on its file ID.
#
# A file ID looks like: laggVyWshwcyP6kEI-y_W3P8D26sz

zip_id = '1n-sRLDAAZfdibGQXPYYo8v0s0rcDGJC6'
downloaded = drive.CreateFile({'id': zip_id})
downloaded.GetContentFile('zip_spectrogram.zip')

small_id = "1maJ4o_aHSy-ZLHUzOYuQ0QnRbS3rIciX"
downloaded = drive.CreateFile({'id': small_id})
downloaded.GetContentFile('small_tracks.csv')


tracks_id = "1me5bv76Fd9mFyJd25TSzOrl6SrcxsFFk"
downloaded = drive.CreateFile({'id': tracks_id})
downloaded.GetContentFile('tracks.csv')


#print('Downloaded content "{}"'.format(downloaded.GetContentString()))

The step of unzipping the spectrograms, and then loading them to the memory, still takes pretty long (though much less so than the original one).

In [3]:
!unzip zip_spectrogram.zip

Archive:  zip_spectrogram.zip
   creating: spectrogram/
  inflating: spectrogram/109905.txt  
  inflating: spectrogram/126187.txt  
  inflating: spectrogram/122809.txt  
  inflating: spectrogram/116704.txt  
  inflating: spectrogram/003832.txt  
  inflating: spectrogram/127299.txt  
  inflating: spectrogram/152545.txt  
  inflating: spectrogram/004233.txt  
  inflating: spectrogram/108014.txt  
  inflating: spectrogram/031887.txt  
  inflating: spectrogram/063804.txt  
  inflating: spectrogram/042046.txt  
  inflating: spectrogram/145710.txt  
  inflating: spectrogram/043842.txt  
  inflating: spectrogram/114279.txt  
  inflating: spectrogram/006394.txt  
  inflating: spectrogram/004070.txt  
  inflating: spectrogram/092129.txt  
  inflating: spectrogram/092951.txt  
  inflating: spectrogram/048367.txt  
  inflating: spectrogram/006342.txt  
  inflating: spectrogram/081555.txt  
  inflating: spectrogram/143216.txt  
  inflating: spectrogram/044796.txt  
  inflating: spectrogram/126018.

In [4]:
len(os.listdir('spectrogram'))

1001

In [11]:
os.listdir('spectrogram')[0][:-4]

'141290'

There is one extra file that appeared during the process and that we couldn't really get rid of before, so let's remove it now.

In [0]:
for file_name in os.listdir('spectrogram'):
  if file_name[-4:] != ".txt": # Remove the one file that does not have a .txt extension
    os.remove('spectrogram/' + file_name)

In [7]:
len(os.listdir('spectrogram'))

1000

Everything is in order: we can now import every text file in Numpy again.

In [15]:
import time

start_time = time.time()

file_dict = {} # A dictionary again!! It's because I can and because I like them

for file_name in os.listdir('spectrogram'):
  file_dict[file_name[:-4]] = np.loadtxt('spectrogram/%s' % file_name)
  
print('Time lapsed: %f' % (time.time() - start_time))

Time lapsed: 149.939054


In [14]:
file_dict

{'014579': array([[1.57374129e-02, 2.47489284e-01, 1.84735169e-01, ...,
         1.94785701e-01, 1.50717733e-01, 1.76884645e-01],
        [4.75279647e-01, 6.64805623e-01, 5.39150310e-01, ...,
         1.62074048e-01, 5.81101488e-02, 1.77214669e-01],
        [2.39118200e+00, 6.19566867e-01, 1.56863328e+00, ...,
         5.72528022e-02, 1.39914750e-01, 3.43807693e-01],
        ...,
        [9.72771484e-06, 1.60575027e-07, 3.91538698e-08, ...,
         2.82191136e-05, 1.81553838e-05, 8.45021968e-06],
        [1.25022242e-05, 5.77653859e-08, 4.29462962e-08, ...,
         7.91627637e-07, 5.35027973e-07, 6.15797591e-07],
        [1.72560889e-05, 4.59501906e-08, 4.78152878e-08, ...,
         3.65111682e-06, 2.39277743e-06, 1.32599591e-06]]),
 '064989': array([[1.20959374e-02, 4.92232181e+00, 3.77782997e+00, ...,
         3.20983011e-01, 3.67875918e-01, 3.64809515e-01],
        [2.19370261e-03, 8.91565666e+00, 7.43959707e+00, ...,
         3.21002092e-01, 2.78513667e-01, 3.35052619e-01],
     