Merge branch 'master' of https://github.com/claytonblythe/neuralMusic

claytonblythe · Nov 22, 2017 · a31e348 · a31e348
2 parents 4ef2c11 + 41ea8c5
commit a31e348
Show file tree

Hide file tree

Showing 14 changed files with 9,562 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 
 This is a project called neuralMusic created by Clayton Blythe on 2017/09/29
 
-I aim to use the Free Music Archive (FMA) along with Convolutional Neural Networks to do genre classification from 
+I aim to use the Free Music Archive (FMA) along with Convolutional Neural Networks in PyTorch to do genre classification from 
 short snippets of different songs (~6 seconds of audio). I am working on taking random 6 second samples from 100,000 songs to train a classifier. 
 
 Here is an example of what a spectrogram looks like, it is kind of like a "fingerprint" for a song, representing how different frequencies of sound evolve over time. 
@@ -13,6 +13,11 @@ For example, here is a six second snippet from "Lose Yourself To Dance" by Daft
 ![Alt Test](https://github.com/claytonblythe/neuralMusic/blob/master/data/spectrograms/lose_yourself_to_dance.png)
 
 
-Convolutional Neural Networks have contributed to amazing advancements in image recognition, and this dataset is fairly large, so I am looking to see how good they are at converting the visual representation of a snippet of audio into genre predictions. I imagine it could be a cool app where you can get classification of a genre from a very short recording of audio
+Convolutional Neural Networks have contributed to amazing advancements in image recognition, and this dataset is fairly large, so I am looking to see how good they are at converting the visual representation of a snippet of audio into genre predictions. I imagine it could be a cool app where you can get classification of a genre from a very short recording of audio.
+
+
+With an initial test on 5.94 second length samples, training on ~5500 samples and testing on ~2200 validation examples, I achieved an accuracy of 45.7% at classifying a song's membership to one of eight genre classses. I think these results are pretty decent considering the small size of data I used, and for such a short time snippet. I uses a model of five hidden layers inspired by the vgg net, employing convolutions, batch normalizations and ReLU at each layer.  
+
+I plan on continuing this project, looking at larger amounts of training examples and incorporating data augmentation as well as different types of model architectures like squeeze net. 
 
 
diff --git a/nmutils.py b/nmutils.py
@@ -26,11 +26,16 @@ def save_random_clips(base_path, save_path, snip_length):
     for directory in tqdm(directories):
         filenames = iter(f for f in os.listdir(base_path + directory + '/'))
         for filename in filenames:
-            y, sr = librosa.load(base_path + directory + '/' + filename, mono=True,  sr=None)
-            song_duration = librosa.core.get_duration(y, sr)
-            random_offset = random.uniform(0,song_duration - 5.96)
-            y, sr = librosa.load(base_path + directory + '/' + filename, mono=True,  offset=random_offset, duration= 5.94, sr=None)
-            librosa.output.write_wav(y=y, sr=sr, path=save_path + filename[:-4] + '.wav')
+            try:
+
+                y, sr = librosa.load(base_path + directory + '/' + filename, mono=True,  sr=None)
+                song_duration = librosa.core.get_duration(y, sr)
+                random_offset = random.uniform(0,song_duration - 5.96)
+                y, sr = librosa.load(base_path + directory + '/' + filename, mono=True,  offset=random_offset, duration= 5.94, sr=None)
+
+                librosa.output.write_wav(y=y, sr=sr, path=save_path + filename[:-4] + '.wav')
+            except:
+                pass
 
 # Save melspectrogram tensors for every file in some base_path directory to some save_path
 ## Note: this creates 512 bins (128*4) for the frequency component

diff --git a/nohup.out b/nohup.out
diff --git a/notebooks/nohup.out b/notebooks/nohup.out
diff --git a/notebooks/nohup2.out b/notebooks/nohup2.out