Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
claytonblythe committed Nov 22, 2017
2 parents 4ef2c11 + 41ea8c5 commit a31e348
Show file tree
Hide file tree
Showing 14 changed files with 9,562 additions and 8 deletions.
9 changes: 7 additions & 2 deletions README.md 100644 → 100755
Expand Up @@ -3,7 +3,7 @@

This is a project called neuralMusic created by Clayton Blythe on 2017/09/29

I aim to use the Free Music Archive (FMA) along with Convolutional Neural Networks to do genre classification from
I aim to use the Free Music Archive (FMA) along with Convolutional Neural Networks in PyTorch to do genre classification from
short snippets of different songs (~6 seconds of audio). I am working on taking random 6 second samples from 100,000 songs to train a classifier.

Here is an example of what a spectrogram looks like, it is kind of like a "fingerprint" for a song, representing how different frequencies of sound evolve over time.
Expand All @@ -13,6 +13,11 @@ For example, here is a six second snippet from "Lose Yourself To Dance" by Daft
![Alt Test](https://github.com/claytonblythe/neuralMusic/blob/master/data/spectrograms/lose_yourself_to_dance.png)


Convolutional Neural Networks have contributed to amazing advancements in image recognition, and this dataset is fairly large, so I am looking to see how good they are at converting the visual representation of a snippet of audio into genre predictions. I imagine it could be a cool app where you can get classification of a genre from a very short recording of audio
Convolutional Neural Networks have contributed to amazing advancements in image recognition, and this dataset is fairly large, so I am looking to see how good they are at converting the visual representation of a snippet of audio into genre predictions. I imagine it could be a cool app where you can get classification of a genre from a very short recording of audio.


With an initial test on 5.94 second length samples, training on ~5500 samples and testing on ~2200 validation examples, I achieved an accuracy of 45.7% at classifying a song's membership to one of eight genre classses. I think these results are pretty decent considering the small size of data I used, and for such a short time snippet. I uses a model of five hidden layers inspired by the vgg net, employing convolutions, batch normalizations and ReLU at each layer.

I plan on continuing this project, looking at larger amounts of training examples and incorporating data augmentation as well as different types of model architectures like squeeze net.


15 changes: 10 additions & 5 deletions nmutils.py 100644 → 100755
Expand Up @@ -26,11 +26,16 @@ def save_random_clips(base_path, save_path, snip_length):
for directory in tqdm(directories):
filenames = iter(f for f in os.listdir(base_path + directory + '/'))
for filename in filenames:
y, sr = librosa.load(base_path + directory + '/' + filename, mono=True, sr=None)
song_duration = librosa.core.get_duration(y, sr)
random_offset = random.uniform(0,song_duration - 5.96)
y, sr = librosa.load(base_path + directory + '/' + filename, mono=True, offset=random_offset, duration= 5.94, sr=None)
librosa.output.write_wav(y=y, sr=sr, path=save_path + filename[:-4] + '.wav')
try:

y, sr = librosa.load(base_path + directory + '/' + filename, mono=True, sr=None)
song_duration = librosa.core.get_duration(y, sr)
random_offset = random.uniform(0,song_duration - 5.96)
y, sr = librosa.load(base_path + directory + '/' + filename, mono=True, offset=random_offset, duration= 5.94, sr=None)

librosa.output.write_wav(y=y, sr=sr, path=save_path + filename[:-4] + '.wav')
except:
pass

# Save melspectrogram tensors for every file in some base_path directory to some save_path
## Note: this creates 512 bins (128*4) for the frequency component
Expand Down
29 changes: 29 additions & 0 deletions nohup.out

Large diffs are not rendered by default.

2,006 changes: 2,006 additions & 0 deletions notebooks/nohup.out

Large diffs are not rendered by default.

2,006 changes: 2,006 additions & 0 deletions notebooks/nohup2.out

Large diffs are not rendered by default.

0 comments on commit a31e348

Please sign in to comment.