GitHub - demunger/songs_som: Implementation of a SOM Visualization

Analysis of Million Songs Dataset: SOM Implementation

In this repository is an implementation of a Self-Organizing Map (SOM) used to aggregate the million song data, enabling comparision of song trends across time and genre.

Algorithm Overview

A SOM is a data visualization technique comprised of a self-organizing neural network. In brief, the map collapses vector data into a two-dimensional space; each component node is then tuned according to input features, creating a topologically ordered map.

The basic implementation strategy was as follows: first, we cleaned and standardized the numeric song data. Our program then constructs a grid of nodes according to user-passed size variables. Each node represents a vector of length n - where n is the number of features in the sample data - and is initalized to a set of random values.

For each complete pass of the song data (specified, again, by the user), we first compute a best matching unit (BMU) c for each song vector, defined as the grid node the shortest Euclidean distance from the passed input vector. Then, the grid weight vectors for each node k are updated for each input vector t according to the equation¹:

where the extent of a vector's weight response is controlled by the Gaussian neighborhood function:

The Cartesian coordinates of c and k are given by and , respectively, and is defined by the exponential decay function, assigned constants by the user:

¹Christian Weichel, “Adapting Self-Organizing Maps to the MapReduce Programming Paradigm” (Paper presented at the proceedings of Software-Technologien und Prozesse in Furtwangen, Germany. May 6, 2010).

This repository represents part of a final project for Spring 2016 Computer Science with Applications III.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
cleaned_test_data		cleaned_test_data
images		images
presentation_stuff		presentation_stuff
test_data		test_data
.gitignore		.gitignore
Anna_exp.ipynb		Anna_exp.ipynb
DataCleaning.ipynb		DataCleaning.ipynb
Node.py		Node.py
README.md		README.md
SOM.py		SOM.py
SOMMapper.py		SOMMapper.py
UMatrixMapper.py		UMatrixMapper.py
attribute overview.xlsx		attribute overview.xlsx
clean_data.py		clean_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of Million Songs Dataset: SOM Implementation

Algorithm Overview

About

Releases

Packages

Languages

demunger/songs_som

Folders and files

Latest commit

History

Repository files navigation

Analysis of Million Songs Dataset: SOM Implementation

Algorithm Overview

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages