Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
css
data
js
Makefile
README.md
process_isomap.py

README.md

A visualization of the metadata on many of the books in the Toronto Public Library's Catalogue. Specifically, from the Catalogue data (which was generously converted from XML to JSON by Alex Volkov, I extracted the records for English-language print books that contained subject metadata and attempted to cluster related subjects. Following an approach inspired by Nicolas Kruchten, the subject-coocurrence matrix was first reduced using an SVD decomposition, and then the high-dimensional subject-vectors were embedded in the low-dimensional visualization space using the Isomap technique. Finally, clusters of closely related subjects were identified and highlighted using the K-Means clustering algorithm. The SVD, Isomap, and K-mean implementations used were from scikit-learn, and the visualization uses D3.js.