Skip to content
Collect, clean, and cluster song lyrics with Doc2Vec and t-SNE
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Lyrics, Pt. 3: Rap Song Clustering with Doc2Vec

This project creates embeddings of song lyrics with Doc2Vec, reduces the resulting dimensionality with t-SNE, and compares artist-by-artist song clusters.

A full description of the project can be found at

Getting started

Prerequisite software

  • Python

Prerequisite libraries

  • Python:
contractions, collections, gensim, nltk, pandas, re, sklearn, string (```install any missing libraries with !pip install [library name]```)

Instructions for use

  • Change paths in as appropriate.

  • Update artist dictionary in as appropriate.

  • Change working directory in as appropriate, and run entire file. Note that additional parameters are available for many LyricsAnalyzer methods; see for further details on available options.



This project is licensed under the MIT License - see the file for details.


You can’t perform that action at this time.