This code was written to analyze the evolution of informational complexity of the Million Song Dataset (MSD) [1], using conditional entropy of codewords [2] as the measure of complexity. The complexity of the overall MSD was compared with those songs found on the Billboard Hot 100 from [3].
The corresponding paper was accepted to the ISMIR 2019 conference and is available at https://arxiv.org/abs/1907.04292 [4].
[1] Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. The million song dataset. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011), 2011.
[2] Joan Serrà, Álvaro Corral, Marián Boguñá, Martín Haro, and Josep Ll Arcos. Measuring the evolution of contemporary western popular music. Scientific reports, 2, 2012.
[3] Matthias Mauch, Robert M MacCallum, Mark Levy, and Armand M Leroi. The evolution of popular music: USA 1960-2010. arXiv preprint arXiv:1502.05417, 2015.
[4] Thomas Parmer and Yong-Yeol Ahn. Evolution of the informational complexity of contemporary western music. arXiv preprint arXiv:1907.04292, 2019.