Music Segmentation Estimation using Self-similarity Matrix

Generate SSM

First, transfer the audio into chroma vector, then generate the Self-Similarity Matrix (SSM) of chroma with cosine-similarity.
The following picture is the SSM of chroma.

Find the novelty

Next, we need to distinct the lines between every adjacent blocks. The lines called novelty. In order to find the novelty, we use a simple kernel to scan the SSM from bottom-left to top-right.
Then, we plot the output value as the novelty curve. Following picture is the novelty curve.

Post-process

Finally, we do post-process to the novelty curve with the following rules.

Min distance between peaks: 2 sec
Discard the peaks of low novelty (last 15%)

Result

https://github.com/CodeGoood/Music-Segmentation/blob/master/pic/output.m4a
The peak sound in the audio is the phrase position we predict.

Challenges

A phrase may consist of multiple chords
A phrase may begin/end within a chord
Need information more than chroma (e.g. vocal onset/offset)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
pic		pic
README.md		README.md
ssm.py		ssm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Segmentation Estimation using Self-similarity Matrix

Generate SSM

Find the novelty

Post-process

Result

Challenges

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Music Segmentation Estimation using Self-similarity Matrix

Generate SSM

Find the novelty

Post-process

Result

Challenges

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages