A simple tool that takes an artist's name and outputs the average number of words in their songs.
Uses the MusicBrainz API to get artist's albums and tracks, and the LyricsOvh API to get the lyrics to the tracks.
See docs for documentation
Install dependencies with:
pip install -r requirements.txt
Use with:
python3 avglyriccounter <artist_name> <options>
Note that with multi-word artist names, double quotes should be used.
For example:
python3 avglyriccounter "iron maiden"
The following optional flags are available:
-h --help Display usage
-v Increase log level to INFO
-vv Increase log level to DEBUG
Run unit tests with:
python3 -m unittest discover test
It takes a long time to get the results, mainly because the MusicBrainz API has a rate limit of one (1) request per second. The number of MusicBrainz requests made per entry is 2 + number_of_albums, so with artists that have dozens of albums, the requests will take a long time to complete.
In addition to the requests made to MusicBrainz, each unique track's lyrics will be requested separately from LyricsOvh, potentially raising the count of requests made to hundreds. There is no rate limit for LyricsOvh, but the server responses do take a while.
- Improve lyrics parsing to ignore non-word strings and to understand special cases
- There doesn't seem to be a standardized format for the lyrics, but from looking at the results, well educated guesses can be taken to improve result accuracy, e.g.
- '(2x)' can be used to repeat some lines to get an accurate lyric count
- Remove non-alphabetical characters, for example '...'
- Ignore anything in angle brackets, oftentimes used to indicate instrumental songs, lyric credits or writer credits
- Many lyrics start with "Paroles de la chanson [track] par [artist]" in them, which is unnecessary and can be removed to increase accuracy
- There doesn't seem to be a standardized format for the lyrics, but from looking at the results, well educated guesses can be taken to improve result accuracy, e.g.
- Write unit tests for AvgLyricCounter, MusicBrainzClient and LyricsOvhClient
- Consider automated end-to-end testing