Skip to content

Crawl Spotify and Genius for all the songs of your favourite artist!

License

Notifications You must be signed in to change notification settings

marcderbauer/songcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SongCrawler

Crawl Spotify and Genius for all the songs of your favourite artist!

❗ Requirements

This program was built using Python 3.10.
Older versions may work, but are currently not supported.
So far I have only tested this on OSX. I plan on testing this on Windows in the future, but it may currently not work.

You will also need to setup access to the Spotify and Genius APIs. More on that below.

😮 About

This program is a tool for data scientists and NLP enthusiasts who would like to gather data on music. It gathers two types of data: Spotify audio features and lyrics.

Spotify audio features are metrics about songs which were generated by Spotify and are accessible through their API. As the name implies, the features are audio related and include some interesting measurements, such as dancability or acousticness. A full list of features the API provides can be found here.

The lyrics are gathered through the Genius API.

🤖 Setup

1. Install the Required Dependencies

pip install -r requirements.txt

2. Setup Spotify API

This repository makes use of Spotipy to access the Spotify API. If you're interested in the statistics gathered from Spotify, you might want to checkout the Spotipy Documentation. To query data from the API, you first need to create an app as Spotify developer. You need to do this in order to get a client_id and client_secret.

You need to export the credentials as environment variables SPOTIFY_CLIENT_ID and SPOTIFY_CLIENT_SECRET respectively.

Here is a little tutorial on how to get the credentials: https://cran.r-project.org/web/packages/spotidy/vignettes/Connecting-with-the-Spotify-API.html

3. Setup Genius API

Lyrics are gathered using LyricsGenius to access the Genius.com API.

The first step to access the API is to create a Genius account. Once you have the account, you can use it to generate a client access token. You need to export this as the environment variable GENIUS_ACCESS_TOKEN.

After setting up both the Spotify and Genius API you are ready to go.

🎶 How It Works

There are two main ways to use this repo:

  1. As a command-line interface (CLI)
  2. As a Python module

The songcrawler project was made with the intention of creating a CLI. It was then written in a way that should make it usable as a Python module, but that functionality is secondary.

Command-Line Interface

The CLI functionality is provided in the main.py. You can use it as follows:

python3 main.py query

query is a variable, which could take the following forms:

  1. A Spotify URI (e.g. spotify:track:2Ud3deeqLAG988pfW0Kwcl)
  2. A Genius ID (e.g. 8150537)
  3. A freetext query

There are many additional parameters and flags to adjust the program's behaviour. To list them all you can use:

python3 main.py --help

Spotify URI

Spotify URIs are the main way of using this program. They are quite flexible, as they represent not only songs, but also entire albums, playlists or even artists.

You can access the URIs through the share menu, which, for songs you find by right-clicking on them, and for all other resources through the three dot button at the top. Under the share option, the URIs are currently hidden. In order to reveal them, you need to use the option key on mac or the ctrl key on Windows.

Accessing the URIs
Credit: MattSuda in the Spotify Community

With the Spotify URI, you can use the CLI like this:

python3 main.py spotify:album:1R8kkopLT4IAxzMMkjic6X

Genius IDs

Sometimes artists have different songs with the same name (looking at you, The 1975...). Other times, the Genius API may just have difficulties finding the correct song. In this case you can query the lyrics directly using the Genius ID.

This works the same way as with Spotify URIs:

python3 main.py 8150537

It is only supported for single songs. There is no easy way to gather the Genius ID as of now, see issue #29.

Freetext Query

My personal favourite feature is the freetext search. It allows for the use of keywords to help with your search.

python3 main.py "artist:LCD Soundsystem album:This is Happening"

Accepted keywords are artist:, album:, playlist: and track:. Songcrawler will always look for the most specific keyword given. In the query above it will request an album. If no keywords are given, it will search for a track.

As requests can be complex and it's not always certain, which result is the best, the search mode is interactive. Songcrawler will return a table of 15 results; you can select the correct one by typing it's index. Alternatively you can gather the next set of results using '+'/'-' or (p)revious/(n)ext.

Example Query

Python Module

Contribute

I'm happy about any feedback, contribution, etc.

About

Crawl Spotify and Genius for all the songs of your favourite artist!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages