# Data Collection

This notebook outlines the process of collecting playlist data from the Spotify API, focusing on playlists associated with specific times of the day: "morning," "afternoon," "evening," and "night." The contents of these playlists often reflect mood, activity, or genre preferences tied to these time periods.

The objective of this data collection is to compile information about playlists that fit these time periods, extracting key details such as playlist ID, number of followers, tracks, track genres, and additional metadata. The approach includes various filters and requirements to ensure the data is relevant and accurate.

## 1. Preparing the Environment

- **Libraries:** Core libraries like `os`, `json`, and `time` are loaded, along with `dotenv` for managing API credentials securely.
- **Custom Functions:** Functions from `Functions.py` are imported to handle Spotify API interactions and data processing.
- **Directories:** The `data/raw/` folder is set up to store raw JSON outputs from the API.
- **Credentials:** Spotify API credentials are loaded securely from a `.env` file.

These steps ensure the environment is ready for efficient data collection.

In [1]:
# library loading
import os
import json
import time
from dotenv import load_dotenv

# custom function loading
from Functions import *

In [2]:
raw_data_dir = "data/raw/"
os.makedirs(raw_data_dir, exist_ok=True)

In [3]:
load_dotenv()

SPOTIFY_CLIENT_ID = os.getenv("SPOTIFY_CLIENT_ID")
SPOTIFY_CLIENT_SECRET = os.getenv("SPOTIFY_CLIENT_SECRET")

### Generating Spotify API Access Token

To access the Spotify API, the `get_spotify_token` function from `Functions.py` is used. This function utilizes the Client ID and Client Secret stored in a secure `.env` file. The function generates an access token, and an exception is raised if the token generation fails. This token is essential for authenticating API requests throughout the data collection process.

In [4]:
try:
    token = get_spotify_token(SPOTIFY_CLIENT_ID, SPOTIFY_CLIENT_SECRET)
    print(f"Access Token Generated")
except Exception as e:
    print(f"Error fetching Spotify token: {e}")

Access Token Generated


### Defining Time Periods for Data Collection

The next step is collecting playlist data categorized by four main time periods: **Morning**, **Afternoon**, **Evening**, and **Night**. These categories help capture playlists that are thematically aligned with specific times of the day, reflecting mood, activity, or genre preferences.

In [5]:
time_periods = ["morning", "afternoon", "evening", "night"]

## 2. Data Collection
- **Search Criteria:** Playlists must include the time period keyword in either the title or description.
- **Inclusion Filters:**
  - The playlist must be user-generated (not created by Spotify's algorithms).
  - It must have at least 50 followers.
  - It must contain a minimum of 10 tracks.
  - It must have a valid playlist ID.
  - Only the first 200 tracks of a playlist are considered, so as to not bias results too heavily based on the particularly long (outlier) playlists
- **Data Extraction:** The collected data includes:
  - Playlist ID
  - Tracks and associated metadata
  - Artist genres for each track
  - Additional playlist details, such as follower count
- **Output** The data is saved as individual JSON files for each playlist, named using the associated time period and playlist ID. These files serve as raw data for further analysis and visualization.

In [None]:
for time_period in time_periods:
    print(f"Collecting data for {time_period} playlists...")

    time.sleep(60) # Time sleep to avoid rate limiting
    
    # Search playlists by time period
    playlist_ids = search_playlists(time_period, token)

    if not playlist_ids:
        print(f"No playlists found for {time_period}")
        continue
    
    # Initialize playlist data to store all playlist details
    playlist_data = []
    
    for playlist_id in playlist_ids:
        time.sleep(7.5) # Time sleep to avoid rate limiting
        print(f"Fetching details for playlist ID: {playlist_id}")
        
        # Get playlist details (optional; can enrich data if needed)
        playlist_details = get_playlist_details(playlist_id, token)
        if not playlist_details:
            print(f"Details not found for playlist ID: {playlist_id}")
            continue

        # Get playlist tracks (optional for database purposes)
        playlist_tracks = get_playlist_tracks(playlist_id, token)
        if not playlist_tracks:
                print(f"No tracks found for playlist ID: {playlist_id}")
                continue

        # Get track genres using the playlist ID
        track_genres = get_track_genres(playlist_id, token)

        # Compile data
        playlist_info = {
            "playlist_id": playlist_id,
            "playlist_details": playlist_details,
            "playlist_tracks": playlist_tracks,  # For database purposes
            "track_genres": track_genres,
        }
        playlist_data.append(playlist_info)

        # Save JSON file for the playlist
        save_json(playlist_info, f"{time_period}_playlist_{playlist_id}")

print(f"Data collection for {time_period} playlists completed.")

Collecting data for morning playlists...
Filtered and Sorted Playlists:
Café Music 2025 ☕ Chill Vibes -  Coffee Lounge  - Followers: 576112 - Tracks: 295
Café Music 2024 ☕️ Coffee Shop Vibes for Good Morning!  - Followers: 233131 - Tracks: 381
Sunday Chill  ☕ Morning Playlist :-) - Followers: 94687 - Tracks: 70
Morning Chill 🥞 Relax Breakfast - Followers: 82610 - Tracks: 200
productive work lofi 🍉 - Followers: 43286 - Tracks: 597
Morning Vibes 2024 🌞 - Followers: 38932 - Tracks: 205
Chill Morning 🥐☕ Breakfast Music 2024 - Followers: 28727 - Tracks: 258
Good Morning Playlist || Best Day Ever 🌤  - Followers: 25229 - Tracks: 44
Sunday Morning Vibes ⛅️ - Followers: 14999 - Tracks: 245
Happy Chill Morning - Followers: 12804 - Tracks: 217
it girl morning 🧖‍♀️🛁🌟 - Followers: 11715 - Tracks: 117
happy morning vibes☀️🐥 - Followers: 6214 - Tracks: 327
Morning Music (Breakfast / Wake-Up / Coffee) - Followers: 5524 - Tracks: 219
Morning  - Followers: 429 - Tracks: 217
Morning playlist - Followers: