# **Project: Are Top Spotify Songs Getting Shorter Over Time?**

#### **Objective:**

To determine whether the most popular songs released each year (from 2014 to 2024) are getting shorter in duration over time.

#### **Challenge:**

The official Spotify API does not provide the total number of streams per song. Instead, it gives a popularity score (from 0 to 100) which reflects recent activity, not total plays.

#### **Strategy:**

**We will:**

1. Search for the most popular songs released in each year using the Spotify API.
2. Record their duration, popularity score, and release date.
3. Analyze if the average duration of top songs has decreased over the years.

#### **Import Libraries Needed**

In [1]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import pandas as pd
import time

#### **Python Script to Fetch Most Popular Songs Per Year**

In [5]:
# Set your Spotify API credentials here
CLIENT_ID = '34935f1ddbeb4480a22350f7e4e1e983'
CLIENT_SECRET = '2570074ecf764b6b957b92631d70ea34'

auth_manager = SpotifyClientCredentials(client_id=CLIENT_ID, client_secret=CLIENT_SECRET)
sp = spotipy.Spotify(auth_manager=auth_manager)

def get_top_tracks_by_year(year, limit=50):
    query = f'year:{year}'
    tracks = []
    offset = 0
    fetched = 0
    max_fetch = 1000  # max search results
    batch_size = 50

    while fetched < max_fetch and len(tracks) < limit:
        try:
            results = sp.search(q=query, type='track', limit=batch_size, offset=offset)
        except spotipy.exceptions.SpotifyException as e:
            print(f"Spotify API error: {e}. Retrying in 5 seconds...")
            time.sleep(5)
            continue
        
        items = results['tracks']['items']
        if not items:
            break
        
        tracks.extend(items)
        fetched += len(items)
        offset += batch_size
        
        time.sleep(0.1)
    
    # Sort by popularity descending
    tracks_sorted = sorted(tracks, key=lambda t: t['popularity'], reverse=True)
    return tracks_sorted[:limit]

def main():
    all_data = []

    for year in range(2014, 2025):
        print(f"Fetching top tracks for {year}...")
        top_tracks = get_top_tracks_by_year(year)

        for track in top_tracks:
            duration_ms = track['duration_ms']
            duration_min = round(duration_ms / 60000, 2)  # convert to minutes with 2 decimals

            all_data.append({
                'Year': year,
                'Track Name': track['name'],
                'Artists': ", ".join([artist['name'] for artist in track['artists']]),
                'Popularity': track['popularity'],
                'Duration (min)': duration_min,
                'Spotify URL': track['external_urls']['spotify']
            })
        
        print(f"Collected {len(top_tracks)} tracks for {year}")
    
    df = pd.DataFrame(all_data)
    df.to_csv('top_tracks_by_year.csv', index=False, encoding='utf-8')
    print("Data saved to top_tracks_by_year.csv")

if __name__ == "__main__":
    main()

df = pd.read_csv('top_tracks_by_year.csv')
df.head()

Fetching top tracks for 2014...
Collected 50 tracks for 2014
Fetching top tracks for 2015...
Collected 50 tracks for 2015
Fetching top tracks for 2016...
Collected 50 tracks for 2016
Fetching top tracks for 2017...
Collected 50 tracks for 2017
Fetching top tracks for 2018...
Collected 50 tracks for 2018
Fetching top tracks for 2019...
Collected 50 tracks for 2019
Fetching top tracks for 2020...
Collected 50 tracks for 2020
Fetching top tracks for 2021...
Collected 50 tracks for 2021
Fetching top tracks for 2022...
Collected 50 tracks for 2022
Fetching top tracks for 2023...
Collected 50 tracks for 2023
Fetching top tracks for 2024...
Collected 50 tracks for 2024
Data saved to top_tracks_by_year.csv


Unnamed: 0,Year,Track Name,Artists,Popularity,Duration (min),Spotify URL
0,2014,A Sky Full of Stars,Coldplay,90,4.46,https://open.spotify.com/track/0FDzzruyVECATHX...
1,2014,I Love You So,The Walters,89,2.67,https://open.spotify.com/track/4SqWKzw0CbA05TG...
2,2014,Summer,Calvin Harris,88,3.71,https://open.spotify.com/track/6YUTL4dYpB9xZO5...
3,2014,Outside (feat. Ellie Goulding),"Calvin Harris, Ellie Goulding",88,3.79,https://open.spotify.com/track/7MmG8p0F9N3C4AX...
4,2014,Lovers Rock,TV Girl,88,3.57,https://open.spotify.com/track/6dBUzqjtbnIa1Tw...


#### **Python Script to Fetch the Image of Each Artist**

In [3]:
auth_manager = SpotifyClientCredentials(client_id=CLIENT_ID, client_secret=CLIENT_SECRET)
sp = spotipy.Spotify(auth_manager=auth_manager)

# STEP 1: Load the top tracks dataset
df_tracks = pd.read_csv('top_tracks_by_year.csv')

# STEP 2: Extract the main artist (first listed)
df_tracks['Main Artist'] = df_tracks['Artists'].apply(lambda x: x.split(',')[0].strip())

# STEP 3: Get unique artist names
unique_artists = df_tracks['Main Artist'].drop_duplicates().tolist()

# STEP 4: Function to fetch artist image
def fetch_artist_image(artist_name):
    try:
        result = sp.search(q=artist_name, type='artist', limit=1)
        items = result['artists']['items']
        if items and 'images' in items[0] and items[0]['images']:
            return items[0]['images'][0]['url']
    except Exception as e:
        print(f"Error fetching image for {artist_name}: {e}")
    return None

# STEP 5: Build artist image mapping
artist_data = []
for name in unique_artists:
    print(f"Fetching image for: {name}")
    image_url = fetch_artist_image(name)
    artist_data.append({'Artist': name, 'Image URL': image_url})
    time.sleep(0.1)  # Be nice to the API

# STEP 6: Save to artists.csv
df_artists = pd.DataFrame(artist_data)
df_artists.to_csv('artists.csv', index=False, encoding='utf-8')
print("Saved artist images to artists.csv")

Fetching image for: Coldplay
Fetching image for: The Walters
Fetching image for: Calvin Harris
Fetching image for: TV Girl
Fetching image for: One Direction
Fetching image for: J. Cole
Fetching image for: Justin Bieber
Fetching image for: Avicii
Fetching image for: Pitbull
Fetching image for: MAGIC!
Fetching image for: Maroon 5
Fetching image for: WALK THE MOON
Fetching image for: Vance Joy
Fetching image for: Ariana Grande
Fetching image for: Hozier
Fetching image for: Sam Smith
Fetching image for: Lana Del Rey
Fetching image for: Taylor Swift
Fetching image for: Ed Sheeran
Fetching image for: Mr. Probz
Fetching image for: Pharrell Williams
Fetching image for: George Ezra
Fetching image for: Sia
Fetching image for: Nico & Vinz
Fetching image for: Ty Dolla $ign
Fetching image for: Route 94
Fetching image for: Clean Bandit
Fetching image for: Lilly Wood and The Prick
Fetching image for: Henrique & Juliano
Fetching image for: Badoxa
Fetching image for: Anselmo Ralph
Fetching image for: D

#### **Next Steps**

Import the csv file into PowerBI and starting making the analysis and determine whether the most popular songs released each year (from 2014 to 2024) are getting shorter in duration over time or not.