# Analyzing the ASOT Top 1000
> Celebrating 1,000 episodes of A State of Trance.

- toc: true 
- badges: true
- comments: false
- categories: [asot, bpm, artist, year]
- image: images/annual-avg-bpm.png

In [8]:
#hide
%pip install spotipy pyyaml altair

You should consider upgrading via the '/usr/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [9]:
#hide
import os
import yaml
import spotipy
import json
import altair as alt
import numpy as np
import pandas as pd
from spotipy.oauth2 import SpotifyClientCredentials

sp = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials())

## Introduction

To celebrate the 1,000th episode of A State of Trance the radioshow invited viewers to vote for their all-time favorite trance tracks, and the result list was broadcast as [ASOT 1000](https://www.astateoftrance.com/episodes/asot1000/).

In this post we'll analyze the top 1,000.

## Some Housekeeping

As with previous posts here, we'll be pulling data from Spotify and graphing the results. While there is [an official "ASOT Top 1000"](https://open.spotify.com/playlist/5QafFMGgQKGwqgV7k3qHy6) playlist on Spotify, I'm opting to instead use the "[ASOT TOP 1000 Countdown Extended](https://open.spotify.com/playlist/5DCcjCLMlPjTwKLCcYyzIj)" playlist [compiled by reddit user turbodevin](https://www.reddit.com/r/trance/comments/l2ae9y/relive_the_asot_top_1000_countdown_in_your_own/). As Devin writes,
> MISSING
>
>    531 || Sean Callery - The Longest Day (Armin van Buuren Remix)
>
> REMIX NOT AVAILABLE
>
>    414 || Faithless - Insomnia (Andrew Rayel Remix)
>
>    520 || Safri Duo - Played A Live (The Bongo Song) [NWYR & Willem de Roo Remix]
>
>    530 || Kensington - Sorry (Armin van Buuren Remix)
>
>    635 || Ilse de Lange - The Great Escape (Armin van Buuren Remix)
>
>    661 || Zedd feat. Foxes - Clarity (Andrew Rayel Remix)

While the playlist may not be complete, I'd still consider to be the most-complete playlist available on Spotify - using extended mixes over the official playlist's radio mixes is certainly more preferrable, at least.

Remember, all data here is pulled directly from Spotify's API without any modification from my end*. See the post on [Methodology](https://scottbrenner.github.io/asot-jupyter/asot/bpm/2020/04/27/methodology.html) for details on what data we can pull from Spotify, and how.

\*[Spotify's API for "Get a Playlist's Items" limits us to getting 100 tracks at a time](https://developer.spotify.com/documentation/web-api/reference/#category-playlists). Let's make 10 API calls for 100 tracks each, incrementing `offset` each time, and save the results.

In [47]:
"""
User: https://open.spotify.com/user/113444659
Playlist: ASOT TOP 1000 Countdown Extended
Playlist link: https://open.spotify.com/playlist/5DCcjCLMlPjTwKLCcYyzIj
Playlist ID: 5DCcjCLMlPjTwKLCcYyzIj
"""
top_1000_playlist = '5DCcjCLMlPjTwKLCcYyzIj'

top_1000_tracks = []

# Get full details of the tracks and episodes of a playlis
# https://spotipy.readthedocs.io/en/2.16.1/#spotipy.client.Spotify.playlist_items
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=100)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=200)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=300)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=400)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=500)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=600)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=700)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=800)['items'])
top_1000_tracks.extend(sp.playlist_tracks(top_1000_playlist, offset=900)['items'])
print(len(top_1000_tracks))

1000


In [63]:
# What's #1?
print(top_1000_tracks[999]['track']['artists'][0]['name'], ' - ', top_1000_tracks[999]['track']['album']['name'])

Armin van Buuren  -  Shivers


## Artists

Let's begin by looking at the artists who made the top 1000 - how many unique artists were featured?

In [66]:
unique_artists = set()

for track in top_1000_tracks:
    for artist in track['track']['artists']:
            unique_artists.add(artist['name'])      

print(len(unique_artists))

639


Which artists were featured the most?

In [67]:
from collections import defaultdict

artist_counter = defaultdict(int)

for track in top_1000_tracks:
    for artist in track['track']['artists']:
         artist_counter[artist['name']] += 1


top_artists = sorted(artist_counter.items(), key=lambda k_v: k_v[1], reverse=True)

Alright, let's see the top 25 in a graph..

In [69]:
source = pd.DataFrame.from_dict(top_artists[:25])

bars = alt.Chart(source).mark_bar().encode(
    x=alt.X('1:Q', title='Plays'),
    y=alt.Y('0:N', sort='-x', title='Artist')
).properties(
    title="ASOT Top 1000 - Most-played artists",
    width=600
)

text = bars.mark_text(
    align='left',
    baseline='middle',
    dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='1:Q'
)

bars + text

No surprise at _who_ the #1 is, but the sheer number of their tracks featured is pretty impressive - over 10% of the ASOT Top 1000 was produced by Armin van Buuren, more than twice the number of the second-most featured artist!

Which artists were featured exactly once?

In [78]:
for artist in top_artists:
    if artist[1] == 1:
        print(artist[0])

ATN
Late Night Alumni
Lumïsade
Ron van den Beuken
Greg Downey
M.I.K.E.
Stoneface
Terminal
Adam Nickey
Salt Tank
Michael Wood
Sassot
Waio
A Force
Donata
Surpresa
Tom Colontonio
Molly
Myon & Shane 54
Labworks
Mona Moua
Neptune Project
Micky Vi
Ava Mea
Rodg
Midway
Ramin Djawadi
Filterheadz
Jan Burton
Jody Wisternoff
Sian Evans
Marc Marberg
Probspot
Kyler England
Simon Lebon
Selu Vibra
Monoverse
Anton Sonin
Ernesto vs. Bastian
Sensation
Space RockerZ
Tania Zygar
Jean Marie
Matt Lange
Kerry Leva
Meighan Nealon
Sunset Bros
Mark McCabe
Maxi Jazz
Avao
Tammy
Elysian
Lemon
Einar K
Luigi Lusini
Thomas Bronzwaer
Shane
Niels Van Gogh
Alissa Feudo
Swedish House Mafia
Monogato
Eximinds
Whiteout
Andrea Mazza
Ronski Speed
NK
Full Tilt
Carl B.
Tinlicker
Cass
Slide
Bjorn Akesson
Frank T.R.A.X.
Radion6
Easton
Arisen Flame
Astrix
Sodality
Gid Sedgwick
Hazem Beltagui
Allan V.
David Gravell
3LAU
XIRA
The Space Brothers
Bagga Bownz
Ferry Corsten's Countdown
Simon O'Shine & Adam Navel
Humate
A & Z
Leolani
Abst

## Calculating

Let's crunch some numbers.

What is the annual average episode BPM?

In [39]:
annual_total_bpm = defaultdict(int)
annual_avg_bpm = defaultdict(int)

for episode in episodes:
    try:
        episode_bpm = 0
        tracks_counted = 0
        for track in sp.album_tracks(episode['uri'])['items']:
            if "a state of trance" in track['name'].lower() or "- interview" in track['name'].lower():
                continue
            else:
                episode_bpm += sp.audio_features(track['uri'])[0]['tempo']
                tracks_counted += 1
        episodes_counted += 1
        avg = episode_bpm/tracks_counted
        annual_total_bpm[episode['release_date'][:4]] += avg
    except:
        pass

for year, avg in annual_total_bpm.items():
    annual_avg_bpm[year] = avg / episodes_counter[year]

print(dict(annual_avg_bpm))

{'2001': 137.34653042328043, '2002': 138.77037672527473, '2003': 137.87648184593183, '2004': 136.8379514048452, '2005': 135.9946882254463, '2006': 136.21986194104406, '2007': 135.31834901771512, '2008': 134.5739768988312, '2009': 134.84276103847634, '2010': 134.2873634011065, '2011': 133.38769823050973, '2012': 133.77641000308137, '2013': 134.89472557606408, '2014': 134.62926964078503, '2015': 133.21748777702336, '2016': 133.8357659060619, '2017': 133.83226986060524, '2018': 132.8617358100079, '2019': 133.39046580493476, '2020': 133.46879858547982}


## Results

Let's see what we've got!

In [109]:
source = pd.DataFrame([(k, v) for k, v in annual_avg_bpm.items()], 
                   columns=['Year', 'Average Episode BPM'])
source['138'] = 138

base = alt.Chart(source).mark_line().encode(
    x=alt.X('Year'),
    y=alt.Y('Average Episode BPM', scale=alt.Scale(domain=(130, 140))),
).properties(
    title="A State of Trance - Annual Average BPM of Episode",
    width=600
)

rule = alt.Chart(source).mark_rule(color='red').encode(
    y='138'
)

base + rule

Straightforward enough. In the coming posts we'll do something similar, looking at the most-played artists and tracks each year.