# Yearly Breakdown
> Top artists and tracks of the year, along with average episode BPM.

- toc: true 
- badges: true
- comments: false
- categories: [asot, tracks, arists, bpm]
- image: images/yearly.png

In [1]:
#hide
import os
import yaml
import spotipy
import json
import altair as alt
import numpy as np
import pandas as pd
from spotipy.oauth2 import SpotifyClientCredentials

with open('spotipy_credentials.yaml', 'r') as spotipy_credentials_file:
    credentials = yaml.safe_load(spotipy_credentials_file)
    os.environ["SPOTIPY_CLIENT_ID"] = credentials['spotipy_credentials']['spotipy_client_id']
    os.environ["SPOTIPY_CLIENT_SECRET"] = credentials['spotipy_credentials']['spotipi_client_seret']

sp = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials())

asot_radio_id = '25mFVpuABa9GkGcj9eOPce'

albums = []
results = sp.artist_albums(asot_radio_id, album_type='album')
albums.extend(results['items'])
while results['next']:
    results = sp.next(results)
    albums.extend(results['items'])
seen = set()  # to avoid dups
for album in albums:
    name = album['name']
    if name not in seen:
        seen.add(name)

singles = []
results = sp.artist_albums(asot_radio_id, album_type='single')
singles.extend(results['items'])
while results['next']:
    results = sp.next(results)
    singles.extend(results['items'])
seen = set()  # to avoid dups
for single in singles:
    name = single['name']
    if name not in seen:
        seen.add(name)

episodes = singles + albums

episodes.sort(key=lambda x: x['release_date']) # Sort by release date

## Introduction

We've previously looked at A State of Trance's [average episode BPM](https://scottbrenner.github.io/asot-jupyter/asot/bpm/2020/04/28/avg-bpm.html), [most-played artists](https://scottbrenner.github.io/asot-jupyter/asot/artists/2020/05/02/artist-plays.html) and [most-played tracks](https://scottbrenner.github.io/asot-jupyter/asot/tracks/2020/05/16/track-plays.html) _overall_.

In this post, we'll do the same - but looking at how it changes from _year-to-year_.

## Getting Started

[The first episode of A State of Trance aired in 2001](https://en.wikipedia.org/wiki/A_State_of_Trance#History). Since then, the show has seen

In [2]:
len(episodes)

963

episodes of across its nearly 20-year run. _As of writing, according to Spotify, etc .._

As a weekly radio show, I'd expect to see about 52 episodes air each year. Is that correct?

Fortunately Spotify can tell us when an episode aired:

In [8]:
# Air dates for the first 10 episodes
for episode in episodes[:10]:
    print(episode['name'], episode['release_date'])

A State Of Trance Episode 000 2001-05-17
A State Of Trance Episode 001 2001-05-31
A State Of Trance Episode 002 2001-06-07
A State Of Trance Episode 003 2001-06-14
A State Of Trance Episode 004 2001-06-21
A State Of Trance Episode 005 2001-06-28
A State Of Trance Episode 006 2001-07-19
A State Of Trance Episode 007 2001-07-26
A State Of Trance Episode 008 2001-08-02
A State Of Trance Episode 009 2001-08-09


So we can keep a running tally for each year, then print the result:

In [34]:
from collections import defaultdict

episodes_counter = defaultdict(int)

for episode in episodes:
    episodes_counter[episode['release_date'][:4]] += 1

print(dict(episodes_counter))

{'2001': 27, '2002': 50, '2003': 50, '2004': 52, '2005': 48, '2006': 50, '2007': 51, '2008': 51, '2009': 51, '2010': 50, '2011': 52, '2012': 51, '2013': 51, '2014': 51, '2015': 52, '2016': 50, '2017': 51, '2018': 52, '2019': 52, '2020': 21}


Seems reasonable enough!

## Calculating

Let's crunch some numbers.

What is the annual average episode BPM?

In [39]:
annual_total_bpm = defaultdict(int)
annual_avg_bpm = defaultdict(int)

for episode in episodes:
    try:
        episode_bpm = 0
        tracks_counted = 0
        for track in sp.album_tracks(episode['uri'])['items']:
            if "a state of trance" in track['name'].lower() or "- interview" in track['name'].lower():
                continue
            else:
                episode_bpm += sp.audio_features(track['uri'])[0]['tempo']
                tracks_counted += 1
        episodes_counted += 1
        avg = episode_bpm/tracks_counted
        annual_total_bpm[episode['release_date'][:4]] += avg
    except:
        pass

for year, avg in annual_total_bpm.items():
    annual_avg_bpm[year] = avg / episodes_counter[year]

print(dict(annual_avg_bpm))

{'2001': 137.34653042328043, '2002': 138.77037672527473, '2003': 137.87648184593183, '2004': 136.8379514048452, '2005': 135.9946882254463, '2006': 136.21986194104406, '2007': 135.31834901771512, '2008': 134.5739768988312, '2009': 134.84276103847634, '2010': 134.2873634011065, '2011': 133.38769823050973, '2012': 133.77641000308137, '2013': 134.89472557606408, '2014': 134.62926964078503, '2015': 133.21748777702336, '2016': 133.8357659060619, '2017': 133.83226986060524, '2018': 132.8617358100079, '2019': 133.39046580493476, '2020': 133.46879858547982}


## Results

Let's see what we've got!

In [109]:
source = pd.DataFrame([(k, v) for k, v in annual_avg_bpm.items()], 
                   columns=['Year', 'Average Episode BPM'])
source['138'] = 138

base = alt.Chart(source).mark_line().encode(
    x=alt.X('Year'),
    y=alt.Y('Average Episode BPM', scale=alt.Scale(domain=(130, 140))),
).properties(
    title="A State of Trance - Annual Average BPM of Episode",
    width=600
)

rule = alt.Chart(source).mark_rule(color='red').encode(
    y='138'
)

base + rule