# AOTY

Sometimes I've felt compelled to list my favorite albums of the year in my blog. I've been using LastFM for 16 years and it only just occurred to me that I could use their API to look at my listening history for the year and pick out the albums that were released this year. To use the API you need a key. I've put my key in a `.env` file that my notebook has access to. If you want to use this yourself I guess you could add the key explicitly here.

In [205]:
import os
import dotenv

dotenv.load_dotenv()

LASTFM_KEY = os.environ['LASTFM_KEY']

## Get Listening Data

Over in [another notebook](https://github.com/edsu/notebooks/blob/master/LastFM%20Dissertation%20Highlights.ipynb) I was looking at my listening habits while working on my dissertation. I can reuse the `get_tracks()` and `get_tracks_df()` functions from there to get my listening history for 2020 as a Pandas DataFrame.

`get_tracks()` takes a LastFM user, a `start` and an `end` datetime objects. Give a time period `start` refers to the beginning of the period and `end` to the end of the period.

In [234]:
import pandas
import requests

from datetime import datetime

def get_tracks(user, start=None, end=None):
    params = {
        "method": "user.getrecenttracks",
        "user": user,
        "api_key": LASTFM_KEY,
        "format": "json",
        "limit": 200,
    }
    
    if start:
        params['from'] = int(start.timestamp())
    if end:
        params['to'] = int(end.timestamp())
        
    print('fetching {} to {}'.format(start, end))
                
    response = requests.get('http://ws.audioscrobbler.com/2.0/', params=params)
    
    earliest = None
    results = response.json()

    if response.status_code != 200:
        raise 'Uhoh {}'.format(response.status_code)
        
    results = response.json()
    if 'recenttracks' not in results:
        return
        
    for track in response.json()['recenttracks']['track']:
        # currently playing tracks don't have a date yet 
        if track.get("@attr") and track["@attr"].get('nowplaying'):
            continue
        t = datetime.utcfromtimestamp(int(track['date']['uts']))
        if start and t < start:
            break
        if earliest is None or t < earliest:
            earliest = t
        yield track
        
    if start and earliest and earliest != end:
        yield from get_tracks(user, start, earliest)

`get_tracks_df()` takes the same arguments as `get_tracks()` but instead of returning an interator it iterates through all the results and builds up a big Pandas DataFrame for all the tracks listened to.

In [235]:
def get_tracks_df(user, start, end):
    results = []
    for track in get_tracks(user, start, end):
        results.append([
            datetime.utcfromtimestamp(int(track['date']['uts'])),
            track['mbid'],
            track['name'],
            track['artist']['#text'],
            track['album']['#text'],
            track['image'][-1]['#text']
        ])
    return pandas.DataFrame(results, columns=[
        'timestamp',
        'track_id',
        'track_name',
        'artist',
        'album',
        'image'
    ])   

Now I can get all the tracks I listened to in 2020 into a DataFrame!

In [236]:
df = get_tracks_df('inkdroid', start=datetime(2020, 1, 1), end=datetime(2020, 12, 6))

fetching 2020-01-01 00:00:00 to 2020-12-06 00:00:00
fetching 2020-01-01 00:00:00 to 2020-11-23 21:56:42
fetching 2020-01-01 00:00:00 to 2020-11-20 13:11:19
fetching 2020-01-01 00:00:00 to 2020-10-24 00:16:40
fetching 2020-01-01 00:00:00 to 2020-09-26 01:06:25
fetching 2020-01-01 00:00:00 to 2020-09-04 23:30:32
fetching 2020-01-01 00:00:00 to 2020-08-10 21:11:06
fetching 2020-01-01 00:00:00 to 2020-07-25 01:40:11
fetching 2020-01-01 00:00:00 to 2020-07-10 18:49:20
fetching 2020-01-01 00:00:00 to 2020-06-25 00:44:12
fetching 2020-01-01 00:00:00 to 2020-06-15 12:30:52
fetching 2020-01-01 00:00:00 to 2020-06-02 19:00:59
fetching 2020-01-01 00:00:00 to 2020-05-22 20:58:50
fetching 2020-01-01 00:00:00 to 2020-05-10 00:04:50
fetching 2020-01-01 00:00:00 to 2020-05-01 01:40:20
fetching 2020-01-01 00:00:00 to 2020-04-22 13:27:27
fetching 2020-01-01 00:00:00 to 2020-04-12 22:00:49
fetching 2020-01-01 00:00:00 to 2020-04-01 00:17:42
fetching 2020-01-01 00:00:00 to 2020-03-20 21:05:47
fetching 202

In [283]:
df

Unnamed: 0,timestamp,track_id,track_name,artist,album,image
0,2020-12-05 22:23:33,,Suññatā,JAMES MURRAY & Mike Lazarev,Suññatā,https://lastfm.freetls.fastly.net/i/u/300x300/...
1,2020-12-05 22:20:32,,Ānimitta,JAMES MURRAY & Mike Lazarev,Suññatā,https://lastfm.freetls.fastly.net/i/u/300x300/...
2,2020-12-05 22:17:19,,Asāraka,JAMES MURRAY & Mike Lazarev,Suññatā,https://lastfm.freetls.fastly.net/i/u/300x300/...
3,2020-12-05 22:14:30,,Tucchaka,JAMES MURRAY & Mike Lazarev,Suññatā,https://lastfm.freetls.fastly.net/i/u/300x300/...
4,2020-12-05 22:11:20,,Rittaka,JAMES MURRAY & Mike Lazarev,Suññatā,https://lastfm.freetls.fastly.net/i/u/300x300/...
...,...,...,...,...,...,...
5181,2020-01-01 20:39:33,,Estante,Edu Comelles,aquí llega el invierno,https://lastfm.freetls.fastly.net/i/u/300x300/...
5182,2020-01-01 20:33:02,,Deshielo,Edu Comelles,aquí llega el invierno,https://lastfm.freetls.fastly.net/i/u/300x300/...
5183,2020-01-01 20:30:11,,Todos Quietos,Edu Comelles,aquí llega el invierno,https://lastfm.freetls.fastly.net/i/u/300x300/...
5184,2020-01-01 20:23:30,,aquí llega el invierno,Edu Comelles,aquí llega el invierno,https://lastfm.freetls.fastly.net/i/u/300x300/...


## Count the Albums

Now we can get the counts per artist and album.

In [285]:
counts = (
    df.groupby(['artist', 'album', 'image'])
        .agg(count=('timestamp', 'count'))
        .sort_values('count', ascending=False)
)
counts

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,count
artist,album,image,Unnamed: 3_level_1
Monster Movie,Last Night Something Happened (Expanded),https://lastfm.freetls.fastly.net/i/u/300x300/4087724b9f08f744234d1c8c5caf5627.jpg,78
Seabuckthorn,Through A Vulnerable Occur,https://lastfm.freetls.fastly.net/i/u/300x300/7bea369e5f6851fbb4d7249bded305ac.jpg,68
Federico Durand,Jardín de invierno,https://lastfm.freetls.fastly.net/i/u/300x300/8fc6a8114d30b042937106cb5817b81e.jpg,59
David Newlyn,Apparitions I and II,https://lastfm.freetls.fastly.net/i/u/300x300/2a96cbd8b46e442fc41c2b86b821562f.png,59
Warmth,Life,https://lastfm.freetls.fastly.net/i/u/300x300/023734f04bea5da47946ed34dddd42b8.jpg,54
...,...,...,...
Eluder,The Most Beautiful Blue,https://lastfm.freetls.fastly.net/i/u/300x300/224881e4fa6a4103a510daccff391c82.jpg,1
Elskavon,Movements In Season,https://lastfm.freetls.fastly.net/i/u/300x300/57bb8f75aba14b78cb572956e38646c0.jpg,1
Elephant Stone,,,1
Elegi,Gap In Time (charity compilation),https://lastfm.freetls.fastly.net/i/u/300x300/f2568e2f9f25220f665e1e8367e2cd88.jpg,1


Unfortunately the [user.getRecentTracks](https://www.last.fm/api/show/user.getRecentTracks) API call doesn't tell us what year the album is from so we need to go through these and add a year. Unfortunately while the documentation for the [album.getInfo](https://www.last.fm/api/show/album.getInfo) API endpoint indicate that it returns `releasedate` I didn't find this in either the XML or JSON response...

In [241]:
def get_year(artist, album):
    params = {
        "method": "album.getInfo",
        "artist": artist,
        "album": album,
        "format": "json",
        "api_key": LASTFM_KEY
    }
    response = requests.get('http://ws.audioscrobbler.com/2.0/', params=params)
    return response.json()

get_year('Seabuckthorn', 'Through A Vulnerable Occur')    

{'album': {'name': 'Through A Vulnerable Occur',
  'artist': 'Seabuckthorn',
  'url': 'https://www.last.fm/music/Seabuckthorn/Through+A+Vulnerable+Occur',
  'image': [{'#text': 'https://lastfm.freetls.fastly.net/i/u/34s/7bea369e5f6851fbb4d7249bded305ac.png',
    'size': 'small'},
   {'#text': 'https://lastfm.freetls.fastly.net/i/u/64s/7bea369e5f6851fbb4d7249bded305ac.png',
    'size': 'medium'},
   {'#text': 'https://lastfm.freetls.fastly.net/i/u/174s/7bea369e5f6851fbb4d7249bded305ac.png',
    'size': 'large'},
   {'#text': 'https://lastfm.freetls.fastly.net/i/u/300x300/7bea369e5f6851fbb4d7249bded305ac.png',
    'size': 'extralarge'},
   {'#text': 'https://lastfm.freetls.fastly.net/i/u/300x300/7bea369e5f6851fbb4d7249bded305ac.png',
    'size': 'mega'},
   {'#text': 'https://lastfm.freetls.fastly.net/i/u/300x300/7bea369e5f6851fbb4d7249bded305ac.png',
    'size': ''}],
  'listeners': '307',
  'playcount': '3729',
  'tracks': {'track': [{'name': 'Toward The Warmth',
     'url': 'https://w

## Discogs

Instead lets try searching for the album on the [Discogs API](https://www.discogs.com/developers/) using their [search](https://www.discogs.com/developers/#page:database,header:database-search) endpoint.

In [242]:
def get_album(artist, album):
    params = {
        "artist": artist,
        "release_title": album,
        "token": os.environ.get('DISCOGS_KEY')
    }
    response = requests.get('https://api.discogs.com/database/search', params=params)
    if response.status_code != 200:
        raise "Unexpected response from Discogs: {}".format(response.status_code)
    results = response.json()['results']
    if len(results) > 0:
        return results[0]
    else:
        print('no match for {} - {}'.format(artist, album))
        return None

get_album('Seabuckthorn', 'Through A Vulnerable Occur')
    

{'country': 'France',
 'year': '2020',
 'format': ['CD', 'Album', 'Limited Edition', 'Numbered'],
 'label': ['IIKKI', 'wmfono'],
 'type': 'release',
 'genre': ['Electronic', 'Classical'],
 'style': ['Ambient', 'Classical', 'Contemporary', 'Experimental'],
 'id': 15140844,
 'barcode': ['3770007319210', '1409820702', 'IFPI LT57', 'IFPI Z953'],
 'user_data': {'in_wantlist': False, 'in_collection': False},
 'master_id': 1720606,
 'master_url': 'https://api.discogs.com/masters/1720606',
 'uri': '/Seabuckthorn-Through-A-Vulnerable-Occur/release/15140844',
 'catno': 'IIKKI011',
 'title': 'Seabuckthorn - Through A Vulnerable Occur',
 'thumb': 'https://img.discogs.com/iCRFnlO7ukSXb4CIhX33hB2pSxA=/fit-in/150x150/filters:strip_icc():format(jpeg):mode_rgb():quality(40)/discogs-images/R-15140844-1587215757-8424.jpeg.jpg',
 'cover_image': 'https://img.discogs.com/lciGRcdUxsYVQxpzZza6ifPz-lU=/fit-in/600x600/filters:strip_icc():format(jpeg):mode_rgb():quality(90)/discogs-images/R-15140844-1587215757-8

Nice, there's a `year` to use! So we can create a little function to get the year for a album.

In [243]:
def get_year(artist, album):
    album = get_album(artist, album)
    if album and album.get('year'):
        return int(album.get('year'))
    else:
        return None
    
get_year('Seabuckthorn', 'Through A Vulnerable Occur')


2020

One little wrinke is that if the album is a re-release or remastered version and has that information in parentheses this throws off the Discogs search. For example:

In [244]:
get_album("New Order", "Brotherhood (Collector's Edition)")

no match for New Order - Brotherhood (Collector's Edition)


vs

In [245]:
get_album("New Order", "Brotherhood")

{'country': 'UK',
 'year': '1986',
 'format': ['Vinyl', 'LP', 'Album'],
 'label': ['Factory',
  'Jam Studios',
  'Windmill Lane Studios',
  'Amazon Studios',
  'The Town House',
  'Be Music',
  'Warner Bros. Music',
  'DFI'],
 'type': 'master',
 'genre': ['Electronic', 'Rock'],
 'style': ['Alternative Rock', 'Electro'],
 'id': 3699,
 'barcode': ['FACT 150 A1 SEE AN OLD SOLDIER RIGHT TOWNHOUSE     M̶P̶O̶       86-9   ⩓',
  'FACT 150 B1 MORE JUICE PLEASE TOWNHOUSE      M̶P̶O̶       86-9  ⩓',
  'FACT 150 A1X SEE AN OLD SOLDIER RIGHT TOWNHOUSE   M̶P̶O̶    86-9   ⩓',
  'FACT 150 B1 MORE JUICE PLEASE TOWNHOUSE   M̶P̶O̶    86-9   ⩓',
  'FACT 150 A1 SEE AN OLD SOLDIER RIGHT TOWNHOUSE   M̶P̶O̶    86-9   ⩓',
  'FACT 150 B1X MORE JUICE PLEASE TOWNHOUSE   M̶P̶O̶    86-9   ⩓',
  'FACT 150 A1X SEE AN OLD SOLDIER RIGHT TOWNHOUSE   M̶P̶O̶    86-9   ⩓',
  'FACT 150 B1X MORE JUICE PLEASE TOWNHOUSE   M̶P̶O̶    86-9   ⩓'],
 'user_data': {'in_wantlist': False, 'in_collection': False},
 'master_id': 3699,
 

The same is true for hyphenated titles like this:

In [246]:
get_album('R.E.M.', 'Murmur - Deluxe Edition')

no match for R.E.M. - Murmur - Deluxe Edition


vs

In [247]:
get_album('R.E.M.', 'Murmur')

{'country': 'US',
 'year': '1983',
 'format': ['Vinyl', 'LP', 'Album'],
 'label': ['I.R.S. Records',
  'I.R.S. Records',
  'I.R.S. Records',
  'A&M Records, Inc.',
  'A&M Records, Inc.',
  'I.R.S., Inc.',
  'I.R.S., Inc.',
  'Electrosound Group Midwest, Inc.',
  'Night Garden Music',
  'Unichappell Music, Inc.',
  'Reflection Sound Studios',
  'Sterling Sound'],
 'type': 'master',
 'genre': ['Rock'],
 'style': ['Indie Rock'],
 'id': 14515,
 'barcode': ['SP-070604-A',
  'SP-070604-B',
  'SP0 70604 A ES1 EMW',
  'SP0 70604-B-ES1 EMW',
  'SP0 70604-B-ES2 EMW',
  'STERLING',
  '(B)',
  'BMI'],
 'user_data': {'in_wantlist': False, 'in_collection': False},
 'master_id': 14515,
 'master_url': 'https://api.discogs.com/masters/14515',
 'uri': '/REM-Murmur/master/14515',
 'catno': 'SP 70604',
 'title': 'R.E.M. - Murmur',
 'thumb': 'https://discogs-images.imgix.net/R-414122-1459975774-1411.jpeg?auto=compress&blur=0&fit=max&fm=jpg&h=150&q=40&w=150&s=52b867c541b102b5c8bcf5accae025e0',
 'cover_image

So the get_album() method can be adjusted to strip out this extra information when no results are found.

In [359]:
import re

def get_album(artist, album):
    params = {
        "artist": artist,
        "release_title": album,
        "token": os.environ['DISCOGS_KEY']

    }
    response = requests.get('https://api.discogs.com/database/search', params=params)
    if response.status_code != 200:
        raise Exception("Unexpected response from Discogs: {}".format(response.status_code))
    results = response.json()['results']
    if len(results) > 0:
        return results[0]
    
    norm_album = re.sub(r' \(.+\)$', '', album)
    norm_album = re.sub(r' -.+$', '', norm_album)
        
    if norm_album != album:
        return get_album(artist, norm_album)
    else:
        print('No results for {} - {}'.format(artist, album))
        return None

In [360]:
get_year('R.E.M.', 'Murmur - Deluxe Edition')

1983

In [361]:
get_year("New Order", "Brotherhood (Collector's Edition)")

1986

Now we can look up all of them! But another little wrinkle is that Discogs [rate limit](https://www.discogs.com/developers/#page:home,header:home-rate-limiting) their requests. So we should build in a bit of logic to look for the 'X-Discogs-Ratelimit' header in the response and sleep the appropriate amount of time.

In [357]:
import time

def get_album(artist, album):
    params = {
        "artist": artist,
        "release_title": album,
        "token": os.environ['DISCOGS_KEY']

    }
    response = requests.get('https://api.discogs.com/database/search', params=params)
    
    req_per_minute = int(response.headers.get('X-Discogs-Ratelimit'))
    time.sleep(60 / req_per_minute)
    
    if response.status_code != 200:
        raise "Unexpected response from Discogs: {}".format(response.status_code)
    results = response.json()['results']
    for result in results:
        if result.get('year'):
            return result
    
    norm_album = re.sub(r' \(.+\)$', '', album)
    norm_album = re.sub(r' -.+$', '', norm_album)
        
    if norm_album != album:
        return get_album(artist, norm_album)
    else:
        return None

In [358]:
get_year('Seabuckthorn', 'Through A Vulnerable Occur')

2020

Now we can apply the `get_year` function to every row in the `counts` dataframe. It's going to take a bit of time since there are 1300 rows and it will do 1 request per second. So it'll be about 20 minutes.

First lets reset the index from being (artist, album, image) to just being a number so we can access the values easily.

In [286]:
counts = counts.reset_index()
counts

Unnamed: 0,artist,album,image,count
0,Monster Movie,Last Night Something Happened (Expanded),https://lastfm.freetls.fastly.net/i/u/300x300/...,78
1,Seabuckthorn,Through A Vulnerable Occur,https://lastfm.freetls.fastly.net/i/u/300x300/...,68
2,Federico Durand,Jardín de invierno,https://lastfm.freetls.fastly.net/i/u/300x300/...,59
3,David Newlyn,Apparitions I and II,https://lastfm.freetls.fastly.net/i/u/300x300/...,59
4,Warmth,Life,https://lastfm.freetls.fastly.net/i/u/300x300/...,54
...,...,...,...,...
1309,Eluder,The Most Beautiful Blue,https://lastfm.freetls.fastly.net/i/u/300x300/...,1
1310,Elskavon,Movements In Season,https://lastfm.freetls.fastly.net/i/u/300x300/...,1
1311,Elephant Stone,,,1
1312,Elegi,Gap In Time (charity compilation),https://lastfm.freetls.fastly.net/i/u/300x300/...,1


In [287]:
counts['year'] = counts.apply(lambda r: get_year(r['artist'], r['album']), axis=1)

In [288]:
counts

Unnamed: 0,artist,album,image,count,year
0,Monster Movie,Last Night Something Happened (Expanded),https://lastfm.freetls.fastly.net/i/u/300x300/...,78,2019.0
1,Seabuckthorn,Through A Vulnerable Occur,https://lastfm.freetls.fastly.net/i/u/300x300/...,68,2020.0
2,Federico Durand,Jardín de invierno,https://lastfm.freetls.fastly.net/i/u/300x300/...,59,2016.0
3,David Newlyn,Apparitions I and II,https://lastfm.freetls.fastly.net/i/u/300x300/...,59,2020.0
4,Warmth,Life,https://lastfm.freetls.fastly.net/i/u/300x300/...,54,2020.0
...,...,...,...,...,...
1309,Eluder,The Most Beautiful Blue,https://lastfm.freetls.fastly.net/i/u/300x300/...,1,2008.0
1310,Elskavon,Movements In Season,https://lastfm.freetls.fastly.net/i/u/300x300/...,1,2012.0
1311,Elephant Stone,,,1,2013.0
1312,Elegi,Gap In Time (charity compilation),https://lastfm.freetls.fastly.net/i/u/300x300/...,1,


In [219]:
counts.sort_values(['year', 'timestamp'], ascending=False)

Unnamed: 0,level_0,index,artist,album,image,timestamp,track_id,track_name,year
1,1,1,Seabuckthorn,Through A Vulnerable Occur,https://lastfm.freetls.fastly.net/i/u/300x300/...,68,68,68,2020.0
2,2,2,David Newlyn,Apparitions I and II,https://lastfm.freetls.fastly.net/i/u/300x300/...,59,59,59,2020.0
4,4,4,Warmth,Life,https://lastfm.freetls.fastly.net/i/u/300x300/...,54,54,54,2020.0
5,5,5,R Beny,natural fiction,https://lastfm.freetls.fastly.net/i/u/300x300/...,51,51,51,2020.0
9,9,9,Hazel English,Wake Up!,https://lastfm.freetls.fastly.net/i/u/300x300/...,44,44,44,2020.0
...,...,...,...,...,...,...,...,...,...
1282,1282,1282,R Beny,Solstice,https://lastfm.freetls.fastly.net/i/u/300x300/...,1,1,1,
1297,1297,1297,Emanuele Errante,Sleeplaboratory2.0,https://lastfm.freetls.fastly.net/i/u/300x300/...,1,1,1,
1299,1299,1299,Pleq & Hakobune,Home Normal,https://lastfm.freetls.fastly.net/i/u/300x300/...,1,1,1,
1307,1307,1307,Pretenders,Pretenders [Reissue],https://lastfm.freetls.fastly.net/i/u/300x300/...,1,1,1,


We can now see what my to p25 albums were for 2020.

In [290]:
counts.sort_values(['year', 'count'], ascending=False)[0:25]

Unnamed: 0,artist,album,image,count,year
1,Seabuckthorn,Through A Vulnerable Occur,https://lastfm.freetls.fastly.net/i/u/300x300/...,68,2020.0
3,David Newlyn,Apparitions I and II,https://lastfm.freetls.fastly.net/i/u/300x300/...,59,2020.0
4,Warmth,Life,https://lastfm.freetls.fastly.net/i/u/300x300/...,54,2020.0
5,R Beny,natural fiction,https://lastfm.freetls.fastly.net/i/u/300x300/...,51,2020.0
10,Hazel English,Wake Up!,https://lastfm.freetls.fastly.net/i/u/300x300/...,44,2020.0
11,Halftribe,Archipelago,https://lastfm.freetls.fastly.net/i/u/300x300/...,42,2020.0
14,Norken & Nyquist,Synchronized Minds,https://lastfm.freetls.fastly.net/i/u/300x300/...,37,2020.0
24,Andrew Weathers,Dreams and Visions from the Llano Estacado,https://lastfm.freetls.fastly.net/i/u/300x300/...,27,2020.0
25,koji itoyama,I Know,https://lastfm.freetls.fastly.net/i/u/300x300/...,27,2020.0
27,Taylor Swift,folklore (deluxe version),https://lastfm.freetls.fastly.net/i/u/300x300/...,26,2020.0


## Bandcamp

It is easy to see which artist/album combinations failed to return any results at Discogs:

In [291]:
missing = counts[counts['year'].isnull()]
len(missing)

295

That's quite a few. The question is, which ones have I listened to more than 13 times (which is the cutoff for the 25th) album? I could go throught them by hand but it might be easier to go look over at Bandcamp where I listen to some things. This is where knowing the provenance of the listening data helps a lot.

Unfortunately the [Bandcamp API](https://bandcamp.com/developer) is only for publishers, and you have to apply for a key. I doubt this would qualify so I'm going to try to write a little function that scrapes the year. 

In [319]:
import requests_html

http = requests_html.HTMLSession()

def normal(s):
    return re.sub(r'[^\w]', '', s.lower())

def get_year_bandcamp(artist, album):
    params = {
        "q": album
    }
    resp = http.get('https://bandcamp.com/search', params=params)
    for info in resp.html.find('.album .result-info'):
        link = info.find('.heading a', first=True)
        album_found = link.text
        url = link.attrs['href']
        artist_found = info.find('.subhead', first=True).text.strip('by ')
        if normal(album_found) == normal(album) and normal(artist_found) == normal(artist):
            resp = http.get(url)
            info = resp.html.find('.tralbum-credits', first=True)
            if info:
                m = re.match('released .+? (\d\d\d\d)\n', info.text)
                if m:
                    return int(m.group(1))
                
    return None

get_year_bandcamp('Jim Guthrie', 'Below OST - Volume II')

2020

Now I just need to go through the missing years and look them up, but just for the ones that I've listened to more than 13 times since that is the lower bound of my top 25.

In [334]:
missing = counts[counts['year'].isnull() & (counts['count'] >= 13)]
for i, row in missing.iterrows():
    counts.loc[counts.index == i, 'year'] = get_year_bandcamp(row['artist'], row['album'])


Ok, how many are missing now?

In [335]:
counts[counts['year'].isnull() & (counts['count'] >= 13)]

Unnamed: 0,artist,album,image,count,year
8,Josh Alexander,Hiraeth,https://lastfm.freetls.fastly.net/i/u/300x300/...,44,
16,Father John Misty,Fear Fun,https://lastfm.freetls.fastly.net/i/u/300x300/...,33,
19,The Guy Who Sings Songs About Cities & Towns,"Maryland Songs for Maryland People, Md",https://lastfm.freetls.fastly.net/i/u/300x300/...,30,
20,Hirotaka Shirotsubaki,fragment 2011-2017,https://lastfm.freetls.fastly.net/i/u/300x300/...,30,
59,Wilco,Summerteeth,https://lastfm.freetls.fastly.net/i/u/300x300/...,18,
83,Deradoorian,Cosmic Garden EP,https://lastfm.freetls.fastly.net/i/u/300x300/...,14,


## HTML

Here is a bit of code to generate some HTML to paste into my blog.

In [356]:
from urllib.parse import quote

top_25 = counts.sort_values(['year', 'count'], ascending=False)[0:25]
top_25 = top_25.sort_values('count')

count = 25
for i, row in top_25.iterrows():
    album = {
        'artist': row['artist'],
        'artist_url': 'https://www.last.fm/music/' + quote(row['artist']),
        'album': row['album'],
        'album_url': 'https://www.last.fm/music/' + quote(row['artist']) + '/' + quote(row['album']),
        'image': row['image'], 
        'count': row['count']
    }
        
    print("""
<div class="album">
  <h2>{1}</h2>
  <a href="{0[album_url]}"><img title="{0[count]} plays" src="{0[image]}"></a>
  <a href="{0[artist_url]}">{0[artist]}</a> / 
  <a href="{0[album_url]}">{0[album]}</a>
</div>
    """.format(album, count))
    count -= 1



<div class="album">
  <h2>25</h2>
  <a href="https://www.last.fm/music/Perfume%20Genius/Set%20My%20Heart%20On%20Fire%20Immediately"><img title="17 plays" src="https://lastfm.freetls.fastly.net/i/u/300x300/ca79d5a2dd935979e8c849c159bbdb13.jpg"></a>
  <a href="https://www.last.fm/music/Perfume%20Genius">Perfume Genius</a> / 
  <a href="https://www.last.fm/music/Perfume%20Genius/Set%20My%20Heart%20On%20Fire%20Immediately">Set My Heart On Fire Immediately</a>
</div>
    

<div class="album">
  <h2>24</h2>
  <a href="https://www.last.fm/music/Roger%20Eno/Mixing%20Colours"><img title="18 plays" src="https://lastfm.freetls.fastly.net/i/u/300x300/89ec402783d88c631c6858ff4295fb9b.jpg"></a>
  <a href="https://www.last.fm/music/Roger%20Eno">Roger Eno</a> / 
  <a href="https://www.last.fm/music/Roger%20Eno/Mixing%20Colours">Mixing Colours</a>
</div>
    

<div class="album">
  <h2>23</h2>
  <a href="https://www.last.fm/music/Blochemy/nebe"><img title="18 plays" src="https://lastfm.freetls.fastly.