<a href="https://colab.research.google.com/github/JonathanYon/BINGO-Board/blob/main/6-top-songs/albums-lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Albums and Songs Lab

### Introduction

In this lesson, we'll use the skills we have learned over the past several lessons to answer questions about the top songs, artists and albums over the past fifty years.

### Working with Songs

Let's start by working with data regarding top 500 albums according to the Rolling Stone Magazine.

In [None]:
import pandas as pd
url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/6-top-songs/data.csv"
df = pd.read_csv(url)
albums = df.to_dict('records')

In [None]:
albums[:2]

[{'album': "Sgt. Pepper's Lonely Hearts Club Band",
  'artist': 'The Beatles',
  'genre': 'Rock',
  'number': 1,
  'subgenre': 'Rock & Roll, Psychedelic Rock',
  'year': 1967},
 {'album': 'Pet Sounds',
  'artist': 'The Beach Boys',
  'genre': 'Rock',
  'number': 2,
  'subgenre': 'Pop Rock, Psychedelic Rock',
  'year': 1966}]

In [None]:
len(albums)

478

In [None]:
def all_albums(albums):
  return [album['album'] for album in albums]

> Well, 478.

In [None]:
albums[0]

{'album': "Sgt. Pepper's Lonely Hearts Club Band",
 'artist': 'The Beatles',
 'genre': 'Rock',
 'number': 1,
 'subgenre': 'Rock & Roll, Psychedelic Rock',
 'year': 1967}

In [None]:
all_albums(albums[3:6])

['Highway 61 Revisited', 'Rubber Soul', "What's Going On"]

In [None]:
def all_artists(albums):
  return list(set([album['artist'] for album in albums]))

In [None]:
all_artists(albums[:8])

['Bob Dylan',
 'The Rolling Stones',
 'The Clash',
 'The Beatles',
 'Marvin Gaye',
 'The Beach Boys']

Let's write some functions to help us better explore the data.

In [None]:
albums[0]

{'album': "Sgt. Pepper's Lonely Hearts Club Band",
 'artist': 'The Beatles',
 'genre': 'Rock',
 'number': 1,
 'subgenre': 'Rock & Roll, Psychedelic Rock',
 'year': 1967}

In [None]:
def find_by_name(albums, album_name):
  for album in albums:
    if album['album'] == album_name:
      return album
  return None # if the for loop doesnt return a truthy value from the if statement then we should go out of the loop and return default value like none.
    


In [None]:
albums[0]

{'album': "Sgt. Pepper's Lonely Hearts Club Band",
 'artist': 'The Beatles',
 'genre': 'Rock',
 'number': 1,
 'subgenre': 'Rock & Roll, Psychedelic Rock',
 'year': 1967}

In [None]:
find_by_name(albums, 'Rubber Soul')

In [None]:
def find_by_ranks(albums, begin_rank=1, end_rank=500):
  ranked_albums = []
  for album in albums:
    if album['number'] >= begin_rank and album['number'] <= end_rank:
      ranked_albums.append(album)
  return ranked_albums


In [None]:
find_by_ranks(albums, 1, 2)

[{'album': "Sgt. Pepper's Lonely Hearts Club Band",
  'artist': 'The Beatles',
  'genre': 'Rock',
  'number': 1,
  'subgenre': 'Rock & Roll, Psychedelic Rock',
  'year': 1967},
 {'album': 'Pet Sounds',
  'artist': 'The Beach Boys',
  'genre': 'Rock',
  'number': 2,
  'subgenre': 'Pop Rock, Psychedelic Rock',
  'year': 1966}]

In [None]:
def find_by_years(begin_year=1900, end_year=2022):
  ranked_albums = []
  for album in albums:
    if album['year'] >= begin_year and album['year'] <= end_year:
      ranked_albums.append(album)
  return ranked_albums


In [None]:
find_by_years(1999, 2001)

[{'album': 'The Anthology',
  'artist': 'Muddy Waters',
  'genre': 'Folk, World, & Country',
  'number': 38,
  'subgenre': 'Folk',
  'year': 2001},
 {'album': 'Kid A',
  'artist': 'Radiohead',
  'genre': 'Electronic, Rock',
  'number': 67,
  'subgenre': 'Alternative Rock, IDM, Experimental',
  'year': 2000},
 {'album': 'The Definitive Collection',
  'artist': 'ABBA',
  'genre': 'Electronic, Pop',
  'number': 179,
  'subgenre': 'Europop, Synth-pop, Disco',
  'year': 2001},
 {'album': 'Is This It',
  'artist': 'The Strokes',
  'genre': 'Rock',
  'number': 199,
  'subgenre': 'Indie Rock',
  'year': 2001},
 {'album': 'The Neil Diamond Collection',
  'artist': 'Neil Diamond',
  'genre': 'Rock, Pop',
  'number': 224,
  'subgenre': 'Soft Rock, Ballad',
  'year': 1999},
 {'album': 'The Ultimate Collection',
  'artist': 'Patsy Cline',
  'genre': 'Folk, World, & Country',
  'number': 235,
  'subgenre': 'None',
  'year': 2000},
 {'album': 'The Marshall Mathers LP',
  'artist': 'Eminem',
  'genre'

* `all_albums` - Takes an argument of albums and returns the list of album names.

* `all_artists` - Takes argument of list of albums and returns a list of all artists (where each element is a string), and no artist is repeated. 

* `find_by_name` - Has one argument of `album_name`. Returns a dictionary of the correct album, or `None` if no album is found.

* `find_by_ranks` - Takes `begin_rank` and `end_rank` as arguments.  Also possible to execute the function by just providing the `begin_rank` or `end_rank` (and not both).  If no arguments are provided the entire list of albums are returned.

* `find_by_years` - Takes `begin_year` and `end_year` as arguments, and returns a list of dictionaries for albums between those years.  Also possible to execute the function by just providing the `begin_year` or `end_year` (and not both).

### Working with Songs

Next, let's load up data related to songs, and data that connects albums and songs.

In [1]:
import pandas as pd
songs_url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/6-top-songs/top-500-songs.txt"
songs_df = pd.read_csv(songs_url, sep='\t', header = None, names = ['rank', 'song', 'artist', 'year'])
songs = songs_df.to_dict('records')

track_url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/6-top-songs/track_data.json"
albums_and_tracks = pd.read_json(track_url)
albums_tracks = albums_and_tracks.to_dict('records')

In [2]:
songs[:5]

[{'artist': 'Bob Dylan',
  'rank': 1,
  'song': 'Like a Rolling Stone',
  'year': 1965},
 {'artist': 'The Rolling Stones',
  'rank': 2,
  'song': 'Satisfaction',
  'year': 1965},
 {'artist': 'John Lennon', 'rank': 3, 'song': 'Imagine', 'year': 1971},
 {'artist': 'Marvin Gaye', 'rank': 4, 'song': "What's Going On", 'year': 1971},
 {'artist': 'Aretha Franklin', 'rank': 5, 'song': 'Respect', 'year': 1967}]

In [3]:

for song in songs:
  if song['song'] == 'Like a Rolling Stone':
    print(song)

{'rank': 1, 'song': 'Like a Rolling Stone', 'artist': 'Bob Dylan', 'year': 1965}


In [40]:
albums_tracks[3]

{'album': 'Highway 61 Revisited',
 'artist': 'Bob Dylan',
 'tracks': ['Like a Rolling Stone',
  'Tombstone Blues',
  'It Takes a Lot to Laugh, It Takes a Train to Cry',
  'From a Buick 6',
  'Ballad of a Thin Man',
  'Queen Jane Approximately',
  'Highway 61 Revisited',
  "Just Like Tom Thumb's Blues",
  'Desolation Row']}

In [30]:
albums_tracks[3]['tracks']

['Like a Rolling Stone',
 'Tombstone Blues',
 'It Takes a Lot to Laugh, It Takes a Train to Cry',
 'From a Buick 6',
 'Ballad of a Thin Man',
 'Queen Jane Approximately',
 'Highway 61 Revisited',
 "Just Like Tom Thumb's Blues",
 'Desolation Row']

In [32]:
for track in albums_tracks[3]['tracks']:
  if track == 'Like a Rolling Stone':
    print(track)

Like a Rolling Stone


In [54]:
best_albums = []
counter = 0
for album in albums_tracks:
  for track in album['tracks']:
    if track == 'Like a Rolling Stone':
      counter += 1
      print(album['album'], counter)
      album_num = { 'album': album['album'], 'num_on_500': counter }
      best_albums.append(album_num)
    # for song in songs:
    #   if track == song['song']:
    #     counter += 1
        # print(album['album'])
        # best_albums.append()
best_albums
# counter

Highway 61 Revisited 1


[{'album': 'Highway 61 Revisited', 'num_on_500': 1}]

In [96]:

# counter


['Elvis Presley', 8]

In [83]:
top_500_albums = []
counter = 0
for best_album in best_albums:
  if best_album['album'] == best_album['album']:
    counter += 1
    top_500_albums.append({best_album['album'], counter})
# top_500_albums


In [None]:
len(songs)

500

In [98]:
def album_most_top_songs():
  
  best_albums = []
  for album in albums_tracks:
    for track in album['tracks']:
      counter = 0
      for song in songs:
        if track == song['song']:
          best_albums.append(album['album'])

  final_album = [[x,best_albums.count(x)] for x in set(best_albums)]
    
  return sorted(final_album, key=lambda x:x[1], reverse=True)[0]

* Write functions that perform the following: 

In [99]:
album_most_top_songs()

['Elvis Presley', 8]

In [None]:
def album_most_top_songs():
  for albums_track in albums_tracks:
    


`album_most_top_songs` - 
    * Returns the name of the artist and album that has that most songs featured on the top 500 songs list

`top_ten_albums_by_songs` - returns a dictionary with the 10 albums that have the most songs that appear in the top songs list. The album names should be the keys and the corresponding values should be the number of songs that appear on the top 500 list.