# Albums and Songs Lab

### Introduction

In this lesson, we'll use the skills we have learned over the past several lessons to answer questions about the top songs, artists and albums over the past fifty years.

### Working with Songs

Let's start by working with data regarding top 500 albums according to the Rolling Stone Magazine.

In [3]:
import pandas as pd
url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/6-top-songs/data.csv"
df = pd.read_csv(url)
albums = df.to_dict('records')

In [4]:
# albums[:2]

In [5]:
len(albums)

478

> Well, 478.

Let's write some functions to help us better explore the data.

* `all_albums` - Takes an argument of albums and returns the list of album names.

In [8]:
def all_albums(albums):
    pass

In [None]:
all_albums(albums)[:4]
# ["Sgt. Pepper's Lonely Hearts Club Band",
 # 'Pet Sounds',
 # 'Revolver',
 # 'Highway 61 Revisited']

* `all_artists` - Takes argument of list of albums and returns a list of all artists (where each element is a string), and no artist is repeated. 



In [9]:
def all_artists(albums):
    pass

In [None]:
sorted(all_artists(albums))[:4]

# ['A Tribe Called Quest', 'ABBA', 'AC/DC', 'Aerosmith']

* `find_by_name` - Has one argument of `album_name`. Returns a dictionary of the correct album, or `None` if no album is found.


In [11]:
def find_by_name(album_name):
    pass

In [None]:
find_by_name('Pet Sounds')
# {'number': 2,
#  'year': 1966,
#  'album': 'Pet Sounds',
#  'artist': 'The Beach Boys',
#  'genre': 'Rock',
#  'subgenre': 'Pop Rock, Psychedelic Rock'}

find_by_name('foobar') == None
# True

* `find_by_ranks` - Takes `begin_rank` and `end_rank` as arguments.  Also possible to execute the function by just providing the `begin_rank` or `end_rank` (and not both).  If no arguments are provided the entire list of albums are returned.



In [None]:
find_by_ranks(10, 12)
# [{'number': 10,
#   'year': 1968,
#   'album': 'The Beatles ("The White Album")',
#   'artist': 'The Beatles',
#   'genre': 'Rock',
#   'subgenre': 'Rock & Roll, Pop Rock, Psychedelic Rock, Experimental'},
#  {'number': 11,
#   'year': 1976,
#   'album': 'The Sun Sessions',
#   'artist': 'Elvis Presley',
#   'genre': 'Rock',
#   'subgenre': 'Rock & Roll'},
#  {'number': 12,
#   'year': 1959,
#   'album': 'Kind of Blue',
#   'artist': 'Miles Davis',
#   'genre': 'Jazz',
#   'subgenre': 'Modal'}]

In [None]:
find_by_ranks(498)

# [{'number': 498,
#   'year': 1989,
#   'album': 'The Stone Roses',
#   'artist': 'The Stone Roses',
#   'genre': 'Rock',
#   'subgenre': 'Indie Rock'},
#  {'number': 499,
#   'year': 1971,
#   'album': 'Live in Cook County Jail',
#   'artist': 'B.B. King',
#   'genre': 'Blues',
#   'subgenre': 'Electric Blues'},
#  {'number': 500,
#   'year': 1998,
#   'album': 'Aquemini',
#   'artist': 'OutKast',
#   'genre': 'Hip Hop',
#   'subgenre': 'Reggae, Gangsta, Soul, Conscious'}]

In [None]:
len(find_by_rank())
# 478

* `find_by_years` - Takes `begin_year` and `end_year` as arguments, and returns a list of dictionaries for albums between those years.  Also possible to execute the function by just providing the `begin_year` or `end_year` (and not both).

In [None]:
find_by_years(2008, 2009)

# [{'number': 437,
#   'year': 2008,
#   'album': 'Tha Carter III',
#   'artist': 'Lil Wayne',
#   'genre': 'Hip Hop, Funk / Soul',
#   'subgenre': 'RnB/Swing, Screw, Pop Rap, Thug Rap'}]

In [None]:
find_by_years(2009)

# [{'number': 353,
#   'year': 2010,
#   'album': 'My Beautiful Dark Twisted Fantasy',
#   'artist': 'Kanye West',
#   'genre': 'Hip Hop',
#   'subgenre': 'None'},
#  {'number': 381,
#   'year': 2011,
#   'album': 'The Smile Sessions',
#   'artist': 'The Beach Boys',
#   'genre': 'Rock',
#   'subgenre': 'Pop Rock, Psychedelic Rock'}]

In [None]:
len(find_by_years())
# 478

### Working with Songs

Next, let's load up data related to songs, and data that connects albums and songs.

In [12]:
import pandas as pd
songs_url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/6-top-songs/top-500-songs.txt"
songs_df = pd.read_csv(songs_url, sep='\t', header = None, names = ['rank', 'song', 'artist', 'year'])
top_songs = songs_df.to_dict('records')

track_url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/6-top-songs/track_data.json"
albums_and_tracks = pd.read_json(track_url)
albums_tracks = albums_and_tracks.to_dict('records')

We can see that the top songs has the top 500 songs.

In [13]:
len(top_songs)
# 500

500

In [6]:
top_songs[:5]

[{'rank': 1,
  'song': 'Like a Rolling Stone',
  'artist': 'Bob Dylan',
  'year': 1965},
 {'rank': 2,
  'song': 'Satisfaction',
  'artist': 'The Rolling Stones',
  'year': 1965},
 {'rank': 3, 'song': 'Imagine', 'artist': 'John Lennon', 'year': 1971},
 {'rank': 4, 'song': "What's Going On", 'artist': 'Marvin Gaye', 'year': 1971},
 {'rank': 5, 'song': 'Respect', 'artist': 'Aretha Franklin', 'year': 1967}]

And that the albums tracks has information that connects a song (the tracks) to a specific album.

In [4]:
albums_tracks[0]

{'artist': 'The Beatles',
 'album': "Sgt. Pepper's Lonely Hearts Club Band",
 'tracks': ["Sgt. Pepper's Lonely Hearts Club Band - Remix",
  'With A Little Help From My Friends - Remix',
  'Lucy In The Sky With Diamonds - Remix',
  'Getting Better - Remix',
  'Fixing A Hole - Remix',
  "She's Leaving Home - Remix",
  'Being For The Benefit Of Mr. Kite! - Remix',
  'Within You Without You - Remix',
  "When I'm Sixty-Four - Remix",
  'Lovely Rita - Remix',
  'Good Morning Good Morning - Remix',
  "Sgt. Pepper's Lonely Hearts Club Band (Reprise) - Remix",
  'A Day In The Life - Remix',
  "Sgt. Pepper's Lonely Hearts Club Band - Take 9 And Speech",
  'With A Little Help From My Friends - Take 1 / False Start And Take 2 / Instrumental',
  'Lucy In The Sky With Diamonds - Take 1',
  'Getting Better - Take 1 / Instrumental And Speech At The End',
  'Fixing A Hole - Speech And Take 3',
  "She's Leaving Home - Take 1 / Instrumental",
  'Being For The Benefit Of Mr. Kite! - Take 4',
  'Within You

* Write functions that perform the following: 

`top_ten_artists_by_songs` - returns a dictionary with the 10 artists that have the most songs that appear in the top songs list. 

`album_most_top_songs` - 
    * Returns the name of the top ten *albums* that has that have the most songs featured on the top 500 songs list, and the amount of songs each album has listed.
```
[('Elvis Presley', 8),
 ('The Sun Records Collection', 6),
 ('Greatest Hits', 6),
 ('The Great Twenty Eight', 5),
 ('Bringing It All Back Home', 4),
 ('Are You Experienced', 4),
 ('Portrait of a Legend 1951-1964', 4),
 ('Abbey Road', 3),
 ('Pet Sounds', 3),
 ('Wheels of Fire', 3)]
```
