# Web Scraping - Part I 

## Scrape Billboard 100 Hot Songs

Create a function to scrape the Billboards 100 HOT songs and create a local dataframe of songs with them including:

- Song’s name
- Song’s artis
- Song’s album
- Song’s release year

## Libraries 

In [3]:
from urllib.request import urlopen 
from bs4 import BeautifulSoup
import requests
import pandas as pd

In [4]:
# 2. find url and store it in a variable
#url = "https://www.billboard.com/charts/decade-end/hot-100"
#url = "http://www.discjockey.org/top-100-songs-of-the-1950s/"
url = "https://www.billboard.com/charts/year-end/2020/hot-100-songs"


In [5]:
# 3. download html with a get request
response = requests.get(url)

In [6]:
response.status_code # 200 status code means OK!

200

In [7]:
# 4.1. parse html (create the 'soup')
soup = BeautifulSoup(response.content, "html.parser")

# 4.2. check that the html code looks like it should
soup

<!DOCTYPE html>

<html class="" lang="">
<head>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta content="width=device-width, initial-scale=1, user-scalable=no" name="viewport"/>
<title>Hot 100 Songs - Year-End | Billboard</title>
<meta content="Hot 100 Songs - Year-End" name="title" property="title">
<meta content="See Billboard's rankings of this year's most popular songs, albums, and artists." name="description" property="description">
<meta content="https://www.billboard.com/assets/1631720915/images/ye-charts/charts-ye-share-fb.jpg?ef1b383147295d08313f" name="og:image" property="og:image">
<meta content="https://www.billboard.com/assets/1631720915/images/ye-charts/charts-ye-share-twitter.jpg?ef1b383147295d08313f" name="twitter:image" property="twitter:image"/>
<meta content="@billboard" name="twitter:site"/>
<meta content="Billboard" property="og:site_name">
<meta content="article" property="og:type"/>
<script async="async" data-cfasync="false" s

In [8]:
# improve visual code if necessary 
print(soup.prettify())

<!DOCTYPE html>
<html class="" lang="">
 <head>
  <meta charset="utf-8"/>
  <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
  <meta content="width=device-width, initial-scale=1, user-scalable=no" name="viewport"/>
  <title>
   Hot 100 Songs - Year-End | Billboard
  </title>
  <meta content="Hot 100 Songs - Year-End" name="title" property="title">
   <meta content="See Billboard's rankings of this year's most popular songs, albums, and artists." name="description" property="description">
    <meta content="https://www.billboard.com/assets/1631720915/images/ye-charts/charts-ye-share-fb.jpg?ef1b383147295d08313f" name="og:image" property="og:image">
     <meta content="https://www.billboard.com/assets/1631720915/images/ye-charts/charts-ye-share-twitter.jpg?ef1b383147295d08313f" name="twitter:image" property="twitter:image"/>
     <meta content="@billboard" name="twitter:site"/>
     <meta content="Billboard" property="og:site_name">
      <meta content="article" property="og:type"/>

In [None]:
# 5. retrieve/extract the desired info (here, you'll paste the "Selector" you copied before to get the element that belongs to the top movie)
#titles = soup.select("li button span span.chart-element__information__song.text--truncate.color--primary")
#print(titles)

##musicTable > tbody > tr:nth-child(1) > td:nth-child(2)
##musicTable > tbody

In [9]:
#or use find_all function
#titles = soup.find_all("span", class_="chart-element__information__song")
titles = soup.find_all("div", class_="ye-chart-item__title")
titles

[<div class="ye-chart-item__title">
 Blinding Lights
 </div>,
 <div class="ye-chart-item__title">
 Circles
 </div>,
 <div class="ye-chart-item__title">
 The Box
 </div>,
 <div class="ye-chart-item__title">
 Don't Start Now
 </div>,
 <div class="ye-chart-item__title">
 Rockstar
 </div>,
 <div class="ye-chart-item__title">
 Adore You
 </div>,
 <div class="ye-chart-item__title">
 Life Is Good
 </div>,
 <div class="ye-chart-item__title">
 Memories
 </div>,
 <div class="ye-chart-item__title">
 The Bones
 </div>,
 <div class="ye-chart-item__title">
 Someone You Loved
 </div>,
 <div class="ye-chart-item__title">
 Say So
 </div>,
 <div class="ye-chart-item__title">
 I Hope
 </div>,
 <div class="ye-chart-item__title">
 Whats Poppin
 </div>,
 <div class="ye-chart-item__title">
 Dance Monkey
 </div>,
 <div class="ye-chart-item__title">
 Savage
 </div>,
 <div class="ye-chart-item__title">
 Roxanne
 </div>,
 <div class="ye-chart-item__title">
 Intentions
 </div>,
 <div class="ye-chart-item__title">

In [10]:
song_title = [song.getText() for song in titles]
song_title

['\nBlinding Lights\n',
 '\nCircles\n',
 '\nThe Box\n',
 "\nDon't Start Now\n",
 '\nRockstar\n',
 '\nAdore You\n',
 '\nLife Is Good\n',
 '\nMemories\n',
 '\nThe Bones\n',
 '\nSomeone You Loved\n',
 '\nSay So\n',
 '\nI Hope\n',
 '\nWhats Poppin\n',
 '\nDance Monkey\n',
 '\nSavage\n',
 '\nRoxanne\n',
 '\nIntentions\n',
 '\nEverything I Wanted\n',
 '\nRoses\n',
 '\nWatermelon Sugar\n',
 '\nBefore You Go\n',
 '\nFalling\n',
 '\n10,000 Hours\n',
 '\nWAP\n',
 "\nBallin'\n",
 '\nHot Girl Bummer\n',
 '\nBlueberry Faygo\n',
 '\nHeartless\n',
 '\nBOP\n',
 '\nLose You To Love Me\n',
 '\nGood As Hell\n',
 '\nToosie Slide\n',
 '\nBreak My Heart\n',
 "\nChasin' You\n",
 '\nSavage Love (Laxed - Siren Beat)\n',
 '\nNo Guidance\n',
 '\nMy Oh My\n',
 '\nDynamite\n',
 '\nGo Crazy\n',
 '\nHigh Fashion\n',
 '\nLaugh Now Cry Later\n',
 '\nWoah\n',
 '\nDeath Bed\n',
 '\nSenorita\n',
 '\nHIGHEST IN THE ROOM\n',
 '\nBad Guy\n',
 '\nMood\n',
 '\nRain On Me\n',
 '\nFor The Night\n',
 '\nRITMO (Bad Boys For Life)

In [11]:
song_title_converted = []

for e in song_title:
    song_title_converted.append(e.strip())

print(song_title_converted)

['Blinding Lights', 'Circles', 'The Box', "Don't Start Now", 'Rockstar', 'Adore You', 'Life Is Good', 'Memories', 'The Bones', 'Someone You Loved', 'Say So', 'I Hope', 'Whats Poppin', 'Dance Monkey', 'Savage', 'Roxanne', 'Intentions', 'Everything I Wanted', 'Roses', 'Watermelon Sugar', 'Before You Go', 'Falling', '10,000 Hours', 'WAP', "Ballin'", 'Hot Girl Bummer', 'Blueberry Faygo', 'Heartless', 'BOP', 'Lose You To Love Me', 'Good As Hell', 'Toosie Slide', 'Break My Heart', "Chasin' You", 'Savage Love (Laxed - Siren Beat)', 'No Guidance', 'My Oh My', 'Dynamite', 'Go Crazy', 'High Fashion', 'Laugh Now Cry Later', 'Woah', 'Death Bed', 'Senorita', 'HIGHEST IN THE ROOM', 'Bad Guy', 'Mood', 'Rain On Me', 'For The Night', 'RITMO (Bad Boys For Life)', 'Heart On Ice', 'Nobody But You', 'Trampoline', 'Come & Go', 'Truth Hurts', 'If The World Was Ending', 'We Paid', 'Yummy', 'One Man Band', 'Got What I Got', 'Sunday Best', 'Godzilla', 'Bandit', 'Party Girl', 'Die From A Broken Heart', 'Popstar'

In [62]:
# now we are doing the same with the artist
#artist_name = soup.find_all("span", class_="chart-element__information__artist")
artist_name = soup.find_all("div", class_="ye-chart-item__artist")
artist_name


[<div class="ye-chart-item__artist">
 Mark Ronson Featuring Bruno Mars
 </div>,
 <div class="ye-chart-item__artist">
 LMFAO Featuring Lauren Bennett &amp; GoonRock
 </div>,
 <div class="ye-chart-item__artist">
 <a href="/music/ed-sheeran">
 Ed Sheeran
 </a>
 </div>,
 <div class="ye-chart-item__artist">
 The Chainsmokers Featuring Halsey
 </div>,
 <div class="ye-chart-item__artist">
 Maroon 5 Featuring Cardi B
 </div>,
 <div class="ye-chart-item__artist">
 Rihanna Featuring Calvin Harris
 </div>,
 <div class="ye-chart-item__artist">
 Lil Nas X Featuring Billy Ray Cyrus
 </div>,
 <div class="ye-chart-item__artist">
 Gotye Featuring Kimbra
 </div>,
 <div class="ye-chart-item__artist">
 Luis Fonsi &amp; Daddy Yankee Featuring Justin Bieber
 </div>,
 <div class="ye-chart-item__artist">
 <a href="/music/adele">
 Adele
 </a>
 </div>,
 <div class="ye-chart-item__artist">
 <a href="/music/post-malone-swae-lee">
 Post Malone &amp; Swae Lee
 </a>
 </div>,
 <div class="ye-chart-item__artist">
 <a 

In [63]:
artist = [a.getText() for a in artist_name]
artist

['\nMark Ronson Featuring Bruno Mars\n',
 '\nLMFAO Featuring Lauren Bennett & GoonRock\n',
 '\n\nEd Sheeran\n\n',
 '\nThe Chainsmokers Featuring Halsey\n',
 '\nMaroon 5 Featuring Cardi B\n',
 '\nRihanna Featuring Calvin Harris\n',
 '\nLil Nas X Featuring Billy Ray Cyrus\n',
 '\nGotye Featuring Kimbra\n',
 '\nLuis Fonsi & Daddy Yankee Featuring Justin Bieber\n',
 '\n\nAdele\n\n',
 '\n\nPost Malone & Swae Lee\n\n',
 '\n\nHalsey\n\n',
 '\n\nCarly Rae Jepsen\n\n',
 '\nRobin Thicke Featuring T.I. + Pharrell\n',
 '\n\nEd Sheeran\n\n',
 '\n\nTravis Scott\n\n',
 '\n\nMeghan Trainor\n\n',
 '\n\nLorde\n\n',
 '\n\nDrake\n\n',
 '\nMaroon 5 Featuring Christina Aguilera\n',
 '\n\nPharrell Williams\n\n',
 '\n\nBruno Mars\n\n',
 '\n\nPost Malone Featuring 21 Savage\n\n',
 '\n\nKe$ha\n\n',
 '\nWiz Khalifa Featuring Charlie Puth\n',
 '\nKaty Perry Featuring Juicy J\n',
 '\nMacklemore & Ryan Lewis Featuring Wanz\n',
 '\n\nMaroon 5\n\n',
 '\nfun. Featuring Janelle Monae\n',
 '\n\nBruno Mars\n\n',
 '\n\nTh

In [64]:
#remove /n in list
artist_converted = []

for e in artist:
    artist_converted.append(e.strip())

print(artist_converted)

['Mark Ronson Featuring Bruno Mars', 'LMFAO Featuring Lauren Bennett & GoonRock', 'Ed Sheeran', 'The Chainsmokers Featuring Halsey', 'Maroon 5 Featuring Cardi B', 'Rihanna Featuring Calvin Harris', 'Lil Nas X Featuring Billy Ray Cyrus', 'Gotye Featuring Kimbra', 'Luis Fonsi & Daddy Yankee Featuring Justin Bieber', 'Adele', 'Post Malone & Swae Lee', 'Halsey', 'Carly Rae Jepsen', 'Robin Thicke Featuring T.I. + Pharrell', 'Ed Sheeran', 'Travis Scott', 'Meghan Trainor', 'Lorde', 'Drake', 'Maroon 5 Featuring Christina Aguilera', 'Pharrell Williams', 'Bruno Mars', 'Post Malone Featuring 21 Savage', 'Ke$ha', 'Wiz Khalifa Featuring Charlie Puth', 'Katy Perry Featuring Juicy J', 'Macklemore & Ryan Lewis Featuring Wanz', 'Maroon 5', 'fun. Featuring Janelle Monae', 'Bruno Mars', 'The Weeknd', 'John Legend', 'Marshmello & Bastille', 'Taylor Swift', 'Drake Featuring WizKid & Kyla', 'Imagine Dragons', 'LMFAO', 'Adele', 'OneRepublic', 'Katy Perry Featuring Kanye West', 'Fetty Wap', 'Justin Bieber', '

In [65]:
year = soup.find_all("span", class_="decade-end-chart-item__peak-info-date")
year

[<span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2015-01-17</span> Peak Date</span>,
 <span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2011-07-16</span> Peak Date</span>,
 <span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2017-01-28</span> Peak Date</span>,
 <span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2016-09-03</span> Peak Date</span>,
 <span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2018-09-29</span> Peak Date</span>,
 <span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2011-11-12</span> Peak Date</span>,
 <span class="decade-end-chart-item__peak-info-date"><span class="decade-end-chart-item__peak-info-data">2019-04-13</span> Peak Date</span>,
 <span class=

In [66]:
year = [y.getText() for y in year]
year

['2015-01-17 Peak Date',
 '2011-07-16 Peak Date',
 '2017-01-28 Peak Date',
 '2016-09-03 Peak Date',
 '2018-09-29 Peak Date',
 '2011-11-12 Peak Date',
 '2019-04-13 Peak Date',
 '2012-04-28 Peak Date',
 '2017-05-27 Peak Date',
 '2011-05-21 Peak Date',
 '2019-01-19 Peak Date',
 '2019-01-12 Peak Date',
 '2012-06-23 Peak Date',
 '2013-06-22 Peak Date',
 '2017-12-23 Peak Date',
 '2018-12-08 Peak Date',
 '2014-09-20 Peak Date',
 '2013-10-12 Peak Date',
 '2018-02-03 Peak Date',
 '2011-09-10 Peak Date',
 '2014-03-08 Peak Date',
 '2010-10-02 Peak Date',
 '2017-10-28 Peak Date',
 '2010-01-02 Peak Date',
 '2015-04-25 Peak Date',
 '2014-02-08 Peak Date',
 '2013-02-02 Peak Date',
 '2012-09-29 Peak Date',
 '2012-03-17 Peak Date',
 '2017-05-13 Peak Date',
 '2015-10-03 Peak Date',
 '2014-05-17 Peak Date',
 '2019-02-16 Peak Date',
 '2014-09-06 Peak Date',
 '2016-05-21 Peak Date',
 '2013-07-06 Peak Date',
 '2012-01-07 Peak Date',
 '2011-09-17 Peak Date',
 '2014-01-18 Peak Date',
 '2011-04-09 Peak Date',


In [67]:
rank = soup.find_all("div", class_="ye-chart-item__rank")
rank

[<div class="ye-chart-item__rank">
 1
 </div>,
 <div class="ye-chart-item__rank">
 2
 </div>,
 <div class="ye-chart-item__rank">
 3
 </div>,
 <div class="ye-chart-item__rank">
 4
 </div>,
 <div class="ye-chart-item__rank">
 5
 </div>,
 <div class="ye-chart-item__rank">
 6
 </div>,
 <div class="ye-chart-item__rank">
 7
 </div>,
 <div class="ye-chart-item__rank">
 8
 </div>,
 <div class="ye-chart-item__rank">
 9
 </div>,
 <div class="ye-chart-item__rank">
 10
 </div>,
 <div class="ye-chart-item__rank">
 11
 </div>,
 <div class="ye-chart-item__rank">
 12
 </div>,
 <div class="ye-chart-item__rank">
 13
 </div>,
 <div class="ye-chart-item__rank">
 14
 </div>,
 <div class="ye-chart-item__rank">
 15
 </div>,
 <div class="ye-chart-item__rank">
 16
 </div>,
 <div class="ye-chart-item__rank">
 17
 </div>,
 <div class="ye-chart-item__rank">
 18
 </div>,
 <div class="ye-chart-item__rank">
 19
 </div>,
 <div class="ye-chart-item__rank">
 20
 </div>,
 <div class="ye-chart-item__rank">
 21
 </div>,
 

In [68]:
rank = [r.getText() for r in rank]
rank

['\n1\n',
 '\n2\n',
 '\n3\n',
 '\n4\n',
 '\n5\n',
 '\n6\n',
 '\n7\n',
 '\n8\n',
 '\n9\n',
 '\n10\n',
 '\n11\n',
 '\n12\n',
 '\n13\n',
 '\n14\n',
 '\n15\n',
 '\n16\n',
 '\n17\n',
 '\n18\n',
 '\n19\n',
 '\n20\n',
 '\n21\n',
 '\n22\n',
 '\n23\n',
 '\n24\n',
 '\n25\n',
 '\n26\n',
 '\n27\n',
 '\n28\n',
 '\n29\n',
 '\n30\n',
 '\n31\n',
 '\n32\n',
 '\n33\n',
 '\n34\n',
 '\n35\n',
 '\n36\n',
 '\n37\n',
 '\n38\n',
 '\n39\n',
 '\n40\n',
 '\n41\n',
 '\n42\n',
 '\n43\n',
 '\n44\n',
 '\n45\n',
 '\n46\n',
 '\n47\n',
 '\n48\n',
 '\n49\n',
 '\n50\n',
 '\n51\n',
 '\n52\n',
 '\n53\n',
 '\n54\n',
 '\n55\n',
 '\n56\n',
 '\n57\n',
 '\n58\n',
 '\n59\n',
 '\n60\n',
 '\n61\n',
 '\n62\n',
 '\n63\n',
 '\n64\n',
 '\n65\n',
 '\n66\n',
 '\n67\n',
 '\n68\n',
 '\n69\n',
 '\n70\n',
 '\n71\n',
 '\n72\n',
 '\n73\n',
 '\n 74\n',
 '\n75\n',
 '\n76\n',
 '\n77\n',
 '\n78\n',
 '\n79\n',
 '\n80\n',
 '\n81\n',
 '\n82\n',
 '\n83\n',
 '\n84\n',
 '\n85\n',
 '\n86\n',
 '\n87\n',
 '\n88\n',
 '\n89\n',
 '\n90\n',
 '\n91\n',
 '\n92\

In [69]:
#rank = ["a", "b\n", "c\n"]
rank_converted = []

for e in rank:
    rank_converted.append(e.strip())

print(rank_converted)

['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '100']


In [76]:
#df = pd.DataFrame(list(zip(artist, song_title)),
 #              columns =['artist_name', 'title'])
df = pd.DataFrame(zip(rank_converted, song_title_converted, artist_converted, year), columns = ["rank", "song_title", "artist", "year"])
df

Unnamed: 0,rank,song_title,artist,year
0,1,Uptown Funk!,Mark Ronson Featuring Bruno Mars,2015-01-17 Peak Date
1,2,Party Rock Anthem,LMFAO Featuring Lauren Bennett & GoonRock,2011-07-16 Peak Date
2,3,Shape Of You,Ed Sheeran,2017-01-28 Peak Date
3,4,Closer,The Chainsmokers Featuring Halsey,2016-09-03 Peak Date
4,5,Girls Like You,Maroon 5 Featuring Cardi B,2018-09-29 Peak Date
...,...,...,...,...
95,96,Panda,Desiigner,2016-05-07 Peak Date
96,97,Break Your Heart,Taio Cruz Featuring Ludacris,2010-03-20 Peak Date
97,98,In My Feelings,Drake,2018-07-21 Peak Date
98,99,Wrecking Ball,Miley Cyrus,2013-09-28 Peak Date


In [78]:
df.to_csv("top100_2010_")

# Part II 

Steps: 

1. Input User = song title
2. Check if song is currently "hot"
    2a if yes, recommend another hot song 
    2b if no, get audio features of the song + recommend a song that sounds similar 

In [133]:
song = ""
def user_input_song():
    song = input("What's the title of your favourite song? ")
    return song

In [134]:
def is_it_hot(): 
    if user_input_song() in df.values:
        return True 
    else: 
        return False

In [135]:
is_it_hot()

What's the title of your favourite song? repeat it


False

## Part III 

In [2]:
#1. Get Top 100 songs from each decade
#2. Filter per decade, genre, length 

In [2]:
def get_decade():
    decade = input("What decade you want to hear? (1950s, 1960s, ..., 2020s)")
    return decade

In [3]:
#get_decade()

In [4]:
def get_genre():
    genre = input("What mood are you in?")
    print("Here's a list of genres you can choose from:", genre.unique)
    return genre

In [None]:
#get_genre()

In [1]:
import inquirer
questions = [
  inquirer.List('size',
                message="What size do you need?",
                choices=['Jumbo', 'Large', 'Standard', 'Medium', 'Small', 'Micro'],
            ),
]
answers = inquirer.prompt(questions)
print(answers["size"])

ModuleNotFoundError: No module named 'inquirer'

In [None]:
questions

# Scrape 

In [73]:
df_50 = pd.read_html("http://www.discjockey.org/top-100-songs-of-the-1950s/")
df_60 = pd.read_html("http://www.discjockey.org/top-100-songs-of-the-1960s/")
df_70 = pd.read_html("http://www.discjockey.org/top-100-songs-of-the-1970s/")
df_80 = pd.read_html("http://www.discjockey.org/top-100-songs-of-the-1980s/")
df_90 = pd.read_html("http://www.discjockey.org/top-100-songs-of-the-1990s/")
df_00 = pd.read_html("http://www.discjockey.org/top-100-songs-of-the-2000s/")

In [74]:
#print(df_50[0].tail(15))

df1 = df_50[0]
#df1 = df.drop(index = 100)
df1.drop(df1.tail(1).index,inplace=True)
df1

Unnamed: 0,Rank,Song Title,Song Artist,Year,Genre
0,1,That's Amore,Dean Martin,1953,Oldies
1,2,Come Fly With Me,Frank Sinatra,1958,Oldies
2,3,Jailhouse Rock,Elvis Presley,1957,Oldies
3,4,I Walk The Line,Johnny Cash,1956,Country
4,5,I've Got You Under My Skin,Frank Sinatra,1953,Oldies
...,...,...,...,...,...
95,96,Loving You,Elvis Presley,1957,Ballad
96,97,My Prayer,Platters,1956,Oldies
97,98,Sincerely,McGuire Sisters,1955,Oldies
98,99,Cherry Pink And Apple Blossom White,Perez Prado,1955,Oldies


In [88]:
df1["Genre"].value_counts()

Oldies            73
Ballad            17
Country            6
Easy Listening     1
Swing              1
Jazz               1
Blues              1
Name: Genre, dtype: int64

In [75]:
df2 = df_60[0]
df2.drop(df2.tail(1).index,inplace=True)
df2

Unnamed: 0,Rank,Song Title,Song Artist,Year,Genre
0,1,Sweet Caroline (Good Times Never Seemed So Good),Neil Diamond,1969,Oldies
1,2,Shout,Otis Day And The Knights/Isley Brothers,1967,Oldies
2,3,Brown Eyed Girl,Van Morrison,1967,Oldies
3,4,The Way You Look Tonight,Frank Sinatra,1964,Ballad
4,5,Twist And Shout,Beatles,1963,Oldies
...,...,...,...,...,...
95,96,Born To Be Wild,Steppenwolf,1968,Oldies
96,97,Down On The Corner/Fortunate Son,Creedence Clearwater Revival,1969,Oldies
97,98,This Magic Moment,Drifters,1960,Oldies
98,99,My Cherie Amour,Stevie Wonder,1969,Oldies


In [76]:
df3 = df_70[0]
df3.drop(df3.tail(1).index,inplace=True)
df3

Unnamed: 0,Rank,Song Title,Song Artist,Year,Genre
0,1,Wonderful Tonight,Eric Clapton,1978,Ballad
1,2,YMCA,Village People,1975,Disco
2,3,Sweet Home Alabama,Lynyrd Skynyrd,1974,Rock
3,4,We Are Family,Sister Sledge,1979,Popular
4,5,Old Time Rock & Roll,Bob Seger & The Silver Bullet Band,1978,Rock
...,...,...,...,...,...
95,96,Soul Man,Blues Brothers,1979,Rock
96,97,Born To Run,Bruce Springsteen,1975,Rock
97,98,My Sharona,Knack,1979,Oldies
98,99,Gimme Three Steps,Lynyrd Skynyrd,1973,Rock


In [77]:
df4 = df_80[0]
df4.drop(df4.tail(1).index,inplace=True)
df4

Unnamed: 0,Rank,Song Title,Song Artist,Year,Genre
0,1,Don't Stop Believin',Journey,1981,Rock
1,2,You Shook Me All Night Long,AC/DC,1980,Rock
2,3,Love Shack,B-52's,1989,Popular
3,4,Livin' On A Prayer,Bon Jovi,1986,Rock
4,5,Pour Some Sugar On Me,Def Leppard,1987,Rock
...,...,...,...,...,...
95,96,Heaven,Bryan Adams,1985,Oldies
96,97,Here And Now,Luther Vandross,1989,Ballad
97,98,Nothin' But A Good Time,Poison,1988,Rock
98,99,Its Raining Men,Weather Girls,1983,Funk


In [78]:
df5 = df_90[0]
df5.drop(df5.tail(1).index,inplace=True)
df5

Unnamed: 0,Rank,Song Title,Song Artist,Year,Genre
0,1,Electric Slide,Marcia Griffiths,1990,Popular
1,2,Baby Got Back,Sir Mix-A-Lot,1992,Popular
2,3,Friends In Low Places,Garth Brooks,1990,Country
3,4,Cotton Eye Joe,Rednex,1994,Country
4,5,Macarena,Los Del Rio,1995,Popular
...,...,...,...,...,...
95,96,Barbie Girl,Aqua,1997,Popular
96,97,Whatta Man,Salt 'N Pepa,1994,Hip Hop
97,98,I Like It I Love It,Tim McGraw,1995,Country
98,99,Somewhere over the Rainbow,Israel Kamakawiwow'ole,1993,Easy Listening


In [79]:
df6 = df_00[0]
df6.drop(df6.tail(1).index,inplace=True)
df6

Unnamed: 0,Rank,Song Title,Song Artist,Year,Genre
0,1,Cupid Shuffle,Cupid,2007,Popular
1,2,Cha-Cha Slide,Mr. C The Slide Man,2000,Popular
2,3,I Gotta Feeling,Black Eyed Peas,2009,Popular
3,4,Single Ladies (Put A Ring On It),Beyonce,2008,Popular
4,5,Wobble,V.I.C.,2008,Popular
...,...,...,...,...,...
95,96,The Way You Move,Outkast Featuring Sleepy Brown,2003,Hip Hop
96,97,Big Green Tractor,Jason Aldean,2009,Country
97,98,Remember When,Alan Jackson,2001,Ballad
98,99,Shake It,Metro Station,2008,Popular


In [80]:
#combine all dataframes 
frames = [df1, df2, df3, df4, df5, df6]

In [81]:
df_all = pd.concat(frames)

df_all.count()

Rank           600
Song Title     600
Song Artist    599
Year           599
Genre          598
dtype: int64

In [82]:
df_all.to_csv("top100_all")

In [89]:
df_all["Genre"].value_counts()

Oldies            204
Popular           160
Ballad             63
Country            52
Rock               38
Alternative        26
R&B                12
Disco              12
Hip Hop             8
Easy Listening      4
Reggae              4
Rap                 3
Latin               2
Funk                2
Dance               1
Big Band            1
Blues               1
Polka               1
Other               1
Swing               1
None                1
Jazz                1
Name: Genre, dtype: int64

In [103]:
#combine R&B + HipHop + Rap
#Dance + Disco 
#Jazz + Swing + Blues + 

In [104]:
!pip install spotipy



In [105]:
#Client ID 56e8c847e9974517952ef0d040f2a78b
#Client Secret 5622c0f1f94243ffa051cfbdf4734578 

import sys
!{sys.executable} -m pip install spotipy

Defaulting to user installation because normal site-packages is not writeable


In [109]:
#initialize Spotipy with our credentials
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id= Client_ID,
                                                       client_secret=Client_Secret))

NameError: name 'spotipy' is not defined

In [110]:
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="56e8c847e9974517952ef0d040f2a78b",
                                                           client_secret="5622c0f1f94243ffa051cfbdf4734578"))

NameError: name 'spotipy' is not defined