# Instructions - Scraping popular songs
Your product will take a song as an input from the user and will output another song (the recommendation). In most cases, the recommended song will have to be similar to the inputted song, but the CTO thinks that if the song is on the top charts at the moment, the user will enjoy more a recommendation of a song that's also popular at the moment. <br>
<br>
You have find data on the internet about currently popular songs. Billboard maintains a weekly Top 100 of "hot" songs here: https://www.billboard.com/charts/hot-100.<br>
<br>
It's a good place to start! Scrape the current top 100 songs and their respective artists, and put the information into a pandas dataframe.

In [1]:
from bs4 import BeautifulSoup
import requests
import pandas as pd
import re

In [2]:
url = "https://www.billboard.com/charts/hot-100"
response = requests.get(url)
response.status_code

200

In [3]:
soup = BeautifulSoup(response.content, "html.parser")
# print(soup.prettify())

In [4]:
# soup.find_all("div", attrs={"class": "o-chart-results-list-row-container"}) 

In [5]:
# The first song in the list has a different format than the rest

cls = "c-title a-no-trucate a-font-primary-bold-s u-letter-spacing-0021 u-font-size-23@tablet " \
      "lrv-u-font-size-16 u-line-height-125 u-line-height-normal@mobile-max a-truncate-ellipsis " \
      "u-max-width-245 u-max-width-230@tablet-only u-letter-spacing-0028@tablet"

soup.find_all("h3", attrs={"class": cls}) 

[<h3 class="c-title a-no-trucate a-font-primary-bold-s u-letter-spacing-0021 u-font-size-23@tablet lrv-u-font-size-16 u-line-height-125 u-line-height-normal@mobile-max a-truncate-ellipsis u-max-width-245 u-max-width-230@tablet-only u-letter-spacing-0028@tablet" id="title-of-a-story">
 
 	
 	
 		
 					Last Night		
 	
 </h3>]

In [6]:
titles = [soup.find("h3", attrs={"class": cls}).get_text()]
titles

['\n\n\t\n\t\n\t\t\n\t\t\t\t\tLast Night\t\t\n\t\n']

In [7]:
# Rest of the song titles

cls = "c-title a-no-trucate a-font-primary-bold-s u-letter-spacing-0021 lrv-u-font-size-18@tablet " \
      "lrv-u-font-size-16 u-line-height-125 u-line-height-normal@mobile-max a-truncate-ellipsis " \
      "u-max-width-330 u-max-width-230@tablet-only"

# soup.find_all("h3", attrs={"class": cls}) 

In [8]:
num_iter = len(soup.find_all("h3", attrs={"class": cls}))

for i in range(num_iter):
    titles.append(soup.find_all("h3", attrs={"class": cls})[i].get_text())

# titles

In [9]:
# Shorten the list to the first 100 and remove extra characters 

titles = titles[:100]

titles = [re.sub(r'[\r\n\t]', '', x) for x in titles]
# titles 

In [10]:
len(titles)

100

In [11]:
# First artist

cls = "c-label a-no-trucate a-font-primary-s lrv-u-font-size-14@mobile-max u-line-height-normal@mobile-max " \
      "u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 " \
      "u-max-width-230@tablet-only u-font-size-20@tablet"

soup.find_all("span", attrs={"class": cls}) 

[<span class="c-label a-no-trucate a-font-primary-s lrv-u-font-size-14@mobile-max u-line-height-normal@mobile-max u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 u-max-width-230@tablet-only u-font-size-20@tablet">
 	
 	Morgan Wallen
 </span>]

In [12]:
artists = [soup.find("span", attrs={"class": cls}) .get_text()]
artists

['\n\t\n\tMorgan Wallen\n']

In [13]:
# Rest of the artists

cls = "c-label a-no-trucate a-font-primary-s lrv-u-font-size-14@mobile-max u-line-height-normal@mobile-max u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 u-max-width-230@tablet-only"

num_iter = len(soup.find_all("span", attrs={"class": cls}))

for i in range(num_iter):
    artists.append(soup.find_all("span", attrs={"class": cls})[i].get_text())

In [14]:
# Shorten the list to the first 100 and remove extra characters 

artists = artists[:100]

artists = [re.sub(r'[\r\n\t]', '', x) for x in artists]
# artists 

In [15]:
len(artists)

100

In [16]:
# Create dataframe

songs = pd.DataFrame({"title": titles, "artist": artists})
songs

Unnamed: 0,title,artist
0,Last Night,Morgan Wallen
1,Flowers,Miley Cyrus
2,Fast Car,Luke Combs
3,Calm Down,Rema & Selena Gomez
4,All My Life,Lil Durk Featuring J. Cole
...,...,...
95,Save Me,Jelly Roll With Lainey Wilson
96,Yandel 150,Yandel & Feid
97,Beso,Rosalia & Rauw Alejandro
98,I Wrote The Book,Morgan Wallen
