Webscraping Bonus
1. Navigate to https://www.billboard.com/charts/hot-100/. Using BeautifulSoup, extract out the This Week, artist, song, Last Week, Peak Position, and Weeks on Chart values into a pandas DataFrame. Hint: The HTML for the number one ranked song is slightly different from that of the rest of the songs.

In [1]:
# Import libraries
import requests
from bs4 import BeautifulSoup as BS
import pandas as pd

In [2]:
# Get html of billboard hot 100 page
url = "https://www.billboard.com/charts/hot-100/"

# Make a request to get the page content
response = requests.get(url)
html = response.text

# Parse the HTML with BeautifulSoup
soup = BS(html, "html.parser")

print(type(response))
print(response.status_code)

<class 'requests.models.Response'>
200


In [3]:
# Billboard structures #1 differently
first_rank = soup.find('li', class_='o-chart-results-list__item').text.strip()
first_song = soup.find_all('h3', id='title-of-a-story')[2].text.strip()
first_artist = soup.find_all('span', class_='c-label')[1].text.strip()

# Stats for #1
stats = soup.find_all('span', class_='c-label')
lastweek = stats[2].text.strip()
peak = stats[3].text.strip()
weeks_chart = stats[4].text.strip()

# Store in lists
ranks = [first_rank]
songs = [first_song]
artists = [first_artist]
last_week = [lastweek]
peak_pos = [peak]
weeks_on_chart = [weeks_chart]

# Print to check
print("This Week:", ranks[0])
print("Song:", songs[0])
print("Artist:", artists[0])
print("Last Week:", last_week[0])
print("Peak Position:", peak_pos[0])
print("Weeks on Chart:", weeks_on_chart[0])

This Week: 1
Song: Golden
Artist: HUNTR/X: EJAE, Audrey Nuna & REI AMI
Last Week: 1
Peak Position: 1
Weeks on Chart: 15


In [5]:
data = []

# Find all song rows (each <ul> = one song entry)
chart_rows = soup.find_all("ul", class_="o-chart-results-list-row")

for row in chart_rows:
    # Song Rank
    rank_tag = row.find("span", class_="c-label")
    rank = rank_tag.text.strip() if rank_tag else "N/A"

    # Song title
    song_tag = row.find("h3", class_="c-title")
    song = song_tag.text.strip() if song_tag else "N/A"

    # Artist name
    artist_tag = row.find("span", class_="c-label a-no-trucate")
    if artist_tag:
        artist = artist_tag.text.strip()
    else:
        backup = row.find("ul", class_="lrv-a-unstyle-list")
        artist = backup.find("span").text.strip() if backup else "N/A"

    # Song stats (last week, peak, weeks on chart)
    stats = row.find_all("span", class_="c-label")
    if len(stats) >= 3:
        last_week = stats[-3].text.strip()
        peak = stats[-2].text.strip()
        weeks = stats[-1].text.strip()
    else:
        last_week = peak = weeks = "N/A"

    # Save into data list
    data.append({
        "This Week": rank,
        "Song": song,
        "Artist": artist,
        "Last Week": last_week,
        "Peak Position": peak,
        "Weeks on Chart": weeks
    })
# Build DataFrame
df = pd.DataFrame(data)

# Show summary info
print("Hot", len(df), "chart")
print("\nShape:", df.shape)
df.head(10)

Hot 100 chart

Shape: (100, 6)


Unnamed: 0,This Week,Song,Artist,Last Week,Peak Position,Weeks on Chart
0,1,Golden,"HUNTR/X: EJAE, Audrey Nuna & REI AMI",1,1,15
1,2,Ordinary,Alex Warren,2,1,34
2,3,Tit For Tat,Tate McRae,-,3,1
3,4,What I Want,Morgan Wallen Featuring Tate McRae,4,1,20
4,5,Daisies,Justin Bieber,8,2,12
5,6,Lose Control,Teddy Swims,7,1,111
6,7,Soda Pop,"Saja Boys: Andrew Choi, Neckwav, Danny Chung, ...",3,3,14
7,8,I Got Better,Morgan Wallen,11,7,20
8,9,Love Me Not,Ravyn Lenae,9,5,27
9,10,Your Idol,"Saja Boys: Andrew Choi, Neckwav, Danny Chung, ...",5,4,15


2. After getting the code working for the current chart, navigate to last week's chart. Notice how the url for the page changes. Write a function which will, given a date, return a pandas DataFrame containing the Billboard chart data for that date.

In [21]:
def get_billboard_chart(date):
    """Given a date in 'YYYY-MM-DD' format, return a pandas DataFrame
    containing the Billboard Hot 100 chart for that week."""
    # Billboard chart URL pattern
    url = f"https://www.billboard.com/charts/hot-100/{date}/"

    # Fetch and parse the page
    headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                  "AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/140.0.0.0 Safari/537.36"
}
    response = requests.get(url, headers=headers)
    response.raise_for_status()  # ensure request succeeded
    soup = BS(response.text, "html.parser")

    data = []

    # Find all song rows
    chart_rows = soup.find_all("ul", class_="o-chart-results-list-row")

    for row in chart_rows:
        # Rank
        rank_tag = row.find("span", class_="c-label")
        rank = rank_tag.text.strip() if rank_tag else "N/A"

        # Song Title
        song_tag = row.find("h3", class_="c-title")
        song = song_tag.text.strip() if song_tag else "N/A"

        # Artist
        artist_tag = row.find("span", class_="c-label a-no-trucate")
        if artist_tag:
            artist = artist_tag.text.strip()
        else:
            backup = row.find("ul", class_="lrv-a-unstyle-list")
            artist = backup.find("span").text.strip() if backup else "N/A"

        # Song stats (last week, peak, weeks on chart)
        stats = row.find_all("span", class_="c-label")
        if len(stats) >= 3:
            last_week = stats[-3].text.strip()
            peak = stats[-2].text.strip()
            weeks = stats[-1].text.strip()
        else:
            last_week = peak = weeks = "N/A"

        # Save row
        data.append({
            "This Week": rank,
            "Song": song,
            "Artist": artist,
            "Last Week": last_week,
            "Peak Position": peak,
            "Weeks on Chart": weeks
        })

    # Convert to DataFrame
    df = pd.DataFrame(data)
    return df

# Example: Get chart for October 5, 2024
df = get_billboard_chart("2024-10-05")
df.head(10)

Unnamed: 0,This Week,Song,Artist,Last Week,Peak Position,Weeks on Chart
0,1,A Bar Song (Tipsy),Shaboozey,1,1,24
1,2,I Had Some Help,Post Malone Featuring Morgan Wallen,2,1,20
2,3,Espresso,Sabrina Carpenter,3,3,24
3,4,"Good Luck, Babe!",Chappell Roan,4,4,25
4,5,Die With A Smile,Lady Gaga & Bruno Mars,5,3,6
5,6,Birds Of A Feather,Billie Eilish,6,5,19
6,7,Lose Control,Teddy Swims,7,1,59
7,8,Please Please Please,Sabrina Carpenter,9,1,16
8,9,Taste,Sabrina Carpenter,8,2,5
9,10,Not Like Us,Kendrick Lamar,10,1,21


3. Write a loop to retrieve the Billboard chart data for the last 10 weeks.

In [22]:
from datetime import datetime, timedelta

# Using function get_billboard_chart(date) from above

# Billboard charts update every Saturday
today = datetime.today()

# Find the most recent Saturday
last_saturday = today - timedelta(days=(today.weekday() + 2) % 7)

# Create a list of the last 10 Saturdays
dates = [(last_saturday - timedelta(weeks=i)).strftime("%Y-%m-%d") for i in range(10)]

# Retrieve the chart data for each week
charts = []
for d in dates:
    print(f"Fetching chart for {d}...")
    df = get_billboard_chart(d)
    df["Chart Date"] = d  # Add date column
    charts.append(df)

# Combine all 10 weeks into one DataFrame
all_charts = pd.concat(charts, ignore_index=True)

# Preview the result
print(all_charts.head())

Fetching chart for 2025-10-04...
Fetching chart for 2025-09-27...
Fetching chart for 2025-09-20...
Fetching chart for 2025-09-13...
Fetching chart for 2025-09-06...
Fetching chart for 2025-08-30...
Fetching chart for 2025-08-23...
Fetching chart for 2025-08-16...
Fetching chart for 2025-08-09...
Fetching chart for 2025-08-02...
  This Week         Song                                             Artist  \
0         1       Golden               HUNTR/X: EJAE, Audrey Nuna & REI AMI   
1         2     Ordinary                                        Alex Warren   
2         3     Soda Pop  Saja Boys: Andrew Choi, Neckwav, Danny Chung, ...   
3         4  What I Want                 Morgan Wallen Featuring Tate McRae   
4         5    Your Idol  Saja Boys: Andrew Choi, Neckwav, Danny Chung, ...   

  Last Week Peak Position Weeks on Chart  Chart Date  
0         1             1             14  2025-10-04  
1         2             1             33  2025-10-04  
2         5             3     

Finished bonus questions but the code isn't all my own.