# Scraping to Plotting Lab

### Introduction

Ok, now it's time to work through scraping to plotting data on our own.  

### Collecting our Data

Let's gather a list of popular songs.  From Wikipedia, we can gather the list of most streamed songs from Spotify.  It's located at the following url.

In [None]:
url = 'https://en.wikipedia.org/wiki/List_of_most-streamed_songs_on_Spotify'

Ok, now let's use pandas to find the list of tables from this url.  Store the list of tables to the variable `tables`.

In [None]:
import pandas as pd

tables = pd.read_html(url)

Now that we have our list of tables, find the element that has the large table of songs, and store it as the variable `songs_table`.

In [13]:
songs_table = tables[0]
print(songs_table)

                      Rank                    Song  \
0                        1       "Blinding Lights"   
1                        2          "Shape of You"   
2                        3     "Someone You Loved"   
3                        4          "Dance Monkey"   
4                        5             "Sunflower"   
..                     ...                     ...   
96                      97  "Just The Way You Are"   
97                      98         "See You Again"   
98                      99             "I'm Yours"   
99                     100     "Despacito (remix)"   
100  As of August 8th 2023   As of August 8th 2023   

                                             Artist(s)     Streams (billions)  \
0                                           The Weeknd                  3.730   
1                                           Ed Sheeran                  3.580   
2                                        Lewis Capaldi                  2.911   
3                          

Once stored, we can convert our table, which is a pandas dataframe, to a list of dictionaries.  We can do this with the line `songs_table.to_dict('records')`.  Assign the result to the variable `songs`.

In [15]:
songs = songs_table.to_dict('records')

Check that your result matches the commented out data in the cell.

In [16]:
songs[:2]

# [{'Song': '"Flowers"',
#   'Artist(s)': 'Miley Cyrus',
#   'Weeksat No. 1[152]': '1',
#   'Average streams(millions)': '96.0',
#   'Date published': '13 January 2023',
#   'Date achieved': '19 January 2023',
#   'Ref.': '[153]'},
#  {'Song': '"Kill Bill"',
#   'Artist(s)': 'SZA',
#   'Weeksat No. 1[152]': '2',
#   'Average streams(millions)': '44.4',
#   'Date published': '9 December 2022',
#   'Date achieved': '5 January 2023',
#   'Ref.': '[154]'}]

[{'Rank': '1',
  'Song': '"Blinding Lights"',
  'Artist(s)': 'The Weeknd',
  'Streams (billions)': '3.730',
  'Release date': 'November 29, 2019',
  'Ref.': '[3][4]'},
 {'Rank': '2',
  'Song': '"Shape of You"',
  'Artist(s)': 'Ed Sheeran',
  'Streams (billions)': '3.580',
  'Release date': 'January 6, 2017',
  'Ref.': '[5]'}]

### Converting our list of dictionaries

Ok, so now we have a list of dictionaries, and we would like to have two lists -- one for each of the top songs, and another for the related number of streams of that songs.

First, use a for loop to create a list of the songs.  Store that as the variable `song_names`.

In [17]:
song_names = []
for song in songs:
    song_names.append(song['Song'])

In [18]:
song_names[:3]

# ['"Flowers"', '"Kill Bill"', '"Anti-Hero"']

['"Blinding Lights"', '"Shape of You"', '"Someone You Loved"']

Next we need a list of the number of streams.

In [19]:
streams = []
for song in songs:
  streams.append(song['Streams (billions)'])

In [20]:
streams[:3]
# ['96.0', '44.4', '64.0']

['3.730', '3.580', '2.911']

### Plotting our Data

Ok, now it's time for plotly. We start by importing our `plotly.graph_objects` library in such a way that we can reference it as `go`.

In [21]:
import plotly.graph_objects as go

Now we want to create a figure, and inside of the figure, place a trace.  Change the trace so that it prints out the correct information.  Remember our two lists are `song_names` and `streams`.

In [25]:
import plotly.graph_objects as go

scatter = go.Scatter(y = [song_names[:3], streams], mode = 'markers', hovertext = [])

go.Figure(scatter)

### Summary

Well, that's a job well done.  If you started these lessons without ever having coded before, please go give yourself a well deserved reward.  You earned it.

And if you'd like to keep going, check out one of our other courses or workshops.  You won't be disappointed :)