<a name='1'></a>
#1 - Install spotify sdk and pymongo in your Google Colab env

In [None]:
!pip install spotipy pymongo --upgrade



<a name='2'></a>
#2 - Create an Atlas Client on MongoDB

To sign up for a free MongoDB account, go to https://mongodb.com, then create a new free account. Once your account is set up, you will be taken to the screen to create your cluster. Use the default settings for their free Atlas cluster (MO, as they refer to it) and click Create Cluster to get started. This will take you to the Clusters page so you can begin creating your new cluster, which takes several minutes.

###Create your Database User and whitelist your IP address
Next, in the Atlas tab Security Quickstart, you will need to complete additional steps to get up and running:

*	Add your username and password, then click Create User—This enables you to log into your cluster.
*	Keep My Local Environment—This means adding your network IP addresses to the IP Access List. This can be modified at any time.
*	Click on Add My Current IP Address—This is a security measure that ensures only the IP addresses you verify are allowed to interact with your cluster. To connect to this cluster from multiple locations (school, home, work, etc.), you will need to whitelist each IP address from which you intend to connect.
Finally, click on Finish and Close.

###Connect to your Cluster

Go to Databases. Click Connect to continue. Connecting to a MongoDB Atlas database from Python requires a connection string. To get your connection string, click **Connect Your Application**. In **Select your driver and version**, choose Python 3.6 or later. Your connection string will appear below in **Add your connection string into your application code**. Click COPY to copy the string. Paste this string into the keys.py file as mongo_connection_string’s value. Replace “<PASSWORD>” in the connection string with your password, and replace the database name “myFirstDatabase” with “mySpotifyDatabase”,” which will be the database name in this assignment. At the bottom of the Connect to YourClusterName, click Close. You are now ready to interact with your Atlas cluster.


In [None]:
MONGO_STRING = 'mongodb+srv://ashna:ashna23@cluster0.0fn4k.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0' #Include your mongo connection string here

In [None]:
from pymongo import MongoClient
#START YOUR CODE HERE
atlas_client = MongoClient(MONGO_STRING)   #Pass your cluster connection string to the client method
#END YOUR CODE HERE

In [None]:
#START YOUR CODE HERE
database = atlas_client["MySpotifydatabase"]                            #Create a database object and name it for your atlas_client
featured_albums_collection = database["Canada_Hits"]      #Select a name for your collection
#END YOUR CODE HERE

<a name='3'></a>
#3 - Create a Spotify APP

To get access to Spotify's API resources, you need to create a Spotify account if you don't already have one. A trial account will be enough to complete this lab.

1. Go to https://developer.spotify.com/, create an account and log in.
2. Click on the account name in the right-top corner and then click on **Dashboard**.
3. Create a new APP using the following details:
   - App name: You can choose the name, make sure you select only use an alphanumeric string without special characters
   - App description: `DBMS test API application`
   - Website: leave empty
   - Redirect URIs: `http://localhost:6000`
   - API to use: select `Web API`
4. Click on **Save** button. If you get an error message saying that your account is not ready, you can log out, wait for a few minutes and then repeat again steps 2-4.
5. In the App Home page click on **Settings** and reveal `Client ID` and `Client secret`. Make sure you copy those and save them in a separated file!


Here's the link to [the Spotify API documentation](https://developer.spotify.com/documentation/web-api/tutorials/getting-started) that you can refer to while you're working on this assignment.

<a name='4'></a>
#4 - Create a Spotify ClientCredential object using the SDK

The Spotipy SDK is a Python client for interacting with Spotify’s Web API. It provides a range of functions to access and manage data related to artists, albums, tracks, playlists, and user profiles. Here’s an overview of some key capabilities that you will explore in this assignment:





*   **Accessing Artist Information**: With Spotipy, you can retrieve detailed information about artists, including their name, genres, popularity score, and followers. The SDK also allows access to an artist's top tracks and related artists, which can help students explore music trends and build up artist profiles for batch storage in MongoDB.

*   **Track and Album Metadata**: Spotipy enables access to metadata for tracks and albums, such as track name, album name, release date, and track popularity. Additionally, you can retrieve audio features like tempo, danceability, and energy, which provide in-depth details about the music and are valuable for data analysis.

*   **Searching for Content**: Using Spotipy’s search functionality, you can query the Spotify catalog by keywords for artists, albums, playlists, or tracks. This can be instrumental in batch processing, as users can search for multiple artists or songs and gather relevant data in one go.

*   **User Profile and Playlist Management**: Spotipy also supports accessing Spotify user profiles and playlists, though this is less relevant for the assignment. However, this feature could provide additional context or personalization if students wanted to explore user-based music preferences.


*   **Authorization and Access Control**: Spotipy handles authorization with Spotify’s OAuth, ensuring that only authenticated requests are made. This allows students to securely access data and manage the rate limits associated with the Spotify API.

In [None]:
import spotipy
import pandas as pd
from spotipy.oauth2 import SpotifyClientCredentials

The first step in working with an API is understanding its authentication process. For Spotify, this involves using a Client ID and Client Secret generated by the Spotify app to obtain an access token. The access token is a string containing the credentials and permissions required to access specific resources. For more information, refer to the Spotify [API documentation](https://developer.spotify.com/documentation/web-api/concepts/access-token).

Since each API is designed with unique features, it’s essential to review its documentation thoroughly to access data responsibly. Throughout this lab, you’ll find links to documentation; it’s recommended to review these during and after the session as needed.

Now, let’s create variables to store the client_id and client_secret values.

In [None]:
CLIENT_ID = '401b5ca9b3b44754aa40c0c775698cef'     #Include your client ID here
CLIENT_SECRET = 'b91afa7dc4e24b868671cc2392748124' #Include your client Secret here

In [None]:
credentials = SpotifyClientCredentials(
        client_id=CLIENT_ID,
        client_secret=CLIENT_SECRET
    )

spotify = spotipy.Spotify(client_credentials_manager=credentials, language='en')  #You can change this if you want to get data from a different lenguage

When working with the Spotify API, you'll receive a temporary access token, with its validity period specified in the `expires_in` field (in seconds). Once this token expires, any subsequent requests will fail and return an error with a status code of 401, indicating that the request is unauthorized.

For each API request you send to Spotify, you need to include the access token in the request’s authorization header. The get_auth_header function is provided to streamline this process. It takes the access token as input and returns a properly formatted authorization header, which you can then include in your API requests.

**If you get an 401 response, please make sure to create your access token again by executing the code below!**

In [None]:
credentials.get_access_token()

  credentials.get_access_token()


{'access_token': 'BQBwxafUTW8xOrWqM5dspolNJt1wVKWlDOl-cEbJXPPwKPNX41Vosoq-vwankAdJfXTEQ4M3QfVZaB01KFCFve3h3gmuld1e6JxyEcXUOzgNMsy4dzU',
 'token_type': 'Bearer',
 'expires_in': 3600,
 'expires_at': 1732603853}

The above token contains the expiration (in seconds) of the token. Once the token expires, you will need to create a new one.

<a name='5'></a>
#5 - Get new releases data from Spotify API

Select one of the following country codes to fetch data from Spotify based on the country of your choice:

* AU: Australia
* AT: Austria
* BE: Belgium
* BO: Bolivia
* BR: Brazil
* BG: Bulgaria
* CA: Canada
* CL: Chile
* CO: Colombia
* CR: Costa Rica
* CY: Cyprus
* DO: Dominican Republic
* FI: Finland
* FR: France
* DE: Germany
* GT: Guatemala
* HN: Honduras
* HK: Hong Kong
* IE: Ireland
* IT: Italy
* JP: Japan
* LV: Latvia
* LU: Luxembourg
* MY: Malaysia
* MT: Malta
* MX: Mexico
* MC: Monaco
* NL: Netherlands
* NZ: New Zealand
* NI: Nicaragua
* PY: Paraguay
* PE: Peru
* PH: Philippines
* PL: Poland
* PT: Portugal
* SG: Singapore
* ES: Spain
* SK: Slovakia
* SE: Sweden
* CH: Switzerland
* TW: Taiwan
* TR: Turkey
* GB: United Kingdom
* US: United States
* UY: Uruguay


**Your task**:
*   Select one country from the list above for which you will retrieve data from the Spotify API.
*   Use the limit parameter to specify the number of records you want to retrieve. Default: 20. Minimum: 1. Maximum: 50


In [None]:
#START YOUR CODE HERE
COUNTRY_CODE = 'CA'
LIMIT = 20
#END YOUR CODE HERE

Now, let's use the token to perform a request to access the first resource, which is the [new_releases](https://spotipy.readthedocs.io/en/2.22.1/?highlight=featured_playlists#spotipy.client.Spotify.new_releases).

**Your tasks**:


1.   Look at the link above and make the correct call to the Spotify end-point and get the new_releases and store them in the `featured_albums`.
2.   Loop through the response and store each record (album) into your MongoDB collection you created above. HINT: You should explore the `featured_albums` response to understand how it is structured, also check the [mongodb doc](https://www.mongodb.com/docs/manual/reference/method/db.collection.insertOne/?msockid=2c010af6d0b963db3ebe1e3ed1496248)



In [None]:
#START YOUR CODE HERE
featured_albums = spotify.new_releases(country='CA', limit=20)

for album in featured_albums['albums']['items']:
    featured_albums_collection.insert_one(album)
#END YOUR CODE HERE

<a name='6'></a>
#6 - Explore your MongoDB collection

This script will connect to the MongoDB collection, query for specific fields (artist ID, name, and URI), and load the data into a list.

**Your tasks:**


1.   Check the following [link](https://www.mongodb.com/docs/manual/reference/method/db.collection.find/) to explore how to use the find() method to query specific fields from your collection.  Ensure that your query retrieves only the artists id, name and uri from your collection. Make sure to read the documentation.
2.   Once you get the data from your query, you should create a pandas DataFrame with the results. You should find a way to combine all records into the `artists_data` dictionary.



In [None]:
artists_data = []
#START YOUR CODE HERE
for album in featured_albums_collection.find({},
    {   "artists.id": 1,
        "artists.name": 1,
        "artists.uri": 1,
        "_id": 0 }):
    for artist in album.get("artists", []):
        artist_info = {
            "artist_id": artist["id"],
            "artist_name": artist["name"],
            "artist_uri": artist["uri"]
        }
        artists_data.append(artist_info)
    #END YOUR CODE HERE

# Convert the list of artist data into a pandas DataFrame
artists_df = pd.DataFrame(artists_data)

<a name='7'></a>
#7 - Get all albums from the featured Artists

When we used the `new_releases` method, we actually queried all new relases (based on the parameters you picked) from the Spotify API. This allowed us to save multiple documents (each beign a single released) in an object called `artists_data`. Now your job is to retrieve every single album from the list of artists you got from the `new_releases` method.

Your tasks:


1.   Loop through the `artists_data` object and get the `artist_uri` for each artist. This value will be required for you to call the `artist_albums` method and get all `albums` from that `artist_uri`. Learn more about [artist_albums](https://spotipy.readthedocs.io/en/2.22.1/?highlight=featured_playlists#spotipy.client.Spotify.artist_albums) and [artist_uri](https://spotipy.readthedocs.io/en/2.22.1/?highlight=featured_playlists#ids-uris-and-urls) by clicking the links
2.   You should create a temporal variable to store the results from the `artist_albums` method. Furthermore, you should only store the `items` key from the results inside a variable called `albums`
3.   We want to create a new list with all the different albums that an artist has, to do this, you will first add a new key named `artist_name` that will contain the `artist_name` that you got from the `artist_albums` method.
4.   Join each album inside the `artists_albums` list.
5.   Spotify API works with something called "pagination". Pagination means that within the string response from the API, there will be another set of results contained in the `next` key. This allows us to create consecutive requests from the same element. Your job is to use the `next` [method](https://spotipy.readthedocs.io/en/2.22.1/?highlight=featured_playlists#spotipy.client.Spotify.next) to get the next albums from a given artist. Do not forget to include the `artist_name` just as you did in step 3.



In [None]:
artists_albums = []

#START YOUR CODE HERE
for artist in artists_data:
    results = spotify.artist_albums(artist['artist_uri'], album_type='album')
    albums = results['items']

    # Add artist's name to each album in the initial results
    for album in albums:
        album['artist_name'] = artist['artist_name']
        artists_albums.append(album)

    # Loop through paginated results, adding artist's name
    while results['next']:
        results = spotify.next(results)
        for album in results['items']:
            album['artist_name'] = artist['artist_name']
            artists_albums.append(album)
#END YOUR CODE HERE

<a name='8'></a>
#8 - Create New MongoDB collection

Now that you have your new object with all artists' albums, you will need to create a new Collection in your MongoDB cluster. Use the data you created above to store those in a new MongoDB collection.

Remember to look at this [documentation](https://www.mongodb.com/docs/manual/tutorial/insert-documents/#:~:text=Collection.-,insertOne()%20inserts%20a%20single%20document%20into%20a%20collection.,value%20to%20the%20new%20document) to learn more about MongoDB. Also, DO NOT FORGET to go and verify that the data is in your MongoDB cluster.

In [None]:
#START YOUR CODE HERE
database = atlas_client["Canada_new_Database"]       #Select the name of your database
albums_collection = database["Trending_Hits"]  #Select the name of your collection

for album in artists_albums:
    albums_collection.insert_one(album)     #Insert the data into MongoDB
#END YOUR CODE HERE

<a name='9'></a>
#9 - Explore your data!
You have now collected all albums from artists with new releases. Your next task is to explore and analyze this data using Python and MongoDB.

Answer the following questions based on the data in your collection:


1.   How many albums are stored in the collection?
2.   Which artist has the most albums in the collection?
3.   Which artist has the least albums in the collection?
4.   What is the average number of tracks per album? (*Include the Artist Name*)
5.   How many albums are available in each market?
6.   What is the release date of the oldest album? (*Include the Artist Name*)
7.   What are the top 5 albums with the most tracks? (*Include the Artist Name*)
8.   Which albums are available in more than 60 markets? (*Include the Artist Name*)
9.   How many albums does each artist have, and what is the average number of tracks per album for each artist?
10.  Which albums have the word "Deluxe" in their title? (*Include the Artist Name*)

For your reference, here are the MongoDB commands that will be useful for these tasks:

[Aggregate](https://www.mongodb.com/docs/manual/reference/command/aggregate/)

[Find](https://www.mongodb.com/docs/manual/reference/command/find/)


In [None]:
# Question 1:
total_albums_count = albums_collection.count_documents({})
print(f"Total Albums: {total_albums_count}")

Total Albums: 636


In [None]:
# Question 2:
artist_with_most_albums = albums_collection.aggregate([
    {"$group": {"_id": "$artist_name", "album_count": {"$sum": 1}}},
    {"$sort": {"album_count": -1}},
    {"$limit": 1}
])
print(f"Artist with most albums: {list(artist_with_most_albums)}")

Artist with most albums: [{'_id': 'Taylor Swift', 'album_count': 87}]


In [None]:
# Question 3:
artist_with_least_albums = albums_collection.aggregate([
    { "$group":{"_id": "$artist_name", "least_album_count": {"$sum": 1}}},
    {"$sort":{"least_album_count":1}},
    {"$limit":1 }])

print(f"Artist with least albums: {list(artist_with_least_albums)}")

Artist with least albums: [{'_id': 'Elvie Shane', 'least_album_count': 6}]


In [None]:
# Question 4
average_track_per_album= albums_collection.aggregate([
    {"$match":{"album_type":"album"}},
    {"$unwind":"$artists"}, # unwind the artists to get artist separately
    {"$group" :{"_id":"$artists.name" , "average_track": {"$avg":"$total_tracks"}}},
     {"$sort":{"average_track":-1}}

])
for a in average_track_per_album:
  artist_name = a["_id"]
  average_track = round(a['average_track'], 2)
  print(f"Artist: {artist_name} has an average of {average_track} tracks per album.")

Artist: DJ Snake has an average of 21.0 tracks per album.
Artist: Taylor Swift has an average of 19.9 tracks per album.
Artist: YUNGMORPHEUS has an average of 18.0 tracks per album.
Artist: Honey Dijon has an average of 17.5 tracks per album.
Artist: Eyedress has an average of 17.45 tracks per album.
Artist: DJ Dan has an average of 17.0 tracks per album.
Artist: Bumpy Knuckles has an average of 17.0 tracks per album.
Artist: Sonny Fodera has an average of 17.0 tracks per album.
Artist: Anne Wilson has an average of 16.75 tracks per album.
Artist: Hozier has an average of 16.67 tracks per album.
Artist: Nicky Jam has an average of 16.29 tracks per album.
Artist: Tee Grizzley has an average of 16.15 tracks per album.
Artist: Olamide has an average of 15.45 tracks per album.
Artist: The Lumineers has an average of 15.29 tracks per album.
Artist: Lucky Daye has an average of 15.17 tracks per album.
Artist: The Game has an average of 15.0 tracks per album.
Artist: Blaq Poet has an average 

In [None]:
# Question 5
albums_available_in_market = albums_collection.aggregate([
    {"$unwind": "$available_markets"},
    {"$group": {"_id": "$available_markets",
      "album_count": {"$sum": 1}}},
     {"$sort":{"album_count":-1}}

])
print(f"Albums per market: {list(albums_available_in_market)}")

Albums per market: [{'_id': 'US', 'album_count': 636}, {'_id': 'CA', 'album_count': 633}, {'_id': 'MX', 'album_count': 576}, {'_id': 'LU', 'album_count': 570}, {'_id': 'SL', 'album_count': 570}, {'_id': 'SN', 'album_count': 570}, {'_id': 'ZA', 'album_count': 570}, {'_id': 'BH', 'album_count': 570}, {'_id': 'VN', 'album_count': 570}, {'_id': 'IN', 'album_count': 570}, {'_id': 'NO', 'album_count': 570}, {'_id': 'UY', 'album_count': 570}, {'_id': 'EG', 'album_count': 570}, {'_id': 'RO', 'album_count': 570}, {'_id': 'MW', 'album_count': 570}, {'_id': 'OM', 'album_count': 570}, {'_id': 'GR', 'album_count': 570}, {'_id': 'CR', 'album_count': 570}, {'_id': 'LV', 'album_count': 570}, {'_id': 'CV', 'album_count': 570}, {'_id': 'CY', 'album_count': 570}, {'_id': 'SK', 'album_count': 570}, {'_id': 'AT', 'album_count': 570}, {'_id': 'ME', 'album_count': 570}, {'_id': 'NE', 'album_count': 570}, {'_id': 'TZ', 'album_count': 570}, {'_id': 'HK', 'album_count': 570}, {'_id': 'UA', 'album_count': 570}, 

In [None]:
# Question 6
oldest_album_release = albums_collection.find({},
 {"artist_name": 1, "release_date": 1, "_id": 0}).sort("release_date", 1).limit(1)
print(f"Oldest Album Release: {list(oldest_album_release)}")

Oldest Album Release: [{'release_date': '1983', 'artist_name': 'Nile Rodgers'}]


In [None]:
# Question 7
top_5_albums_most_track = albums_collection.aggregate([
    {"$match":{"album_type":"album"}},
    {"$unwind":"$artists"},
    {"$group" :
     {"_id":{"album_name":"$name","artist_name":"$artist_name"},
     "tracks_count": {"$sum":"$total_tracks"}}},

    {"$sort":{"tracks_count":-1}},
    {"$limit":5}
])
print(f"Top 5 albums with most tracks: {list(top_5_albums_most_track)}")

Top 5 albums with most tracks: [{'_id': {'album_name': 'Sounds Of InStereo, Vol. 3', 'artist_name': 'Honey Dijon'}, 'tracks_count': 153}, {'_id': {'album_name': 'reputation Stadium Tour Surprise Song Playlist', 'artist_name': 'Taylor Swift'}, 'tracks_count': 138}, {'_id': {'album_name': '2 Sides To Every Story', 'artist_name': 'DJ Premier'}, 'tracks_count': 126}, {'_id': {'album_name': 'Siblings', 'artist_name': 'Eyedress'}, 'tracks_count': 126}, {'_id': {'album_name': 'Affable with Pointed Teeth', 'artist_name': 'Eyedress'}, 'tracks_count': 108}]


In [None]:
# Question 8
find_album = albums_collection.aggregate([
    {"$project": {"name": 1, "available_markets": 1, "artist_name": 1}},
    {"$match": {"$expr": {"$gt": [{"$size": "$available_markets"}, 60]}}}
])

for a in find_album:
    print(f"{a['name']} by {a['artist_name']} is available in {len(a['available_markets'])} markets.")


THE TORTURED POETS DEPARTMENT: THE ANTHOLOGY by Taylor Swift is available in 183 markets.
THE TORTURED POETS DEPARTMENT by Taylor Swift is available in 183 markets.
1989 (Taylor's Version) [Deluxe] by Taylor Swift is available in 183 markets.
1989 (Taylor's Version) by Taylor Swift is available in 183 markets.
Speak Now (Taylor's Version) by Taylor Swift is available in 183 markets.
Midnights (The Til Dawn Edition) by Taylor Swift is available in 183 markets.
Midnights (3am Edition) by Taylor Swift is available in 183 markets.
Midnights by Taylor Swift is available in 183 markets.
Red (Taylor's Version) by Taylor Swift is available in 184 markets.
Fearless (Taylor's Version) by Taylor Swift is available in 184 markets.
evermore (deluxe version) by Taylor Swift is available in 184 markets.
evermore by Taylor Swift is available in 183 markets.
folklore: the long pond studio sessions (from the Disney+ special) [deluxe edition] by Taylor Swift is available in 184 markets.
folklore (deluxe 

In [None]:
# Question 9
artist_album = albums_collection.aggregate([
    {"$match": {"album_type": "album"}},
    {"$group": {
        "_id": "$artist_name",
        "album_count": {"$sum": 1},
        "average_tracks": {"$avg": "$total_tracks"}
    }},
    {"$sort": {"album_count": -1}}
])

for a in artist_album:
    artist_name = a["_id"]
    album_count = a["album_count"]
    avg_tracks = round(a["average_tracks"], 2)
    print(f"Artist: {artist_name} has {album_count} albums and an average of {avg_tracks} tracks per album.")

Artist: Taylor Swift has 87 albums and an average of 19.9 tracks per album.
Artist: Nas has 75 albums and an average of 14.44 tracks per album.
Artist: Pearl Jam has 57 albums and an average of 14.47 tracks per album.
Artist: Tee Grizzley has 39 albums and an average of 16.15 tracks per album.
Artist: Eyedress has 33 albums and an average of 17.45 tracks per album.
Artist: Olamide has 33 albums and an average of 15.45 tracks per album.
Artist: Rauw Alejandro has 24 albums and an average of 14.12 tracks per album.
Artist: The Lumineers has 21 albums and an average of 15.29 tracks per album.
Artist: Kygo has 21 albums and an average of 14.29 tracks per album.
Artist: Nicky Jam has 21 albums and an average of 16.29 tracks per album.
Artist: DJ Premier has 21 albums and an average of 14.71 tracks per album.
Artist: Fontaines D.C. has 18 albums and an average of 11.83 tracks per album.
Artist: Lucky Daye has 18 albums and an average of 15.17 tracks per album.
Artist: Peso Pluma has 18 album

In [None]:
# Question 10
deluxe_albums = albums_collection.find(
    {"name": {"$regex": "Deluxe", "$options": "i"}},
    {"artist_name": 1, "name": 1, "_id": 0}
)
print(f"Deluxe albums: {list(deluxe_albums)}")

Deluxe albums: [{'name': "1989 (Taylor's Version) [Deluxe]", 'artist_name': 'Taylor Swift'}, {'name': 'evermore (deluxe version)', 'artist_name': 'Taylor Swift'}, {'name': 'folklore: the long pond studio sessions (from the Disney+ special) [deluxe edition]', 'artist_name': 'Taylor Swift'}, {'name': 'folklore (deluxe version)', 'artist_name': 'Taylor Swift'}, {'name': '1989 (Deluxe Edition)', 'artist_name': 'Taylor Swift'}, {'name': 'Red (Deluxe Edition)', 'artist_name': 'Taylor Swift'}, {'name': 'Speak Now (Deluxe Edition)', 'artist_name': 'Taylor Swift'}, {'name': 'Candydrip (Deluxe)', 'artist_name': 'Lucky Daye'}, {'name': 'Painted (Deluxe Edition)', 'artist_name': 'Lucky Daye'}, {'name': 'Carte Blanche (Deluxe)', 'artist_name': 'DJ Snake'}, {'name': 'My Jesus (Anniversary Deluxe)', 'artist_name': 'Anne Wilson'}, {'name': 'Life Is Good (Deluxe)', 'artist_name': 'Nas'}, {'name': 'Sad Romance (Deluxe)', 'artist_name': 'CKay'}, {'name': 'Tee’s Coney Island (Deluxe)', 'artist_name': 'Tee

<a name='10'></a>
#10 - Create an interactive map using Folium!

Folium is a Python library that simplifies the creation of interactive, visually appealing maps. It acts as a wrapper for the Leaflet.js JavaScript library, allowing users to create maps in Python without needing to write JavaScript. [Learn More](https://python-visualization.github.io/folium/latest/)

Here are a few key concepts about this library:

* **Map Initialization**: Folium provides a Map class that lets users set a central location and zoom level, initializing a map on which they can place markers or other geographic elements.
* **Adding Markers**: Folium’s Marker class lets students place icons on the map at specific locations, which can contain popups with details (like artist names and album information). This feature is key for visualizing different locations where an artist’s album is available.
* **Customization and Interactivity**: The library supports customizing marker icons, colors, and popups, making the map both interactive and visually informative. Users can click on markers to view additional information about the album or artist, which makes exploring data on a map engaging.

In this assignment, Folium will allow you to see where each album is available. Each marker represents a market (country) in which the artist’s album has been released, making it easy to see the global spread and reach of the album. By adding artist and album information to the markers, you can visually assess which artists and albums have the widest distribution.

**Geopy** is a Python library that enables geocoding—converting addresses or location names (like country codes) into latitude and longitude coordinates, which can then be used for plotting on a map.

Here’s how we will be using it:

* **Geocoding Services**: Geopy can connect to multiple geocoding providers (like Nominatim, Google Maps, etc.) to look up geographic data. When given a country code, Geopy queries the provider to retrieve the corresponding latitude and longitude.
* **Caching and Rate Limiting**: Geopy includes rate limits to prevent users from overwhelming the service with requests. This is particularly helpful in this assignment, because many albums might be available in multiple countries! And we will provide 1 request per country per album.

In [None]:
import folium
import time
from pymongo import MongoClient
from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut

Your tasks:


1.   Because your collection can have many albums and many markets you will need to first create a new query using pymongo to get specific records from your collection that will be used for your map. Your task is to create a variable to store the TOP 10 albums order by the lastest `release_date`.
2.   We will send request to the Nominatim geocoding provider you will need to give a name to the geolocator.
3.   We have defined a function called `get_coordinates` that will take as an input a `country_code` and will return a tuple with the coordinates of that country. We want to reduce having multiple API requests for countries that we already asked the coordinates for (this is because multiple albums may have the same market). Your task is to check if the `country_code` passed to the `get_coordinates` was already provided by checking if it is within our `coordinates_cache` dictionary.
4.   If the country code has not been retrieved before, you will need to pass it to the geocode method to get the coordinates.
5.   Create a tuple with the `latitude` and `longitude` values
6.   In step 1 you got a variable with a list of albums that you need to map using Folium. You should loop and get the following information for each album: `artist_name`, `name` and `available_markets`.
7.   Because each `available_market` for each record can contain multiple countries, you will need to loop through each country and call the `get_coordinates` function you just created and pass the current market
8.   Now you are ready to create a folium Marker. This market should have the `artist_name`, `album_name` and `market`. You can change the colors and icon if you like. Learn more about this [here](https://python-visualization.github.io/folium/latest/reference.html#folium.map.Marker)



 NOTE: The way that we have constructed the map only allows a single record to be shown in the MAP. Please avoid creating a query that returns multiple records as only the last record will be mapped using the code below.




In [None]:
# Initialize a Folium map centered globally
map_ = folium.Map(location=[20, 0], zoom_start=2)
coordinates_cache = {}

#START YOUR CODE HERE
geolocator = Nominatim(user_agent="albums_geolocator")   #Name your geolocator

latest_albums = albums_collection.find({}, {"artist_name": 1, "name": 1, "available_markets": 1, "_id": 0}).sort("release_date", -1).limit(10)      #Create your query using pymongo here

def get_coordinates(country_code):
    if country_code in coordinates_cache:
        return coordinates_cache[country_code]

    try:
        location =  geolocator.geocode(country_code, timeout=10)
        if location:
            coords = (location.latitude, location.longitude)
            coordinates_cache[country_code] = coords
            return coords
        else:
            return None
    except GeocoderTimedOut:
        return None

# Loop through each album and add markers for available markets
for album in latest_albums:
    artist_name = album["artist_name"]
    album_name = album["name"]
    available_markets = album["available_markets"]

    for market in available_markets:
        coords = get_coordinates(market)
        if coords:
            # Add a marker with a popup showing artist and album information
            folium.Marker(
                location=coords,
                popup=f"Artist: {artist_name}<br>Album: {album_name}<br>Market: {market}",
                icon=folium.Icon(color="blue", icon="music")
            ).add_to(map_)
        time.sleep(1)
#END YOUR CODE HERE
map_