#                                         GOLDFINCH
![title](gold_1x.JPG)

## Spotify Song Recommendation based on audio features Clustering

    1- Web scraping using beautifulsoup to get Billboard Hot 100
    2- Spotify API to get Soptify dataset
    3- Clustring using Kmeans (5 Clusters)
    4- Flexible search (Song/Artist)
    5- You will get your results in case of mistyping with 40% error  

## Importing libraries

In [1]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import difflib
import os.path, time
import datetime
import pickle
from my_credentials import *

# Data Importing

## Checking Billboard Hot 100 Last Update

In [2]:
## the date of last update
last_update= time.strftime('%m/%d/%Y', time.gmtime(os.path.getmtime('../Day 2/top100.csv')))
## How many days
days=(datetime.datetime.now()-datetime.datetime.strptime(last_update, '%m/%d/%Y')).days

## More than one week it will update itself
if days > 7:
    from bs4 import BeautifulSoup
    import requests
    url = "https://www.billboard.com/charts/hot-100"
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    song = [ i.get_text() for i in soup.select(".chart-element__information__song")]
    artist = [ i.get_text() for i in soup.select(".chart-element__information__artist")]
    top100=pd.DataFrame({'song':song,'artist':artist})
# importing Billboard dataset
else:
    top100 = pd.read_csv('../Day 2/top100.csv')


In [3]:
top100

Unnamed: 0,song,artist
0,mood,24kgoldn featuring iann dior
1,therefore i am,billie eilish
2,positions,ariana grande
3,i hope,gabby barrett featuring charlie puth
4,laugh now cry later,drake featuring lil durk
...,...,...
95,tap in,saweetie
96,rockstar chainz,future
97,kacey talk,youngboy never broke again
98,practice,dababy


## Importing Spotify Playlist

In [4]:
spotify = pd.read_csv('spotify_clusterd.csv')

In [5]:
## Make sure that all names are strings
spotify[['artist','song']]=spotify[['artist','song']].astype(str)

In [6]:
spotify

Unnamed: 0,song,artist,cluster
0,Take Me To Church,Hozier,3
1,Cooler Than Me - Single Mix,"Mike Posner, Gigamesh",0
2,See You Again (feat. Kali Uchis),"Tyler, The Creator, Kali Uchis",3
3,Pompeii,Bastille,1
4,Hips Don't Lie (feat. Wyclef Jean),"Shakira, Wyclef Jean",0
...,...,...,...
4890,Prisoner (feat. Dua Lipa),"Miley Cyrus, Dua Lipa",0
4891,Therefore I Am,Billie Eilish,0
4892,Dakiti,"Bad Bunny, Jhay Cortez",3
4893,Levitating (feat. DaBaby),"Dua Lipa, DaBaby",0


![title](Project.JPG)

# The Main function 

In [7]:
def music(s):
    s=str(s)
    # not in the top 100
    if len(difflib.get_close_matches(s,top100.song)) == 0 and len(difflib.get_close_matches(s,top100.artist))== 0:
        track = difflib.get_close_matches(s,spotify.song)
        singer = difflib.get_close_matches(s,spotify.artist)
        # if it is not in spotify playlist will search online
        if len(track) == 0 and len(singer)== 0:
            sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id= client_id,
                                                           client_secret= client_secret))
            results = sp.search(q = s, limit=1)
            # if it is not online will recomend any song from top 100
            if len(results["tracks"]["items"]) == 0:
                print("\n \U0001F6D1 Sorry,there is nothing that matches your search or try to search by artist name.\U0001F6D1 \n")
                print("\nYou can lestin to our top songs this week: \U0001F4AF \n")
                rec=top100.head(10)
                for i in range(10):
                    print("{}- \U0001F51D '{}' \U0001F3B5 by {} \U0001F3A4 \n".format(i+1,rec.iloc[i,0],rec.iloc[i,1]))
            # if it is online will recomend songs from Spotify after modling
            else:
                # spotify api
                search=pd.DataFrame(sp.audio_features(results["tracks"]["items"][0]["uri"]))
                search.drop(columns=['type','id','uri','track_href','duration_ms','time_signature','analysis_url']
                        ,inplace=True)
                # Scaling usin a pickle file
                with open('scaler.pkl', 'rb') as file:
                    scaler = pickle.load(file)
                search = scaler.transform(search)
                # modling using a pickle file
                with open('kmeans.pkl', 'rb') as file:
                    kmeans = pickle.load(file)
                clus = kmeans.predict(search)[0] # requied cluster
                # recomended songs
                print("\nRecomended Songs: \U0001F4F6 \n")
                rec=spotify[spotify['cluster']== clus].sample(n=5)
                for i in range(5):
                    print("{}- \U0001F3A7 '{}' \U0001F3B5 by {} \U0001F3A4 ".format(i+1,rec.iloc[i,0],rec.iloc[i,1]))
            
            #answer print(spot_clus[['song','artist']].sample(n=3))
            # if it is in spotify playlist will recomend songs from the same cluster 
        else:
            # searching by song
            if len(difflib.get_close_matches(s,spotify.song)) != 0:
                track=track[0]
                clus = spotify[spotify['song']== track]['cluster'].iloc[0]
            #searching by artist    
            else:
                singer=singer[0]
                clus = spotify[spotify['artist']== singer]['cluster'].iloc[0]
            print("\n Recomended Songs: \U0001F197 \n")
            rec=spotify[spotify['cluster']== clus].sample(n=5)
            for i in range(5):
                print("{}- \U0001F3A7 '{}' \U0001F3B5 by {} \U0001F3A4 ".format(i+1,rec.iloc[i,0],rec.iloc[i,1]))
        
    # it is in the top 100 will recomend from them    
    else: 
        rec=top100.sample(n=5)
        print("\nRecomended Songs from TOP100: \U0001F4AF \n")
        for i in range(5):
            print("{}- \U0001F51D '{}' \U0001F3B5 by {} \U0001F3A4 ".format(i+1,rec.iloc[i,0],rec.iloc[i,1]))

## You can search using this cell

In [13]:
mysong=music(input("Enter a Song or an Artist Name: "))

Enter a Song or an Artist Name: dfifduoifjodfjkfjjvkdx

 🛑 Sorry,there is nothing that matches your search or try to search by artist name.🛑 


You can lestin to our top songs this week: 💯 

1- 🔝 'mood' 🎵 by 24kgoldn featuring iann dior 🎤 

2- 🔝 'therefore i am' 🎵 by billie eilish 🎤 

3- 🔝 'positions' 🎵 by ariana grande 🎤 

4- 🔝 'i hope' 🎵 by gabby barrett featuring charlie puth 🎤 

5- 🔝 'laugh now cry later' 🎵 by drake featuring lil durk 🎤 

6- 🔝 'holy' 🎵 by justin bieber featuring chance the rapper 🎤 

7- 🔝 'blinding lights' 🎵 by the weeknd 🎤 

8- 🔝 'lemonade' 🎵 by internet money & gunna featuring don toliver & nav 🎤 

9- 🔝 'dakiti' 🎵 by bad bunny & jhay cortez 🎤 

10- 🔝 'for the night' 🎵 by pop smoke featuring lil baby & dababy 🎤 

