# **Database Creation for the Spotify Project**  

## **Objective**  
This notebook describes the process used to create a dedicated database for this project.  

## **Database Overview**  
The database is built using **SQLite** and is designed to store historical listening data and curated playlists for a recommendation algorithm.  

### **Database Details**  
- **Name:** `spotify_project`  
- **Tables:**  
  - `tracks_history` : Stores all historically listened tracks. This table will be used to adapt the recommendation algorithm based on my listening history and preferences.  
  - **Curated Playlists for Recommendations:**  
    - `playlist_rap_us`  
    - `playlist_house`  
    - `playlist_drill`  
    - `playlist_80s`  
    - `playlist_afrobeat`  

These playlists will serve as a reference to improve the recommendation algorithm and tailor suggestions to my musical tastes.

In [57]:
# Import libraries and define global variables
import os
import spotify
import sqlite3
import pandas as pd
%load_ext autoreload
%autoreload 2

DB_name = 'spotify_project.db'

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [41]:
# To have a look to a specific table
def view_table(table_name='tracks_history', db_name=DB_name):
    conn = sqlite3.connect(db_name)
    df = pd.read_sql(f"SELECT * FROM {table_name}", conn)
    conn.close()
    return df

In [50]:
view_table(table_name='playlist_drill')

Unnamed: 0,album_id,album_name,release_date,album_artists_id,album_artists_name,duration,track_id,track_name,popularity,track_artists_id,track_artists_name,track_listeners,track_playcount,similar_artists,track_tags,image_url,spotify_url
0,208okgvV6yEx0Aq9t8zT4p,Polémique,2021-05-15 00:00:00,7GWvguZN6WpXDVaMDjg97v,pauldscrga,176.427,5nLe4rNMf6B5v9sJ6gGCQO,Polémique,0,"7GWvguZN6WpXDVaMDjg97v, 5OVwoZ9WDfQiyUkblkOGLe","pauldscrga, Freeze Cee",1519,11302,"667, Adekhey, XIII B, Mini RTTCLAN, Swey Zen",,https://i.scdn.co/image/ab67616d00001e02e8af6a...,https://open.spotify.com/track/5nLe4rNMf6B5v9s...
1,1hRmY5K4OzG8Wdxmn6YNlg,"669, Pt. 2",2021-05-16 00:00:00,4FjcWWBsbgD3TAEf2jQuVv,667,478.693,55rPRY6upmsnuCdJTW7UQt,"669, Pt. 2",37,"4FjcWWBsbgD3TAEf2jQuVv, 43mlbNLGRuLXwqTE8G61JB","667, Lyonzon",7704,88497,"EniMa, Afro S, Sazamyzy, DOC OVG, Mini RTTCLAN","Hip-Hop, french, hip hop, seen live, metal, ro...",https://i.scdn.co/image/ab67616d00001e02dcc5c7...,https://open.spotify.com/track/55rPRY6upmsnuCd...
2,0FEXxdB1W49qvYwKL0wnJh,669,2018-09-20 00:00:00,43mlbNLGRuLXwqTE8G61JB,Lyonzon,305.600,3nSL8n2FdvP3nLycf3Ip93,669,25,43mlbNLGRuLXwqTE8G61JB,Lyonzon,3132,25189,,"Hip-Hop, french, hip hop, trap, rap",https://i.scdn.co/image/ab67616d00001e02f927ab...,https://open.spotify.com/track/3nSL8n2FdvP3nLy...
3,34cYXBHmeJ49Tr4Y3plD9H,"Ashe Tape, Vol. 3",2021-04-09 00:00:00,3tTvSeZiFDP3CY5EdPGcR4,ASHE 22,227.040,02uSh14aUckmfTVj5MEhmg,Scellé Part. 3,0,"3tTvSeZiFDP3CY5EdPGcR4, 76Pl0epAMXVXJspaSuz8im","ASHE 22, Freeze corleone",6829,62105,"Freeze Corleone, Kaaris, Osirus Jack, Alpha Wa...","Hip-Hop, french, lyonzon, drill, france, rap",https://i.scdn.co/image/ab67616d00001e02ed1912...,https://open.spotify.com/track/02uSh14aUckmfTV...
4,7zrUnXByeOKHbXY3fyn6vz,LMF,2020-09-11 00:00:00,76Pl0epAMXVXJspaSuz8im,Freeze corleone,170.106,5IkofYa6Ac1plKIf6nYkDE,Scellé part.2,45,"76Pl0epAMXVXJspaSuz8im, 3tTvSeZiFDP3CY5EdPGcR4","Freeze corleone, ASHE 22",21604,273035,"Freeze Corleone, Kaaris, Osirus Jack, Alpha Wa...","Hip-Hop, french, lyonzon, drill, france, rap",https://i.scdn.co/image/ab67616d00001e0204a2bc...,https://open.spotify.com/track/5IkofYa6Ac1plKI...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,4I8edkU7TNJmSv4cpWjfSC,140 BPM 2,2021-02-05 00:00:00,5gs4Sm2WQUkcGeikMcVHbh,Hamza,253.906,1Ujvp85FTkDkCdJPPM21wI,Spaghetti (feat. Gazo & Guy2Bezbar),0,"5gs4Sm2WQUkcGeikMcVHbh, 5gqmbbfjcikQBzPB5Hv13I...","Hamza, Gazo, Guy2Bezbar",,,"Ninho, Tiakola, Damso, Leto, RSKO, Josman, Nis...","Hip-Hop, french, junk, belgian, House, Belgium...",https://i.scdn.co/image/ab67616d00001e0267c883...,https://open.spotify.com/track/1Ujvp85FTkDkCdJ...
96,31H0KWvM2hCC3p8jkctCWN,DRILL FR,2021-02-26 00:00:00,5gqmbbfjcikQBzPB5Hv13I,Gazo,205.253,10nCB4Iq9BI3aW2iEldiR9,A$AP,45,5gqmbbfjcikQBzPB5Hv13I,Gazo,9705,92491,"Tiakola, Leto, Niska, Koba LaD, Ninho","Hip-Hop, french, drill, french rap, france",https://i.scdn.co/image/ab67616d00001e02f8e112...,https://open.spotify.com/track/10nCB4Iq9BI3aW2...
97,7IRchhaQTK6BFQAPkcPcHm,1PLIKÉ140 x Fumez The Engineer - Plugged In Fr...,2020-10-15 00:00:00,"0ksX396B3t2Gt8kwr0BJZk, 4Ue6MAZqz18NlaOQomRXLU","Fumez The Engineer, 1PLIKÉ140",177.103,1eQXQNNSYGInObudf5yXbg,1PLIKÉ140 x Fumez The Engineer - Plugged In Fr...,34,"0ksX396B3t2Gt8kwr0BJZk, 4Ue6MAZqz18NlaOQomRXLU","Fumez The Engineer, 1PLIKÉ140",3360,21173,"Beendo Z, Niska, menace Santana, Gazo, Ziak","Hip-Hop, hip hop, numeric, drill, UK Drill, nu...",https://i.scdn.co/image/ab67616d00001e02892bb9...,https://open.spotify.com/track/1eQXQNNSYGInObu...
98,1tTWT8dn8Iyh3BBVsltJBg,1PLIKTOI BIEN,2020-05-28 00:00:00,4Ue6MAZqz18NlaOQomRXLU,1PLIKÉ140,140.434,7m1wsat0RhxlxviJhQ3sfJ,1PLIKTOI BIEN,49,4Ue6MAZqz18NlaOQomRXLU,1PLIKÉ140,7048,63764,"Beendo Z, Niska, menace Santana, Gazo, Ziak","numeric, numbers, french rap, 92140",https://i.scdn.co/image/ab67616d00001e02076777...,https://open.spotify.com/track/7m1wsat0Rhxlxvi...


In [42]:
# Load a specific table to CSV file
def load_tracks_history_csv(table_name='tracks_history', db_name=DB_name):
    conn = sqlite3.connect(db_name)

    tracks_history_df = pd.read_sql(f"SELECT * FROM {table_name}", conn)

    if os.path.exists('track_history.csv'):
        tracks_history_df.to_csv('track_history.csv', index=False)
        print("File 'track_history.csv' has been replaced succesfully")

    else:
        tracks_history_df.to_csv('track_history.csv', index=False)
        print("File 'track_history.csv' has been created successfully")

    conn.close()

In [29]:
load_tracks_history_csv()

File 'track_history.csv' has been created successfully


In [43]:
# Import playlists ids
playlist_drill = '1z1tOO60TXJaLEfXb5Z1pw'
playlist_house = '1vIMNWoiysQgw4q13PErN4'
playlist_rap_us = '4OZ02mQrmS1LU8bkG09vq7'
playlist_afrobeat = '25Y75ozl2aI0NylFToefO5'
playlist_annees_80 = '0slE73JFtRr3F2KnfoWlbO'

playlist_ids = {'playlist_drill':playlist_drill, 'playlist_house':playlist_house, 'playlist_rap_us':playlist_rap_us,
                'playlist_afrobeat':playlist_afrobeat, 'playlist_annees_80':playlist_annees_80}

In [44]:
def create_playlist_tables(playlist_ids, DB_name="spotify_project.db"):
    conn = sqlite3.connect(DB_name)

    for playlist_name, playlist_id in playlist_ids.items():
        # Récupérer les morceaux de la playlist
        new_tracks_df = spotify.get_playlist_tracks(playlist_id)

        # Vérifier si la table existe déjà dans la base de données
        query = f"SELECT name FROM sqlite_master WHERE type='table' AND name='{playlist_name}';"
        table_exists = pd.read_sql(query, conn)

        if table_exists.empty:
            # Si la table n'existe pas, la créer et ajouter tous les morceaux
            new_tracks_df.to_sql(playlist_name, conn, if_exists="replace", index=False)
            print(f"Table {playlist_name} created and {len(new_tracks_df)} tracks added.")
        else:
            # Si la table existe, récupérer les morceaux déjà stockés
            existing_tracks_query = f"SELECT track_id FROM {playlist_name}"
            try:
                existing_tracks_df = pd.read_sql(existing_tracks_query, conn)
            except:
                existing_tracks_df = None

            # Filtrer les nouveaux morceaux qui ne sont pas déjà en base
            if existing_tracks_df is not None and not existing_tracks_df.empty:
                new_tracks_df = new_tracks_df[~new_tracks_df['track_id'].isin(existing_tracks_df['track_id'])]

            # Ajouter uniquement les nouveaux morceaux
            if not new_tracks_df.empty:
                new_tracks_df.to_sql(playlist_name, conn, if_exists="append", index=False)
                print(f"{len(new_tracks_df)} new tracks added to {playlist_name}.")
            else:
                print(f"No new tracks to add for {playlist_name}.")

    conn.close()

In [45]:
create_playlist_tables(playlist_ids)

No new tracks to add for playlist_drill.
No new tracks to add for playlist_house.
No new tracks to add for playlist_rap_us.
No new tracks to add for playlist_afrobeat.
No new tracks to add for playlist_annees_80.
