# Introduction

In this notebook we are going to explore the information of the table "Spotify-Top-Songs" of the Database "Spotify-Songs" that we create for the project. We will also document the development of the project.

## Description of the project

For this project, we're going to make a Power BI dashboard. The dashboard will show the most popular songs on Spotify in each country. To develop this project, we're going to upload the dataset to a PostgreSQL database. We'll also use this notebook to preview the data and detect any issues, and then clean the table.. After the data is cleaned, we will connect the database to Power BI to create the dashboard.

## Data Description

The data proportioned has the following columns of data: 

* `spotify_id`: Unique identifier for the song in Spotify
* `name`: The title of the song
* `artist`: Name(s) of the artist(s) associated with the song
* `daily_rank`: Daily rank of the song in the top 50 list (0 to 50)
* `daily_movement`: Change in rankings compared to the previous day (-49 to 50)
* `weekly_movement`: Change in rankings compared to the previous week (-49 to 50) 
* `country`: ISO code of the country of the TOP 50 Playlist
* `snapshot_date`: Date on which the data was collected from Spotify 
* `popularity`: Measure of the song's current popularity on Spotify (0 to 100)
* `is_explicit`: Whether the song contains explicit lyrics
* `duration_ms`: The duration of the song in milliseconds 
* `album_name`: Name of the album of the song
* `album_release_date`: Date of the release of the album of the song
* `danceability`
* `energy`
* `key`
* `loudness`
* `mode`
* `speechiness`
* `acousticness`
* `instrumentalness`
* `liveness`
* `valence`
* `tempo`
* `time_signature`

## Procedure



In [1]:
# Importing libraries
import pandas as pd
from sqlalchemy import create_engine

In [3]:
# Conexion to the data base
db_config = {'user': 'postgres',
                'pwd': '0123456789',
                'host': 'localhost',
                'port': '5432',
                'db': 'Spotify-Songs'}

connection_string = 'postgresql://{}:{}@{}:{}/{}'.format(db_config['user'], db_config['pwd'], db_config['host'], db_config['port'], db_config['db'])

engine = create_engine(connection_string)

In [5]:
# Query
query = """SELECT *
FROM "Spotify-Top-Songs"
LIMIT 10;
"""

df = pd.io.sql.read_sql(query, con = engine)

df.head(5)

Unnamed: 0,spotify_id,name,artists,daily_rank,daily_movement,weekly_movement,country,snapshot_date,popularity,is_explicit,...,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature
0,2ctjDCCg1wHoQSjIJ8p6U4,Candy,Plan B,47,-5,-1,SV,2023-12-16,75,False,...,11,-3.735,0,0.147,0.252,0.0,0.517,0.542,95.981,4
1,0DWdj2oZMBFSzRsi2Cvfzf,TQG,"KAROL G, Shakira",48,1,-10,SV,2023-12-16,89,True,...,4,-3.547,0,0.277,0.673,0.0,0.0936,0.607,179.974,4
2,7JbMsR4rZh6J77LNafur8U,¿por Que Te Demoras?,Plan B,49,1,1,SV,2023-12-16,63,False,...,2,-5.923,0,0.0769,0.0386,0.0,0.0547,0.941,96.018,4
3,69Ej1xrGjOcHvIMtMKxK0G,Dile,Don Omar,50,-5,-5,SV,2023-12-16,84,False,...,4,-7.501,0,0.141,0.184,0.000132,0.042,0.714,94.001,4
4,06qMRF18gwbOYYbnP2du6i,Last Christmas - Single Version,Wham!,1,0,0,SK,2023-12-16,88,False,...,2,-8.228,1,0.0278,0.212,4e-06,0.156,0.935,107.732,4
