# 📊 Análise de Dados Musicais - Spotify & YouTube

## 🎯 Contexto do Projeto

**Você foi contratado pela Mateus Music** como Cientista de Dados para analisar o catálogo musical da empresa. Sua missão é explorar a base de dados contendo **mais de 20.000 músicas** e extrair insights valiosos sobre o desempenho dos artistas nas plataformas Spotify e YouTube.

A empresa precisa entender:
- Quem são os artistas mais relevantes
- Quais músicas performam melhor
- Como otimizar investimentos em marketing digital

## 📁 Sobre a Base de Dados

### Características Principais:
- **+20.000 músicas** no catálogo completo
- Dados de **Streams no Spotify** 
- Dados de **Views no YouTube**
- **Links dos vídeos** do YouTube para cada música
- **Metadados completos** de artistas e faixas

### 1. Importe o Pandas e mostre as informações gerais do dataframe

In [1]:
import pandas as pd

In [2]:
df = pd.read_parquet('Dados_Artistas.parquet')
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20718 entries, 0 to 20717
Data columns (total 27 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Artist            20718 non-null  object 
 1   Url_spotify       20718 non-null  object 
 2   Track             20718 non-null  object 
 3   Album             20718 non-null  object 
 4   Album_type        20718 non-null  object 
 5   Uri               20718 non-null  object 
 6   Danceability      20716 non-null  float64
 7   Energy            20716 non-null  float64
 8   Key               20716 non-null  float64
 9   Loudness          20716 non-null  float64
 10  Speechiness       20716 non-null  float64
 11  Acousticness      20716 non-null  float64
 12  Instrumentalness  20716 non-null  float64
 13  Liveness          20716 non-null  float64
 14  Valence           20716 non-null  float64
 15  Tempo             20716 non-null  float64
 16  Duration_ms       20716 non-null  float6

### 2. Mostre as cinco primeiras linhas do dataframe

In [3]:
df.head()

Unnamed: 0,Artist,Url_spotify,Track,Album,Album_type,Uri,Danceability,Energy,Key,Loudness,...,Url_youtube,Title,Channel,Views,Likes,Comments,Description,Licensed,official_video,Stream
0,Gorillaz,https://open.spotify.com/artist/3AA28KZvwAUcZu...,Feel Good Inc.,Demon Days,album,spotify:track:0d28khcov6AiegSCpG5TuT,0.818,0.705,6.0,-6.679,...,https://www.youtube.com/watch?v=HyHNuVaZJ-k,Gorillaz - Feel Good Inc. (Official Video),Gorillaz,693555221.0,6220896.0,169907.0,Official HD Video for Gorillaz' fantastic trac...,True,True,1040235000.0
1,Gorillaz,https://open.spotify.com/artist/3AA28KZvwAUcZu...,Rhinestone Eyes,Plastic Beach,album,spotify:track:1foMv2HQwfQ2vntFf9HFeG,0.676,0.703,8.0,-5.815,...,https://www.youtube.com/watch?v=yYDmaexVHic,Gorillaz - Rhinestone Eyes [Storyboard Film] (...,Gorillaz,72011645.0,1079128.0,31003.0,The official video for Gorillaz - Rhinestone E...,True,True,310083700.0
2,Gorillaz,https://open.spotify.com/artist/3AA28KZvwAUcZu...,New Gold (feat. Tame Impala and Bootie Brown),New Gold (feat. Tame Impala and Bootie Brown),single,spotify:track:64dLd6rVqDLtkXFYrEUHIU,0.695,0.923,1.0,-3.93,...,https://www.youtube.com/watch?v=qJa-VFwPpYA,Gorillaz - New Gold ft. Tame Impala & Bootie B...,Gorillaz,8435055.0,282142.0,7399.0,Gorillaz - New Gold ft. Tame Impala & Bootie B...,True,True,63063470.0
3,Gorillaz,https://open.spotify.com/artist/3AA28KZvwAUcZu...,On Melancholy Hill,Plastic Beach,album,spotify:track:0q6LuUqGLUiCPP1cbdwFs3,0.689,0.739,2.0,-5.81,...,https://www.youtube.com/watch?v=04mfKJWDSzI,Gorillaz - On Melancholy Hill (Official Video),Gorillaz,211754952.0,1788577.0,55229.0,Follow Gorillaz online:\nhttp://gorillaz.com \...,True,True,434663600.0
4,Gorillaz,https://open.spotify.com/artist/3AA28KZvwAUcZu...,Clint Eastwood,Gorillaz,album,spotify:track:7yMiX7n9SBvadzox8T5jzT,0.663,0.694,10.0,-8.627,...,https://www.youtube.com/watch?v=1V_xRb0x9aw,Gorillaz - Clint Eastwood (Official Video),Gorillaz,618480958.0,6197318.0,155930.0,The official music video for Gorillaz - Clint ...,True,True,617259700.0


### 3. Vamos ver quais os artistas temos em nosso df e contar quantos artistas diferentes temos no nosso dataset

In [4]:
df['Artist'].value_counts().nunique()

7

### 4. Quais os 10 artistas com mais musicas em nosso dataset?

In [14]:
### vamos verifcar os artistas com mais musicas na lista
df['Artist'].value_counts().head(10)

Artist
SICK LEGEND              10
Gorillaz                 10
Red Hot Chili Peppers    10
50 Cent                  10
Metallica                10
Coldplay                 10
Daft Punk                10
Linkin Park              10
Radiohead                10
AC/DC                    10
Name: count, dtype: int64

### 5. Quais as 5 músicas com mais views no youtube?

In [28]:
### quero mostrar as 5 musicas com mais views no youtube
top5 = df.sort_values(by='Views', ascending=False).head(5)
top5[['Track', 'Views','Artist']]



Unnamed: 0,Track,Views,Artist
1147,Despacito,8079649000.0,Luis Fonsi
365,Despacito,8079647000.0,Daddy Yankee
12452,Shape of You,5908398000.0,Ed Sheeran
14580,See You Again (feat. Charlie Puth),5773798000.0,Charlie Puth
12469,See You Again (feat. Charlie Puth),5773797000.0,Wiz Khalifa


### 6. Quais os 5 artistas com mais streams no spotify?

In [29]:
### Quais os 5 artistas com mais streams no spotify?
top5 = df.sort_values(by='Stream', ascending=False).head(5)
top5[['Artist']]

Unnamed: 0,Artist
15250,The Weeknd
12452,Ed Sheeran
19186,Lewis Capaldi
17937,Post Malone
17938,Post Malone


### 7. Vamos converter e salvar o arquivo com parquet, para subir no github

In [30]:
### vamos converter o arquivo para parquet
df.to_parquet('Dados_Artistas_V2.parquet', index=False)