# Análise do Impacto da Música na Saúde Mental

Este projeto tem como objetivo analisar os dados de uma pesquisa que investiga a relação entre os hábitos de escuta musical e a saúde mental. O foco é explorar como diferentes preferências musicais, frequências de escuta e características pessoais se correlacionam com condições de saúde mental autodeclaradas, como ansiedade, depressão, insônia e TOC (Transtorno Obsessivo-Compulsivo).

---
## Importando Bibliotecas e Lendo os Dados

In [44]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import func

In [45]:
df = pd.read_csv("mxmh_survey_results.csv")

In [46]:
df.head()

Unnamed: 0,Timestamp,Age,Primary streaming service,Hours per day,While working,Instrumentalist,Composer,Fav genre,Exploratory,Foreign languages,...,Frequency [R&B],Frequency [Rap],Frequency [Rock],Frequency [Video game music],Anxiety,Depression,Insomnia,OCD,Music effects,Permissions
0,8/27/2022 19:29:02,18.0,Spotify,3.0,Yes,Yes,Yes,Latin,Yes,Yes,...,Sometimes,Very frequently,Never,Sometimes,3.0,0.0,1.0,0.0,,I understand.
1,8/27/2022 19:57:31,63.0,Pandora,1.5,Yes,No,No,Rock,Yes,No,...,Sometimes,Rarely,Very frequently,Rarely,7.0,2.0,2.0,1.0,,I understand.
2,8/27/2022 21:28:18,18.0,Spotify,4.0,No,No,No,Video game music,No,Yes,...,Never,Rarely,Rarely,Very frequently,7.0,7.0,10.0,2.0,No effect,I understand.
3,8/27/2022 21:40:40,61.0,YouTube Music,2.5,Yes,No,Yes,Jazz,Yes,Yes,...,Sometimes,Never,Never,Never,9.0,7.0,3.0,3.0,Improve,I understand.
4,8/27/2022 21:54:47,18.0,Spotify,4.0,Yes,No,No,R&B,Yes,No,...,Very frequently,Very frequently,Never,Rarely,7.0,2.0,5.0,9.0,Improve,I understand.


---
## Alterando os Tipos dos Dados

Alteramos os tipos dos dados que estavam classificados como "object", não sendo possível realizar nossas análises por conta disso. Usamos uma função criada por nós no arquivo "func.py" deste projeto.

In [47]:
func.alterar_type(df)

---
## Traduzindo Colunas

Para facilitar o trabalho, vamos traduzir as colunas do dataset.

In [48]:
colunas_traduzidas = [
    "data_e_hora_envio", "idade", "servico_de_streaming_principal",
    "horas_por_dia", "enquanto_trabalha",
    "instrumentalista", "compositor", "genero_fav",
    "exploratorio", "em_outros_idiomas", "bpm",
    "frequencia_musica_classica", "frequencia_country",
    "frequencia_edm", "frequencia_folk", "frequencia_gospel",
    "frequencia_hip_hop", "frequencia_jazz", "frequencia_kpop",
    "frequencia_latin", "frequencia_lofi", "frequencia_metal",
    "frequencia_pop", "frequencia_rb", "frequencia_rap",
    "frequencia_rock", "frequencia_musica_videogame",
    "ansiedade", "depressao", "insonia", "toc",
    "efeitos_na_saude_mental", "permissoes"
]

In [49]:
df.columns = colunas_traduzidas
df

Unnamed: 0,data_e_hora_envio,idade,servico_de_streaming_principal,horas_por_dia,enquanto_trabalha,instrumentalista,compositor,genero_fav,exploratorio,em_outros_idiomas,...,frequencia_rb,frequencia_rap,frequencia_rock,frequencia_musica_videogame,ansiedade,depressao,insonia,toc,efeitos_na_saude_mental,permissoes
0,8/27/2022 19:29:02,18.0,Spotify,3.0,1.0,1.0,1.0,Latin,1.0,1.0,...,2.0,3.0,0.0,2.0,3.0,0.0,1.0,0.0,,I understand.
1,8/27/2022 19:57:31,63.0,Pandora,1.5,1.0,0.0,0.0,Rock,1.0,0.0,...,2.0,1.0,3.0,1.0,7.0,2.0,2.0,1.0,,I understand.
2,8/27/2022 21:28:18,18.0,Spotify,4.0,0.0,0.0,0.0,Video game music,0.0,1.0,...,0.0,1.0,1.0,3.0,7.0,7.0,10.0,2.0,No effect,I understand.
3,8/27/2022 21:40:40,61.0,YouTube Music,2.5,1.0,0.0,1.0,Jazz,1.0,1.0,...,2.0,0.0,0.0,0.0,9.0,7.0,3.0,3.0,Improve,I understand.
4,8/27/2022 21:54:47,18.0,Spotify,4.0,1.0,0.0,0.0,R&B,1.0,0.0,...,3.0,3.0,0.0,1.0,7.0,2.0,5.0,9.0,Improve,I understand.
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
731,10/30/2022 14:37:28,17.0,Spotify,2.0,1.0,1.0,0.0,Rock,1.0,1.0,...,0.0,1.0,3.0,0.0,7.0,6.0,0.0,9.0,Improve,I understand.
732,11/1/2022 22:26:42,18.0,Spotify,1.0,1.0,1.0,0.0,Pop,1.0,1.0,...,0.0,0.0,2.0,2.0,3.0,2.0,2.0,5.0,Improve,I understand.
733,11/3/2022 23:24:38,19.0,Other streaming service,6.0,1.0,0.0,1.0,Rap,1.0,0.0,...,2.0,2.0,1.0,1.0,2.0,2.0,2.0,2.0,Improve,I understand.
734,11/4/2022 17:31:47,19.0,Spotify,5.0,1.0,1.0,0.0,Classical,0.0,0.0,...,0.0,0.0,0.0,2.0,2.0,3.0,2.0,1.0,Improve,I understand.


---
## Selecionando Apenas Colunas Numéricas

In [50]:
lista_colunas_numericas = df.select_dtypes(include=['number']).columns.tolist()
print("Lista de colunas numéricas")
print(lista_colunas_numericas)

Lista de colunas numéricas
['idade', 'horas_por_dia', 'enquanto_trabalha', 'instrumentalista', 'compositor', 'exploratorio', 'em_outros_idiomas', 'bpm', 'frequencia_musica_classica', 'frequencia_country', 'frequencia_edm', 'frequencia_folk', 'frequencia_gospel', 'frequencia_hip_hop', 'frequencia_jazz', 'frequencia_kpop', 'frequencia_latin', 'frequencia_lofi', 'frequencia_metal', 'frequencia_pop', 'frequencia_rb', 'frequencia_rap', 'frequencia_rock', 'frequencia_musica_videogame', 'ansiedade', 'depressao', 'insonia', 'toc']


---
## Calculando Medidas de Centralidade e Dispersão

In [51]:
medidas_dispersao = {}

for column in lista_colunas_numericas:
    medidas_dispersao[column] = {
        'Media': df[column].mean(),
        'Mediana': df[column].median(),
        'Moda': df[column].mode()[0],
        'Desvio Padrão': df[column].std(),
        'Variância': df[column].var(),
        'Amplitude': df[column].max() - df[column].min()
    }

# Convertendo para DataFrame
medidas_df = pd.DataFrame(medidas_dispersao).T
medidas_df.columns = ['Media', 'Mediana', 'Moda', 'Desvio Padrão', 'Variância', 'Amplitude']
medidas_df

Unnamed: 0,Media,Mediana,Moda,Desvio Padrão,Variância,Amplitude
idade,25.2068,21.0,18.0,12.05497,145.3223,79.0
horas_por_dia,3.572758,3.0,2.0,3.028199,9.169988,24.0
enquanto_trabalha,0.7899045,1.0,1.0,0.4076544,0.1661821,1.0
instrumentalista,0.3210383,0.0,0.0,0.4671947,0.2182709,1.0
compositor,0.1714286,0.0,0.0,0.3771397,0.1422343,1.0
exploratorio,0.7133152,1.0,1.0,0.4525205,0.2047748,1.0
em_outros_idiomas,0.5519126,1.0,1.0,0.4976378,0.2476434,1.0
bpm,1589948.0,120.0,120.0,39872610.0,1589825000000000.0,999999999.0
frequencia_musica_classica,1.335598,1.0,1.0,0.9884416,0.9770168,3.0
frequencia_country,0.8179348,1.0,0.0,0.9225838,0.8511609,3.0


Como podemos ver, a média dos valores estavam muito fora de formatação, por isso precisamos formatar e ajusta-las.

In [52]:
# Arredondar os valores para duas casas decimais
df_medidas_centralidade = medidas_df.map(lambda x: f"{x:,.2f}" if isinstance(x, (int, float)) else x)

# Exibir o DataFrame formatado
display(df_medidas_centralidade)

Unnamed: 0,Media,Mediana,Moda,Desvio Padrão,Variância,Amplitude
idade,25.21,21.0,18.0,12.05,145.32,79.0
horas_por_dia,3.57,3.0,2.0,3.03,9.17,24.0
enquanto_trabalha,0.79,1.0,1.0,0.41,0.17,1.0
instrumentalista,0.32,0.0,0.0,0.47,0.22,1.0
compositor,0.17,0.0,0.0,0.38,0.14,1.0
exploratorio,0.71,1.0,1.0,0.45,0.2,1.0
em_outros_idiomas,0.55,1.0,1.0,0.5,0.25,1.0
bpm,1589948.34,120.0,120.0,39872606.18,1589824723615030.0,999999999.0
frequencia_musica_classica,1.34,1.0,1.0,0.99,0.98,3.0
frequencia_country,0.82,1.0,0.0,0.92,0.85,3.0
