<a href="https://colab.research.google.com/github/saumilhj/projects/blob/main/MusicVSMentalHealth.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**EFFECT OF MUSIC ON MENTAL HEALTH**

Dataset from Kaggle: https://www.kaggle.com/datasets/catherinerasgaitis/mxmh-survey-results

Music therapy is using music to improve overall mental health. Music therapy is an evidence based practice, using music as a catalyst to generate "happy" hormones such as oxytocin.<br><br>
This dataset is used to explore correlation between an individuals music taste and self-reported mental health.

In [1]:
import pandas as pd
import numpy as np

import plotly.express as px

Import data

Data summary:


*   Respondents have ranked 16 genres of music as Never, Rarely, Sometimes and Very frequently
*   Respondents have ranked Anxiety, Depression, Insomnia and OCD on the scale of 1 to 10.



In [2]:
df = pd.read_csv("music_survey.csv")

In [3]:
df.head()

Unnamed: 0,Timestamp,Age,Primary streaming service,Hours per day,While working,Instrumentalist,Composer,Fav genre,Exploratory,Foreign languages,...,Frequency [R&B],Frequency [Rap],Frequency [Rock],Frequency [Video game music],Anxiety,Depression,Insomnia,OCD,Music effects,Permissions
0,8/27/2022 19:29:02,18.0,Spotify,3.0,Yes,Yes,Yes,Latin,Yes,Yes,...,Sometimes,Very frequently,Never,Sometimes,3.0,0.0,1.0,0.0,,I understand.
1,8/27/2022 19:57:31,63.0,Pandora,1.5,Yes,No,No,Rock,Yes,No,...,Sometimes,Rarely,Very frequently,Rarely,7.0,2.0,2.0,1.0,,I understand.
2,8/27/2022 21:28:18,18.0,Spotify,4.0,No,No,No,Video game music,No,Yes,...,Never,Rarely,Rarely,Very frequently,7.0,7.0,10.0,2.0,No effect,I understand.
3,8/27/2022 21:40:40,61.0,YouTube Music,2.5,Yes,No,Yes,Jazz,Yes,Yes,...,Sometimes,Never,Never,Never,9.0,7.0,3.0,3.0,Improve,I understand.
4,8/27/2022 21:54:47,18.0,Spotify,4.0,Yes,No,No,R&B,Yes,No,...,Very frequently,Very frequently,Never,Rarely,7.0,2.0,5.0,9.0,Improve,I understand.


In [4]:
list(df.columns)

['Timestamp',
 'Age',
 'Primary streaming service',
 'Hours per day',
 'While working',
 'Instrumentalist',
 'Composer',
 'Fav genre',
 'Exploratory',
 'Foreign languages',
 'BPM',
 'Frequency [Classical]',
 'Frequency [Country]',
 'Frequency [EDM]',
 'Frequency [Folk]',
 'Frequency [Gospel]',
 'Frequency [Hip hop]',
 'Frequency [Jazz]',
 'Frequency [K pop]',
 'Frequency [Latin]',
 'Frequency [Lofi]',
 'Frequency [Metal]',
 'Frequency [Pop]',
 'Frequency [R&B]',
 'Frequency [Rap]',
 'Frequency [Rock]',
 'Frequency [Video game music]',
 'Anxiety',
 'Depression',
 'Insomnia',
 'OCD',
 'Music effects',
 'Permissions']

Drop unrequired columns

In [5]:
df.drop(columns=['Timestamp', 'BPM', 'Permissions'], inplace=True)

In [6]:
print(f"Total number of records: {len(df.index)}")

Total number of records: 736


Check duplicates

In [7]:
df.duplicated(keep='first').sum()

0

Check NaN

In [8]:
df.isna().sum()

Age                             1
Primary streaming service       1
Hours per day                   0
While working                   3
Instrumentalist                 4
Composer                        1
Fav genre                       0
Exploratory                     0
Foreign languages               4
Frequency [Classical]           0
Frequency [Country]             0
Frequency [EDM]                 0
Frequency [Folk]                0
Frequency [Gospel]              0
Frequency [Hip hop]             0
Frequency [Jazz]                0
Frequency [K pop]               0
Frequency [Latin]               0
Frequency [Lofi]                0
Frequency [Metal]               0
Frequency [Pop]                 0
Frequency [R&B]                 0
Frequency [Rap]                 0
Frequency [Rock]                0
Frequency [Video game music]    0
Anxiety                         0
Depression                      0
Insomnia                        0
OCD                             0
Music effects 

In [9]:
df.dropna(how='any', axis=0, inplace=True)

Number of users on each streaming platform

In [10]:
df_platform = df.groupby(['Primary streaming service'], as_index=False).count()[['Primary streaming service','Age']].sort_values(by='Age', ascending=False)
df_platform

Unnamed: 0,Primary streaming service,Age
4,Spotify,450
5,YouTube Music,90
1,I do not use a streaming service.,69
0,Apple Music,50
2,Other streaming service,49
3,Pandora,10


In [11]:
fig = px.bar(df_platform, x='Primary streaming service', y='Age', labels={'Primary streaming service': 'Streaming Service', 'Age': 'No. of users'},
             title='No. of user per streaming service')
fig.show()

Average age of user on each streaming platform

In [12]:
df_avg_age = df.groupby(['Primary streaming service'], as_index=False).agg({'Age': 'mean'})
df_avg_age['Age']= round(df_avg_age['Age'])
df_avg_age.sort_values(by='Age', ascending=False, inplace=True)

In [13]:
fig = px.bar(df_avg_age, x='Primary streaming service', y='Age', labels={'Primary streaming service': 'Streaming Service', 'Age': 'Average age of users'}, 
             title='Average age of users for music streaming services')
fig.show()

Spotify has the maximum number of users and average age is the lowest for these users.

Total number of users who reported positive, negative or no effect

In [14]:
df_effect = df.groupby(['Music effects'], as_index=False).count()[['Music effects', 'Age']]
df_effect

Unnamed: 0,Music effects,Age
0,Improve,535
1,No effect,166
2,Worsen,17


In [15]:
fig = px.bar(df_effect, x='Music effects', y='Age', labels={'Music effects': 'Music Effects', 'Age': 'Total people'},
             title='Type of effect vs total people')
fig.show()

535 people reported improvement in mental state, 166 reported no effect while only 17 reported worsening of mental state

Favourite genre of users and the effect

In [16]:
fig = px.sunburst(df, path=['Music effects', 'Fav genre'], title='Effect of music and favoutire genre count of users')
fig.show()

Rock music has the maximum count in all the three categories.

Distribution of effect based on listening to music when working

In [17]:
df_while_working_effect = df.groupby(['While working', 'Music effects'], as_index=False).count()[['While working', 'Music effects', 'Age']]
df_while_working_effect

Unnamed: 0,While working,Music effects,Age
0,No,Improve,90
1,No,No effect,55
2,No,Worsen,6
3,Yes,Improve,445
4,Yes,No effect,111
5,Yes,Worsen,11


In [18]:
fig = px.bar(df_while_working_effect, x='While working', y='Age', color='Music effects', labels={'Age': 'Total people'},
           title='Distribution of effect based on listening to music when working')
fig.show()

Almost 5 times the people who did not listening to music while working reported improvement when listening to music while working.

Foreign languages effect

In [19]:
df_fl_effect = df.groupby(['Foreign languages', 'Music effects'], as_index=False).count()[['Foreign languages', 'Music effects', 'Age']]
df_fl_effect

Unnamed: 0,Foreign languages,Music effects,Age
0,No,Improve,239
1,No,No effect,77
2,No,Worsen,7
3,Yes,Improve,296
4,Yes,No effect,89
5,Yes,Worsen,10


In [20]:
fig = px.bar(df_fl_effect, x='Foreign languages', y='Age', color='Music effects', labels={'Age': 'Total people'},
           title='Distribution of effect based on listening to foreign language music')
fig.show()

Listening to foreign langage music has almost negligible effect on improvement, worsening or no change to mood.

Relation between psychological ailment and effect of music

In [21]:
df_ailment = df[['Anxiety', 'Depression', 'Insomnia', 'OCD', 'Music effects']]
df_ailment = df_ailment.groupby(['Music effects'], as_index=False).mean()
df_ailment

Unnamed: 0,Music effects,Anxiety,Depression,Insomnia,OCD
0,Improve,6.038318,4.84486,3.720561,2.693458
1,No effect,5.096386,4.439759,3.626506,2.39759
2,Worsen,6.764706,7.176471,4.529412,3.117647


From this table, correlation can be drawn that conditions worsened for people who on average reported higher value for the ailments