### About Dataset:
#### The data has been acquired from Kaggle (public dataset) for Music and Mental Health Analysis https://www.kaggle.com/datasets/catherinerasgaitis/mxmh-survey-results

### Data Context:
#### Music therapy(MT) is used to improve an individual's stress, mood, and overall mental health. MT is also recognized as an evidence-based practice, using music as a catalyst for "happy" hormones such as oxytocin.However, MT employs a wide range of different genres, varying from one organization to the next.

### Data Includes:
#### * Age Group and their prefered music genre in self-reported survey 
#### * Conditions such as Anxiety, Depression, Insomnia,OCD and time duration of music     listened per day
#### * Effect of music on their condition

### Problem Statement:
#### * Correlations between an individual's prefered music genre and  mental health.
#### * Analyse and Visualizing the effect of music on their condition according to data

### Importing Libraries, Data and Getting data info

In [None]:
library(tidyverse)
library(tidyr)
library(ggplot2)
library(dplyr)
library(lubridate)
library(RColorBrewer)

In [None]:
#Import data
df<- read.csv('mxmh_survey_results.csv')

In [None]:
#Get head detail and tail details from data
head(df)

In [None]:
tail(df)

In [None]:
#Summary of the data
summary(df)

In [None]:
df[df == "NA"] <- NA

In [None]:
df<-drop_na(df,Age)

###  Analysis and Visualising data 

In [None]:
count(df,Age,Depression,Insomnia,Anxiety,OCD)

In [None]:
#plot distribution for Age:
ggplot(df, aes(x=Age))+
geom_histogram(bins=10,fill="purple")+
    labs(title = "Distribution of Age ", x="Age")

In [None]:
#plot distribution for Depression
ggplot(df, aes(x=Depression))+
geom_histogram(bins=10,fill="darkgreen")+
    labs(title = "Distribution of Depression ", x="Depression")


In [None]:
# plot distribution for Insomnia
ggplot(df, aes(x=Insomnia))+
geom_histogram(bins=10,fill="blue")+
    labs(title = "Distribution of Insomnia ", x="Insomnia")


In [None]:
# plot distribution for Anxiety
ggplot(df, aes(x=Anxiety))+
geom_histogram(bins=10,fill="pink")+
    labs(title = "Distribution of Anxiety ", x="Anxiety")


In [None]:
# plot distribution for OCD
ggplot(df, aes(x=OCD))+
geom_histogram(bins=10,fill="orange")+
    labs(title = "Distribution of OCD ", x="OCD")


### Cleaning Data for further analysis
##### Max vs Range Data shows outlier data

In [None]:
# Drop rows over age of 60 in age
df<-subset(df, Age <= 60 )

In [None]:
# Drop rows where Hours.per.day is over 15

df<- subset(df, Hours.per.day <= 15 )

In [None]:
# Drop rows where BPM is over 250

df <- subset(df, BPM <= 250 )

In [None]:
# Drop columns Timestamp, Permissions

df= subset(df, select = -c(Timestamp,Permissions) )

In [None]:
df

### Data Analysis and Visualization

####  For streaming platforms and the count of users

In [None]:
#Create a dataframe for streaming platform

streaming_platform <- df%>%
filter (Primary.streaming.service != "") %>%
group_by (Primary.streaming.service)%>%
summarize (users = n())%>%
arrange (desc(users))
View (streaming_platform)

In [None]:
# Visualize streaming_platform

ggplot ((data = streaming_platform), aes(x = reorder(Primary.streaming.service, -users), y = users)) + 
geom_col(fill = 'brown') + 
labs(title = "Streaming Platforms by Popularity")+
xlab("Streaming Platforms") + 
theme(axis.text.x=element_text(size =15, angle =90))


##### Spotify is the most popular streaming platform with the highest count of users, while Pandora is the least used streaming platform.

### For Age distribution of data

In [None]:
ggplot(df, aes(x = Age)) + geom_histogram(binwidth = 3, fill = "purple", colour = "black") + labs(title = "Age distribution")

##### Age group between 15 to 40 listen to music more frequently.

### Analyze for Genre Preference

In [None]:
Genre <- df%>%
group_by(Fav.genre)%>%
summarize(number = n())%>%
arrange(desc(number))

In [None]:
ggplot(Genre, aes(x = number, y = reorder(Fav.genre, number))) + geom_col(fill = 'pink') + labs(title = "Genre Preference") + ylab("Fav Genre")

##### Rock is the most preferred genre of music, while latin is the least preferred music genres

In [None]:
# Check the age distribution of genre preference

ggplot(data = df, aes(x=Age,y=Fav.genre)) + geom_boxplot(colour = "blue")

###  For Music Listening Preference

In [None]:
working <- df %>%
filter (While.working != "") %>%
group_by (While.working)

In [None]:
ggplot(data=working, aes(x = While.working)) + geom_bar( fill = 'pink')  + labs(title = "Preference for listening to music while working")+ xlab("While working") + theme(axis.text.x= element_text(size =10))

##### Most people listen to music while working.

In [None]:
### Hours of music listened per day
ggplot(df, aes(x = Hours.per.day)) + geom_histogram(binwidth = 1, fill = "purple", colour = "black") + labs(title = "Hours of Music listened to daily")

##### Majority of people listen to music 1-3 hours per day.

### Visualizing relationship between Age and time duration listened to music daily

In [None]:
ggplot(data =df, aes( x = Age, y = Hours.per.day)) + geom_smooth (method = 'lm', colour = "darkgreen") 

###  Comparinng data between instruments used in composing music and composers

In [None]:
df%>%
filter (Instrumentalist != "") %>%
count(Instrumentalist) %>%
group_by (Instrumentalist)

In [None]:
df%>%
filter (Composer != "") %>%
count(Composer) %>%
group_by (Composer)

##### Result Sugests listeners prefer instruments used than composers

### Analysis for Mental Conditions

In [None]:
# Check the age distribution of people who experience each mental condition

ggplot(data = df, aes(x = Age, y = Depression)) + geom_smooth(colour = "darkgreen") + labs(title = "Age distribition: Depression")
ggplot(data = df, aes(x = Age, y = Anxiety)) + geom_smooth(colour = "navyblue") + labs(title = "Age distribition: Anxiety")
ggplot(data = df, aes(x = Age, y = Insomnia)) + geom_smooth(colour = "red") + labs(title = "Age distribition: Insomnia")
ggplot(data = df, aes(x = Age, y = OCD)) + geom_smooth(colour = "brown") + labs(title = "Age distribition: OCD")

##### Age group above 10 to 40 experience Anxiety, Insomnia and OCD than older people. People in their twenties, however, experience higher levels of Depression and forties experience higher level of Insomnia than other age groups.

### Analyse and Visualize the effect on mental health conditions

In [None]:
mhc <- df %>%
filter (Music.effects != "") %>%
group_by (Music.effects) %>%
summarize(number =n())

View (mhc)

In [None]:
ggplot(mhc, aes(x= Music.effects, y = number)) + geom_col(fill = "navyblue") + labs(title = "Music effect on Mental Condition")+ xlab("Music.Effects") + theme(axis.text.x= element_text(size =10)) 

##### 75% of people indicated that their mental conditions improve with music, while 3% reported that their mental conditions worsened with music.

### View mental health effect per condition

In [None]:
#Create subset for Anxiety

Anxiety <- subset(df, select = c(Fav.genre, Anxiety, Music.effects))%>%
filter(Music.effects != "") %>%
filter(Anxiety > 0)

In [None]:
ggplot(Anxiety, aes(x = Fav.genre, fill = Music.effects)) + geom_bar(position="dodge") + labs(title = "Music effect on Anxiety") + xlab("Fav Genre") + theme(axis.text.x= element_text(size =10, angle = 90)) + scale_fill_manual(breaks =c("Improve","No effect","Worsen"),values=c("navyblue", "violet", "orange")) 

In [None]:
#Create subset for Depression

Depression <- subset(df, select = c(Fav.genre, Depression, Music.effects))%>%
filter(Music.effects != "")%>%
filter(Depression > 0)

In [None]:
ggplot(Depression, aes(x = Fav.genre, fill = Music.effects)) + geom_bar(position="dodge") + labs(title = "Music effect on Depression") + xlab("Fav Genre") + theme(axis.text.x= element_text(size =10, angle = 90)) + scale_fill_manual(breaks =c("Improve","No effect","Worsen"),values=c("pink", "darkgreen", "yellow")) 

In [None]:
#Create subset for Insomnia

Insomnia <- subset(df, select = c(Fav.genre, Insomnia, Music.effects))%>%
filter(Music.effects != "") %>%
filter(Insomnia > 0)

In [None]:
ggplot(Insomnia, aes(x = Fav.genre, fill = Music.effects)) + geom_bar(position="dodge") + labs(title = "Music effect on Insomnia") + xlab("Fav Genre") + theme(axis.text.x= element_text(size =10, angle = 90)) + scale_fill_manual(breaks =c("Improve","No effect","Worsen"),values=c("Purple", "brown", "violet")) 

In [None]:
#Create subset for OCD

ocd <- subset(df, select = c(Fav.genre, OCD, Music.effects))%>%
filter(Music.effects != "") %>%
filter(OCD > 0)

In [None]:
ggplot(ocd, aes(x = Fav.genre, fill = Music.effects)) + geom_bar(position="dodge") + labs(title = "Music effect on OCD") + xlab("Fav Genre") + theme(axis.text.x= element_text(size =10, angle = 90)) + scale_fill_manual(breaks =c("Improve","No effect","Worsen"),values=c("navyblue", "yellow", "grey")) 

**Music can be an effective tool in regulating mental health conditions. While some genres may worsen situations, others definitely improve mental health conditions.**