# Mental Health Survey â€“ Exploratory Analysis

This notebook explores a real-world mental health survey dataset.
The objective is to understand the structure of the data and identify variables relevant for behavioral and satisfaction analysis.


In [3]:
import pandas as pd

df = pd.read_csv('/content/Mental Health Dataset.csv')
df.head()

Unnamed: 0,Timestamp,Gender,Country,Occupation,self_employed,family_history,treatment,Days_Indoors,Growing_Stress,Changes_Habits,Mental_Health_History,Mood_Swings,Coping_Struggles,Work_Interest,Social_Weakness,mental_health_interview,care_options
0,2014-08-27 11:29:31,Female,United States,Corporate,,No,Yes,1-14 days,Yes,No,Yes,Medium,No,No,Yes,No,Not sure
1,2014-08-27 11:31:50,Female,United States,Corporate,,Yes,Yes,1-14 days,Yes,No,Yes,Medium,No,No,Yes,No,No
2,2014-08-27 11:32:39,Female,United States,Corporate,,Yes,Yes,1-14 days,Yes,No,Yes,Medium,No,No,Yes,No,Yes
3,2014-08-27 11:37:59,Female,United States,Corporate,No,Yes,Yes,1-14 days,Yes,No,Yes,Medium,No,No,Yes,Maybe,Yes
4,2014-08-27 11:43:36,Female,United States,Corporate,No,Yes,Yes,1-14 days,Yes,No,Yes,Medium,No,No,Yes,No,Yes


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 292364 entries, 0 to 292363
Data columns (total 17 columns):
 #   Column                   Non-Null Count   Dtype 
---  ------                   --------------   ----- 
 0   Timestamp                292364 non-null  object
 1   Gender                   292364 non-null  object
 2   Country                  292364 non-null  object
 3   Occupation               292364 non-null  object
 4   self_employed            287162 non-null  object
 5   family_history           292364 non-null  object
 6   treatment                292364 non-null  object
 7   Days_Indoors             292364 non-null  object
 8   Growing_Stress           292364 non-null  object
 9   Changes_Habits           292364 non-null  object
 10  Mental_Health_History    292364 non-null  object
 11  Mood_Swings              292364 non-null  object
 12  Coping_Struggles         292364 non-null  object
 13  Work_Interest            292364 non-null  object
 14  Social_Weakness     

In [5]:
columns_of_interest = [
    "Gender",
    "Occupation",
    "Days_Indoors",
    "Growing_Stress",
    "Mood_Swings",
    "Work_Interest"
]

df_clean = df[columns_of_interest]
df_clean.head()


Unnamed: 0,Gender,Occupation,Days_Indoors,Growing_Stress,Mood_Swings,Work_Interest
0,Female,Corporate,1-14 days,Yes,Medium,No
1,Female,Corporate,1-14 days,Yes,Medium,No
2,Female,Corporate,1-14 days,Yes,Medium,No
3,Female,Corporate,1-14 days,Yes,Medium,No
4,Female,Corporate,1-14 days,Yes,Medium,No


In [6]:
for col in df_clean.columns:
    print(col)
    print(df_clean[col].value_counts())
    print("-" * 30)

Gender
Gender
Male      239850
Female     52514
Name: count, dtype: int64
------------------------------
Occupation
Occupation
Housewife    66351
Student      61794
Corporate    61229
Others       52841
Business     50149
Name: count, dtype: int64
------------------------------
Days_Indoors
Days_Indoors
1-14 days             63548
31-60 days            60705
Go out Every day      58366
More than 2 months    55916
15-30 days            53829
Name: count, dtype: int64
------------------------------
Growing_Stress
Growing_Stress
Maybe    99985
Yes      99653
No       92726
Name: count, dtype: int64
------------------------------
Mood_Swings
Mood_Swings
Medium    101064
Low        99834
High       91466
Name: count, dtype: int64
------------------------------
Work_Interest
Work_Interest
No       105843
Maybe    101185
Yes       85336
Name: count, dtype: int64
------------------------------


In [7]:
stress_work_interest = (
    df_clean
    .groupby("Growing_Stress")["Work_Interest"]
    .value_counts(normalize=True)
    .rename("proportion")
    .reset_index()
)

stress_work_interest


Unnamed: 0,Growing_Stress,Work_Interest,proportion
0,Maybe,No,0.422303
1,Maybe,Maybe,0.296865
2,Maybe,Yes,0.280832
3,No,Maybe,0.390236
4,No,No,0.342504
5,No,Yes,0.267261
6,Yes,Maybe,0.35441
7,Yes,Yes,0.325881
8,Yes,No,0.319709


In [10]:
stress_mood_swings = (
    df_clean
    .groupby("Growing_Stress")["Mood_Swings"]
    .value_counts(normalize=True)
    .rename("proportion")
    .reset_index()
)

stress_mood_swings

Unnamed: 0,Growing_Stress,Mood_Swings,proportion
0,Maybe,Low,0.360594
1,Maybe,Medium,0.325199
2,Maybe,High,0.314207
3,No,Low,0.345329
4,No,High,0.333261
5,No,Medium,0.321409
6,Yes,Medium,0.388809
7,Yes,Low,0.318696
8,Yes,High,0.292495


Among respondents reporting growing stress, mood swings are present across all intensity levels, with a tendency toward moderate fluctuations rather than extreme ones.

## Emotional Patterns Under Stress

Respondents reporting growing stress show mood swings across low, medium, and high levels, with a higher concentration in moderate intensity.
This suggests that stress is associated with emotional variability rather than exclusively extreme mood changes.
