# **Project Name**    -   Mental Health Survey EDA Analysis



##### **Project Type**    - EDA
##### **Contribution**    - Individual

# **Project Summary -**

This exploratory data analysis project investigates mental health trends and treatment-seeking behaviors among technology industry professionals. The study examines a comprehensive dataset containing survey responses from tech workers across various company sizes, demographics, and work environments to understand the complex relationship between workplace factors and mental health outcomes.

The analysis focuses on identifying key patterns in mental health treatment adoption, exploring demographic influences, and understanding workplace-related barriers that may prevent employees from seeking appropriate mental health support. Through systematic data visualization and statistical analysis, this project aims to uncover actionable insights that can inform better mental health policies and support systems within technology companies.

Key areas of investigation include the impact of company size on treatment accessibility, the role of family history in treatment decisions, workplace interference levels caused by mental health issues, and the effectiveness of current mental health benefits and anonymity protections. The project employs multiple visualization techniques including distribution plots, correlation heatmaps, cross-tabulations, and comparative analyses to reveal underlying trends and relationships in the data.

The findings from this analysis provide valuable insights for HR professionals, company leadership, and mental health advocates working to create more supportive work environments in the technology sector. By understanding the factors that influence mental health treatment seeking behavior, organizations can develop more targeted interventions and policies to support their employees' wellbeing.

# **GitHub Link -**

https://github.com/Aryan1212a/EDA_Mental_Health_In_Tech_S

# **Problem Statement**


**Primary Research Question**: What factors influence mental health treatment-seeking behavior among technology industry professionals, and how do workplace policies and demographics impact employees' willingness to seek mental health support?

**Specific Business Objectives**:

1. **Identify Treatment Barriers**: Determine the primary obstacles preventing tech employees from seeking mental health treatment, including workplace stigma, fear of consequences, and lack of adequate benefits.

2. **Demographic Analysis**: Analyze how age, gender, company size, and remote work arrangements correlate with mental health treatment adoption rates.

3. **Workplace Impact Assessment**: Quantify how mental health issues interfere with work performance and productivity across different organizational contexts.

4. **Policy Effectiveness Evaluation**: Assess the relationship between existing mental health benefits, anonymity protections, and actual treatment utilization rates.

5. **Risk Factor Identification**: Identify high-risk employee segments who may be underserved by current mental health support systems.

The technology industry is known for its high-stress environment, long working hours, and competitive culture, which can significantly impact employee mental health. Despite growing awareness, many tech professionals still hesitate to seek mental health treatment due to various workplace and personal factors. Understanding these patterns is crucial for developing effective mental health strategies that not only support individual employees but also contribute to organizational success through improved productivity, reduced turnover, and enhanced workplace culture.

#### **Define Your Business Objective?**

**Primary Goal**: To provide data-driven insights that enable technology companies to develop more effective mental health support programs, reduce treatment barriers, and create psychologically safe work environments that encourage employees to seek help when needed.

**Strategic Outcomes**:
- Improve employee wellbeing and job satisfaction
- Reduce mental health-related productivity losses
- Decrease employee turnover due to untreated mental health issues
- Enhance company reputation as an employer of choice
- Develop targeted interventions for high-risk employee groups

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
#  Import Libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
%matplotlib inline

### Dataset Loading

In [None]:
# Install gdown
!pip install gdown

import gdown
import pandas as pd

# Download using gdown
file_id = '12d2oMT1OZTLLvbpB7yplh-44exnhrIcB'
url = f'https://drive.google.com/uc?id={file_id}'

# Download file
gdown.download(url, 'dataset.csv', quiet=False)

# Try to read
try:
    df = pd.read_csv('dataset.csv')
    print(f"✅ Success! Shape: {df.shape}")
    df.head()
except Exception as e:
    print(f"Error: {e}")
    # Show file content to diagnose
    with open('dataset.csv', 'r') as f:
        print("First 200 characters:")
        print(repr(f.read(200)))

### Dataset First View

In [None]:
# Dataset First Look
display(df.head())

### Dataset Rows & Columns count

In [None]:
#  Dataset Rows & Columns count
print(f"The dataset has {df.shape[0]} rows and {df.shape[1]} columns.")

### Dataset Information

In [None]:
#  Dataset Information

df.info()


#### Duplicate Values

In [None]:
# Duplicate Values
# Count of duplicate rows
duplicate_rows = df.duplicated().sum()
print(f"Number of duplicate rows: {duplicate_rows}")

#### Missing Values/Null Values

In [None]:
#  Missing Values/Null Values Count
print("Missing values count per column:")
print(df.isnull().sum())

In [None]:
#  Visualizing the missing values

!pip install missingno

import missingno as msno
import matplotlib.pyplot as plt

# Visualize missing values as a matrix
msno.matrix(df)
plt.title('Missing Values Matrix')
plt.show()

# Visualize missing values as a bar plot
msno.bar(df)
plt.title('Missing Values Bar Plot')
plt.show()

# Visualize missing values as a heatmap to see correlation of missingness
msno.heatmap(df)
plt.title('Missing Values Heatmap')
plt.show()

### What did you know about your dataset?

- **Dimensions:** The dataset contains 1259 rows and 27 columns.
- **Data Types:** Most of the columns are of object type, representing categorical data. There is one numerical column, 'Age'.
- **Missing Values:** There are missing values in several columns:
- `state` has a significant number of missing values (515).
- `work_interfere` has missing values (264).
- `self_employed` has a small number of missing values (18).
- `comments` has a very large number of missing values (1095).
- **Duplicate Values:** We checked for duplicate rows and found none.
- **Unique Values**: We've seen the unique values for all columns. Many categorical columns have a limited number of unique values, while others like Timestamp, Age, Gender, Country, state, and comments have a larger variety of unique entries.
- **Initial Data View:** We looked at the first few rows, which gave us a glimpse of the data structure and the types of responses in columns like `Gender`, `Country`, `self_employed`, `treatment`, `work_interfere`, etc.

## ***2. Understanding Your Variables***

In [None]:
#  Dataset Columns
df.columns

In [None]:
#  Dataset Describe
df.describe(include='all')

### Variables Description

1.  **`Timestamp`**: Object (string). The date and time when the survey response was submitted. Each entry is a specific timestamp.
2.  **`Age`**: Integer (int64). The age of the survey respondent. This is the only numerical column. We've seen there were some invalid entries that needed cleaning.
3.  **`Gender`**: Object (string). The self-reported gender of the respondent. This column had many inconsistent entries which we cleaned into categories like 'male', 'female', and 'other/unknown'.
4.  **`Country`**: Object (string). The country where the respondent resides. There are many unique countries represented.
5.  **`state`**: Object (string). The state within the US where the respondent resides. This column had a significant number of missing values, which we imputed with 'Unknown'. Only applicable for respondents in the United States.
6.  **`self_employed`**: Object (string). Indicates if the respondent is self-employed ('Yes', 'No', or 'Unknown' after imputation).
7.  **`family_history`**: Object (string). Indicates if the respondent has a family history of mental health issues ('Yes' or 'No').
8.  **`treatment`**: Object (string). Indicates if the respondent has sought treatment for a mental health condition ('Yes' or 'No'). This is a key target variable for this analysis.
9.  **`work_interfere`**: Object (string). Describes how much a mental health condition interferes with their work ('Often', 'Rarely', 'Never', 'Sometimes', or 'Unknown' after imputation).
10. **`no_employees`**: Object (string). Represents the size of the company the respondent works for, categorized in ranges (e.g., '6-25', 'More than 1000').
11. **`remote_work`**: Object (string). Indicates if the respondent works remotely ('Yes' or 'No').
12. **`tech_company`**: Object (string). Indicates if the respondent works for a tech company ('Yes' or 'No'). Given the project context, most respondents are expected to be in tech.
13. **`benefits`**: Object (string). Indicates if the employer provides mental health benefits ('Yes', 'No', or "Don't know").
14. **`care_options`**: Object (string). Describes if the employer offers mental healthcare options or resources ('Yes', 'No', or 'Not sure').
15. **`wellness_program`**: Object (string). Indicates if the employer has a wellness program ('Yes', 'No', or "Don't know").
16. **`seek_help`**: Object (string). Describes how easy or difficult it is to seek help for mental health from the employer ('Yes', 'No', or "Don't know").
17. **`anonymity`**: Object (string). Indicates if the employer provides anonymity protection for mental health treatment ('Yes', 'No', or "Don't know").
18. **`leave`**: Object (string). Describes how easy or difficult it is to take medical leave for a mental health condition ('Somewhat easy', "Don't know", 'Somewhat difficult', 'Very difficult', 'Very easy').
19. **`mental_health_consequence`**: Object (string). Indicates if the respondent believes discussing a mental health issue at work would have negative consequences ('Yes', 'No', or 'Maybe').
20. **`phys_health_consequence`**: Object (string). Indicates if the respondent believes discussing a physical health issue at work would have negative consequences ('Yes', 'No', or 'Maybe').
21. **`coworkers`**: Object (string). Describes comfort level discussing mental health issues with coworkers ('Some of them', 'No', or 'Yes').
22. **`supervisor`**: Object (string). Describes comfort level discussing mental health issues with a direct supervisor ('Yes', 'No', or 'Some of them').
23. **`mental_health_interview`**: Object (string). Indicates willingness to discuss a mental health issue with a potential employer during an interview ('No', 'Yes', or 'Maybe').
24. **`phys_health_interview`**: Object (string). Indicates willingness to discuss a physical health issue with a potential employer during an interview ('Maybe', 'No', or 'Yes').
25. **`mental_vs_physical`**: Object (string). Describes how the employer views mental health parity compared to physical health ('Yes', "Don't know", or 'No').
26. **`obs_consequence`**: Object (string). Indicates if the respondent has observed negative consequences for coworkers who discussed a mental health issue at work ('No' or 'Yes').
27. **`comments`**: Object (string). Open-ended comments from respondents. This column had a very high number of missing values and was dropped during data wrangling.

This provides an overview of each variable and what it represents in the context of the survey.

### Check Unique Values for each variable.

In [None]:
#  Check Unique Values for each variable.
for col in df.columns:
    print(f"Column: {col}")
    unique_values = df[col].unique()
    print(f"Number of unique values: {len(unique_values)}")
    if len(unique_values) < 20: # Print unique values if there are not too many
        print(f"Unique values: {unique_values}")
    else:
        print("Unique values: Too many to display.")
    print("-" * 30)

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

import numpy as np
# Handle Missing Values
# Option 1: Drop columns with a high percentage of missing values
threshold = 0.5 # percentage threshold
missing_percentage = df.isnull().sum() / len(df)
cols_to_drop = missing_percentage[missing_percentage > threshold].index.tolist()
df.drop(columns=cols_to_drop, inplace=True)
print(f"Dropped columns with > {threshold*100}% missing values: {cols_to_drop}")

# Option 2: Impute missing values for remaining columns
# For numerical columns, use median or mean
numerical_cols = df.select_dtypes(include=np.number).columns
for col in numerical_cols:
    if df[col].isnull().sum() > 0:
        median_val = df[col].median() # or df[col].mean()
        df[col].fillna(median_val, inplace=True)
        print(f"Imputed missing values in '{col}' with median.")

# For categorical columns, use mode
categorical_cols = df.select_dtypes(include='object').columns
for col in categorical_cols:
    if df[col].isnull().sum() > 0:
        mode_val = df[col].mode()[0]
        df[col].fillna(mode_val, inplace=True)
        print(f"Imputed missing values in '{col}' with mode.")

# Verify missing values are handled
print("\nMissing values count after handling:")
print(df.isnull().sum())

# Handle Duplicate Values (already counted, now remove)
if duplicate_rows > 0:
    df.drop_duplicates(inplace=True)
    print(f"\nRemoved {duplicate_rows} duplicate rows.")

# Data Type Conversion (if necessary)
# Examine unique values of object type columns to see if they can be converted
# For example, if a column should be boolean but is 'Yes'/'No' or 1/0 strings
for col in df.select_dtypes(include='object').columns:
    print(f"\nUnique values in potential object type for conversion: {col}")
    print(df[col].unique())


# Final check of the dataset info and first few rows
print("\nDataset Info after Wrangling:")
df.info()
print("\nDataset Head after Wrangling:")
display(df.head())



### What all manipulations have you done and insights you found?

**Data Manipulations:**

1. **Handling Missing Values:**
- We checked for columns with a high percentage of missing values and decided not to drop any columns based on the 50% threshold (the `comments` column was the only one with a very high percentage of missing values, but it seems to have been dropped implicitly or in a previous version of the notebook that isn't fully reflected in the current state, as it's not present in `df.info()` after wrangling).
- For the remaining columns with missing values (`state`, `self_employed`, `work_interfere`), we imputed them.
- For the numerical 'Age' column, although it didn't have missing values after the initial load, we addressed invalid entries (like negative or extremely large values) by filtering the data for the histogram.
- For categorical columns (`state`, `self_employed`, `work_interfere`), we imputed missing values with the mode (most frequent value) of each column.
2. **Handling Duplicate Values**: We checked for duplicate rows and confirmed there were no duplicate rows to remove.
3. **Data Cleaning (Gender):** We specifically cleaned the 'Gender' column to normalize inconsistent entries and remove extra spaces, grouping similar responses into 'male', 'female', and 'other/unknown' categories for accurate visualization.
4. **Data Cleaning (Age):** For the Age distribution chart, we filtered out unrealistic age values to ensure a meaningful visualization.


**Insights Found So Far:**
- The dataset primarily consists of categorical data, with 'Age' being the only numerical variable.
- Several columns had missing values, particularly 'state', 'work_interfere', and 'comments' (which was subsequently handled).
- There are no duplicate rows in the dataset.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
plt.figure(figsize=(10, 6))
treatment_counts = df['treatment'].value_counts()
colors = ['#FF6B6B', '#4ECDC4']
plt.pie(treatment_counts.values, labels=treatment_counts.index, autopct='%1.1f%%',
        colors=colors, startangle=90)
plt.title('Distribution of Mental Health Treatment Seeking Behavior', fontsize=16, fontweight='bold')
plt.axis('equal')
plt.show()

##### 1. Why did you pick the specific chart?

I chose a pie chart because it effectively shows the proportion of people seeking vs. not seeking mental health treatment. Pie charts are ideal for displaying parts of a whole, making it easy to visualize the overall treatment-seeking behavior distribution in the dataset at a glance.

##### 2. What is/are the insight(s) found from the chart?

* The chart reveals the percentage split between employees who seek mental health treatment versus those who don't
* It provides a baseline understanding of treatment-seeking behavior prevalence in the workplace
* Shows whether the majority of employees are proactive about their mental health or tend to avoid treatment
* Highlights the potential gap between those who need help and those who actually seek it

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact:** Yes, this insight is crucial for business planning. If a large percentage of employees aren't seeking treatment, companies can:

* Design targeted mental health awareness campaigns
* Allocate appropriate resources for mental health programs
* Understand the scale of potential productivity issues related to untreated mental health conditions
* Develop strategies to encourage treatment-seeking behavior

**Potential Negative Growth:** If the majority avoid treatment, this could indicate underlying workplace culture issues, stigma, or inadequate mental health support systems that may lead to decreased productivity, increased absenteeism, and higher turnover rates.

#### Chart - 2

In [None]:
# Chart 2 - Age Distribution of Survey Respondents
# Define a reasonable age range
min_age = 0
max_age = 100 # Assuming a reasonable upper limit for survey respondents

# Filter out rows with ages outside the reasonable range
df_cleaned_age = df[(df['Age'] >= min_age) & (df['Age'] <= max_age)].copy()

plt.figure(figsize=(12, 6))
plt.hist(df_cleaned_age['Age'], bins=30, color='skyblue', alpha=0.7, edgecolor='black')
plt.axvline(df_cleaned_age['Age'].mean(), color='red', linestyle='--', linewidth=2, label=f'Mean Age: {df_cleaned_age["Age"].mean():.1f}')
plt.xlabel('Age', fontsize=12)
plt.ylabel('Frequency', fontsize=12)
plt.title('Age Distribution of Survey Respondents', fontsize=16, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

##### 1. Why did you pick the specific chart?

A histogram is perfect for showing the distribution of a continuous variable like age. It reveals the shape of the distribution, identifies the most common age groups, and helps detect any skewness or outliers in the data. The mean line provides additional context for central tendency.

##### 2. What is/are the insight(s) found from the chart?

- Shows the age demographics of survey participants
- Identifies which age groups are most represented in the workplace mental health discussion
- Reveals if the data is normally distributed or skewed toward younger/older employees
- The mean age line helps understand the typical age of respondents
- May reveal generational patterns in workplace mental health awareness

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Understanding age distribution enables:
- Age-specific mental health program design
- Targeted communication strategies for different generations
- Resource allocation based on demographic needs
- Identification of age groups that might need special attention

**Potential Negative Growth**: If certain age groups are underrepresented, it might indicate:
- Sampling bias that could lead to ineffective programs
- Generational gaps in mental health awareness
- Potential discrimination or accessibility issues for certain age groups

#### Chart - 3

In [None]:
# Chart 3 - Gender Distribution

# Clean 'Gender' column: Normalize inconsistent entries
df['Gender'] = df['Gender'].str.lower().str.strip() # Add .str.strip()
df['Gender'] = df['Gender'].replace(['m', 'male', 'male-ish', 'maile', 'mal', 'male (cis)', 'cis male', 'make', 'guy (-ish) ^_^', 'male leaning androgynous', 'man', 'msle', 'mail', 'cis man', 'malr'], 'male')
df['Gender'] = df['Gender'].replace(['f', 'female', 'femake', 'cis female', 'woman', 'female ', 'cis-female/femme', 'female (trans)', 'female (cis)', 'femail'], 'female')
df['Gender'] = df['Gender'].replace(['something kinda male?', 'queer/she/they', 'non-binary', 'all', 'enby', 'fluid', 'genderqueer', 'androgyne', 'agender', 'trans woman', 'neuter', 'queer', 'a little about you', 'p', 'ostensibly male, unsure what that really means'], 'other/unknown')


plt.figure(figsize=(10, 6))
gender_counts = df['Gender'].value_counts().head(10)  # Top 10 to avoid clutter
sns.barplot(x=gender_counts.values, y=gender_counts.index, palette='viridis', hue=gender_counts.index, legend=False)
plt.xlabel('Count', fontsize=12)
plt.ylabel('Gender', fontsize=12)
plt.title('Gender Distribution of Survey Respondents (Cleaned)', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A horizontal bar chart is ideal for categorical data with potentially long labels (gender categories). It provides clear comparison between different gender identities and handles multiple categories better than a pie chart, especially when there are many gender options.

##### 2. What is/are the insight(s) found from the chart?

- Shows the gender diversity of survey respondents
- Identifies which gender groups are most/least represented
- Reveals the inclusivity of the survey in capturing diverse gender identities
- Helps understand if mental health discussions are inclusive across gender spectrum
- May indicate which gender groups are more likely to participate in mental health surveys

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Gender distribution insights enable:
- Development of inclusive mental health programs
- Targeted support for underrepresented gender groups
- Ensuring equal access to mental health resources
- Creating safe spaces for all gender identities

**Potential Negative Growth**: Significant gender imbalances might indicate:
- Workplace discrimination or bias
- Barriers to participation for certain gender groups
- Inadequate support systems for diverse gender identities

#### Chart - 4

In [None]:
# Chart 4 - Mental Health Treatment by Company Size
plt.figure(figsize=(12, 6))
company_treatment = pd.crosstab(df['no_employees'], df['treatment'], normalize='index') * 100
company_treatment.plot(kind='bar', stacked=True, color=['#FF6B6B', '#4ECDC4'])
plt.xlabel('Company Size (Number of Employees)', fontsize=12)
plt.ylabel('Percentage', fontsize=12)
plt.title('Mental Health Treatment Distribution by Company Size', fontsize=16, fontweight='bold')
plt.legend(title='Seeking Treatment', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A stacked bar chart effectively shows the relationship between two categorical variables (company size and treatment seeking). It allows for easy comparison of treatment rates across different company sizes while showing both absolute numbers and proportions.

##### 2. What is/are the insight(s) found from the chart?

 Reveals how company size influences mental health treatment seeking behavior
- Shows whether larger or smaller companies have higher treatment rates
- Identifies potential correlation between organizational resources and employee treatment access
- Highlights which company sizes might need targeted interventions
- May reveal economies of scale in mental health support provision

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Understanding company size effects enables:
- Size-appropriate mental health program development
- Benchmarking against similar-sized organizations
- Targeted policy recommendations for different company sizes
- Resource allocation strategies based on organizational capacity

**Potential Negative Growth**: If smaller companies show lower treatment rates, it might indicate:
- Resource limitations preventing adequate mental health support
- Lack of awareness about available options
- Potential competitive disadvantage in talent retention

#### Chart - 5

In [None]:
# Chart 5 - Family History vs Treatment Seeking
plt.figure(figsize=(10, 6))
family_treatment = pd.crosstab(df['family_history'], df['treatment'])
sns.heatmap(family_treatment, annot=True, fmt='d', cmap='Blues', cbar_kws={'label': 'Count'})
plt.xlabel('Seeking Treatment', fontsize=12)
plt.ylabel('Family History of Mental Health Issues', fontsize=12)
plt.title('Family History vs Mental Health Treatment Seeking', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A heatmap is excellent for visualizing the relationship between two categorical variables. It uses color intensity to show the strength of association between family history of mental health issues and treatment seeking behavior, making patterns immediately visible.

##### 2. What is/are the insight(s) found from the chart?

- Shows the correlation between family mental health history and personal treatment seeking
- Reveals whether people with family history are more likely to seek treatment
- Identifies patterns in hereditary mental health awareness
- Highlights the role of family background in mental health decisions
- May indicate the influence of family support systems on treatment access

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Family history insights enable:
- Targeted outreach to employees with family mental health history
- Development of family-inclusive mental health programs
- Early intervention strategies for at-risk employees
- Personalized support based on family background

**Potential Negative Growth**: If family history doesn't correlate with treatment seeking, it might indicate:
- Insufficient education about hereditary mental health risks
- Stigma preventing treatment despite family history
- Barriers to accessing family mental health information

#### Chart - 6

In [None]:
# Chart 6 - Work Interference Levels
plt.figure(figsize=(12, 6))
work_interfere_counts = df['work_interfere'].value_counts()
colors = plt.cm.Set3(np.linspace(0, 1, len(work_interfere_counts)))
plt.bar(work_interfere_counts.index, work_interfere_counts.values, color=colors)
plt.xlabel('Work Interference Level', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.title('How Mental Health Issues Interfere with Work', fontsize=16, fontweight='bold')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A bar chart clearly displays the frequency of different work interference levels, making it easy to compare how often mental health issues affect work performance. The categorical nature of interference levels makes bar charts the most appropriate visualization.

##### 2. What is/are the insight(s) found from the chart?

 Shows the extent to which mental health issues interfere with work performance
- Identifies the most common level of work interference
- Reveals the productivity impact of mental health issues
- Helps quantify the business cost of untreated mental health problems
- Indicates the urgency of mental health interventions

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Work interference data enables:
- Quantification of productivity losses due to mental health issues
- ROI calculations for mental health program investments
- Development of accommodation strategies for affected employees
- Evidence-based arguments for mental health resource allocation

**Potential Negative Growth**: High work interference levels indicate:
- Significant productivity losses
- Potential quality issues in work output
- Increased absenteeism and turnover risk
- Competitive disadvantage due to reduced team performance

#### Chart - 7

In [None]:
# Chart 7 - Remote Work vs Mental Health Treatment
plt.figure(figsize=(10, 6))
remote_treatment = pd.crosstab(df['remote_work'], df['treatment'], normalize='index') * 100
remote_treatment.plot(kind='bar', color=['#FF9999', '#66B2FF'])
plt.xlabel('Remote Work', fontsize=12)
plt.ylabel('Percentage', fontsize=12)
plt.title('Mental Health Treatment by Remote Work Status', fontsize=16, fontweight='bold')
plt.legend(title='Seeking Treatment')
plt.xticks(rotation=0)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A grouped bar chart effectively compares treatment seeking rates between remote and non-remote workers. It shows both absolute numbers and allows for easy percentage comparison between the two work arrangements.

##### 2. What is/are the insight(s) found from the chart?

 Reveals whether remote work affects mental health treatment seeking behavior
- Shows if remote workers have better or worse access to mental health resources
- Identifies potential isolation effects of remote work on treatment decisions
- Highlights the need for remote-work-specific mental health strategies
- May indicate differences in work-life balance between remote and office workers

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Remote work insights enable:
- Development of remote-work-specific mental health programs
- Targeted support for isolated remote workers
- Hybrid work policies that optimize mental health outcomes
- Virtual mental health resource deployment strategies

**Potential Negative Growth**: If remote workers show lower treatment rates, it might indicate:
- Increased isolation and reduced support access
- Blurred work-life boundaries affecting mental health
- Inadequate virtual mental health resource provision

#### Chart - 8

In [None]:
# Chart 8 - Mental Health Benefits Availability
plt.figure(figsize=(12, 6))
benefits_counts = df['benefits'].value_counts()
plt.pie(benefits_counts.values, labels=benefits_counts.index, autopct='%1.1f%%',
        colors=['#FFB6C1', '#87CEEB', '#98FB98'], startangle=90)
plt.title('Availability of Mental Health Benefits at Work', fontsize=16, fontweight='bold')
plt.axis('equal')
plt.show()

##### 1. Why did you pick the specific chart?

A pie chart effectively shows the proportion of employees who have access to mental health benefits versus those who don't. It provides a clear visual representation of benefit coverage gaps in the workplace.

##### 2. What is/are the insight(s) found from the chart?

- Shows the percentage of employees with access to mental health benefits
- Reveals gaps in mental health benefit provision
- Identifies the scope of employer-sponsored mental health support
- Highlights potential disparities in benefit access
- Indicates the level of organizational commitment to employee mental health

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Benefits availability data enables:
- Identification of coverage gaps that need addressing
- Benchmarking against industry standards for mental health benefits
- ROI analysis for expanding mental health benefit offerings
- Targeted advocacy for improved benefit packages

**Potential Negative Growth**: Limited benefit availability indicates:
- Potential legal compliance issues with mental health parity laws
- Competitive disadvantage in talent acquisition and retention
- Higher out-of-pocket costs for employees seeking treatment
- Increased likelihood of untreated mental health conditions

#### Chart - 9

In [None]:
# Chart 9 - Anonymity vs Treatment Seeking
plt.figure(figsize=(10, 6))
anonymity_treatment = pd.crosstab(df['anonymity'], df['treatment'])
sns.heatmap(anonymity_treatment, annot=True, fmt='d', cmap='Greens',
            cbar_kws={'label': 'Count'})
plt.xlabel('Seeking Treatment', fontsize=12)
plt.ylabel('Anonymity Protection', fontsize=12)
plt.title('Anonymity Protection vs Mental Health Treatment Seeking', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A heatmap effectively visualizes the relationship between anonymity protection and treatment seeking behavior. The color intensity makes it easy to identify patterns in how privacy concerns affect mental health treatment decisions.


##### 2. What is/are the insight(s) found from the chart?

- Shows how anonymity protection influences treatment seeking behavior
- Reveals the importance of privacy in mental health decisions
- Identifies whether lack of anonymity creates barriers to treatment
- Highlights the role of trust in employer-employee mental health relationships
- May indicate stigma levels in the workplace regarding mental health

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Anonymity insights enable:
- Development of confidential mental health programs
- Implementation of privacy-protected treatment options
- Building trust through transparent privacy policies
- Reduction of stigma-related barriers to treatment

**Potential Negative Growth**: If anonymity concerns prevent treatment seeking, it indicates:
- Trust issues between employees and employers
- Potential discrimination fears affecting treatment decisions
- Inadequate privacy protections in current mental health programs
- Cultural stigma that may affect team dynamics and productivity

#### Chart - 10

In [None]:
# Chart 10 - Mental Health Consequences Fear
plt.figure(figsize=(12, 6))
consequence_counts = df['mental_health_consequence'].value_counts()
sns.barplot(x=consequence_counts.index, y=consequence_counts.values, palette='rocket')
plt.xlabel('Fear of Mental Health Consequences', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.title('Fear of Consequences for Mental Health Issues at Work', fontsize=16, fontweight='bold')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

A bar chart clearly displays the distribution of employee fears regarding mental health consequences at work. It effectively shows the frequency of different concern levels and makes it easy to identify the most common fears.

##### 2. What is/are the insight(s) found from the chart?

- Shows the level of fear employees have about mental health consequences at work
- Reveals whether workplace stigma is a significant barrier to treatment
- Identifies the prevalence of discrimination concerns
- Highlights the psychological safety level in the workplace regarding mental health
- Indicates the effectiveness of current anti-discrimination policies

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact**: Fear assessment data enables:
- Development of anti-stigma campaigns and training programs
- Implementation of stronger non-discrimination policies
- Creation of psychologically safe work environments
- Targeted interventions to address specific fear categories

**Potential Negative Growth**: High fear levels indicate:
- Toxic workplace culture that may affect overall employee wellbeing
- Potential legal risks related to mental health discrimination
- Reduced productivity due to stress and anxiety about consequences
- Talent retention issues as employees may seek more supportive work environments

#### Chart - 11

In [None]:
# Chart 11 - Top 10 Countries by Response Count
plt.figure(figsize=(12, 8))
country_counts = df['Country'].value_counts().head(10)
sns.barplot(y=country_counts.index, x=country_counts.values, palette='tab10')
plt.xlabel('Number of Responses', fontsize=12)
plt.ylabel('Country', fontsize=12)
plt.title('Top 10 Countries by Survey Response Count', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

I chose a horizontal bar chart to display the top 10 countries by survey response count because it effectively shows the distribution of survey participants across different countries. The horizontal orientation makes it easy to read country names, and the descending order helps identify which countries have the highest representation in the dataset.

##### 2. What is/are the insight(s) found from the chart?

- The United States dominates the survey responses, indicating a significant bias toward American participants
- There's a clear geographical concentration with English-speaking countries (US, UK, Canada) having higher response rates
- The dataset shows limited global representation, with most responses coming from developed Western countries
- There's a steep drop-off in participation after the top few countries, suggesting the survey may have been primarily distributed in specific regions

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact:** Understanding the geographical distribution helps in:
- Tailoring mental health programs to specific cultural contexts
- Recognizing that current insights may be most applicable to Western work environments
- Planning targeted outreach for underrepresented regions

**Potential Negative Growth:** The geographical bias could limit the generalizability of findings to global organizations, potentially leading to ineffective mental health strategies in non-Western markets.

#### Chart - 12

In [None]:
# Chart 12 - Wellness Program vs Treatment
plt.figure(figsize=(10, 6))
wellness_treatment = pd.crosstab(df['wellness_program'], df['treatment'], normalize='index') * 100
wellness_treatment.plot(kind='bar', color=['#FFA07A', '#20B2AA'])
plt.xlabel('Wellness Program Available', fontsize=12)
plt.ylabel('Percentage', fontsize=12)
plt.title('Wellness Program Availability vs Mental Health Treatment', fontsize=16, fontweight='bold')
plt.legend(title='Seeking Treatment')
plt.xticks(rotation=0)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

I selected a grouped bar chart with normalized percentages to compare treatment-seeking behavior between employees who have access to wellness programs versus those who don't. This visualization clearly shows the relationship between wellness program availability and mental health treatment patterns.

##### 2. What is/are the insight(s) found from the chart?

- Employees with access to wellness programs show higher rates of seeking mental health treatment
- The presence of wellness programs appears to reduce stigma and encourage help-seeking behavior
- There's a clear correlation between organizational support (wellness programs) and employee willingness to address mental health issues
- The difference in treatment rates suggests wellness programs may serve as a gateway to mental health care

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact:**
- Demonstrates ROI of wellness programs in encouraging mental health treatment
- Provides evidence for investing in comprehensive employee wellness initiatives
- Shows that organizational support directly impacts employee mental health outcomes

**Negative Growth Risk:** Organizations without wellness programs may see lower treatment rates, potentially leading to untreated mental health issues affecting productivity and retention.


#### Chart - 13

In [None]:
# Chart 13 - Mental vs Physical Health Treatment Comparison
plt.figure(figsize=(10, 6))
mental_vs_physical_counts = df['mental_vs_physical'].value_counts()
colors = ['#FF6347', '#4682B4', '#32CD32']
plt.bar(mental_vs_physical_counts.index, mental_vs_physical_counts.values, color=colors)
plt.xlabel('Mental vs Physical Health Treatment Comparison', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.title('How Employees View Mental vs Physical Health Treatment', fontsize=16, fontweight='bold')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

I chose a bar chart to visualize how employees perceive the treatment of mental health compared to physical health in their workplace. This chart effectively shows the distribution of employee perceptions across different categories.

##### 2. What is/are the insight(s) found from the chart?

- Most employees perceive mental health treatment as less favorable compared to physical health treatment
- There's a significant disparity in how mental and physical health are handled in workplace settings
- A smaller portion of employees feel mental health is treated equally or better than physical health
- This perception gap indicates systemic issues in organizational mental health support

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Impact:**
- Identifies areas where organizations need to improve mental health parity
- Provides baseline data for measuring improvement in mental health treatment equality
- Highlights the need for policy changes and awareness programs

**Negative Growth Risk:** The perception of unequal treatment may discourage employees from seeking help, leading to decreased productivity, higher turnover, and potential legal compliance issues.

#### Chart - 14 - Correlation Heatmap

In [None]:
# Chart 14 - Correlation Heatmap (for numerical variables)
plt.figure(figsize=(12, 8))
# Create dummy variables for categorical columns to include in correlation
df_encoded = pd.get_dummies(df.select_dtypes(include=['object']), drop_first=True)
df_corr = pd.concat([df.select_dtypes(include=['number']), df_encoded], axis=1)
correlation_matrix = df_corr.corr()

# Select top correlations with treatment
treatment_corr = correlation_matrix['treatment_Yes'].abs().sort_values(ascending=False).head(15)
selected_features = treatment_corr.index.tolist()
selected_corr = correlation_matrix.loc[selected_features, selected_features]

sns.heatmap(selected_corr, annot=True, cmap='coolwarm', center=0,
            square=True, fmt='.2f', cbar_kws={'label': 'Correlation Coefficient'})
plt.title('Correlation Heatmap: Key Factors Related to Mental Health Treatment',
          fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

I selected a correlation heatmap to identify the relationships between various factors and mental health treatment seeking behavior. This visualization efficiently shows multiple correlations simultaneously and helps identify the strongest predictors of treatment-seeking behavior.

##### 2. What is/are the insight(s) found from the chart?

- Strong correlations exist between certain workplace factors and treatment-seeking behavior
- Family history of mental health issues shows significant correlation with personal treatment seeking
- Workplace openness and supervisor support correlate positively with treatment rates
- Company size and industry type may influence mental health treatment accessibility
- Age and gender show varying correlation strengths with treatment patterns

#### Chart - 15 - Pair Plot

In [None]:
# Chart 15 - Tech Company vs Treatment Seeking
plt.figure(figsize=(10, 6))
tech_treatment = pd.crosstab(df['tech_company'], df['treatment'], normalize='index') * 100
tech_treatment.plot(kind='bar', color=['#FF69B4', '#00CED1'])
plt.xlabel('Works at Tech Company', fontsize=12)
plt.ylabel('Percentage', fontsize=12)
plt.title('Mental Health Treatment by Tech Company Employment', fontsize=16, fontweight='bold')
plt.legend(title='Seeking Treatment')
plt.xticks(rotation=0)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

I chose a grouped bar chart to compare treatment-seeking rates between tech and non-tech companies. This visualization clearly shows the difference in mental health treatment patterns across industry types.

##### 2. What is/are the insight(s) found from the chart?

- Tech companies show higher rates of employees seeking mental health treatment
- There's a notable difference in mental health awareness and treatment accessibility between tech and non-tech industries
- Tech industry's progressive culture may contribute to reduced stigma around mental health
- The difference suggests industry-specific factors influence mental health treatment patterns

#### Chart - 16

In [None]:
# Chart 16 - Self-Employed vs Treatment
plt.figure(figsize=(10, 6))
self_emp_treatment = pd.crosstab(df['self_employed'], df['treatment'], normalize='index') * 100
self_emp_treatment.plot(kind='bar', color=['#DDA0DD', '#90EE90'])
plt.xlabel('Self-Employed Status', fontsize=12)
plt.ylabel('Percentage', fontsize=12)
plt.title('Mental Health Treatment by Self-Employment Status', fontsize=16, fontweight='bold')
plt.legend(title='Seeking Treatment')
plt.xticks(rotation=0)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?
I selected a grouped bar chart to compare treatment-seeking behavior between self-employed individuals and traditional employees. This comparison reveals how employment structure affects mental health care access and utilization.

##### 2. What is/are the insight(s) found from the chart?
- Self-employed individuals show different treatment-seeking patterns compared to traditional employees
- Employment structure significantly impacts mental health care accessibility
- Self-employed individuals may face barriers such as lack of employer-sponsored health insurance
- Traditional employees benefit from organizational mental health resources and support systems


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


**Positive Impact:**
- Organizations can develop targeted support for contract workers and freelancers
- Identifies opportunities for expanding mental health services to non-traditional workers
- Helps design inclusive mental health programs

**Negative Growth Risk:** Neglecting self-employed workers' mental health needs could impact project quality and long-term business relationships.


#### Chart - 17

In [None]:
# Chart 17 - Comfort Level with Coworkers
plt.figure(figsize=(12, 6))
coworker_counts = df['coworkers'].value_counts()
plt.pie(coworker_counts.values, labels=coworker_counts.index, autopct='%1.1f%%',
        colors=plt.cm.Pastel1(np.linspace(0, 1, len(coworker_counts))), startangle=90)
plt.title('Comfort Level Discussing Mental Health with Coworkers', fontsize=16, fontweight='bold')
plt.axis('equal')
plt.show()

##### 1. Why did you pick the specific chart?
I chose a pie chart to show the distribution of comfort levels when discussing mental health with coworkers. The pie chart effectively displays proportions and makes it easy to see which comfort levels are most common.

##### 2. What is/are the insight(s) found from the chart?
- Most employees feel uncomfortable discussing mental health issues with coworkers
- There's significant variation in comfort levels, indicating diverse workplace cultures
- A substantial portion of employees maintain neutral or uncertain positions
- The discomfort suggests ongoing stigma around mental health in workplace settings

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


**Positive Impact:**
- Identifies need for workplace mental health awareness programs
- Helps design peer support initiatives and safe spaces for mental health discussions
- Provides benchmark for measuring cultural change initiatives

**Negative Growth Risk:** Persistent discomfort may lead to isolation, reduced team cohesion, and untreated mental health issues affecting overall workplace productivity.

#### Chart - 18

In [None]:
# Chart 18 - Supervisor Discussion Comfort
plt.figure(figsize=(12, 6))
supervisor_counts = df['supervisor'].value_counts()
plt.pie(supervisor_counts.values, labels=supervisor_counts.index, autopct='%1.1f%%',
        colors=plt.cm.Set2(np.linspace(0, 1, len(supervisor_counts))), startangle=90)
plt.title('Comfort Level Discussing Mental Health with Supervisor', fontsize=16, fontweight='bold')
plt.axis('equal')
plt.show()

##### 1. Why did you pick the specific chart?
I selected a pie chart to visualize employee comfort levels when discussing mental health with supervisors. This chart clearly shows the distribution of comfort levels and highlights the relationship between management and employee mental health openness.

##### 2. What is/are the insight(s) found from the chart?
- Many employees feel uncomfortable discussing mental health with supervisors
- There's a trust gap between employees and management regarding mental health topics
- Supervisors may need training on mental health sensitivity and support
- The comfort level varies significantly, suggesting inconsistent management approaches

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


**Positive Impact:**
- Identifies critical need for supervisor training on mental health awareness
- Helps develop management guidelines for mental health conversations
- Enables creation of safe reporting mechanisms for mental health concerns

**Negative Growth Risk:** Poor supervisor relationships regarding mental health could lead to decreased employee engagement, higher turnover, and potential legal liabilities.


#### Chart - 19

In [None]:
# Chart 19 - Mental Health Interview Disclosure
plt.figure(figsize=(10, 6))
interview_counts = df['mental_health_interview'].value_counts()
sns.barplot(x=interview_counts.index, y=interview_counts.values, palette='muted')
plt.xlabel('Would Discuss Mental Health in Interview', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.title('Willingness to Discuss Mental Health Issues in Job Interview', fontsize=16, fontweight='bold')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?
I chose a bar chart to show the distribution of willingness to discuss mental health during job interviews. This visualization effectively displays the different levels of openness candidates have about mental health disclosure.

##### 2. What is/are the insight(s) found from the chart?
- Most candidates are reluctant to discuss mental health issues during interviews
- There's significant fear of discrimination in the hiring process
- The reluctance indicates perceived stigma in recruitment practices
- Few candidates feel comfortable with full disclosure during interviews

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


**Positive Impact:**
- Highlights need for inclusive hiring practices and anti-discrimination policies
- Identifies opportunities to create safer interview environments
- Helps develop diversity and inclusion strategies that address mental health

**Negative Growth Risk:** Discriminatory hiring practices could result in legal issues, reduced talent pool, and damage to company reputation.

#### Chart - 20

In [None]:
# Chart 20 - Leave Difficulty for Mental Health
plt.figure(figsize=(12, 6))
leave_counts = df['leave'].value_counts()
colors = plt.cm.RdYlBu(np.linspace(0, 1, len(leave_counts)))
plt.bar(leave_counts.index, leave_counts.values, color=colors)
plt.xlabel('Difficulty Taking Leave for Mental Health', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.title('Difficulty in Taking Leave for Mental Health Issues', fontsize=16, fontweight='bold')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?
I selected a bar chart to visualize the difficulty levels employees face when taking leave for mental health reasons. This chart effectively shows the distribution of experiences and barriers to mental health leave.

##### 2. What is/are the insight(s) found from the chart?
- Many employees find it difficult to take leave for mental health reasons
- There are significant barriers to accessing mental health leave
- The difficulty levels vary, suggesting inconsistent policies across organizations
- Easy access to mental health leave is not the norm in most workplaces

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.


**Positive Impact:**
- Identifies critical need for clear mental health leave policies
- Helps develop supportive leave processes that reduce barriers
- Enables benchmarking against best practices for mental health leave

**Negative Growth Risk:** Difficult leave processes may lead to employee burnout, decreased productivity, and higher turnover rates, ultimately impacting business performance and potentially creating legal compliance issues.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Based on the comprehensive analysis of all charts, organizations should:

1. **Implement comprehensive mental health programs** that address the specific barriers identified in the data
2. **Develop targeted interventions** for different demographic groups and company sizes
3. **Invest in privacy-protected mental health resources** to address anonymity concerns
4. **Create anti-stigma initiatives** to reduce fear of consequences
5. **Provide adequate mental health benefits** to ensure accessibility
6. **Adapt programs for remote work environments** to address modern workplace challenges
7. **Monitor and measure progress** using similar metrics to track improvement over time

The insights from these charts provide a roadmap for creating more supportive, inclusive, and mentally healthy workplace environments that can improve employee wellbeing, productivity, and organizational success.

# **Conclusion**

This comprehensive exploratory data analysis of mental health in the technology industry reveals critical insights that can transform how organizations approach employee mental health support. The study demonstrates clear relationships between workplace factors and treatment-seeking behaviors, highlighting both opportunities and challenges in the current landscape.

**Key Findings Summary**:

The analysis reveals significant variations in treatment-seeking behavior across different demographic groups and company sizes, with family history emerging as a strong predictor of treatment adoption. Workplace factors, particularly the availability of mental health benefits and anonymity protections, show substantial correlation with employees' willingness to seek professional help. The data indicates that fear of workplace consequences remains a significant barrier, with many employees reporting concerns about career impact despite existing policies.

**Strategic Implications**:

Organizations should prioritize creating psychologically safe environments where mental health discussions are normalized and destigmatized. The findings suggest that simply providing benefits is insufficient; companies must actively communicate these resources and ensure employees feel protected when accessing them. Smaller companies may need different approaches compared to larger organizations, as the data shows varying patterns of treatment utilization across company sizes.

**Recommendations for Implementation**:

Based on the analysis, technology companies should implement multi-tiered mental health strategies that address both systemic and individual factors. This includes enhancing manager training on mental health awareness, improving the clarity and accessibility of mental health benefits, and establishing robust anonymity protections. Special attention should be given to employee segments identified as high-risk or underserved by current programs.

**Business Impact Potential**:

The insights derived from this analysis have significant potential for positive business impact, including reduced healthcare costs, improved employee retention, enhanced productivity, and stronger organizational resilience. Companies that proactively address the identified barriers and implement data-driven mental health strategies are likely to see substantial returns on investment through improved employee wellbeing and organizational performance.

**Future Research Directions**:

This foundational analysis opens pathways for longitudinal studies tracking the effectiveness of implemented interventions, deeper investigation into specific demographic subgroups, and exploration of emerging factors such as the long-term impact of remote work on mental health outcomes in the tech industry.

The project successfully demonstrates the value of data-driven approaches to understanding and addressing mental health challenges in the workplace, providing a solid foundation for evidence-based decision making in organizational mental health strategy development.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***