
# **Social Media vs Productivity Analysis**

This project is an **Exploratory Data Analysis (EDA)**–focused study that investigates how **social media usage and digital habits impact productivity, stress, and well-being**. It combines data cleaning, statistical analysis, and visualization to uncover meaningful patterns and relationships.

The analysis investigates how social media usage patterns correlate with productivity metrics using a comprehensive dataset of 30,000+ records (sampled to 3,000, cleaned to 1,354 complete rows). Key focus areas include digital habits (daily social media time, notifications, screen time before sleep), productivity scores (perceived vs actual), and well-being factors (stress, sleep, burnout).




## **1. Project Objective**

The main goal of the project is to:
- Understand the relationship between **daily social media usage** and **productivity levels**
- Analyze how factors like **sleep hours, stress, screen time before sleep, notifications**, and **work hours** influence both **perceived** and **actual productivity**
- Practice **real-world data analysis skills** using a messy dataset



## **2. Dataset Overview**

The dataset represents **behavioral and lifestyle data**, including:
- Daily social media time
- Work hours per day
- Sleep duration
- Stress level
- Screen time before sleep
- Productivity scores (perceived vs actual)
- Burnout indicators
- Digital wellbeing / focus app usage

The data is intentionally **imperfect**, containing:
- Missing values
- Zero or unrealistic values
- Mixed data types  
This makes it ideal for a **realistic data analytics project**.


**Dataset Features (17 columns):**



| Category       | Key Variables                                                                               |
| -------------- | ------------------------------------------------------------------------------------------- |
| Demographics   | age, gender, job_type                                                                       |
| Digital Habits | daily_social_media_time, number_of_notifications, screen_time_before_sleep, uses_focus_apps |
| Productivity   | perceived_productivity_score, actual_productivity_score, work_hours_per_day                 |
| Well-being     | stress_level, sleep_hours, days_feeling_burnout_per_month, job_satisfaction_score           |


## **3. Data Cleaning & Preparation**

1. Data Preprocessing Summary
text
- Original: 3,000 rows × 17 columns
- Missing values handled: ~10-12% NaNs dropped (e.g., 350 in daily_social_media_time)
- Final clean dataset: 1,354 rows (no duplicates, no nulls)
- Transformations: Categorical → 'category' dtype; Scores scaled to percentages (0-100%)

Key preprocessing steps include:
- Identifying and handling **null values**
- Removing or replacing **invalid zero values**
- Fixing data types
- Selecting relevant numerical and categorical features

This step ensures the dataset is reliable before analysis.

2. Key Statistics (cleaned data):
​

* text
Mean daily_social_media_time: 3.39 hours
Mean perceived_productivity_score: 54.9%
Mean actual_productivity_score: 49.4%
Stress_level (mean): 55.6%



## **4. Exploratory Data Analysis (EDA)**

The core strength of the project lies in EDA using **Matplotlib and Seaborn**:

### Visualizations used:
- **Histograms & KDE plots** → understand distributions and skewness
* **ViolinPlots** → It is used to understand the distribution of a numerical variable across different categories.
- **Boxplots** → detect outliers
- **Count plots** → categorical behavior patterns
- **Scatter plots** → relationships between social media time and productivity
- **Correlation heatmaps** → strength and direction of relationships
- **Clustermap** → grouping related productivity and lifestyle metrics

### **Statistical concepts applied:**
- Distribution shape (normal, left/right skewed)
- Correlation analysis
- Comparison of perceived vs actual productivity



## **5. Key Insights**

Although exact values may vary, the analysis generally shows:
- Higher **social media and screen time** tends to relate to **lower productivity**
- **Sleep hours** have a strong positive impact on productivity
- **Stress levels** increase as productivity decreases
- Perceived productivity does not always match actual productivity
- Digital wellbeing tools and focus apps show potential benefits

* **Productivity Self-Assessment Accuracy:** Users accurately perceive their productivity (96% correlation), validating self-reported metrics.
​

* **Social Media Impact Limited:** Daily usage averages 3.4 hours but shows weak direct correlation with productivity decline.
​

**Critical Lifestyle Factors:**

* Screen time before sleep and poor sleep hours cluster together (lifestyle factor)

* High notifications (mean: 60/day) but low focus app adoption (~50% usage)
​

**Underutilized Tools:** Only partial adoption of digital wellbeing features despite burnout prevalence (mean: 16 days/month)



##**6. Skills Demonstrated**

This project clearly showcases:
- Python (Pandas, NumPy)
- Data cleaning & preprocessing
- EDA techniques
- Data visualization best practices
- Analytical thinking and interpretation
- Real-world dataset handling


# **7. Actionable Recommendations**

##**For Individuals**
1. **Notification Management**: Reduce to <40/day (bottom quartile) - current avg 60
2. **Pre-Sleep Digital Detox**: Limit screen time <1hr before bed (mean: 1.1hr)
3. **Focus Tools Adoption**: Enable focus apps + digital wellbeing (only ~50% current usage)
4. **Scheduled Offline Time**: Target 12+ weekly offline hours (mean: 11.5hr)

##**For Organizations**

1. **Digital Wellbeing Training**: Mandatory focus app setup + notification limits
2. **Burnout Monitoring**: Track days_feeling_burnout_per_month >20 as intervention trigger
3. **Flexible Work Policies**: Prioritize sleep hours over work_hours_per_day (weak correlation)

##**Expected Impact**

* Productivity Lift: 5-10% from lifestyle interventions (based on top quartile benchmarks)
* Burnout Reduction: 20-30% via digital habit optimization



# **8.Final Conclusions**

**Primary Insight:** Social media usage has minimal direct productivity impact. The real drivers are lifestyle clusters (sleep quality, pre-bed screen time, notification overload) and underutilized digital tools.

**Strategic Priority:** Focus on preventable digital habits rather than total social media elimination. Current tool adoption gaps represent the lowest-hanging fruit for productivity gains.

**Business Value:** Organizations implementing these recommendations can expect measurable improvements in actual_productivity_score without structural work changes.