
# Digital Addiction and Psychological Well-being

This project analyzes the relationship between digital device usage, including smartphones, gaming, and social media, and psychological well-being. We will explore how different patterns of digital usage correlate with mental health indicators such as anxiety, depression, and stress.

---

## Project Objectives

1. To analyze the relationship between digital addiction patterns and psychological well-being.
2. To use predictive models to estimate well-being based on digital usage behaviors.
3. To employ inferential statistics to identify significant differences in well-being based on usage patterns.
4. To explore potential causal links between digital addiction and mental health indicators.
5. To visualize the data for better understanding and communication of findings.

---

## Brief

The analysis is based on a simulated dataset of 1,000 participants. The dataset includes variables such as age, gender, hours of digital usage per day, primary digital activity, most used device, addiction scores, and various psychological well-being indicators.


In [None]:

# Load necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from scipy.stats import ttest_ind, f_oneway
import statsmodels.api as sm

# Load the dataset
df = pd.read_csv('digital_addiction_psychological_wellbeing_data.csv')

# Data Preprocessing: Encode categorical variables
label_encoders = {}
for column in ['Gender', 'Primary_Activity', 'Device_Used_Most']:
    le = LabelEncoder()
    df[column] = le.fit_transform(df[column])
    label_encoders[column] = le

# Standardizing numerical features for analysis
scaler = StandardScaler()
df[['Digital_Usage_Hours_Per_Day', 'Addiction_Score', 'Wellbeing_Score']] = scaler.fit_transform(df[['Digital_Usage_Hours_Per_Day', 'Addiction_Score', 'Wellbeing_Score']])

df.head()


In [None]:

# Predictive Modeling
X = df[['Age', 'Gender', 'Digital_Usage_Hours_Per_Day', 'Primary_Activity', 'Device_Used_Most', 'Addiction_Score']]
y = df['Wellbeing_Score']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Random Forest Classifier
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)
rf_pred = rf_model.predict(X_test)
rf_accuracy = accuracy_score(y_test, rf_pred)

# Decision Tree Classifier
dt_model = DecisionTreeClassifier(random_state=42)
dt_model.fit(X_train, y_train)
dt_pred = dt_model.predict(X_test)
dt_accuracy = accuracy_score(y_test, dt_pred)

(rf_accuracy, dt_accuracy)


In [None]:

# Inferential Statistics

# ANOVA test for differences in Addiction Score across different Primary Activities
anova_result_addiction = f_oneway(
    df[df['Primary_Activity'] == 0]['Addiction_Score'],
    df[df['Primary_Activity'] == 1]['Addiction_Score'],
    df[df['Primary_Activity'] == 2]['Addiction_Score'],
    df[df['Primary_Activity'] == 3]['Addiction_Score']
)

# T-test for Wellbeing Score between high and low Addiction Scores
high_addiction_wellbeing = df[df['Addiction_Score'] > 7]['Wellbeing_Score']
low_addiction_wellbeing = df[df['Addiction_Score'] < 3]['Wellbeing_Score']
ttest_result_wellbeing = ttest_ind(high_addiction_wellbeing, low_addiction_wellbeing)

(anova_result_addiction, ttest_result_wellbeing)


In [None]:

# Causal Inference using Regression Analysis
X = df[['Digital_Usage_Hours_Per_Day', 'Addiction_Score', 'Anxiety_Level', 'Depression_Level', 'Stress_Level']]
y = df['Wellbeing_Score']

# Adding a constant term for the intercept
X = sm.add_constant(X)

# Fit the regression model
causal_regression_model = sm.OLS(y, X).fit()

causal_regression_model.summary()


In [None]:

# Visualizations

# Visualization 1: Pie Chart - Distribution of Primary Activity
plt.figure(figsize=(6, 6))
df['Primary_Activity'].value_counts().plot.pie(autopct='%1.1f%%', startangle=90, colors=['#ff9999','#66b3ff','#99ff99', '#ffcc99'])
plt.title('Distribution of Primary Activity')
plt.ylabel('')
plt.show()

# Visualization 2: Bar Graph - Average Addiction Score by Primary Activity
plt.figure(figsize=(8, 6))
sns.barplot(x='Primary_Activity', y='Addiction_Score', data=df, ci=None)
plt.title('Average Addiction Score by Primary Activity')
plt.xlabel('Primary Activity')
plt.ylabel('Addiction Score')
plt.show()

# Visualization 3: Scatter Plot - Wellbeing Score by Digital Usage Hours
plt.figure(figsize=(8, 6))
sns.scatterplot(x='Digital_Usage_Hours_Per_Day', y='Wellbeing_Score', hue='Primary_Activity', data=df, palette='viridis')
plt.title('Wellbeing Score by Digital Usage Hours')
plt.xlabel('Digital Usage Hours per Day')
plt.ylabel('Wellbeing Score')
plt.show()

# Visualization 4: Heatmap - Correlation Matrix
plt.figure(figsize=(10, 8))
corr_matrix = df.corr()
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0)
plt.title('Heatmap of Correlation Matrix')
plt.show()

# Visualization 5: Boxplot - Stress Level by Device Used
plt.figure(figsize=(8, 6))
sns.boxplot(x='Device_Used_Most', y='Stress_Level', data=df)
plt.title('Stress Level by Device Used')
plt.xlabel('Device Used Most')
plt.ylabel('Stress Level')
plt.show()



## Summary and Conclusion

This analysis provides insights into the relationships between digital addiction and psychological well-being. Predictive models showed moderate accuracy, suggesting that while there is a relationship between digital usage and well-being, it is influenced by multiple factors. Inferential statistics identified significant differences in addiction scores across primary activities, and causal analysis highlighted key predictors of well-being.

Further research is needed to explore additional factors and more sophisticated modeling techniques to fully understand the impact of digital addiction on mental health.

---

### Key Takeaways
1. **Predictive Models**: Random Forest and Decision Tree models provided insights into the impact of digital usage on well-being.
2. **Significant Differences**: High addiction scores are associated with lower well-being.
3. **Primary Activity Influence**: Different primary activities show varying levels of addiction and impact on well-being.

### Recommendations
1. Promote awareness about healthy digital usage habits.
2. Encourage further research to explore other factors influencing digital addiction.
3. Develop interventions to manage digital addiction and improve psychological well-being.

---

**Note**: The results are based on simulated data, and real-world implications should be interpreted with caution.
