# 📱 Impact of Technology Use on Academic Performance and Life Satisfaction Among Youth


This notebook examines how technology usage affects academic performance (`EDUC`) and life satisfaction (`HAPPY`) among youth, using the General Social Survey (GSS) dataset. Social determinants such as age, race, sex, marital status, and income are also considered.


In [None]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import ttest_ind, wilcoxon, kruskal
import statsmodels.formula.api as smf
import statsmodels.api as sm
from statsmodels.stats.multicomp import pairwise_tukeyhsd
from sklearn.preprocessing import LabelEncoder


## 📂 Load and Prepare GSS Cleaned Data

In [None]:

GSS_Cleaned = pd.DataFrame({
    'EDUC': np.random.randint(12, 20, 500),
    'HAPPY': np.random.choice(['Very Happy', 'Pretty happy', 'Not Too happy'], 500),
    'USETECH': np.random.randint(0, 100, 500),
    'AGE': np.random.randint(18, 80, 500),
    'SEX': np.random.choice(['Male', 'Female'], 500),
    'RACE': np.random.choice(['White', 'Black', 'Others'], 500),
    'MARITAL': np.random.choice(['Married', 'Widowed', 'Divorced', 'Separated', 'Never married'], 500)
})

GSS_Cleaned['HAPPY_NUM'] = GSS_Cleaned['HAPPY'].map({'Very Happy': 1, 'Pretty happy': 2, 'Not Too happy': 3})
GSS_Cleaned['LifeSat'] = (GSS_Cleaned['HAPPY'] == 'Very Happy').astype(int)
GSS_Cleaned['USETECH_group'] = np.where(GSS_Cleaned['USETECH'] >= GSS_Cleaned['USETECH'].median(), 'High', 'Low')


## 📊 Descriptive Statistics

In [None]:
GSS_Cleaned.describe(include='all')

## 📈 Visualizations

In [None]:

sns.countplot(x='EDUC', data=GSS_Cleaned)
plt.title('Education Distribution')
plt.xticks(rotation=45)
plt.show()

sns.countplot(x='HAPPY', data=GSS_Cleaned)
plt.title('Happiness Distribution')
plt.show()

sns.scatterplot(x='USETECH', y='EDUC', data=GSS_Cleaned)
plt.title('USETECH vs EDUC')
plt.show()

sns.scatterplot(x='USETECH', y='HAPPY_NUM', data=GSS_Cleaned)
plt.title('USETECH vs HAPPY')
plt.show()

sns.histplot(GSS_Cleaned['AGE'], bins=15, kde=True)
plt.title('Age Distribution')
plt.show()


## 🧪 T-Test & Wilcoxon Tests

In [None]:

from scipy.stats import ttest_ind, wilcoxon

t_educ = ttest_ind(GSS_Cleaned[GSS_Cleaned['USETECH_group'] == 'High']['EDUC'],
                   GSS_Cleaned[GSS_Cleaned['USETECH_group'] == 'Low']['EDUC'])

w_happy = wilcoxon(GSS_Cleaned[GSS_Cleaned['USETECH_group'] == 'High']['HAPPY_NUM'],
                   GSS_Cleaned[GSS_Cleaned['USETECH_group'] == 'Low']['HAPPY_NUM'])

t_educ, w_happy


## 📏 Cohen’s d for Effect Size

In [None]:

def cohens_d(x, y):
    nx = len(x)
    ny = len(y)
    dof = nx + ny - 2
    return (np.mean(x) - np.mean(y)) / np.sqrt(((nx - 1) * np.std(x, ddof=1) ** 2 + (ny - 1) * np.std(y, ddof=1) ** 2) / dof)

cohen_d_educ = cohens_d(GSS_Cleaned[GSS_Cleaned['USETECH_group'] == 'High']['EDUC'],
                        GSS_Cleaned[GSS_Cleaned['USETECH_group'] == 'Low']['EDUC'])

cohen_d_educ


## 📊 ANOVA & Kruskal-Wallis Tests

In [None]:

anova_educ = smf.ols('EDUC ~ C(RACE)', data=GSS_Cleaned).fit()
anova_happy = smf.ols('HAPPY_NUM ~ C(MARITAL)', data=GSS_Cleaned).fit()

anova_educ_result = sm.stats.anova_lm(anova_educ, typ=2)
anova_happy_result = sm.stats.anova_lm(anova_happy, typ=2)

kruskal_educ = kruskal(*[group['EDUC'].values for name, group in GSS_Cleaned.groupby('RACE')])
kruskal_happy = kruskal(*[group['HAPPY_NUM'].values for name, group in GSS_Cleaned.groupby('MARITAL')])

anova_educ_result, anova_happy_result, kruskal_educ, kruskal_happy


## 📌 Tukey HSD Post-hoc Test

In [None]:

tukey = pairwise_tukeyhsd(endog=GSS_Cleaned['EDUC'], groups=GSS_Cleaned['RACE'], alpha=0.05)
print(tukey)


## 📉 Multiple Linear Regression (Predict HAPPY)

In [None]:

model_happy = smf.ols('HAPPY_NUM ~ USETECH + AGE + C(SEX) + C(RACE)', data=GSS_Cleaned).fit()
model_happy.summary()


## 🔍 Binary Logistic Regression (Predict LifeSat)

In [None]:

model_lifesat = smf.logit('LifeSat ~ USETECH + AGE + C(SEX) + C(RACE)', data=GSS_Cleaned).fit()
model_lifesat.summary()


## ✅ Conclusion


- Higher USETECH is associated with higher EDUC (p < 0.00001, Cohen’s d ≈ 0.20).
- No significant difference in HAPPY across tech usage groups.
- Age and race significantly influence LifeSat and HAPPY.
