# The Welfare Effects of Social Media: Exploring the Facebook Deactivation Experiment

**Persuasion at Scale** (PSAM 3707 / UN 3707), Week 4

Based on: Allcott, Braghieri, Eichmeyer, and Gentzkow (2020). "The Welfare Effects of Social Media." *American Economic Review* 110(3): 629-676.

---

## What happened in this experiment?

In October 2018, right before the US midterm elections, researchers **paid Facebook users to deactivate their accounts for four weeks**.

- 2,743 users recruited via Facebook ads
- Those willing to deactivate for $102 or less were randomized
- ~830 in the Treatment group (paid to deactivate)
- ~830 in the Control group (kept using Facebook)
- Over 90% compliance with deactivation

**The big questions:** What happens when you take away someone's Facebook? Do they become happier? Less informed? Less politically polarized?

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import seaborn as sns
from scipy import stats

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 13
np.random.seed(2018)  # the year of the experiment

## Part 1: The Experimental Design

Before we look at results, let's understand who was in this experiment and how randomization works.

### Who were the participants?

The sample is **not** a random sample of Americans. It's a sample of Facebook users who:
1. Saw a Facebook ad about the study
2. Were willing to participate
3. Were willing to deactivate for $102 or less

This matters! Let's see how they compare to the general population.

In [None]:
# Table 2 from the paper: Sample vs. population demographics
categories = ['Income\nunder $50K', 'College\neducated', 'Male', 'White', 'Age\nunder 30', 'Republican', 'Democrat']
sample_vals = [0.40, 0.51, 0.43, 0.68, 0.52, 0.13, 0.42]
fb_users =   [0.41, 0.33, None, None, None, None, None]  # limited public data
us_pop =     [0.52, 0.30, 0.49, 0.64, 0.21, 0.26, 0.31]

fig, ax = plt.subplots(figsize=(12, 6))
x = np.arange(len(categories))
width = 0.3

bars1 = ax.bar(x - width, sample_vals, width, label='Experiment sample', color='#2196F3', alpha=0.85)
bars3 = ax.bar(x + width, us_pop, width, label='US population', color='#9E9E9E', alpha=0.7)

# Add FB user bars where available
for i, val in enumerate(fb_users):
    if val is not None:
        ax.bar(x[i], val, width, color='#4CAF50', alpha=0.7)
ax.bar([], [], width, color='#4CAF50', alpha=0.7, label='Facebook users')

ax.set_ylabel('Proportion')
ax.set_title('Who volunteered for a Facebook deactivation experiment?\n(Table 2 from Allcott et al. 2020)', fontsize=15)
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend(loc='upper right')
ax.set_ylim(0, 0.75)

# Annotate the key differences
ax.annotate('Much more\nDemocratic', xy=(6, 0.42), xytext=(6, 0.60),
            fontsize=10, ha='center', color='#1565C0',
            arrowprops=dict(arrowstyle='->', color='#1565C0'))
ax.annotate('Much younger', xy=(4, 0.52), xytext=(4.5, 0.65),
            fontsize=10, ha='center', color='#1565C0',
            arrowprops=dict(arrowstyle='->', color='#1565C0'))

plt.tight_layout()
plt.show()

print("Key takeaway: The sample skews young, educated, female, and Democratic.")
print("This is important for interpreting the results (external validity).")

### How much were people willing to accept?

Before randomization, the researchers asked: *"What's the minimum you'd accept to deactivate Facebook for 4 weeks?"*

The distribution of these valuations tells us something about how much people value Facebook.

In [None]:
# Simulate the WTA distribution (matching paper: median ~$100, mean ~$180, right-skewed)
# The paper reports that 61% had WTA <= $102
n_total = 2743
wta = np.concatenate([
    np.random.lognormal(mean=3.8, sigma=0.9, size=int(n_total * 0.61)),  # those <= $102
    np.random.lognormal(mean=5.5, sigma=0.8, size=int(n_total * 0.39))   # those > $102
])
wta = np.clip(wta, 1, 1000)
# Adjust so ~61% are <= 102
wta[wta > 102] = wta[wta > 102] * 1.5  # push high values further out

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Left: histogram of WTA
ax1.hist(wta, bins=50, color='#5C6BC0', alpha=0.8, edgecolor='white')
ax1.axvline(x=102, color='red', linestyle='--', linewidth=2, label='$102 cutoff')
ax1.axvline(x=np.median(wta), color='orange', linestyle='--', linewidth=2, label=f'Median: ${np.median(wta):.0f}')
ax1.set_xlabel('Willingness to Accept ($)')
ax1.set_ylabel('Number of participants')
ax1.set_title('How much would you need to quit Facebook\nfor 4 weeks?')
ax1.legend()
ax1.set_xlim(0, 500)

# Right: the randomization scheme
labels = ['Total\nRecruited', 'WTA â‰¤ $102\n(randomized)', 'Treatment\n(deactivate)', 'Control\n(keep FB)']
values = [2743, 1661, 831, 830]
colors = ['#78909C', '#5C6BC0', '#EF5350', '#66BB6A']

bars = ax2.barh(range(4), values, color=colors, height=0.6)
ax2.set_yticks(range(4))
ax2.set_yticklabels(labels)
ax2.set_xlabel('Number of participants')
ax2.set_title('The randomization funnel')
ax2.invert_yaxis()
for bar, val in zip(bars, values):
    ax2.text(bar.get_width() + 30, bar.get_y() + bar.get_height()/2,
             f'n = {val}', va='center', fontsize=12)

plt.tight_layout()
plt.show()

print(f"Only {1661/2743:.0%} of recruits were willing to deactivate for $102.")
print("These are the people who value Facebook LESS. Keep that in mind.")

## Part 2: Loading the Real Data

The replication data is from [openICPSR project 112081](https://www.openicpsr.org/openicpsr/project/112081) (CC BY 4.0 license). We use the anonymized endline survey, which contains the main outcome measures for 2,823 participants.

### The four families of outcomes

The paper measures effects in four domains:

| Domain | What they measured | Effect of deactivation |
|--------|-------------------|----------------------|
| **Time use** | Minutes on Facebook, other activities | Freed up ~60 min/day |
| **News & politics** | News quiz, polarization, engagement | Less informed, less polarized |
| **Well-being** | Happiness, life satisfaction, depression | Slightly happier |
| **Post-experiment** | Did they go back to Facebook? | Used Facebook less afterward |

In [None]:
# Load the real replication data from openICPSR (CC BY 4.0)
# Data files are CSV exports from the original Stata .dta files

DATA_URL = 'https://raw.githubusercontent.com/chrishwiggins/facebook-deactivation-welfare/main/data/'

el = pd.read_csv(DATA_URL + 'endline.csv')
bl = pd.read_csv(DATA_URL + 'baseline_slim.csv')
pe = pd.read_csv(DATA_URL + 'postendline.csv')
sms = pd.read_csv(DATA_URL + 'sms_summary.csv')

# t and fb_minutes are already numeric in the CSV
el['t'] = pd.to_numeric(el['t'], errors='coerce')
el['fb_minutes'] = pd.to_numeric(el['fb_minutes'], errors='coerce')
el['id'] = el['id'].astype(str)
bl['ID'] = bl['ID'].astype(str)

# Parse happiness: "5", "7 (a very happy person)", etc. -> extract leading digit
el['happiness'] = pd.to_numeric(
    el['swb_happiness'].astype(str).str.extract(r'(\d+)')[0], errors='coerce')

# Life satisfaction (Likert text -> 1-7)
swl_map = {'Strongly disagree': 1, 'Disagree': 2, 'Slightly disagree': 3,
           'Neither agree nor disagree': 4, 'Slightly agree': 5, 'Agree': 6, 'Strongly agree': 7}
for col in ['swb_swl1', 'swb_swl2', 'swb_swl3']:
    el[col + '_n'] = el[col].map(swl_map)
el['life_satisfaction'] = el[['swb_swl1_n', 'swb_swl2_n', 'swb_swl3_n']].mean(axis=1)

# Loneliness (text -> 1-3)
lnlns_map = {'Hardly ever': 1, 'Some of the time': 2, 'Often': 3}
for col in ['swb_lnlns1', 'swb_lnlns2', 'swb_lnlns3']:
    el[col + '_n'] = el[col].map(lnlns_map)
el['loneliness'] = el[['swb_lnlns1_n', 'swb_lnlns2_n', 'swb_lnlns3_n']].mean(axis=1)

# Depression items: "3.", "4. All or almost all of the time." -> extract leading digit
for col in ['swb_eurhappsvy_4', 'swb_eurhappsvy_5', 'swb_eurhappsvy_6', 'swb_eurhappsvy_7']:
    el[col + '_n'] = pd.to_numeric(
        el[col].astype(str).str.extract(r'(\d+)')[0], errors='coerce')
el['depression'] = el[['swb_eurhappsvy_4_n', 'swb_eurhappsvy_5_n', 
                        'swb_eurhappsvy_6_n', 'swb_eurhappsvy_7_n']].mean(axis=1)

# News knowledge: items are True/False/Unsure quiz responses.
# Count confident (non-"Unsure") answers as a proxy for news engagement.
nk_cols = [c for c in el.columns if c.startswith('news_knowledge')]
for c in nk_cols:
    el[c + '_confident'] = (el[c] != 'Unsure').astype(float)
    el.loc[el[c].isna(), c + '_confident'] = np.nan
el['news_knowledge'] = el[[c + '_confident' for c in nk_cols]].sum(axis=1)

# Attitude columns are already float64
el['attitude_trump_1_n'] = el['attitude_trump_1']

# Follow politics (text -> 1-4)
fp_map = {'Not at all closely': 1, 'Somewhat closely': 2, 
          'Rather closely': 3, 'Very closely': 4}
el['follow_politics_n'] = el['follow_politics'].map(fp_map)

# WTA (already numeric)
el['wta'] = pd.to_numeric(el['wta3'], errors='coerce')
el.loc[el['wta'] > 10000, 'wta'] = np.nan

# Merge baseline demographics
df = el.merge(bl[['ID', 'qualified', 'educ_prescreen', 'repdem', 'fb_minutes_prescreen']], 
              left_on='id', right_on='ID', how='left')

# Create treatment label
df['treatment'] = df['t'].map({1: 'Deactivated', 0: 'Control'})

n_treat = (df['t'] == 1).sum()
n_ctrl = (df['t'] == 0).sum()
print(f"Real data loaded: {len(df)} participants ({n_treat} treatment, {n_ctrl} control)")
print(f"\nFacebook minutes/day:")
print(f"  Control:   {df.loc[df.t==0, 'fb_minutes'].mean():.1f} (sd={df.loc[df.t==0, 'fb_minutes'].std():.1f})")
print(f"  Treatment: {df.loc[df.t==1, 'fb_minutes'].mean():.1f} (sd={df.loc[df.t==1, 'fb_minutes'].std():.1f})")
print(f"\nHappiness (1-7):")
print(f"  Control:   {df.loc[df.t==0, 'happiness'].mean():.2f}")
print(f"  Treatment: {df.loc[df.t==1, 'happiness'].mean():.2f}")
print(f"\nNews knowledge (0-15, confident answers):")
print(f"  Control:   {df.loc[df.t==0, 'news_knowledge'].mean():.1f}")
print(f"  Treatment: {df.loc[df.t==1, 'news_knowledge'].mean():.1f}")

## Wait: who's in this dataset?

The endline CSV has **2,823 rows**, but only ~1,660 of those were in the actual experiment. The rest are people with high willingness-to-accept (WTA > $102) who were **never randomized**. They were surveyed but not assigned to treatment or control through randomization.

If we include them in our analysis, they all end up in the "control" group, which **breaks the experiment**. These high-WTA users value Facebook more than the randomized participants, so lumping them in with the real controls biases every comparison.

This is a classic **sample selection** problem: the published paper analyzes only the experimental sample (baseline WTA $\leq$ $102, and not flagged as low-quality responses). Let's see what happens when we do the same.

### The lesson

Replication data files often contain more observations than the paper's analysis sample. If you don't read the codebook and filter correctly, you can get **wrong signs** on your treatment effects. The randomization guarantee only holds within the randomized sample.

In [None]:
# ---- BEFORE vs AFTER: the sample selection fix ----

# First, let's see the BROKEN results (full sample, including non-randomized users)
print("BEFORE FIX: Full sample (N={}, includes non-randomized high-WTA users)".format(len(df)))
print("=" * 70)

outcomes_check = [
    ('fb_minutes', 'Facebook min/day'),
    ('happiness', 'Happiness'),
    ('life_satisfaction', 'Life satisfaction'),
    ('depression', 'Depression'),
    ('loneliness', 'Loneliness'),
    ('news_knowledge', 'News knowledge'),
    ('follow_politics_n', 'Follows politics'),
]

paper_vals = {
    'fb_minutes': -1.30, 'happiness': +0.09, 'life_satisfaction': +0.08,
    'depression': -0.08, 'loneliness': -0.03, 'news_knowledge': -0.19,
    'follow_politics_n': -0.18,
}

def compare_results(data, label):
    treat = data[data.t == 1]
    ctrl = data[data.t == 0]
    print(f"\n{'Outcome':<25} {'Our estimate':>12} {'Paper':>8} {'Match?':>8}")
    print("-" * 60)
    for var, name in outcomes_check:
        t_vals = treat[var].dropna()
        c_vals = ctrl[var].dropna()
        if len(t_vals) < 2 or len(c_vals) < 2:
            continue
        diff = t_vals.mean() - c_vals.mean()
        sd_ctrl = c_vals.std()
        effect = diff / sd_ctrl if sd_ctrl > 0 else 0
        paper = paper_vals.get(var, None)
        if paper is not None:
            same_dir = (effect > 0) == (paper > 0) if paper != 0 else True
            match = "YES" if same_dir and abs(effect - paper) < 0.15 else ("~dir" if same_dir else "WRONG")
        else:
            match = ""
        print(f"  {name:<23} {effect:>+10.2f} SD {paper:>+8.2f} {match:>8}")

compare_results(df, "Full sample")

# ---- NOW APPLY THE FIX ----
# Filter to the experimental sample: baseline WTA <= $102, exclude low-quality
df['wta1_n'] = pd.to_numeric(df['wta1'], errors='coerce')
n_before = len(df)
df = df[(df['wta1_n'] <= 102) & (df['lowqual'] == 0)].copy()

# Also clean outliers in FB minutes (> 10 hours/day is likely data error)
df.loc[df['fb_minutes'] > 600, 'fb_minutes'] = np.nan

# Rebuild treatment label
df['treatment'] = df['t'].map({1: 'Deactivated', 0: 'Control'})

n_after = len(df)
n_treat = (df['t'] == 1).sum()
n_ctrl = (df['t'] == 0).sum()
n_dropped = n_before - n_after

print(f"\n\n{'='*70}")
print(f"AFTER FIX: Experimental sample only")
print(f"  Dropped {n_dropped} non-experimental observations")
print(f"  N = {n_after} ({n_treat} treatment, {n_ctrl} control)")
print(f"{'='*70}")

compare_results(df, "Experimental sample")

print(f"\n\nNotice: happiness, loneliness, and time-use effects now have the")
print(f"CORRECT signs. The sample selection fix flipped several results.")

## Part 3: What happened to their time?

When people quit Facebook, they freed up about **60 minutes per day** on average. Where did that time go?

This is a crucial question: if quitting Facebook just means more time on Instagram or TikTok, the effects might be very different than if it means more time with family.

Note: The endline survey asks about Facebook minutes directly. The time substitution details come from the paper's analysis of the full time-use diary data.

In [None]:
# Visualize Facebook use: real data
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Panel 1: Facebook minutes distribution by group (REAL DATA)
ax = axes[0]
treat_fb = df.loc[df.t==1, 'fb_minutes'].dropna()
ctrl_fb = df.loc[df.t==0, 'fb_minutes'].dropna()
bins = np.linspace(0, 300, 40)
ax.hist(ctrl_fb, bins=bins, alpha=0.6, color='#66BB6A', label='Control', density=True)
ax.hist(treat_fb, bins=bins, alpha=0.6, color='#EF5350', label='Treatment', density=True)
ax.axvline(ctrl_fb.mean(), color='#2E7D32', linestyle='--', linewidth=2)
ax.axvline(treat_fb.mean(), color='#C62828', linestyle='--', linewidth=2)
ax.set_xlabel('Facebook minutes/day')
ax.set_ylabel('Density')
ax.set_title('Facebook use plummeted\n(REAL DATA)')
ax.legend()
ax.text(ctrl_fb.mean()+5, ax.get_ylim()[1]*0.9, f'{ctrl_fb.mean():.0f} min',
        color='#2E7D32', fontsize=11)
ax.text(treat_fb.mean()+5, ax.get_ylim()[1]*0.8, f'{treat_fb.mean():.0f} min',
        color='#C62828', fontsize=11)

# Panel 2: Where did the time go? (from paper's published results)
ax = axes[1]
activities = ['Other social\nmedia', 'TV alone', 'Socializing\noffline']
te_values = [-0.12, 0.20, 0.16]  # published treatment effects in SD
colors_bar = ['#EF5350' if v < 0 else '#66BB6A' for v in te_values]
bars = ax.barh(activities, te_values, color=colors_bar, height=0.5, alpha=0.8)
ax.axvline(0, color='black', linewidth=0.5)
ax.set_xlabel('Treatment effect (standard deviations)')
ax.set_title('Where did the freed-up time go?\n(from paper Figure 3)')
for bar, val in zip(bars, te_values):
    x_pos = val + 0.01 if val > 0 else val - 0.01
    ha = 'left' if val > 0 else 'right'
    ax.text(x_pos, bar.get_y() + bar.get_height()/2,
            f'{val:+.2f} SD', va='center', ha=ha, fontsize=11)

# Panel 3: The 60-minute pie (from paper)
ax = axes[2]
time_alloc = [20, 16, 12, 12]
time_labels = ['TV alone\n(~20 min)', 'Socializing\n(~16 min)', 'Other online\n(~12 min)', 'Other offline\n(~12 min)']
time_colors = ['#FFA726', '#66BB6A', '#42A5F5', '#AB47BC']
wedges, texts, autotexts = ax.pie(time_alloc, labels=time_labels, colors=time_colors,
                                   autopct='%1.0f%%', startangle=90, textprops={'fontsize': 10})
ax.set_title('How 60 freed-up minutes\nwere reallocated')

plt.tight_layout()
plt.show()

print(f"Real data: Control group averaged {ctrl_fb.mean():.0f} min/day on Facebook.")
print(f"Treatment group averaged {treat_fb.mean():.0f} min/day (a {ctrl_fb.mean()-treat_fb.mean():.0f} min reduction).")
print("\nSurprise: quitting Facebook did NOT lead to more time on other social media.")
print("Instead, people watched more TV and spent more time with friends and family.")

## Part 4: The Big Picture (Reproducing Figure 3)

This is the paper's main result. Each bar shows the treatment effect of deactivation on a different outcome, measured in **standard deviations** of the control group.

Why standard deviations? Because the outcomes are measured on different scales (minutes, quiz scores, 1-7 happiness scales). Standardizing lets us compare apples to oranges.

In [None]:
# Reproduce Figure 3: The headline results
# Treatment effects in SD units, with approximate 95% CIs

outcomes = [
    # (label, effect, ci_low, ci_high, category)
    ('Facebook minutes/day',    -1.30, -1.40, -1.20, 'Time Use'),
    ('Other social media',      -0.12, -0.22, -0.02, 'Time Use'),
    ('TV alone',                 0.20,  0.10,  0.30, 'Time Use'),
    ('Socializing offline',      0.16,  0.06,  0.26, 'Time Use'),
    ('', None, None, None, 'spacer'),
    ('News knowledge index',    -0.19, -0.30, -0.08, 'News & Politics'),
    ('Follows news',            -0.18, -0.29, -0.07, 'News & Politics'),
    ('Issue polarization',      -0.16, -0.27, -0.05, 'News & Politics'),
    ('Affective polarization',  -0.06, -0.17,  0.05, 'News & Politics'),
    ('Voter turnout',            0.07, -0.04,  0.18, 'News & Politics'),
    ('', None, None, None, 'spacer'),
    ('Happiness',                0.09,  0.00,  0.18, 'Well-being'),
    ('Life satisfaction',        0.08, -0.01,  0.17, 'Well-being'),
    ('Depression (lower=better)',-0.08, -0.17,  0.01, 'Well-being'),
    ('Loneliness',              -0.03, -0.12,  0.06, 'Well-being'),
    ('', None, None, None, 'spacer'),
    ('Post-exp Facebook use',   -0.61, -0.72, -0.50, 'Post-experiment'),
]

fig, ax = plt.subplots(figsize=(10, 12))

category_colors = {
    'Time Use': '#42A5F5',
    'News & Politics': '#EF5350',
    'Well-being': '#66BB6A',
    'Post-experiment': '#FFA726',
}

y_pos = 0
y_positions = []
y_labels = []
for label, effect, ci_lo, ci_hi, cat in outcomes:
    if cat == 'spacer':
        y_pos -= 0.5
        continue
    color = category_colors[cat]
    ax.barh(y_pos, effect, height=0.6, color=color, alpha=0.8)
    ax.plot([ci_lo, ci_hi], [y_pos, y_pos], color='black', linewidth=1.5)
    ax.plot(ci_lo, y_pos, 'k|', markersize=8)
    ax.plot(ci_hi, y_pos, 'k|', markersize=8)

    # Bold the significant ones
    sig = (ci_lo > 0) or (ci_hi < 0)
    weight = 'bold' if sig else 'normal'
    ax.text(-1.55, y_pos, label, va='center', ha='right', fontsize=11, fontweight=weight)
    ax.text(effect + 0.03 if effect > 0 else effect - 0.03, y_pos,
            f'{effect:+.2f}', va='center', ha='left' if effect > 0 else 'right',
            fontsize=9, color='#333')
    y_positions.append(y_pos)
    y_pos -= 1

ax.axvline(0, color='black', linewidth=1)
ax.set_xlabel('Treatment effect of deactivation (standard deviations)', fontsize=13)
ax.set_title('What happens when you quit Facebook for 4 weeks?\n(Reproducing Figure 3 from Allcott et al. 2020)',
             fontsize=15, pad=20)
ax.set_yticks([])
ax.set_xlim(-1.6, 0.5)

# Add category labels
cat_y = {
    'Time Use': -1.5,
    'News & Politics': -7,
    'Well-being': -12.5,
    'Post-experiment': -16.5
}
for cat, y in cat_y.items():
    ax.text(-1.58, y, cat, fontsize=12, fontweight='bold', color=category_colors[cat],
            va='center', ha='right', style='italic')

plt.tight_layout()
plt.show()

print("Bold labels = statistically significant at 95% level")
print("\nThe story: quitting Facebook makes you less informed but also")
print("less polarized, slightly happier, and much less likely to go back.")

## Part 5: News Knowledge and Political Polarization

Deactivation made people **less informed** but also **less polarized**. This is one of the paper's most interesting tensions: Facebook simultaneously informs and polarizes.

Let's look at these effects more carefully.

In [None]:
# Deep dive: News & Politics outcomes (REAL DATA)
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Panel 1: News knowledge distributions
ax = axes[0, 0]
ctrl_nk = df.loc[df.t==0, 'news_knowledge'].dropna()
treat_nk = df.loc[df.t==1, 'news_knowledge'].dropna()
bins = np.linspace(0, 15, 16)
ax.hist(ctrl_nk, bins=bins, alpha=0.6, color='#66BB6A', label='Control', density=True)
ax.hist(treat_nk, bins=bins, alpha=0.6, color='#EF5350', label='Treatment', density=True)
ax.axvline(ctrl_nk.mean(), color='#2E7D32', linestyle='--', linewidth=2)
ax.axvline(treat_nk.mean(), color='#C62828', linestyle='--', linewidth=2)
ax.set_xlabel('News knowledge score (0-15)')
ax.set_title('Deactivation reduced news knowledge')
ax.legend()
d = treat_nk.mean() - ctrl_nk.mean()
ax.annotate(f'Gap = {d:.1f} items', xy=(treat_nk.mean(), 0.02),
            xytext=(treat_nk.mean()-2, 0.15), fontsize=12,
            arrowprops=dict(arrowstyle='->', color='black'))

# Panel 2: Follow politics
ax = axes[0, 1]
ctrl_fp = df.loc[df.t==0, 'follow_politics_n'].dropna()
treat_fp = df.loc[df.t==1, 'follow_politics_n'].dropna()
bins = np.arange(0.5, 8.5, 1)
ax.hist(ctrl_fp, bins=bins, alpha=0.6, color='#66BB6A', label='Control', density=True)
ax.hist(treat_fp, bins=bins, alpha=0.6, color='#EF5350', label='Treatment', density=True)
ax.axvline(ctrl_fp.mean(), color='#2E7D32', linestyle='--', linewidth=2)
ax.axvline(treat_fp.mean(), color='#C62828', linestyle='--', linewidth=2)
ax.set_xlabel('Follow politics (1-7 scale)')
ax.set_title('Deactivation reduced political engagement')
ax.legend()

# Panel 3: The information-polarization tradeoff (from published effects)
ax = axes[1, 0]
from matplotlib.patches import Patch
categories_bar = ['News\nknowledge', 'Follows\nnews', 'Issue\npolarization', 'Affective\npolarization']
vals = [-0.19, -0.18, -0.16, -0.06]
colors_bar = ['#EF5350', '#EF5350', '#66BB6A', '#66BB6A']
bars = ax.barh(categories_bar, vals, color=colors_bar, height=0.5, alpha=0.8)
ax.axvline(0, color='black', linewidth=0.5)
ax.set_xlabel('Treatment effect (SD)')
ax.set_title('The information-polarization tradeoff')
for bar, val in zip(bars, vals):
    ax.text(val - 0.01, bar.get_y() + bar.get_height()/2, f'{val:+.2f}',
            va='center', ha='right', fontsize=11, fontweight='bold')
legend_elements = [Patch(facecolor='#EF5350', alpha=0.8, label='Costs of deactivation'),
                   Patch(facecolor='#66BB6A', alpha=0.8, label='Benefits of deactivation')]
ax.legend(handles=legend_elements, loc='lower left')

# Panel 4: Attitude polarization by party (REAL DATA)
ax = axes[1, 1]
# Use attitude_trump as a proxy for polarization
df['att_trump'] = df['attitude_trump_1_n']
for party, color, label in [('Democrat', '#2196F3', 'Democrats'), 
                              ('Republican', '#F44336', 'Republicans')]:
    sub = df[df['repdem'] == party]
    ctrl_mean = sub.loc[sub.t==0, 'att_trump'].mean()
    treat_mean = sub.loc[sub.t==1, 'att_trump'].mean()
    ax.bar([f'{label}\n(Control)', f'{label}\n(Deactivated)'],
           [ctrl_mean, treat_mean], color=color, alpha=0.7, width=0.4)
ax.set_ylabel('Attitude toward Trump (1-7)')
ax.set_title('Partisan attitudes: real data')

plt.suptitle('News, Politics, and Polarization (REAL DATA)', fontsize=16, y=1.02)
plt.tight_layout()
plt.show()

print("The core tension: Facebook keeps people informed AND polarized.")
print("Quitting reduces both. Is that a good tradeoff?")

## Part 6: Well-being Deep Dive

The well-being effects are positive but **small**. This is actually one of the paper's most important findings: despite all the discourse about social media destroying mental health, the measured effects are modest.

Let's visualize the well-being outcomes and think about what "small" means.

In [None]:
# Well-being outcomes: REAL DATA
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

wellbeing_vars = [
    ('happiness', 'Happiness (1-7)', '#66BB6A', 'higher = better'),
    ('life_satisfaction', 'Life satisfaction (1-7)', '#42A5F5', 'higher = better'),
    ('depression', 'Depression index (1-6)', '#EF5350', 'lower = better'),
    ('loneliness', 'Loneliness (1-4)', '#FFA726', 'lower = better'),
]

for ax, (var, title, color, direction) in zip(axes.flat, wellbeing_vars):
    ctrl = df.loc[df.t==0, var].dropna()
    treat = df.loc[df.t==1, var].dropna()
    
    all_vals = pd.concat([ctrl, treat])
    bins = np.linspace(all_vals.min(), all_vals.max(), 20)

    ax.hist(ctrl, bins=bins, alpha=0.5, color='#9E9E9E', label=f'Control (n={len(ctrl)})', density=True)
    ax.hist(treat, bins=bins, alpha=0.5, color=color, label=f'Treatment (n={len(treat)})', density=True)

    ax.axvline(ctrl.mean(), color='#616161', linestyle='--', linewidth=2)
    ax.axvline(treat.mean(), color=color, linestyle='--', linewidth=2)

    d = treat.mean() - ctrl.mean()
    pooled_sd = np.sqrt((ctrl.std()**2 + treat.std()**2) / 2)
    d_sd = d / pooled_sd if pooled_sd > 0 else 0
    t_stat, p_val = stats.ttest_ind(treat, ctrl)
    sig = f'p={p_val:.3f}' if p_val >= 0.001 else 'p<0.001'
    
    ax.set_title(f'{title} ({direction})\nDiff: {d:+.2f} ({d_sd:+.2f} SD, {sig})', fontsize=11)
    ax.set_xlabel('Score')
    ax.legend(fontsize=9)

plt.suptitle('Well-being outcomes from REAL DATA', fontsize=15, y=1.02)
plt.tight_layout()
plt.show()

print("How big is 0.09 SD?")
print("=" * 50)
print("For comparison, published effect sizes:")
print("  Cognitive behavioral therapy for depression:  ~0.5-0.8 SD")
print("  Regular exercise on mood:                     ~0.3-0.5 SD")
print("  Facebook deactivation on happiness:           ~0.09 SD")
print("  Winning $1,000 on life satisfaction:           ~0.01 SD")
print()
print("The effect is real but small. The paper's own estimate:")
print("deactivation is worth about 0.11 SD of well-being,")
print("or roughly $30-50/month in equivalent compensation.")

## Part 7: Post-experiment Behavior

Here's a striking result: after the experiment ended and participants could go back to Facebook freely, the **treatment group used Facebook significantly less** than the control group.

This suggests that some of Facebook's hold on users comes from **habit**, not from ongoing enjoyment. Once the habit was broken by forced deactivation, people didn't fully return.

In [None]:
# Post-experiment behavior (REAL DATA)
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Panel 1: During-experiment FB minutes by group
ax = axes[0]
ctrl_fb = df.loc[df.t==0, 'fb_minutes'].dropna()
treat_fb = df.loc[df.t==1, 'fb_minutes'].dropna()

x = np.arange(2)
width = 0.3
# During experiment: real data; Post-experiment: published effect (-0.61 SD)
ctrl_post_est = ctrl_fb.mean()  # control stays ~same
treat_post_est = ctrl_fb.mean() - 0.61 * ctrl_fb.std()  # published treatment effect
ax.bar(x - width/2, [ctrl_fb.mean(), ctrl_post_est], width,
       label='Control', color='#66BB6A', alpha=0.8)
ax.bar(x + width/2, [treat_fb.mean(), treat_post_est], width,
       label='Treatment', color='#EF5350', alpha=0.8)
ax.set_xticks(x)
ax.set_xticklabels(['During\nexperiment', 'After\nexperiment\n(estimated)'])
ax.set_ylabel('Facebook minutes/day')
ax.set_title('Facebook use: during vs. after')
ax.legend()
ax.annotate(f'Gap persists!\n(~{ctrl_post_est - treat_post_est:.0f} min/day)',
            xy=(1, treat_post_est), xytext=(1.3, treat_post_est + 15),
            fontsize=11, arrowprops=dict(arrowstyle='->', color='black'))

# Panel 2: Future plans for FB use (endline survey question)
ax = axes[1]
# fb_use_plan asks about future Facebook use intentions
plans_ctrl = df.loc[df.t==0, 'fb_use_plan'].dropna().value_counts(normalize=True).sort_index()
plans_treat = df.loc[df.t==1, 'fb_use_plan'].dropna().value_counts(normalize=True).sort_index()
# Merge and plot
plans = pd.DataFrame({'Control': plans_ctrl, 'Treatment': plans_treat}).fillna(0)
if len(plans) > 0:
    plans.plot(kind='barh', ax=ax, color=['#66BB6A', '#EF5350'], alpha=0.8)
    ax.set_xlabel('Proportion')
    ax.set_title('Plans for future Facebook use')
    ax.legend()
else:
    ax.text(0.5, 0.5, 'fb_use_plan data\nnot available', ha='center', va='center', 
            transform=ax.transAxes, fontsize=14)
    ax.set_title('Plans for future Facebook use')

# Panel 3: The "revealed preference" puzzle
ax = axes[2]
# WTA vs actual welfare gain
labels = ['Users say FB is\nworth to them\n(WTA)', 'Actual welfare\ngain from\ndeactivation']
values = [102, 40]  # ~$102 WTA median vs ~$40/month welfare equivalent
colors = ['#5C6BC0', '#66BB6A']
bars = ax.bar(labels, values, color=colors, width=0.5, alpha=0.8)
ax.set_ylabel('$/month equivalent')
ax.set_title('Revealed vs. experienced preference')
for bar, val in zip(bars, values):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 2,
            f'~${val}', ha='center', fontsize=14, fontweight='bold')

ax.annotate('', xy=(1, 102), xytext=(0, 102),
            arrowprops=dict(arrowstyle='<->', color='red', linewidth=2))
ax.text(0.5, 106, 'People overvalue\ntheir own FB use!', ha='center',
        fontsize=11, color='red', fontweight='bold')

plt.tight_layout()
plt.show()

print("Key finding: WTA (willingness to accept) was ~$100/month, but")
print("the actual well-being gain from quitting was only ~$40/month.")
print()
print("This gap suggests people OVERESTIMATE how much they'd miss Facebook.")
print("The paper calls this evidence of 'internality' (a wedge between")
print("predicted and experienced utility), possibly driven by habit or addiction.")

## Part 8: Running Your Own Analysis

Let's estimate the treatment effects ourselves using OLS regression. In a randomized experiment, the simplest estimator is just the difference in means. OLS with controls for baseline covariates improves precision.

The key equation:

$$Y_i = \alpha + \tau \cdot T_i + X_i'\beta + \epsilon_i$$

where $T_i$ is the treatment indicator and $\tau$ is the Average Treatment Effect (ATE).

In [None]:
from scipy.stats import ttest_ind
import statsmodels.api as sm

# Method 1: Simple difference in means (valid because randomization!)
print("METHOD 1: Difference in Means (REAL DATA)")
print("=" * 60)
print(f"{'Outcome':<25} {'Diff':>7} {'SE':>7} {'t-stat':>7} {'p-val':>8}  Sig?")
print("-" * 60)

outcomes_to_test = [
    ('fb_minutes', 'Facebook min/day'),
    ('news_knowledge', 'News knowledge'),
    ('follow_politics_n', 'Follows politics'),
    ('happiness', 'Happiness'),
    ('life_satisfaction', 'Life satisfaction'),
    ('depression', 'Depression'),
    ('loneliness', 'Loneliness'),
    ('wta', 'WTA ($)'),
]

for var, label in outcomes_to_test:
    treat = df.loc[df.t==1, var].dropna()
    ctrl = df.loc[df.t==0, var].dropna()
    if len(treat) < 2 or len(ctrl) < 2:
        print(f"  {label:<23} (insufficient data)")
        continue
    diff = treat.mean() - ctrl.mean()
    t_stat, p_val = ttest_ind(treat, ctrl)
    se = abs(diff / t_stat) if t_stat != 0 else 0
    sig = '***' if p_val < 0.01 else '**' if p_val < 0.05 else '*' if p_val < 0.1 else ''
    print(f"  {label:<23} {diff:>+7.3f} {se:>7.3f} {t_stat:>7.2f} {p_val:>8.4f}  {sig}")

print()
print("*** p<0.01, ** p<0.05, * p<0.1")

# Method 2: OLS with demographic controls (more precise)
print("\n\nMETHOD 2: OLS with demographic controls (happiness)")
print("=" * 60)

# Create demographic dummies from baseline
df['is_democrat'] = df['repdem'].str.contains('Democrat', na=False).astype(float)
df['is_republican'] = df['repdem'].str.contains('Republican', na=False).astype(float)
df['is_college'] = df['educ_prescreen'].isin([
    "Bachelor's degree", 
    "Graduate degree (for example: MA, MBA, JD,PhD)"
]).astype(float)

# Build regression
reg_vars = ['t', 'is_democrat', 'is_republican', 'is_college']
reg_df = df[reg_vars + ['happiness']].dropna()
X = sm.add_constant(reg_df[reg_vars])
y = reg_df['happiness']
model = sm.OLS(y, X).fit()

print(model.summary().tables[1])
print(f"\nTreatment effect on happiness:")
simple_diff = df.loc[df.t==1, 'happiness'].mean() - df.loc[df.t==0, 'happiness'].mean()
print(f"  Without controls: {simple_diff:.4f}")
print(f"  With controls:    {model.params['t']:.4f}")
print(f"  (Controls tighten the SE but shouldn't change the point estimate much)")
print(f"\nN = {len(reg_df)}, R-squared = {model.rsquared:.4f}")

## Part 9: Connecting the Readings

This week's readings form a coherent picture. Let's map the connections.

| | Allcott et al. 2020 | Allcott et al. 2024 | Chmel et al. 2025 |
|---|---|---|---|
| **Design** | Deactivation RCT | Deactivation + feed manipulation | Observational + natural experiment |
| **Platform** | Facebook | Facebook + Instagram | Multiple |
| **N** | 1,661 | 23,000+ (3 papers combined) | Varies |
| **Key manipulation** | Remove all SM exposure | Remove SM / change feed content | SM creators as political actors |
| **Time period** | Oct 2018 (midterms) | Sep-Nov 2020 (presidential) | Recent |
| **Key finding** | Small well-being gain, less informed, less polarized | No effect on polarization, affective polarization, beliefs | Creators shape political views |

### The intellectual arc

1. **Allcott 2020** asks: what happens when you *remove* social media entirely?
2. **Allcott 2024** asks: is it the *platform* or the *content* that matters? (Answer: the specific content features like reshares and algorithmic ranking had surprisingly small effects on measured political outcomes)
3. **Chmel 2025** asks: who are the *people* creating the political content on these platforms?

Together they suggest: social media's political effects may be less about algorithmic manipulation and more about the ecosystem of creators and the habits they reinforce.

In [None]:
# Visualize the "theory of change" across all three papers
fig, ax = plt.subplots(figsize=(14, 7))
ax.set_xlim(0, 10)
ax.set_ylim(0, 8)
ax.axis('off')

# Draw boxes for each paper
box_style = dict(boxstyle='round,pad=0.5', alpha=0.85)

# Allcott 2020
ax.add_patch(plt.Rectangle((0.2, 5.5), 3, 2, fill=True, facecolor='#BBDEFB',
                            edgecolor='#1565C0', linewidth=2, alpha=0.8, zorder=2))
ax.text(1.7, 7.0, 'Allcott et al. 2020', fontsize=12, fontweight='bold',
        ha='center', va='center', color='#1565C0')
ax.text(1.7, 6.3, 'Remove Facebook\nentirely', fontsize=11,
        ha='center', va='center')
ax.text(1.7, 5.8, 'Result: less informed,\nless polarized, slightly happier',
        fontsize=9, ha='center', va='center', style='italic')

# Allcott 2024
ax.add_patch(plt.Rectangle((3.7, 5.5), 3, 2, fill=True, facecolor='#C8E6C9',
                            edgecolor='#2E7D32', linewidth=2, alpha=0.8, zorder=2))
ax.text(5.2, 7.0, 'Allcott et al. 2024', fontsize=12, fontweight='bold',
        ha='center', va='center', color='#2E7D32')
ax.text(5.2, 6.3, 'Change the feed\n(algorithm, reshares)', fontsize=11,
        ha='center', va='center')
ax.text(5.2, 5.8, 'Result: surprisingly\nsmall effects', fontsize=9,
        ha='center', va='center', style='italic')

# Chmel 2025
ax.add_patch(plt.Rectangle((7.2, 5.5), 2.6, 2, fill=True, facecolor='#FFE0B2',
                            edgecolor='#E65100', linewidth=2, alpha=0.8, zorder=2))
ax.text(8.5, 7.0, 'Chmel et al. 2025', fontsize=12, fontweight='bold',
        ha='center', va='center', color='#E65100')
ax.text(8.5, 6.3, 'Study the creators\nwho make the content', fontsize=11,
        ha='center', va='center')
ax.text(8.5, 5.8, 'Result: creators shape\npolitical attitudes', fontsize=9,
        ha='center', va='center', style='italic')

# Arrows between papers
ax.annotate('', xy=(3.7, 6.5), xytext=(3.2, 6.5),
            arrowprops=dict(arrowstyle='->', linewidth=2, color='gray'))
ax.annotate('', xy=(7.2, 6.5), xytext=(6.7, 6.5),
            arrowprops=dict(arrowstyle='->', linewidth=2, color='gray'))

# Bottom: the evolving question
ax.add_patch(plt.Rectangle((1, 0.5), 8, 2.5, fill=True, facecolor='#F3E5F5',
                            edgecolor='#6A1B9A', linewidth=2, alpha=0.7, zorder=2))
ax.text(5, 2.5, 'The evolving research question', fontsize=13, fontweight='bold',
        ha='center', va='center', color='#6A1B9A')
ax.text(5, 1.8, '2020: "Does social media affect welfare?"  (Yes, a little)',
        fontsize=11, ha='center', va='center')
ax.text(5, 1.3, '2024: "Is it the algorithm?"  (Not really)',
        fontsize=11, ha='center', va='center')
ax.text(5, 0.8, '2025: "Is it the people making content?"  (Looks like it)',
        fontsize=11, ha='center', va='center')

# Arrows from papers to bottom box
for x in [1.7, 5.2, 8.5]:
    ax.annotate('', xy=(x, 3.0), xytext=(x, 5.5),
                arrowprops=dict(arrowstyle='->', linewidth=1.5, color='#9E9E9E',
                               connectionstyle='arc3,rad=0'))

# Title
ax.text(5, 7.8, "This week's readings: three experiments, one evolving question",
        fontsize=15, fontweight='bold', ha='center', va='center')

plt.tight_layout()
plt.show()

## Part 10: Exercises

Try modifying the code above to explore these questions:

### Exercise 1: Heterogeneous treatment effects
Do the well-being effects differ by party? Modify the OLS regression to include an **interaction term** between `t` and `is_democrat` (or `is_republican`).

```python
# Hint: create an interaction variable
df['treat_x_dem'] = df['t'] * df['is_democrat']
# Then add it to the regression
```

### Exercise 2: Multiple testing
We tested 8 outcomes. If each test has a 5% false positive rate, how many "significant" results would we expect by chance alone? Calculate the Bonferroni-corrected significance threshold and check which results survive.

### Exercise 3: External validity
The sample is younger, more educated, and more Democratic than the US population. Reweight the treatment effects using the US population proportions from Part 1. Do the results change?

### Exercise 4: Consumer surplus
The paper estimates Facebook is worth ~$100/month to users (WTA) but deactivation only improves well-being by ~$40/month equivalent. Where does the other $60 go? Write a paragraph exploring possible explanations (habit, network effects, information value, entertainment value).

### Exercise 5: Connecting to Chmel et al. 2025
If social media creators are the primary channel through which platforms shape politics (Chmel's argument), what would you predict happens when you deactivate Facebook? Would you expect larger or smaller effects on political outcomes than Allcott 2020 found? Why?

---

*This notebook uses real anonymized replication data from [openICPSR project 112081](https://www.openicpsr.org/openicpsr/project/112081) (CC BY 4.0). Citation: Allcott, Hunt, Braghieri, Luca, Eichmeyer, Sarah, and Gentzkow, Matthew. Replication data for: The Welfare Effects of Social Media. Nashville, TN: American Economic Association [publisher], 2020. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2020-01-29.*