---
title: "Meta-analysis Assignmen"
author: "Azizbek Ganiev"
date: today
output:
  pdf_document:
    fig-align: center
execute:
  enabled: true
  echo: false
  engine: jupyter
---


# meta-analysis on children’s toy preferences


## Step 1: Import the Data



In [None]:
import pandas as pd

# Load Excel file
file_path = "../11. Metaanalysis/data/metaanalysis_data.xlsx"
df = pd.read_excel(file_path)

df.head()


## Step 2: Combine Effects (Meta-Analysis)
We will compute effect sizes for each study using the mean difference between boys and girls playing with male-typed and female-typed toys.

Let's calculate Cohen’s d for each comparison:

Toy Type 1: Male-Typed Toys
Compare:

- Boys vs Girls on time spent playing with male toys
- Variables: Mean_boys_play_male, SD_boys_play_male, N_boys
- Variables: Mean_girls_play_male, SD_girls_play_female, N_girls
Toy Type 2: Female-Typed Toys

Compare:

- Girls vs Boys on time spent playing with female toys
- Variables: Mean_girls_play_female, SD_girls_play_female, N_girls
- Variables: Mean_boys_play_female, SD_boys_play_female, N_boys


In [None]:
from scipy.stats import norm
import numpy as np

def cohen_d(m1, m2, sd1, sd2, n1, n2):
    sdp = np.sqrt(((n1 - 1)*sd1**2 + (n2 - 1)*sd2**2)/(n1 + n2 - 2))
    return (m1 - m2)/sdp

# Male toys: Boys vs Girls
df['d_male_toys'] = cohen_d(
    df['Mean_boys_play_male'],
    df['Mean_girls_play_male'],
    df['SD_boys_play_male'],
    df['SD_girls_play_male'],
    df['N_boys'],
    df['N_girls']
)

# Female toys: Girls vs Boys
df['d_female_toys'] = cohen_d(
    df['Mean_girls_play_female'],
    df['Mean_boys_play_female'],
    df['SD_girls_play_female'],
    df['SD_boys_play_female'],
    df['N_girls'],
    df['N_boys']
)

# Use inverse variance weighting
df['var_d_male'] = (df['N_boys'] + df['N_girls']) / (df['N_boys'] * df['N_girls']) + df['d_male_toys']**2 / (2 * (df['N_boys'] + df['N_girls']))
df['weight_male'] = 1 / df['var_d_male']

df['var_d_female'] = (df['N_girls'] + df['N_boys']) / (df['N_girls'] * df['N_boys']) + df['d_female_toys']**2 / (2 * (df['N_girls'] + df['N_boys']))
df['weight_female'] = 1 / df['var_d_female']

# Pooled effect sizes
pooled_d_male = sum(df['d_male_toys'] * df['weight_male']) / sum(df['weight_male'])
pooled_d_female = sum(df['d_female_toys'] * df['weight_female']) / sum(df['weight_female'])

print(f"Pooled effect size for male-typed toys: {pooled_d_male:.2f}")
print(f"Pooled effect size for female-typed toys: {pooled_d_female:.2f}")



## Step 3: Funnel Plot
Check for publication bias or heterogeneity.


In [None]:
import matplotlib.pyplot as plt
from statsmodels.stats.meta_analysis import effectsize_smd

# Generate effect sizes and variances
yi = df['d_male_toys'].values
vi = df['var_d_male'].values

plt.figure(figsize=(8,6))
plt.scatter(yi, 1/np.sqrt(vi))
for i in range(len(yi)):
    plt.text(yi[i], 1/np.sqrt(vi[i]), df['Study'][i], fontsize=8)
plt.axvline(pooled_d_male, color='red', linestyle='--')
plt.xlabel("Effect Size (Cohen's d)")
plt.ylabel("Precision (1/SE)")
plt.title("Funnel Plot for Male-Typed Toys")
plt.show()

Symmetry suggests no major publication bias. Asymmetry might suggest small-study effects or selective reporting.

## Step 4: Does Methodology / Quality Affect Results?

Recalculate Cohen’s d and Variances and Meta-Regression with Moderators


In [None]:
import pandas as pd
import numpy as np
from scipy.stats import norm
import statsmodels.api as sm


# Compute Cohen's d for male-typed toys
def cohen_d(m1, m2, sd1, sd2, n1, n2):
    sdp = np.sqrt(((n1 - 1)*sd1**2 + (n2 - 1)*sd2**2) / (n1 + n2 - 2))
    return (m1 - m2) / sdp

# Male toys: Boys vs Girls
df['d_male_toys'] = cohen_d(
    df['Mean_boys_play_male'],
    df['Mean_girls_play_male'],
    df['SD_boys_play_male'],
    df['SD_girls_play_male'],
    df['N_boys'],
    df['N_girls']
)

# Variance of d
def var_d(m1, m2, sd1, sd2, n1, n2, d):
    sdp = np.sqrt(((n1 - 1)*sd1**2 + (n2 - 1)*sd2**2) / (n1 + n2 - 2))
    se = np.sqrt((n1 + n2)/(n1 * n2) + d**2/(2*(n1 + n2)))
    return se**2

df['var_d_male'] = var_d(
    df['Mean_boys_play_male'],
    df['Mean_girls_play_male'],
    df['SD_boys_play_male'],
    df['SD_girls_play_male'],
    df['N_boys'],
    df['N_girls'],
    df['d_male_toys']
)

# Define moderators
df_clean = df.dropna(subset=['d_male_toys', 'var_d_male', 'NOS score', 'Country', 'Parent present', 'Setting'])

X = df_clean[['NOS score', 'Country', 'Parent present', 'Setting']].copy()
X = sm.add_constant(X)  # Add intercept

# Effect sizes and weights
y = df_clean['d_male_toys']
weights = 1 / df_clean['var_d_male']

# Weighted regression
model = sm.WLS(y, X, weights=weights)
results = model.fit()

# Print summary
print(results.summary())

- Significant coefficients indicate that the variable affects toy preference differences.
- For example, if NOS score is significant, it means methodological quality impacts results.
- If Country is significant, gender inequality may influence toy preferences.

## Step 5: Does Author Gender Affect It?

In [None]:
# Create a binary variable: 1 if there are any female authors, else 0
df['any_female_author'] = (df['Female authors'] > 0).astype(int)
# Use only author gender as predictor
X_author = sm.add_constant(df['any_female_author'])
y = df['d_male_toys']
weights = 1 / df['var_d_male']

model_author = sm.WLS(y, X_author, weights=weights)
results_author = model_author.fit()

print(results_author.summary())

If any_female_author has a significant coefficient, it suggests that studies led by women report different effects.